Conversation
|
@copilot build, run the tests, summarize failures and take them into account to apply a tentative fix. |
|
@achamayou I've opened a new pull request, #7732, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR adds initial end-to-end coverage for IPv6 loopback usage in the test suite, and updates host/node networking code paths to better handle IPv6 addresses (notably in TCP connection setup and certificate SAN generation).
Changes:
- Add an e2e test that starts a network with RPC interfaces bound to
::1and exercises common endpoints. - Adjust test infra to avoid binding the IPv4-only “client interface” used for partition simulation when the node RPC host is IPv6.
- Improve IPv6 handling in node certificate SAN construction and in host TCP client socket creation.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
tests/infra/node.py |
Skips node_client_host binding when RPC host is IPv6 to avoid incompatible partition-simulation client interface. |
tests/e2e_logging.py |
Registers a new common_ipv6 sub-test using the logging sample app. |
tests/e2e_common_endpoints.py |
Adds run_ipv6() which forces nodes to bind to ::1 (skipping if IPv6 loopback is unavailable). |
src/node/node_state.h |
Strips brackets from IPv6 hosts when constructing default SANs; updates IPv6-related comment in is_ip(). |
src/host/tcp.h |
Defers client socket creation to use the resolved address family (IPv4 vs IPv6). |
Comments suppressed due to low confidence (1)
src/node/node_state.h:2347
- The is_ip() heuristic does not correctly detect general IPv6 literals: it only returns true when the final ':' component is purely numeric, so valid IPv6 addresses ending with hex letters (eg "2001:db8::a") will be treated as DNS names. This can produce incorrect SAN types in generated node certificates and break TLS validation when connecting by IP. Consider replacing this heuristic with a real IPv4/IPv6 parse (eg inet_pton / uv_inet_pton) after stripping brackets (and any zone ID if present).
bool is_ip(const std::string_view& hostname)
{
// IP address components are purely numeric. DNS names may be largely
// numeric, but at least the final component (TLD) must not be
// all-numeric. So this distinguishes "1.2.3.4" (an IP address) from
// "1.2.3.c4m" (a DNS name). "1.2.3." is invalid for either, and will
// throw. Handles IPv6 by splitting on ':' after splitting on '.'.
const auto final_component =
ccf::nonstd::split(ccf::nonstd::split(hostname, ".").back(), ":")
.back();
if (final_component.empty())
{
throw std::runtime_error(fmt::format(
"{} has a trailing period, is not a valid hostname", hostname));
}
return std::ranges::all_of(
final_component, [](char c) { return c >= '0' && c <= '9'; });
You can also share your feedback on Copilot code review. Take the survey.
| // Create the client socket with the correct address family, but only | ||
| // if client_bind() hasn't already created one via uv_tcp_bind(). | ||
| if (is_client && !client_host.has_value() && addr_current != nullptr) | ||
| { | ||
| int rc = 0; |
There was a problem hiding this comment.
In connect_resolved(), a new OS socket is created and passed to uv_tcp_open() every time this method is called. However on async connect failure, on_connect() advances addr_current and calls connect_resolved() again without closing/resetting the existing uv_handle FD. This will likely make the retry path fail (uv_tcp_open on an already-open handle) and can leak FDs. Consider creating/opening the socket only once per connection attempt (or closing + reinitialising the uv handle before retry), and ensure any newly created socket is closed on error paths (setsockopt/uv_tcp_open failures).
| // Create the client socket with the correct address family, but only | ||
| // if client_bind() hasn't already created one via uv_tcp_bind(). | ||
| if (is_client && !client_host.has_value() && addr_current != nullptr) | ||
| { |
There was a problem hiding this comment.
connect_resolved() only sets TCP_USER_TIMEOUT when !client_host.has_value(). When client_interface/client_host is set (client_bind path), the handle is created via uv_tcp_bind() and this option is never applied, changing behaviour for node-to-node connections when partition simulation is enabled. If TCP_USER_TIMEOUT is required, it should also be applied after a successful uv_tcp_bind() (eg via uv_fileno + setsockopt) or by creating/opening the socket before binding with the correct family.
#3215