ruby/ruby-3.4.0-Revert-Set-AI_ADDRCONFIG-when-making-getaddrinfo.patch
Jarek Prokop aa7e8debee Revert adding AI_ADDRCONFIG flag to getaddrinfo(3) calls.
It is causing problems when network is in certain, valid, configuration.
The point of it is to workaround a glibc bug that is not present for us.
https://bugs.ruby-lang.org/issues/20208

To reproduce, have IPv6 enabled localhost but only IPv4
interfaces (such as eth0, name and count of them may vary based on system)
to disable public interfaces via sysctl:
`$ sysctl "net.ipv6.conf.<interface>.disable_ipv6=1"`
Where <interface> is NOT lo but can be eth0 or other.

Confirm with command `$ ip addr` command that only `lo`
interface has IPv6 address.

and run the following with Ruby 3.3:
~~~
$ ruby -rnet/http -e 'http = Net::HTTP.new("localhost", 8080); http.local_host = Addrinfo.tcp("localhost", 8080).ip_address; p http.get("/")'
~~~
The error is then related to EAFNOSUPPORT:
```
Errno::EAFNOSUPPORT: Failed to open TCP connection to localhost:37337 (Address family not supported by protocol - bind(2) for "::1" port 45395)
```

running above Ruby command with `strace` suggest that
there is an attempt to bind with AF_INET6
using an AF_INET socket, however, only if the reproducer above
attempts to set the `http.local_host` with Addrinfo.

From upstream issue #20208:
"This is calling getaddrinfo to resolve "localhost" for us to use
it as the local side of the connection. Because Ruby does not know
what you intend to do with this IP address, it does not make the request
with AI_ADDRCONFIG. Thus, you get an IPv6 result returned,
since there is an IPv6 addres for localhost!"
2024-01-26 13:54:20 +01:00

242 lines
9.9 KiB
Diff

From c3655b89e7c06555a2e0bf13affb8a63a49f4296 Mon Sep 17 00:00:00 2001
From: Jarek Prokop <jprokop@redhat.com>
Date: Fri, 26 Jan 2024 11:19:48 +0100
Subject: [PATCH] Revert "Set AI_ADDRCONFIG when making getaddrinfo(3) calls
for outgoing conns (#7295)"
This reverts commit d2ba8ea54a4089959afdeecdd963e3c4ff391748.
The purpose of the commit is to workaround a GLIBC bug [0] still present
in older Ubuntu [1]. C8S/RHEL 8 has the fix for some time [2] and the
Ruby workaround is causing problems for us [3]. Therefore we can
revert it for EL8, EL9, and Fedora distros.
[0] https://sourceware.org/bugzilla/show_bug.cgi?id=26600
[1] https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1961697
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1868106
[3] https://bugs.ruby-lang.org/issues/20208
---
ext/socket/extconf.rb | 2 -
ext/socket/ipsocket.c | 11 +--
test/socket/test_tcp.rb | 164 ----------------------------------------
3 files changed, 2 insertions(+), 175 deletions(-)
diff --git a/ext/socket/extconf.rb b/ext/socket/extconf.rb
index 544bed5298..1ca52da366 100644
--- a/ext/socket/extconf.rb
+++ b/ext/socket/extconf.rb
@@ -607,8 +607,6 @@ def %(s) s || self end
EOS
end
- have_const('AI_ADDRCONFIG', headers)
-
case with_config("lookup-order-hack", "UNSPEC")
when "INET"
$defs << "-DLOOKUP_ORDER_HACK_INET"
diff --git a/ext/socket/ipsocket.c b/ext/socket/ipsocket.c
index 0a693655b4..0c13620258 100644
--- a/ext/socket/ipsocket.c
+++ b/ext/socket/ipsocket.c
@@ -54,22 +54,15 @@ init_inetsock_internal(VALUE v)
VALUE connect_timeout = arg->connect_timeout;
struct timeval tv_storage;
struct timeval *tv = NULL;
- int remote_addrinfo_hints = 0;
if (!NIL_P(connect_timeout)) {
tv_storage = rb_time_interval(connect_timeout);
tv = &tv_storage;
}
- if (type == INET_SERVER) {
- remote_addrinfo_hints |= AI_PASSIVE;
- }
-#ifdef HAVE_CONST_AI_ADDRCONFIG
- remote_addrinfo_hints |= AI_ADDRCONFIG;
-#endif
-
arg->remote.res = rsock_addrinfo(arg->remote.host, arg->remote.serv,
- family, SOCK_STREAM, remote_addrinfo_hints);
+ family, SOCK_STREAM,
+ (type == INET_SERVER) ? AI_PASSIVE : 0);
/*
diff --git a/test/socket/test_tcp.rb b/test/socket/test_tcp.rb
index 35d361f060..7f9dc53cae 100644
--- a/test/socket/test_tcp.rb
+++ b/test/socket/test_tcp.rb
@@ -140,168 +140,4 @@ def test_accept_multithread
server_threads.each(&:join)
end
end
-
- def test_ai_addrconfig
- # This test verifies that we pass AI_ADDRCONFIG to the DNS resolver when making
- # an outgoing connection.
- # The verification of this is unfortunately incredibly convoluted. We perform the
- # test by setting up a fake DNS server to receive queries. Then, we construct
- # an environment which has only IPv4 addresses and uses that fake DNS server. We
- # then attempt to make an outgoing TCP connection. Finally, we verify that we
- # only received A and not AAAA queries on our fake resolver.
- # This test can only possibly work on Linux, and only when run as root. If either
- # of these conditions aren't met, the test will be skipped.
-
- # The construction of our IPv6-free environment must happen in a child process,
- # which we can put in its own network & mount namespaces.
-
- omit "This test is disabled. It is retained to show the original intent of [ruby-core:110870]"
-
- IO.popen("-") do |test_io|
- if test_io.nil?
- begin
- # Child program
- require 'fiddle'
- require 'resolv'
- require 'open3'
-
- libc = Fiddle.dlopen(nil)
- begin
- unshare = Fiddle::Function.new(libc['unshare'], [Fiddle::TYPE_INT], Fiddle::TYPE_INT)
- rescue Fiddle::DLError
- # Test can't run because we don't have unshare(2) in libc
- # This will be the case on not-linux, and also on very old glibc versions (or
- # possibly other libc's that don't expose this syscall wrapper)
- $stdout.write(Marshal.dump({result: :skip, reason: "unshare(2) or mount(2) not in libc"}))
- exit
- end
-
- # Move our test process into a new network & mount namespace.
- # This environment will be configured to be IPv6 free and point DNS resolution
- # at a fake DNS server.
- # (n.b. these flags are CLONE_NEWNS | CLONE_NEWNET)
- ret = unshare.call(0x00020000 | 0x40000000)
- errno = Fiddle.last_error
- if ret == -1 && errno == Errno::EPERM::Errno
- # Test can't run because we're not root.
- $stdout.write(Marshal.dump({result: :skip, reason: "insufficient permissions to unshare namespaces"}))
- exit
- elsif ret == -1 && (errno == Errno::ENOSYS::Errno || errno == Errno::EINVAL::Errno)
- # No unshare(2) in the kernel (or kernel too old to know about this namespace type)
- $stdout.write(Marshal.dump({result: :skip, reason: "errno #{errno} calling unshare(2)"}))
- exit
- elsif ret == -1
- # Unexpected failure
- raise "errno #{errno} calling unshare(2)"
- end
-
- # Set up our fake DNS environment. Clean out /etc/hosts...
- fake_hosts_file = Tempfile.new('ruby_test_hosts')
- fake_hosts_file.write <<~HOSTS
- 127.0.0.1 localhost
- ::1 localhost
- HOSTS
- fake_hosts_file.flush
-
- # Have /etc/resolv.conf point to 127.0.0.1...
- fake_resolv_conf = Tempfile.new('ruby_test_resolv')
- fake_resolv_conf.write <<~RESOLV
- nameserver 127.0.0.1
- RESOLV
- fake_resolv_conf.flush
-
- # Also stub out /etc/nsswitch.conf; glibc can have other resolver modules
- # (like systemd-resolved) configured in there other than just using dns,
- # so rewrite it to remove any `hosts:` lines and add one which just uses
- # dns.
- real_nsswitch_conf = File.read('/etc/nsswitch.conf') rescue ""
- fake_nsswitch_conf = Tempfile.new('ruby_test_nsswitch')
- real_nsswitch_conf.lines.reject { _1 =~ /^\s*hosts:/ }.each do |ln|
- fake_nsswitch_conf.puts ln
- end
- fake_nsswitch_conf.puts "hosts: files myhostname dns"
- fake_nsswitch_conf.flush
-
- # This is needed to make sure our bind-mounds aren't visible outside this process.
- system 'mount', '--make-rprivate', '/', exception: true
- # Bind-mount the fake files over the top of the real files.
- system 'mount', '--bind', '--make-private', fake_hosts_file.path, '/etc/hosts', exception: true
- system 'mount', '--bind', '--make-private', fake_resolv_conf.path, '/etc/resolv.conf', exception: true
- system 'mount', '--bind', '--make-private', fake_nsswitch_conf.path, '/etc/nsswitch.conf', exception: true
-
- # Create a dummy interface with only an IPv4 address
- system 'ip', 'link', 'add', 'dummy0', 'type', 'dummy', exception: true
- system 'ip', 'addr', 'add', '192.168.1.2/24', 'dev', 'dummy0', exception: true
- system 'ip', 'link', 'set', 'dummy0', 'up', exception: true
- system 'ip', 'link', 'set', 'lo', 'up', exception: true
-
- # Disable IPv6 on this interface (this is needed to disable the link-local
- # IPv6 address)
- File.open('/proc/sys/net/ipv6/conf/dummy0/disable_ipv6', 'w') do |f|
- f.puts "1"
- end
-
- # Create a fake DNS server which will receive the DNS queries triggered by TCPSocket.new
- fake_dns_server_socket = UDPSocket.new
- fake_dns_server_socket.bind('127.0.0.1', 53)
- received_dns_queries = []
- fake_dns_server_thread = Thread.new do
- Socket.udp_server_loop_on([fake_dns_server_socket]) do |msg, msg_src|
- request = Resolv::DNS::Message.decode(msg)
- received_dns_queries << request
- response = request.dup.tap do |r|
- r.qr = 0
- r.rcode = 3 # NXDOMAIN
- end
- msg_src.reply response.encode
- end
- end
-
- # Make a request which will hit our fake DNS swerver - this needs to be in _another_
- # process because glibc will cache resolver info across the fork otherwise.
- load_path_args = $LOAD_PATH.flat_map { ['-I', _1] }
- Open3.capture3('/proc/self/exe', *load_path_args, '-rsocket', '-e', <<~RUBY)
- TCPSocket.open('www.example.com', 4444)
- RUBY
-
- fake_dns_server_thread.kill
- fake_dns_server_thread.join
-
- have_aaaa_qs = received_dns_queries.any? do |query|
- query.question.any? do |question|
- question[1] == Resolv::DNS::Resource::IN::AAAA
- end
- end
-
- have_a_q = received_dns_queries.any? do |query|
- query.question.any? do |question|
- question[0].to_s == "www.example.com"
- end
- end
-
- if have_aaaa_qs
- $stdout.write(Marshal.dump({result: :fail, reason: "got AAAA queries, expected none"}))
- elsif !have_a_q
- $stdout.write(Marshal.dump({result: :fail, reason: "got no A query for example.com"}))
- else
- $stdout.write(Marshal.dump({result: :success}))
- end
- rescue => ex
- $stdout.write(Marshal.dump({result: :fail, reason: ex.full_message}))
- ensure
- # Make sure the child process does not transfer control back into the test runner.
- exit!
- end
- else
- test_result = Marshal.load(test_io.read)
-
- case test_result[:result]
- when :skip
- omit test_result[:reason]
- when :fail
- fail test_result[:reason]
- end
- end
- end
- end
end if defined?(TCPSocket)
--
2.43.0