From 33064cd077cf6fa386f0a5a840c2161868da7b3a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ond=C5=99ej=20Sur=C3=BD?= <ondrej@isc.org>
Date: Tue, 8 Feb 2022 12:42:34 +0100
Subject: [PATCH] Run .closehandle_cb asynchrounosly in nmhandle_detach_cb()

When sock->closehandle_cb is set, we need to run nmhandle_detach_cb()
asynchronously to ensure correct order of multiple packets processing in
the isc__nm_process_sock_buffer().  When not run asynchronously, it
would cause:

  a) out-of-order processing of the return codes from processbuffer();

  b) stack growth because the next TCP DNS message read callback will
     be called from within the current TCP DNS message read callback.

The sock->closehandle_cb is set to isc__nm_resume_processing() for TCP
sockets which calls isc__nm_process_sock_buffer().  If the read callback
(called from isc__nm_process_sock_buffer()->processbuffer()) doesn't
attach to the nmhandle (f.e. because it wants to drop the processing or
we send the response directly via uv_try_write()), the
isc__nm_resume_processing() (via .closehandle_cb) would call
isc__nm_process_sock_buffer() recursively.

The below shortened code path shows how the stack can grow:

 1: ns__client_request(handle, ...);
 2: isc_nm_tcpdns_sequential(handle);
 3: ns_query_start(client, handle);
 4:   query_lookup(qctx);
 5:     query_send(qctcx->client);
 6:       isc__nmhandle_detach(&client->reqhandle);
 7:         nmhandle_detach_cb(&handle);
 8:           sock->closehandle_cb(sock); // isc__nm_resume_processing
 9:             isc__nm_process_sock_buffer(sock);
10:               processbuffer(sock); // isc__nm_tcpdns_processbuffer
11:                 isc_nmhandle_attach(req->handle, &handle);
12:                 isc__nm_readcb(sock, req, ISC_R_SUCCESS);
13:                   isc__nm_async_readcb(NULL, ...);
14:                     uvreq->cb.recv(...); // ns__client_request

Instead, if 'sock->closehandle_cb' is set, we need to run detach the
handle asynchroniously in 'isc__nmhandle_detach', so that on line 8 in
the code flow above does not start this recursion. This ensures the
correct order when processing multiple packets in the function
'isc__nm_process_sock_buffer()' and prevents the stack growth.

When not run asynchronously, the out-of-order processing leaves the
first TCP socket open until all requests on the stream have been
processed.

If the pipelining is disabled on the TCP via `keep-response-order`
configuration option, named would keep the first socket in lingering
CLOSE_WAIT state when the client sends an incomplete packet and then
closes the connection from the client side.

(cherry picked from commit afee2b5a7bc933a2d987907fc327a9f118fdbd17)
---
 lib/isc/netmgr/netmgr.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/isc/netmgr/netmgr.c b/lib/isc/netmgr/netmgr.c
index 3283eb6e4f..0ed3182fb6 100644
--- a/lib/isc/netmgr/netmgr.c
+++ b/lib/isc/netmgr/netmgr.c
@@ -1746,8 +1746,12 @@ isc__nmhandle_detach(isc_nmhandle_t **handlep FLARG) {
 	handle = *handlep;
 	*handlep = NULL;
 
+	/*
+	 * If the closehandle_cb is set, it needs to run asynchronously to
+	 * ensure correct ordering of the isc__nm_process_sock_buffer().
+	 */
 	sock = handle->sock;
-	if (sock->tid == isc_nm_tid()) {
+	if (sock->tid == isc_nm_tid() && sock->closehandle_cb == NULL) {
 		nmhandle_detach_cb(&handle FLARG_PASS);
 	} else {
 		isc__netievent_detach_t *event =
-- 
2.34.1