From 01f3f96d80d83cbbecabf509e2d1dc506e57dbde Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Fri, 8 May 2026 10:27:32 +0000
Subject: [PATCH] fix(conn): route to error_handler on partial-write-then-error
 (P3)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Phase 3 of the graceful-shutdown / graceful-reload plan documented in
roadmap/plan-graceful-shutdown.md on the roadmap branch — Pattern D'
(write-path contract for non-TLS).

src/nxt_conn_write.c
--------------------

nxt_conn_io_write() in master ran:

    if (ret == 0 || sb.sent != 0) {
        /*
         * ret == NXT_ERROR is ignored here if some data was sent,
         * the error will be handled on the next nxt_conn_write() call.
         */
        c->sent += sb.sent;
        nxt_work_queue_add(... ready_handler ...);
        return;
    }

so a write that returned NXT_ERROR after a partial send (e.g. ECONNRESET
on the second writev/sendfile of the same nxt_conn_io_write call after
the first succeeded) was silently routed to ready_handler, deferring
detection until the next event-loop tick.  The TLS path
(nxt_openssl_conn_test_error at src/nxt_openssl.c:1543-1607) already
fails fast on the equivalent condition; this commit extends that
contract to plain HTTP.

The new routing:

    if (ret == 0 || (ret != NXT_ERROR && sb.sent != 0)) {
        /* normal completion path - unchanged */
        ...
    }

    if (ret != NXT_ERROR) {
        return;
    }

    if (sb.sent != 0) {
        c->sent += sb.sent;
        nxt_log(task, NXT_LOG_INFO,
                "conn write: peer closed mid-response fd:%d sent:%O",
                c->socket.fd, sb.sent);
    }

    nxt_fd_event_block_write(engine, &c->socket);
    /* falls through to error: label */

NXT_AGAIN with sb.sent != 0 still routes to ready_handler (correct -
caller should retry).  Only NXT_ERROR with sb.sent != 0 is the new
fast-fail path, with one INFO record so operators can correlate the
routing decision with the underlying writev/send/sendfile syscall log
emitted at nxt_socket_error_level (also INFO for ECONNRESET-class).

src/nxt_router.c
----------------

In nxt_router_get_mmap_handler(), the NULL-app-pointer branch logged at
ALERT severity (a //FIXME placeholder).  This branch fires legitimately
during graceful reload when an app's reply port outlives the app
struct - it is a benign race, not corruption.  Demoted to INFO with a
comment so reload-under-load doesn't flood the log with ALERT entries.

The companion FIXME at nxt_router.c:5914 (mmap-id-out-of-range) is
left at ALERT: it represents a real protocol violation and should not
be hidden.  See P3 plan note in plan-graceful-shutdown.md.

Scope notes
-----------

Of the six sites originally listed under P3 in plan-graceful-shutdown.md:

- src/nxt_conn_write.c (the actual h1proto write loop, one level deeper
  than the plan's "nxt_h1proto.c write loop" pointer): fixed here.
- src/nxt_router.c:5898 (get_mmap_handler NULL-app): demoted here.
- src/nxt_router.c:5914 (mmap-id-out-of-range): left at ALERT - protocol
  violation, not peer-close.
- src/nxt_port_socket.c:749, :892, :1345: re-classified as port-socket
  read-side (IPC) buf-alloc backpressure, not write-path D'.  Belongs
  in a separate IPC-resilience PR; tracked but out of scope here.

No regression test in this commit
---------------------------------

The new branch fires only when one nxt_conn_io_write() call has
(a) one successful sendbuf followed by (b) one NXT_ERROR sendbuf in
its do-while loop.  On loopback with SO_LINGER{1,0}+close, the RST
arrives before Unit's first sendbuf in 100% of attempts, so sb.sent
stays 0 and the new branch is unreachable from a black-box test.

Deterministic regression coverage requires malloc-failure / syscall-
failure injection so the second sendbuf can be forced to NXT_ERROR
after the first succeeds.  That harness is designed in
roadmap/plan-malloc-injection.md (PR #9) and will be the natural
follow-up consumer.

The fix itself is review-verified by inspection of the two-line
boolean change and the four lines of new logging.

Tests
-----

./configure --tests
make -j$(nproc)                                          clean build
python3 -m pytest test/test_idle_close_wait.py           2 pass
python3 -m pytest test/test_static.py                    18 pass, 1 skip

The static.py suite exercises the share + writev/sendfile paths the
fix touches; no behavioural change observed for normal-completion or
NXT_AGAIN-mid-response traffic.
---
 src/nxt_conn_write.c | 23 +++++++++++++++++------
 src/nxt_router.c     | 13 ++++++++++---
 2 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/src/nxt_conn_write.c b/src/nxt_conn_write.c
index 7d0a579f2..2a92f0f5d 100644
--- a/src/nxt_conn_write.c
+++ b/src/nxt_conn_write.c
@@ -118,12 +118,8 @@ nxt_conn_io_write(nxt_task_t *task, void *obj, void *data)
         }
     }
 
-    if (ret == 0 || sb.sent != 0) {
-        /*
-         * ret == 0 means a sync buffer was processed.
-         * ret == NXT_ERROR is ignored here if some data was sent,
-         * the error will be handled on the next nxt_conn_write() call.
-         */
+    if (ret == 0 || (ret != NXT_ERROR && sb.sent != 0)) {
+        /* ret == 0 means a sync buffer was processed. */
         c->sent += sb.sent;
         nxt_work_queue_add(c->write_work_queue, c->write_state->ready_handler,
                            task, c, data);
@@ -134,6 +130,21 @@ nxt_conn_io_write(nxt_task_t *task, void *obj, void *data)
         return;
     }
 
+    if (sb.sent != 0) {
+        /*
+         * Pattern D′: route to error_handler immediately when a write
+         * returns NXT_ERROR after a partial send (e.g. peer closed
+         * mid-response).  Pre-fix, this case fell through to
+         * ready_handler and the error was deferred to the next
+         * nxt_conn_io_write() call via c->socket.error.  Mirrors the
+         * TLS path's fail-fast contract; see nxt_openssl_conn_test_error.
+         */
+        c->sent += sb.sent;
+        nxt_log(task, NXT_LOG_INFO,
+                "conn write: peer closed mid-response fd:%d sent:%O",
+                c->socket.fd, sb.sent);
+    }
+
     nxt_fd_event_block_write(engine, &c->socket);
 
 error:
diff --git a/src/nxt_router.c b/src/nxt_router.c
index c994f3707..3f2161db7 100644
--- a/src/nxt_router.c
+++ b/src/nxt_router.c
@@ -5892,10 +5892,17 @@ nxt_router_get_mmap_handler(nxt_task_t *task, nxt_port_recv_msg_t *msg)
     nxt_assert(port->type == NXT_PROCESS_APP);
 
     if (nxt_slow_path(port->app == NULL)) {
-        nxt_alert(task, "get_mmap_handler: app == NULL for reply port %PI:%d",
-                  port->pid, port->id);
+        /*
+         * Pattern D′: a NULL app pointer here is a benign race during
+         * worker shutdown / graceful reload (the app freed before its
+         * reply port was drained), not a corruption.  Demoted from
+         * ALERT so reload-under-load doesn't pollute the log.
+         */
+        nxt_log(task, NXT_LOG_INFO,
+                "get_mmap_handler: app == NULL for reply port %PI:%d "
+                "(peer likely shutting down)",
+                port->pid, port->id);
 
-        // FIXME
         nxt_port_socket_write(task, port, NXT_PORT_MSG_RPC_ERROR,
                               -1, msg->port_msg.stream, 0, NULL);