Skip to content

Fix WASIX socket connect to detect connection failures#6059

Draft
Copilot wants to merge 5 commits intomainfrom
copilot/fix-wasix-socket-connection
Draft

Fix WASIX socket connect to detect connection failures#6059
Copilot wants to merge 5 commits intomainfrom
copilot/fix-wasix-socket-connection

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 15, 2026

Description

Socket connect operations in WASIX always returned success, even when connections failed (e.g., ECONNREFUSED). This broke applications that probe ports for availability.

Root Cause

poll_write_ready in LocalTcpStream only checked POLLOUT and POLLHUP flags. When TCP connections fail, the OS sets POLLERR, which was being ignored.

Changes

  • Added POLLERR to poll flags in poll_write_ready (lib/virtual-net/src/host.rs)
  • When POLLERR is detected, retrieve actual error via getsockopt(SO_ERROR) and propagate as NetworkError
  • Handle edge cases: getsockopt failure, POLLERR with SO_ERROR=0

Before:

connect(fd, addr, len)  // Returns 0 (success) even for closed ports
errno  // 0

After:

connect(fd, addr, len)  // Returns -1 (failure) for closed ports
errno  // 111 (ECONNREFUSED)

This follows standard POSIX socket error handling and enables port probing logic.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • registry.wasmer.io
    • Triggering command: /home/REDACTED/work/wasmer/wasmer/target/debug/deps/wasmer_wasix-539a270b3a08905e /home/REDACTED/work/wasmer/wasmer/target/debug/deps/wasmer_wasix-539a270b3a08905e 5-cgu.0.rcgu.o stup/toolchains//home/REDACTED/work/wasmer/wasmer/target/debug/build/blake3-de6d20debuginfo=2 n/rustc -f39�� o ld/syn-f84641bab/home/REDACTED/work/wasmer/wasmer/target/debug/deps/wasmer_derive-feature="std" f/num_enum-0.7.5/src/lib.rs c6597.tracing_test_macro.db42951b32e2e2f8-cgu.0.rcgu.o c6597.tracing_test_macro.db42951b32e2e2f8-cgu.1.rcgu.o own-linux-gnu/li--64 c6597.28uagvhqakf0ge67mb1wav86y.rcgu.o own-�� a8ae0c185c/out/1/tmp/ccDYIUaT.s 6b65.rlib (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>WASIX socket connect always returns success even if it failed</issue_title>
<issue_description>Some applications (in our case, Langflow) attempt to find an open host port to run on by probing random ports. However, this currently does not work, as our connection always succeeds even if a port is not open.

Below is a simplified C version of code that works on native, but not on WASIX.

#include <arpa/inet.h>
#include <errno.h>
#include <netdb.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <unistd.h>

static int connect_ex(const char *host, int port) {
    struct addrinfo hints;
    struct addrinfo *result = NULL;
    char port_str[16];
    int fd = -1;
    int status = 0;

    snprintf(port_str, sizeof(port_str), "%d", port);

    memset(&hints, 0, sizeof(hints));
    hints.ai_family = AF_INET;
    hints.ai_socktype = SOCK_STREAM;

    if (getaddrinfo(host, port_str, &hints, &result) != 0) {
        status = errno;
        printf("getaddrinfo failed (errno = %d)\n", status);
        return status;
    }

    fd = socket(result->ai_family, result->ai_socktype, result->ai_protocol);
    if (fd < 0) {
        status = errno;
        printf("socket failed (errno = %d)\n", status);
        freeaddrinfo(result);
        return status;
    }

    int connect_result = connect(fd, result->ai_addr, result->ai_addrlen);
    if (connect_result != 0) {
        status = errno;
    }
    printf("connect returned %d (errno = %d)\n", connect_result, status);



    close(fd);
    freeaddrinfo(result);
    return status;
}

static bool is_port_in_use(int port, const char *host) {
    return connect_ex(host, port) == 0;
}

static int get_free_port(int port, const char *host) {
    while (is_port_in_use(port, host)) {
        port += 1;
    }
    return port;
}

int main(int argc, char **argv) {
    const char *host = "localhost";
    int port = 8000;
    char *end = NULL;

    if (argc > 1) {
        port = (int)strtol(argv[1], &end, 10);
        if (end == argv[1] || *end != '\0') {
            fprintf(stderr, "Invalid port: %s\n", argv[1]);
            return 1;
        }
    }

    if (argc > 2) {
        host = argv[2];
    }

    printf("Checking port %d, %s\n", port, host);

    if (is_port_in_use(port, host)) {
        printf("Port %d, %s is already in use. Finding a free port...\n", port, host);
        port = get_free_port(port, host);
    }

    printf("Using port %d, %s\n", port, host);
    return 0;
}

Compile and run on native with

clang freeport.c -o freeport.native
./freeport.native

Checking port 8000, localhost
connect returned -1 (errno = 111)
Using port 8000, localhost

Compile and run on WASIX

wasixcc -fwasm-exceptions freeport.c -o freeport
wasmer run --net ./freeport


Checking port 8000, localhost
connect returned 0 (errno = 0)
Port 8000, localhost is already in use. Finding a free port...
connect returned 0 (errno = 0)
connect returned 0 (errno = 0)
...

</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits January 15, 2026 09:02
Handle POLLERR in poll_write_ready to detect failed TCP connections.
When POLLERR is detected, use getsockopt with SO_ERROR to retrieve
the actual socket error and propagate it properly. This fixes the
issue where socket connections always appeared successful even when
they failed (e.g., connection refused).

Co-authored-by: zebreus <1557253+zebreus@users.noreply.github.com>
- Use explicit cast to libc::c_void for better clarity
- Properly handle getsockopt failure case with last_os_error
- Handle edge case where POLLERR is set but SO_ERROR is 0
- Improve error handling robustness with better fallback logic

Co-authored-by: zebreus <1557253+zebreus@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix WASIX socket connect to return correct failure status Fix WASIX socket connect to detect connection failures Jan 15, 2026
Copilot AI requested a review from zebreus January 15, 2026 09:11
@zebreus
Copy link
Copy Markdown
Contributor

zebreus commented Jan 23, 2026

@copilot check again

@zebreus zebreus assigned theduke and unassigned zebreus Mar 2, 2026
@marxin
Copy link
Copy Markdown
Contributor

marxin commented Mar 2, 2026

Closing as the PR is not ready.

@marxin marxin closed this Mar 2, 2026
@zebreus
Copy link
Copy Markdown
Contributor

zebreus commented Mar 2, 2026

@theduke wanted to take a look at this PR

@zebreus zebreus reopened this Mar 2, 2026
@theduke
Copy link
Copy Markdown
Collaborator

theduke commented Mar 26, 2026

@copilot fix CI failures (lints)

@marxin marxin marked this pull request as ready for review March 27, 2026 11:24
Copilot AI review requested due to automatic review settings March 27, 2026 11:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes WASIX TCP connect reporting success even when the underlying connection fails (e.g., ECONNREFUSED) by improving error detection in write-readiness polling.

Changes:

  • Extend LocalTcpStream::poll_write_ready polling mask to include POLLERR.
  • When POLLERR is observed, query SO_ERROR via getsockopt and propagate the resulting error as NetworkError.
  • Add handling for getsockopt failure and the edge case where POLLERR is set but SO_ERROR == 0.
Comments suppressed due to low confidence (1)

lib/virtual-net/src/host.rs:756

  • poll_write_ready only calls getsockopt(SO_ERROR) when POLLERR is set. For nonblocking connect(2), poll/select commonly reports the socket as writable (POLLOUT) on both success and failure, and the recommended way to distinguish the two is always checking SO_ERROR once the fd becomes writable. As-is, a failed connect could still be treated as ready if the OS sets POLLOUT without POLLERR. Consider checking SO_ERROR whenever revents includes POLLOUT (and/or POLLERR) before returning Ok(10240).
        match libc_poll(
            stream.as_raw_fd(),
            libc::POLLOUT | libc::POLLHUP | libc::POLLERR,
        ) {
            Some(val) if (val & libc::POLLERR) != 0 => {
                // Get the actual socket error using SO_ERROR
                let mut error: libc::c_int = 0;
                let mut len = std::mem::size_of::<libc::c_int>() as libc::socklen_t;
                let result = unsafe {
                    libc::getsockopt(
                        stream.as_raw_fd(),
                        libc::SOL_SOCKET,
                        libc::SO_ERROR,
                        &mut error as *mut libc::c_int as *mut libc::c_void,
                        &mut len,
                    )
                };
                if result != 0 {
                    // getsockopt itself failed
                    let io_error = std::io::Error::last_os_error();
                    return Poll::Ready(Err(io_err_into_net_error(io_error)));
                }
                if error != 0 {
                    // Socket has a pending error
                    let io_error = std::io::Error::from_raw_os_error(error);
                    return Poll::Ready(Err(io_err_into_net_error(io_error)));
                }
                // POLLERR was set but SO_ERROR is 0 - this shouldn't normally happen,
                // but we'll treat it as a generic IO error
                return Poll::Ready(Err(NetworkError::IOError));
            }
            Some(val) if (val & libc::POLLHUP) != 0 => {
                return Poll::Ready(Ok(0));
            }
            Some(val) if (val & libc::POLLOUT) != 0 => return Poll::Ready(Ok(10240)),
            _ => {}

Comment on lines +721 to +725
match libc_poll(
stream.as_raw_fd(),
libc::POLLOUT | libc::POLLHUP | libc::POLLERR,
) {
Some(val) if (val & libc::POLLERR) != 0 => {
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change fixes a subtle readiness/connection-error behavior; please add a regression test to cover it. A stable approach is to create a nonblocking TCP socket to a known-closed localhost port (e.g., bind a TcpListener to 127.0.0.1:0, capture the port, drop the listener, then attempt connect), wrap it as LocalTcpStream, and assert poll_write_ready resolves to Err(NetworkError::ConnectionRefused) (or appropriate mapped error) instead of reporting writable.

Copilot uses AI. Check for mistakes.
@marxin marxin requested review from Arshia001 and removed request for zebreus March 31, 2026 07:09
Copy link
Copy Markdown
Collaborator

@theduke theduke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: this is an incomplete/incorrect fix and requires more work.

@theduke theduke marked this pull request as draft April 1, 2026 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WASIX socket connect always returns success even if it failed

5 participants