SocketInput::DoRead reports stale errno on clean connection close (recv() == 0)
Summary
When the remote peer closes the TCP connection cleanly, recv() returns 0. The error handling code at this point calls getSocketErrorCode() (which reads errno / WSAGetLastError()), but recv() returning 0 is not an error — it is an EOF indication — and the OS does not update errno in this case. The result is a std::system_error with a stale, misleading error code from a previous syscall.
Reproduction
A minimal TCP server that accepts a connection, reads the client Hello, then immediately closes:
int client_fd = accept(server_fd, nullptr, nullptr);
// optionally recv() the Hello
close(client_fd);
Client code:
clickhouse::ClientOptions opts;
opts.SetHost("127.0.0.1");
opts.SetPort(port);
clickhouse::Client client(opts); // throws
Actual exception message:
closed: Operation now in progress
Expected exception message:
Connection closed by peer
"Operation now in progress" is EINPROGRESS — completely unrelated to a connection close.
Root Cause
In socket.cpp, SocketInput::DoRead:
size_t SocketInput::DoRead(void* buf, size_t len) {
const ssize_t ret = ::recv(s_, (char*)buf, (int)len, 0);
if (ret > 0) {
return (size_t)ret;
}
if (ret == 0) {
throw std::system_error(getSocketErrorCode(), getErrorCategory(), "closed");
// ^^^^^^^^^^^^^^^^^^^ BUG: errno is stale
}
throw std::system_error(getSocketErrorCode(), getErrorCategory(), "can't receive string data");
}
When recv() returns 0, the POSIX specification does not require errno to be set. The value of errno remains whatever it was from the last syscall that failed. In this case, the typical call sequence is:
SocketConnect → connect() in non-blocking mode → errno set to EINPROGRESS
Poll() succeeds → getsockopt(SO_ERROR) returns 0 → socket switched to blocking
SendHello() → send() succeeds → errno unchanged (success doesn't clear errno)
ReceiveHello() → recv() returns 0 (peer closed) → errno still EINPROGRESS
getSocketErrorCode() returns EINPROGRESS → exception says "Operation now in progress"
Depending on timing and platform, the stale value could be any previous error code, making the exception message non-deterministic and misleading.
Impact
- Debugging difficulty: The misleading error code sends developers on the wrong path. "Operation now in progress" suggests a non-blocking socket issue, not a closed connection.
- Error handling: Callers catching
std::system_error and inspecting .code().value() get an incorrect error code, making programmatic retry/recovery logic unreliable.
- Non-determinism: The stale errno value varies depending on which syscall last set it, making the bug platform- and timing-dependent.
Suggested Fix
For the recv() == 0 case, use a well-defined error code instead of reading errno:
if (ret == 0) {
throw std::system_error(
ECONNRESET, getErrorCategory(), "connection closed by peer");
}
Or on Windows:
if (ret == 0) {
#if defined(_win_)
throw std::system_error(
WSAECONNRESET, windowsErrorCategory::category(), "connection closed by peer");
#else
throw std::system_error(
ECONNRESET, std::system_category(), "connection closed by peer");
#endif
}
Alternatively, a custom error code/category could be used to distinguish a clean close (FIN) from a reset (RST), but at minimum the stale getSocketErrorCode() call must be removed from this path.
Environment
- Library: clickhouse-cpp (all current versions)
- Affected platforms: All (Linux, macOS, Windows)
- Affected file:
clickhouse/base/socket.cpp, SocketInput::DoRead
SocketInput::DoReadreports stale errno on clean connection close (recv() == 0)Summary
When the remote peer closes the TCP connection cleanly,
recv()returns 0. The error handling code at this point callsgetSocketErrorCode()(which readserrno/WSAGetLastError()), butrecv()returning 0 is not an error — it is an EOF indication — and the OS does not updateerrnoin this case. The result is astd::system_errorwith a stale, misleading error code from a previous syscall.Reproduction
A minimal TCP server that accepts a connection, reads the client Hello, then immediately closes:
Client code:
Actual exception message:
Expected exception message:
"Operation now in progress" is
EINPROGRESS— completely unrelated to a connection close.Root Cause
In
socket.cpp,SocketInput::DoRead:When
recv()returns 0, the POSIX specification does not requireerrnoto be set. The value oferrnoremains whatever it was from the last syscall that failed. In this case, the typical call sequence is:SocketConnect→connect()in non-blocking mode →errnoset toEINPROGRESSPoll()succeeds →getsockopt(SO_ERROR)returns 0 → socket switched to blockingSendHello()→send()succeeds →errnounchanged (success doesn't clearerrno)ReceiveHello()→recv()returns 0 (peer closed) →errnostillEINPROGRESSgetSocketErrorCode()returnsEINPROGRESS→ exception says "Operation now in progress"Depending on timing and platform, the stale value could be any previous error code, making the exception message non-deterministic and misleading.
Impact
std::system_errorand inspecting.code().value()get an incorrect error code, making programmatic retry/recovery logic unreliable.Suggested Fix
For the
recv() == 0case, use a well-defined error code instead of readingerrno:Or on Windows:
Alternatively, a custom error code/category could be used to distinguish a clean close (FIN) from a reset (RST), but at minimum the stale
getSocketErrorCode()call must be removed from this path.Environment
clickhouse/base/socket.cpp,SocketInput::DoRead