fix: avoid incrementing readErrCount for temporary errors and reset o…#1007
fix: avoid incrementing readErrCount for temporary errors and reset o…#1007yangdm0209 wants to merge 1 commit intogorilla:release-1.5from
Conversation
…n successful read - Do not increment readErrCount for temporary errors such as timeouts, as they do not indicate actual connection failures. - Reset readErrCount after successful read operations to reflect the true health status of the connection. - This prevents readErrCount from growing due to normal timeout logic and provides a more accurate error state.
|
Although the underlying network connection may still be usable after returning an error, a The The None of the above prevents the normal use of a read deadline to detect when the peer is not sending data as expected. Edit: Because the websocket.Conn does call read on the underlying connection after a read error is encountered, the statement |
Previously any error out of advanceFrame was latched into c.readErr,
which is terminal: the for-loop condition guards every future read
on the same field. So a single expired ReadDeadline poisoned the
connection and all subsequent NextReader calls returned the cached
timeout error - even if the application extended the deadline and
tried again.
Two fixes:
1. On a net.Error with Timeout() == true, return the error to the
caller without setting c.readErr. The caller can SetReadDeadline
and call NextReader again on a healthy connection.
2. Reset c.readErrCount to 0 after any successful frame read
(including control frames), so transient blips don't slowly
accumulate toward the 1000-read spin-panic threshold.
Upstream PR #1007 and issue #474.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit title
fix: avoid incrementing readErrCount for temporary errors and reset on successful read
Description
readErrCountfor temporary errors such as timeouts, as they do not indicate actual connection failures.readErrCountafter successful read operations to reflect the true health status of the connection.readErrCountfrom growing due to normal timeout logic and provides a more accurate error state.What type of PR is this? (check all applicable)
Description
This PR addresses an issue where
readErrCountwas incremented on all errors, including temporary errors such as timeouts. Since timeouts often occur as part of normal connection management, especially when read deadlines are used, counting them as errors could lead to misleading high error counts and unnecessary connection termination or misreporting.Now, only permanent errors cause
readErrCountto increase, and a successful read resets this counter, better reflecting the real stability of the connection.Related Tickets & Documents
Added/updated tests?
Run verifications and test
make verifyis passingmake testis passingChecklist (before submitting):