Skip to content

fix: correct error type on timeout and prevent silent message loss on macOS#13

Open
hobostay wants to merge 1 commit into
bytedance:mainfrom
hobostay:fix-bugs
Open

fix: correct error type on timeout and prevent silent message loss on macOS#13
hobostay wants to merge 1 commit into
bytedance:mainfrom
hobostay:fix-bugs

Conversation

@hobostay
Copy link
Copy Markdown
Contributor

Summary

  • Fix incorrect JoinError type in Rule::join: When the lookup times out after 5 retries, the code was returning JoinError::VersionMismatch(Version((0, 0, 0))) instead of JoinError::Timeout. This misleads callers into thinking there's a version mismatch when the real problem is a network/service timeout.
  • Fix silent message loss on macOS: In platform/macos/mod.rs, when mach_msg_send fails with a non-retryable error and the remote is not dead, the code was returning Ok(()) instead of Err(Error::Disconnect). This causes the caller (bus controller) to believe the message was sent successfully when it was actually dropped. Now returns Err(Error::Disconnect) to be consistent with the Linux implementation.
  • Simplify bus controller buffer logic: The message_buffer_swap field was redundant. Since mem::take already empties self.message_buffer, retained messages can be pushed directly to it without needing a swap buffer and the associated mem::swap call.

Test plan

  • All existing unit tests pass (cargo test — 8 tests passed)
  • Manual testing on macOS to verify mach_msg_send error handling
  • Verify timeout error reporting in multi-process scenarios

🤖 Generated with Claude Code

… macOS

1. Fix incorrect JoinError type in Rule::join: when lookup times out
   after 5 retries, the code was returning VersionMismatch(0,0,0)
   instead of Timeout.

2. Fix silent message loss on macOS: when mach_msg_send fails with a
   non-retryable error and the remote is not dead, the code returned
   Ok(()) instead of Err(Error::Disconnect), causing the caller to
   incorrectly believe the message was sent.

3. Simplify bus controller buffer logic: the message_buffer_swap field
   was unnecessary since after mem::take, self.message_buffer is empty
   and can directly receive retained messages.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@xiaopengli89
Copy link
Copy Markdown
Member

  1. Different versions of ipmb may have protocol changes, which usually cause handshake timeouts. Therefore, if you fail to connect after more than 5 attempts, it is very likely that the version is incompatible.
  2. This is an error handling issue that needs improvement. However, it can't return "disconnect" because this would cause a reconnection, even though the connection isn't actually dead and can continue sending other messages.
  3. This is a performance optimization based on double buffering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants