Skip to content

fix(igc): buffer rx packets on irq#691

Open
ytakano wants to merge 5 commits into
tier4:mainfrom
ytakano:igc_irq_rx_buffering
Open

fix(igc): buffer rx packets on irq#691
ytakano wants to merge 5 commits into
tier4:mainfrom
ytakano:igc_irq_rx_buffering

Conversation

@ytakano
Copy link
Copy Markdown
Collaborator

@ytakano ytakano commented May 15, 2026

This is PR4 in the igc incremental stack (stacked on top of PR3 / igc_observability; PR1–3 are already merged).

PR4 changes (this commit)

Decouples RX packet delivery from immediate upper-layer consumption by
introducing a software ring queue drained during interrupt handling.

Key modifications to awkernel_drivers/src/pcie/intel/igc.rs:

  • Adds read_queue: RingQ<EtherFrameBuf> (32-packet capacity) to Rx
  • Replaces igc_recv() with igc_rx_recv(que_id, rx: &mut Rx) that
  • recv() fast-path pops from read_queue; falls back to igc_rx_recv()
  • intr() calls igc_rx_recv() on RX IRQ, buffering packets before upper-layer consumption
  • Resets read_queue in igc_allocate_queues(), igc_setup_rx_desc_ring(), and igc_stop()

applications/tests/test_network adjusted: INTERFACE_ID → 1, addresses → 192.168.100.x, tests narrowed to tcp_listen_test + new udp_recv_test.

PR4 review follow-up (also in this commit)

Addressed Copilot review feedback:

  • Fixed no-op self.dump() in igc_up(): captures and logs the returned String via log::debug! (regression introduced during PR3 refactoring when dump() was changed to return String instead of logging directly)
  • Changed let _ = rx.read_queue.push(packet) to increment dropped_pkts on failure, making any unexpected push failure observable via the existing drop counter (the while !is_full() guard makes this path unreachable in practice)
  • Removed "via log::debug!" from debug_dump_interface() doc comment; the log level is device-defined
  • Updated (netdump if) help text to (netdump id) to avoid the keyword-like if

Not adopted: kept warn! in the default NetDevice::debug_dump() (PR3 design decision — operators should see a clear message when calling netdump on a device that has not implemented the hook); did not alter the PR
description for debug_dump/netdump additions (those are PR3 changes visible because PR4 is stacked on PR3).

Deferred to later PRs

  • TX reclaim IRQ path (PR5+): can_send() remains a reclaiming gate
  • Multi-queue support (PR5)
  • TX checksum offload (PR6)

Testing

  • make clippy — clean
  • make x86_64 RELEASE=1 — successful
  • make check_x86_64 — passed
  • Real hardware: IGC ports initialized; "link status changed: Up (Full Duplex)" at 1000 Mbps confirmed

ytakano added 5 commits April 27, 2026 11:54
Clone the selected network device while holding the net manager read lock, then release the lock before invoking the debug dump path.

This keeps potentially slow diagnostic dumping outside the shared manager lock and matches the existing interface operation pattern.
Signed-off-by: Yuuki Takano <ytakanoster@gmail.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to improve the Intel IGC driver RX path by buffering received packets into a software ring queue during IRQ handling, decoupling descriptor consumption from immediate upper-layer reads. It also introduces an on-demand “netdump” facility by adding a NetDevice::debug_dump() hook, a debug_dump_interface() helper, and a corresponding shell command.

Changes:

  • Add an RX-side software ring queue (RingQ<EtherFrameBuf>) to IGC and drain completed RX descriptors into it from the IRQ path; update recv() to pop from this queue first.
  • Add a generic NetDevice::debug_dump() API + awkernel_lib::net::debug_dump_interface() helper, and wire a shell (netdump <interface_id>) command to invoke it.
  • Refactor IgcInner::dump() to return a String so callers decide how/when to emit logs.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
awkernel_lib/src/net/net_device.rs Adds NetDevice::debug_dump() default implementation (new API surface).
awkernel_lib/src/net.rs Adds debug_dump_interface() helper that looks up an interface and invokes debug_dump() outside the NET_MANAGER lock.
awkernel_drivers/src/pcie/intel/igc.rs Implements RX buffering via RingQ, calls RX drain from IRQ, updates recv(), and changes dump() to return String.
applications/awkernel_shell/src/lib.rs Registers a new BLisp-exported (netdump ...) command and FFI to call debug_dump_interface().
applications/awkernel_shell/Cargo.toml Adds num-bigint / num-traits dependencies to parse BLisp Int into u64 for netdump.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread awkernel_lib/src/net/net_device.rs
Comment thread awkernel_lib/src/net.rs
Comment thread applications/awkernel_shell/src/lib.rs
Comment thread awkernel_drivers/src/pcie/intel/igc.rs
Comment thread awkernel_drivers/src/pcie/intel/igc.rs
Comment thread awkernel_lib/src/net/net_device.rs
@ytakano ytakano changed the title feat(igc): buffer RX packets on IRQ fix(igc): buffer rx packets on irq May 15, 2026
@ytakano ytakano marked this pull request as ready for review May 15, 2026 05:28
@ytakano ytakano requested review from atsushi421 and nokosaaan May 15, 2026 05:28
{
let mut node = MCSNode::new();
let mut rx = inner.queue_info.que[que_id].rx.lock(&mut node);
inner.igc_rx_recv(que_id, &mut rx)?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ? here short-circuits the IMS/EIMS re-enable writes at the end of intr(). Since the ICR read has already ACKed the interrupt at the device, an error here can leave LSC/per-queue interrupts un-rearmed and silently drop subsequent IRQs. Capture the rx/tx results, still perform the rearm writes, and only then propagate the error.

if rx.read_buf.is_none() {
return Ok(None);
return Ok(());
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old igc_recv() had a defensive if que_id != 0 { return Ok(None); }, but igc_rx_recv() drops that guard while the PR body explicitly defers multi-queue to PR5. If irqs_to_queues ever yields a non-zero que_id before PR5 lands, this will touch ring state that is not yet wired up. Reinstate the guard (or a debug_assert_eq!(que_id, 0)) until PR5 enables multi-queue.

Some(u16::from_le(unsafe { desc.wb.upper.vlan }))
} else {
None
while !rx.read_queue.is_full() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A full memory barrier on every descriptor in the drain loop is more expensive than necessary. The barrier only needs to pair once with the device's DMA write-back; subsequent descriptors are picked up on later iterations or later IRQs. Move membar_sync() out of the loop (or replace the status-word read with an acquire-ordered load).

.igc_recv(que_id)
.or(Err(net_device::NetDevError::DeviceError))
{
let mut node = MCSNode::new();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recv() locks rx, pops, drops the lock, then re-locks it to call igc_rx_recv() and pop again. The two acquisitions are pure overhead and also open a window where intr() can mutate read_queue between them. Collapse into a single critical section: lock once, try pop(), and only call igc_rx_recv() (and pop again) if pop returned None.

lines.push_str("(task) ; print tasks\r\n");
lines.push_str("(interrupt) ; print interrupt information\r\n");
lines.push_str("(ifconfig) ; print network interfaces\r\n");
lines.push_str("(netdump if); dump device registers\r\n");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description states this was changed from (netdump if) to (netdump id) to avoid the keyword-like if, but the diff still says if. The change appears to have been lost; please apply it (and consider clarifying that the argument is the numeric interface_id).


rx.slots += 1;
rx.next_to_check += 1;
if rx.next_to_check == rx.rx_desc_ring.as_ref().len() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says this was changed to increment dropped_pkts on push failure, but the diff still uses let _ = rx.read_queue.push(packet);. The promised observability is missing — any unexpected push failure remains invisible. Replace with if rx.read_queue.push(packet).is_err() { rx.dropped_pkts = rx.dropped_pkts.saturating_add(1); break; } (the break also avoids spinning until the next is_full() check).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants