Skip to content

IR Design and Implementation Plan #5

@morehouse

Description

@morehouse

Problem

The current EncryptedBytesScenario encrypts and sends raw bytes as BOLT messages. This mostly tests message parsing, never reaching deeper protocol states.

To fuzz deeper, we need the fuzzer to become structure aware, able to send and receive sequences of valid (or semi-valid) messages that get past the initial parsing code and exercise deeper protocol logic. Ideally the fuzzer would be smart enough to generate the common message sequences to open channels, send or receive HTLCs, close channels, etc. Many of these message sequences have dependencies between the messages -- e.g., commitment_signed must contain a channel ID that matches a previously-opened channel, as well as signatures generated from the previously negotiated keys and commitment states. We want the fuzzer to be able to satisfy these dependencies as well.

Solution

We can use an intermediate representation (IR) to capture the type and structure knowledge needed to fuzz deeper. The fuzzer can then use this IR to generate and mutate short programs to be executed in the Nyx VM.

The following design is inspired by ideas from both syzkaller and fuzzamoto.

Architecture

We add a new crate smite-ir/ to contain the IR, custom mutators, and program generators, and a new scenario IrScenario<T, S> under smite-scenarios/ to execute IR programs in the Nyx VM.

Image

The smite-ir crate is intended to be engine-agnostic -- no dependency on AFL++ or LibAFL. We intend to use AFL++ at first with a thin custom mutator wrapper library libsmite_ir_mutator.so loaded via AFL_CUSTOM_MUTATOR_LIBRARY. In the future, we may migrate to LibAFL by simply replacing the wrapper. See "Appendix: Fuzzing Engine Tradeoffs" for details.

Our custom mutators and generators create new Programs and serialize them for the fuzzing engine, which then sends each Program to the Nyx VM via shared memory. Inside the VM, our IrScenario deserializes the Program and executes it line-by-line. IrScenario then checks for and reports any crashes or hangs and resets the VM snapshot before processing the next Program.

Example IR Program

This program executes the channel funding flow up to the point where the target sends funding_signed. We will refer to this example in subsequent sections.

# Generate channel keys (6 key pairs)
v0 = GeneratePrivateKey(0x00)
v1 = DerivePoint(v0)
v2 = GeneratePrivateKey(0x01)
v3 = DerivePoint(v2)
v4 = GeneratePrivateKey(0x02)
v5 = DerivePoint(v4)
v6 = GeneratePrivateKey(0x03)
v7 = DerivePoint(v6)
v8 = GeneratePrivateKey(0x04)
v9 = DerivePoint(v8)
v10 = GeneratePrivateKey(0x05)
v11 = DerivePoint(v10)

# Channel parameters
v12 = GenerateTemporaryChannelId(0x00)
v13 = LoadChainHash()
v14 = LoadAmount(100_000)
v15 = LoadAmount(0)
v16 = LoadAmount(546)
v17 = LoadAmount(10_000_000)
v18 = LoadAmount(1_000)
v19 = LoadAmount(1)
v20 = LoadFeeratePerKw(2_500)
v21 = LoadCsvDelay(144)
v22 = LoadFeatures()

# Build and send open_channel
v23 = BuildOpenChannel(v13, v12, v14, v15, v16, v17, v18, v19, v20, v21, v1, v3, v5, v7, v9, v11, v22)
SendMessage(v23)

# Receive accept_channel, extract target's keys
v25 = RecvAcceptChannel()
v26 = ExtractFundingPubkey(v25)
v27 = ExtractFirstPerCommitmentPoint(v25)

# Create funding output via bitcoind
v28 = CreateFundingOutput(v1, v26, v14)

# Sign commitment transaction
v29 = BuildCommitmentTx(v28, v1, v26, v21)
v30 = ComputeCommitmentSighash(v29)
v31 = Sign(v0, v30)

# Build and send funding_created
v32 = BuildFundingCreated(v12, v28, v31)
SendMessage(v32)

# Receive funding_signed
v34 = RecvFundingSigned()
...

Key things to notice:

  • SSA form: each instruction produces at most one variable, numbered by instruction index. v1 = DerivePoint(v0) means instruction 1 takes v0 as input and produces v1.
  • Variable gaps (no v24, v33) indicate void instructions like SendMessage that have side effects but no output.
  • Compound variables: RecvAcceptChannel() produces v25, an AcceptChannelData containing all parsed response fields. ExtractFundingPubkey(v25) pulls one field into a primitive Point for use by later Build operations.
  • Possible mutations: InputSwapMutator could swap v3 (revocation_basepoint) with v5 (payment_basepoint) in BuildOpenChannel, since both are Point. OperationParamMutator could change LoadAmount(100_000) to LoadAmount(7_654).

Core Concepts

Program and Instructions

A Program is an ordered list of Instructions. Programs are serialized with postcard for transport between AFL++ and the VM. Snapshot state (target pubkey, chain hash, block height, channel keys if a channel is already open) lives in a separate ProgramContext that is supplied to the executor at run time.

An Instruction is an Operation plus input variable indices. In the example, BuildFundingCreated(v12, v28, v31) has operation BuildFundingCreated and inputs [12, 28, 31].

Operations

An Operation is one of four categories:

  1. Load: produce a variable from embedded data or snapshot context (LoadAmount(100_000), LoadChainHash, LoadContextChannelId).
  2. Compute: derive a variable from inputs (DerivePoint, Sign, HashPaymentPreimage, Extract*).
  3. Build: construct a BOLT message from inputs (BuildOpenChannel, BuildCommitmentSigned).
  4. Act: produce side effects against the target (SendMessage, RecvAcceptChannel, MineBlocks, Reconnect).

Embedded literal data lives in the Operation itself (e.g., LoadAmount(100_000), GeneratePrivateKey(0x05)).

Recv Operations

Recv* operations (e.g., RecvAcceptChannel) read from the connection and produce compound variables containing all parsed fields. The executor's receive loop auto-responds to pings (pong) and returns the first non-ping message. If the returned message matches the expected type, it is parsed into a compound variable. If it doesn't match (e.g., target sends error instead of accept_channel), the executor stops immediately.

To minimize gossip noise, we can disable option_gossip_queries in our init features and drain any initial gossip messages received during pre-snapshot setup. In some scenarios it may also be helpful to use gossip_timestamp_filter to request the target to refrain from sending us gossip.

Variables

A Variable is a typed runtime value produced by the executor -- ChannelId, Point, PrivateKey, Signature, Amount, Message, etc. Variables correspond to the runtime SSA outputs produced by each executed Instruction in a Program.

Compound variables (e.g., AcceptChannelData, FundingSignedData) bundle all fields from a parsed target response. Extract* operations pull individual fields into primitive types:

v25 = RecvAcceptChannel()              # -> AcceptChannelData (compound)
v26 = ExtractFundingPubkey(v25)        # -> Point (primitive)

Each Variable has a VariableType we can use to ensure mutations are type-safe.

Executor

The Executor is used by IrScenario to walk a Program instruction by instruction, executing the specified actions and maintaining a Vec<Variable> store.

Unlike fuzzamoto's Compiler (which pre-compiles the full program into flat actions on the host), we choose to directly interpret Programs in the VM. This simplifies fuzzing of the many interactive flows in the Lightning protocol, which require us to construct later messages using data sent by the target in earlier messages.

fn run(&mut self, input: &[u8]) -> ScenarioResult {
    let program = match postcard::from_bytes::<Program>(input) {
        Ok(p) => p,
        Err(_) => return ScenarioResult::Skip,
    };
    let mut executor = Executor::new(&self.context);
    match executor.execute(&program, &mut self.conn, &mut self.bitcoind) {
        Ok(()) => {}
        Err(ExecuteError::Connection(_)) => {
            if self.target.check_alive().is_err() {
                return ScenarioResult::Fail("target crashed".into());
            }
        }
        Err(ExecuteError::Timeout) => {
            return ScenarioResult::Fail("target hung".into());
        }
        Err(_) => return ScenarioResult::Skip,
    }
    // Final ping-pong catches delayed crashes
    if let Err(e) = ping_pong(&mut self.conn) { ... }
    ScenarioResult::Ok
}

Generators

Generators produce type-correct instruction sequences that represent protocol interactions. Each generator knows the shape of a protocol flow (what messages to construct, what keys to generate, what order to send/recv) but delegates value selection and variable reuse to ProgramBuilder.

ProgramBuilder

ProgramBuilder is the shared infrastructure that all generators use. It maintains:

  • The instruction list being built (append-only, SSA)
  • A type-indexed variable registry tracking all produced variables (direct primitives and extractable compound fields)
  • The pick_variable() method for probabilistic variable selection

Generators call builder methods -- they never manipulate instruction indices or variable references directly:

// Generator asks builder for variables.
// Builder decides: reuse existing? extract from compound? generate fresh?
let chan_id = builder.pick_variable(VariableType::ChannelId, rng);
let feerate = builder.pick_variable(VariableType::Amount, rng);

// Generator tells builder what instruction to emit.
let msg_idx = builder.append(Operation::BuildUpdateFee, &[chan_id, feerate]);

This separation means generators encode protocol knowledge (e.g., open_channel needs 6 key pairs) while the builder encodes fuzzing strategy (e.g., picks which variables to reuse or generate randomly).

Resource-Aware Variable Selection

When a generator needs a variable, ProgramBuilder::pick_variable() selects randomly from different strategies according to their weight:

  • Reuse recent (75%): Recently-created variables are more likely to be useful for exercising multi-message protocol flows.
  • Reuse any (15%): Cross-pollinates between protocol flows.
  • Generate fresh (10%): Emits instructions that produce a new valid value of the requested type.

Generator Types

Generators are organized into different types.

Message generators emit the instructions for a single protocol message: load parameters, build the message, send the message, and optionally receive the response. These are building blocks for generating interesting protocol flows. (e.g., OpenChannelMsg, FundingCreatedMsg, ChannelReadyMsg, UpdateAddHtlcMsg).

Action generators do some single action, such as mining blocks via bitcoind. (e.g., MineBlocksAction).

Flow generators compose message and action generators in sequence, threading variables between them via ProgramBuilder. They are the easiest way for the fuzzer to reach deep protocol states when many constraints (matching keys, valid signatures, correct sequencing) need to align. (e.g., ChannelOpenFlow, HtlcAddFlow, HtlcFulfillFlow, InteractiveTxFlow).

Standalone vs. Insertion

Generators are used both to generate programs from scratch and to insert new code into existing programs as part of the GeneratorInsertMutator.

Mutators

Mutators transform existing programs while preserving structural validity.

Planned Mutators

Mutator What it does
OperationParamMutator Pick a random instruction with mutable parameters and mutate its embedded literal. Type-aware: amounts get boundary values (0, 1, u64::MAX) and random ranges; byte arrays get bit flips, insertions, and deletions; feerates/delays get truncated or maximized.
InputSwapMutator Replace a variable reference in a random instruction with a different variable of the same VariableType.
InstructionReorderMutator Swap two Act instructions (SendMessage, MineBlocks, etc.) that have no data dependency between them.
SpliceMutator Pick a random program from the corpus and interleave its instruction subsequence into the current program at a random point, adjusting variable indices.
InstructionDeleteMutator Remove a random instruction.
GeneratorInsertionMutator Insert a freshly generated instruction subsequence (via a generator) at a random point.

Snapshot Setup

Different fuzzing goals require different starting states. IrScenario<T, S> is parameterized by a SnapshotSetup trait:

trait SnapshotSetup<T: Target> {
    fn setup(target: &T, conn: &mut NoiseConnection) -> Result<ProgramContext, ScenarioError>;
}

Snapshot setup is Rust code that drives the target through an initial deterministic protocol sequence using NoiseConnection and bitcoin-cli directly. IrScenario calls setup() before snapshotting the VM state.

Sample snapshot variants:

Setup Snapshot state IR fuzzes...
PostInitSetup After handshake + init exchange open_channel, gossip, any first message
PostChannelOpenSetup After channel is funded + ready HTLCs, commitment rounds, fees, closure
InteractiveTxSetup Mid-negotiation (after open_channel2 + accept_channel2) tx_[add/remove]_[input/output], tx_complete

ProgramContext carries setup state into the executor. For example:

struct ProgramContext {
    // Always present
    target_pubkey: [u8; 33],
    chain_hash: [u8; 32],
    block_height: u32,
    target_features: Vec<u8>,

    // Present after PostChannelOpenSetup
    channel_id: Option<[u8; 32]>,
    local_keys: Option<ChannelKeys>,
    remote_keys: Option<ChannelKeys>,
    funding_outpoint: Option<OutPoint>,
    commitment_number: Option<u64>,
}

LoadContext* operations access these fields, erroring at execution time if absent for the current snapshot variant. Each setup variant is a separate binary (e.g., ldk_ir_post_init, ldk_ir_post_channel), enabling independent fuzzing campaigns with different corpora.


Implementation Plan

I've put together rough milestones structured as "vertical slices" -- each milestone delivers a working end-to-end system for a narrow set of messages. After Milestone 1 is completed, most of the other milestones could be developed in parallel.

  • Milestone 1: open_channel End-to-End
  • Milestone 2: Funding Flow
  • Milestone 3: HTLC and Commitment Operations
  • Milestone 4: Co-op Channel Closes
  • Milestone 5: Channel Reestablish
  • Milestone 6: Gossip Messages
  • Milestone 7: Advanced Mutators
  • Milestone 8: Interactive Tx Protocol
  • Milestone 9+: Advanced Features

Milestone 1: open_channel End-to-End

The "minimum viable product". Minimal implementations of IR, mutators, generators, executor, etc. to enable basic fuzzing. The fuzzer can generate structurally valid open_channel messages via IR, send them to the target, read the accept_channel response, and extract variables from it.

Milestone 2: Funding Flow

The fuzzer can complete the channel establishment sequence through funding confirmation, including valid signing and mining of the funding transaction.

Milestone 3: HTLC and Commitment Operations

The fuzzer can add, fulfill, and fail HTLCs on an open channel, including the commitment dance.

Milestone 4: Co-op Channel Closes

The fuzzer can co-op close channels.

Milestone 5: Channel Reestablish

The fuzzer can disconnect, reconnect, and successfully resume channels via channel_reestablish.

Milestone 6: Gossip Messages

The fuzzer can send gossip messages with valid signatures.

Milestone 7: Advanced Mutators

Mutators exist for instruction reordering, instruction deletion, inserting generated snippets, and splicing two programs together.

Milestone 8: Interactive Tx Protocol

The fuzzer can complete dual-funded channel negotiations.

Milestone 9+: Advanced Features

Endless possibilities here, but some ideas:

  • Build valid onion packets.
  • Channel state oracle: detect various protocol violations during execution (e.g., accepting HTLCs on a shutting-down channel).
  • Add "constraint-based generators" that always respect protocol constraints (e.g., increasing commitment numbers, valid HTLC IDs).
  • Multi-channel scenarios.

Appendix: Fuzzing Engine Tradeoffs

Smite currently uses AFL++ with Nyx for all targets. The IR design is intended to work with AFL++ today and is structured to enable LibAFL migration later.

AFL++ with custom mutator -- current approach:

  • Advantages: No migration needed. Nyx integration, queue management, crash dedup, and UI all work today. Simple C ABI.
  • Disadvantages: Must use AFL_CUSTOM_MUTATOR_ONLY=1 (AFL++ byte-level havoc would corrupt IR structure). Serialization overhead on every mutation round. Must implement afl_custom_trim for instruction-level trimming (byte-level trimming destroys IR).

LibAFL with Nyx executor -- future approach:

  • Advantages: First-class IR support (mutators operate directly on Program structs, no serialization per mutation). Structural trimming and splicing. Feedback-driven generation. Fuzzamoto uses this approach.
  • Disadvantages: More implementation work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions