Problem
The current EncryptedBytesScenario encrypts and sends raw bytes as BOLT messages. This mostly tests message parsing, never reaching deeper protocol states.
To fuzz deeper, we need the fuzzer to become structure aware, able to send and receive sequences of valid (or semi-valid) messages that get past the initial parsing code and exercise deeper protocol logic. Ideally the fuzzer would be smart enough to generate the common message sequences to open channels, send or receive HTLCs, close channels, etc. Many of these message sequences have dependencies between the messages -- e.g., commitment_signed must contain a channel ID that matches a previously-opened channel, as well as signatures generated from the previously negotiated keys and commitment states. We want the fuzzer to be able to satisfy these dependencies as well.
Solution
We can use an intermediate representation (IR) to capture the type and structure knowledge needed to fuzz deeper. The fuzzer can then use this IR to generate and mutate short programs to be executed in the Nyx VM.
The following design is inspired by ideas from both syzkaller and fuzzamoto.
Architecture
We add a new crate smite-ir/ to contain the IR, custom mutators, and program generators, and a new scenario IrScenario<T, S> under smite-scenarios/ to execute IR programs in the Nyx VM.

The smite-ir crate is intended to be engine-agnostic -- no dependency on AFL++ or LibAFL. We intend to use AFL++ at first with a thin custom mutator wrapper library libsmite_ir_mutator.so loaded via AFL_CUSTOM_MUTATOR_LIBRARY. In the future, we may migrate to LibAFL by simply replacing the wrapper. See "Appendix: Fuzzing Engine Tradeoffs" for details.
Our custom mutators and generators create new Programs and serialize them for the fuzzing engine, which then sends each Program to the Nyx VM via shared memory. Inside the VM, our IrScenario deserializes the Program and executes it line-by-line. IrScenario then checks for and reports any crashes or hangs and resets the VM snapshot before processing the next Program.
Example IR Program
This program executes the channel funding flow up to the point where the target sends funding_signed. We will refer to this example in subsequent sections.
# Generate channel keys (6 key pairs)
v0 = GeneratePrivateKey(0x00)
v1 = DerivePoint(v0)
v2 = GeneratePrivateKey(0x01)
v3 = DerivePoint(v2)
v4 = GeneratePrivateKey(0x02)
v5 = DerivePoint(v4)
v6 = GeneratePrivateKey(0x03)
v7 = DerivePoint(v6)
v8 = GeneratePrivateKey(0x04)
v9 = DerivePoint(v8)
v10 = GeneratePrivateKey(0x05)
v11 = DerivePoint(v10)
# Channel parameters
v12 = GenerateTemporaryChannelId(0x00)
v13 = LoadChainHash()
v14 = LoadAmount(100_000)
v15 = LoadAmount(0)
v16 = LoadAmount(546)
v17 = LoadAmount(10_000_000)
v18 = LoadAmount(1_000)
v19 = LoadAmount(1)
v20 = LoadFeeratePerKw(2_500)
v21 = LoadCsvDelay(144)
v22 = LoadFeatures()
# Build and send open_channel
v23 = BuildOpenChannel(v13, v12, v14, v15, v16, v17, v18, v19, v20, v21, v1, v3, v5, v7, v9, v11, v22)
SendMessage(v23)
# Receive accept_channel, extract target's keys
v25 = RecvAcceptChannel()
v26 = ExtractFundingPubkey(v25)
v27 = ExtractFirstPerCommitmentPoint(v25)
# Create funding output via bitcoind
v28 = CreateFundingOutput(v1, v26, v14)
# Sign commitment transaction
v29 = BuildCommitmentTx(v28, v1, v26, v21)
v30 = ComputeCommitmentSighash(v29)
v31 = Sign(v0, v30)
# Build and send funding_created
v32 = BuildFundingCreated(v12, v28, v31)
SendMessage(v32)
# Receive funding_signed
v34 = RecvFundingSigned()
...
Key things to notice:
- SSA form: each instruction produces at most one variable, numbered by instruction index.
v1 = DerivePoint(v0) means instruction 1 takes v0 as input and produces v1.
- Variable gaps (no
v24, v33) indicate void instructions like SendMessage that have side effects but no output.
- Compound variables:
RecvAcceptChannel() produces v25, an AcceptChannelData containing all parsed response fields. ExtractFundingPubkey(v25) pulls one field into a primitive Point for use by later Build operations.
- Possible mutations:
InputSwapMutator could swap v3 (revocation_basepoint) with v5 (payment_basepoint) in BuildOpenChannel, since both are Point. OperationParamMutator could change LoadAmount(100_000) to LoadAmount(7_654).
Core Concepts
Program and Instructions
A Program is an ordered list of Instructions. Programs are serialized with postcard for transport between AFL++ and the VM. Snapshot state (target pubkey, chain hash, block height, channel keys if a channel is already open) lives in a separate ProgramContext that is supplied to the executor at run time.
An Instruction is an Operation plus input variable indices. In the example, BuildFundingCreated(v12, v28, v31) has operation BuildFundingCreated and inputs [12, 28, 31].
Operations
An Operation is one of four categories:
- Load: produce a variable from embedded data or snapshot context (
LoadAmount(100_000), LoadChainHash, LoadContextChannelId).
- Compute: derive a variable from inputs (
DerivePoint, Sign, HashPaymentPreimage, Extract*).
- Build: construct a BOLT message from inputs (
BuildOpenChannel, BuildCommitmentSigned).
- Act: produce side effects against the target (
SendMessage, RecvAcceptChannel, MineBlocks, Reconnect).
Embedded literal data lives in the Operation itself (e.g., LoadAmount(100_000), GeneratePrivateKey(0x05)).
Recv Operations
Recv* operations (e.g., RecvAcceptChannel) read from the connection and produce compound variables containing all parsed fields. The executor's receive loop auto-responds to pings (pong) and returns the first non-ping message. If the returned message matches the expected type, it is parsed into a compound variable. If it doesn't match (e.g., target sends error instead of accept_channel), the executor stops immediately.
To minimize gossip noise, we can disable option_gossip_queries in our init features and drain any initial gossip messages received during pre-snapshot setup. In some scenarios it may also be helpful to use gossip_timestamp_filter to request the target to refrain from sending us gossip.
Variables
A Variable is a typed runtime value produced by the executor -- ChannelId, Point, PrivateKey, Signature, Amount, Message, etc. Variables correspond to the runtime SSA outputs produced by each executed Instruction in a Program.
Compound variables (e.g., AcceptChannelData, FundingSignedData) bundle all fields from a parsed target response. Extract* operations pull individual fields into primitive types:
v25 = RecvAcceptChannel() # -> AcceptChannelData (compound)
v26 = ExtractFundingPubkey(v25) # -> Point (primitive)
Each Variable has a VariableType we can use to ensure mutations are type-safe.
Executor
The Executor is used by IrScenario to walk a Program instruction by instruction, executing the specified actions and maintaining a Vec<Variable> store.
Unlike fuzzamoto's Compiler (which pre-compiles the full program into flat actions on the host), we choose to directly interpret Programs in the VM. This simplifies fuzzing of the many interactive flows in the Lightning protocol, which require us to construct later messages using data sent by the target in earlier messages.
fn run(&mut self, input: &[u8]) -> ScenarioResult {
let program = match postcard::from_bytes::<Program>(input) {
Ok(p) => p,
Err(_) => return ScenarioResult::Skip,
};
let mut executor = Executor::new(&self.context);
match executor.execute(&program, &mut self.conn, &mut self.bitcoind) {
Ok(()) => {}
Err(ExecuteError::Connection(_)) => {
if self.target.check_alive().is_err() {
return ScenarioResult::Fail("target crashed".into());
}
}
Err(ExecuteError::Timeout) => {
return ScenarioResult::Fail("target hung".into());
}
Err(_) => return ScenarioResult::Skip,
}
// Final ping-pong catches delayed crashes
if let Err(e) = ping_pong(&mut self.conn) { ... }
ScenarioResult::Ok
}
Generators
Generators produce type-correct instruction sequences that represent protocol interactions. Each generator knows the shape of a protocol flow (what messages to construct, what keys to generate, what order to send/recv) but delegates value selection and variable reuse to ProgramBuilder.
ProgramBuilder
ProgramBuilder is the shared infrastructure that all generators use. It maintains:
- The instruction list being built (append-only, SSA)
- A type-indexed variable registry tracking all produced variables (direct primitives and extractable compound fields)
- The
pick_variable() method for probabilistic variable selection
Generators call builder methods -- they never manipulate instruction indices or variable references directly:
// Generator asks builder for variables.
// Builder decides: reuse existing? extract from compound? generate fresh?
let chan_id = builder.pick_variable(VariableType::ChannelId, rng);
let feerate = builder.pick_variable(VariableType::Amount, rng);
// Generator tells builder what instruction to emit.
let msg_idx = builder.append(Operation::BuildUpdateFee, &[chan_id, feerate]);
This separation means generators encode protocol knowledge (e.g., open_channel needs 6 key pairs) while the builder encodes fuzzing strategy (e.g., picks which variables to reuse or generate randomly).
Resource-Aware Variable Selection
When a generator needs a variable, ProgramBuilder::pick_variable() selects randomly from different strategies according to their weight:
- Reuse recent (75%): Recently-created variables are more likely to be useful for exercising multi-message protocol flows.
- Reuse any (15%): Cross-pollinates between protocol flows.
- Generate fresh (10%): Emits instructions that produce a new valid value of the requested type.
Generator Types
Generators are organized into different types.
Message generators emit the instructions for a single protocol message: load parameters, build the message, send the message, and optionally receive the response. These are building blocks for generating interesting protocol flows. (e.g., OpenChannelMsg, FundingCreatedMsg, ChannelReadyMsg, UpdateAddHtlcMsg).
Action generators do some single action, such as mining blocks via bitcoind. (e.g., MineBlocksAction).
Flow generators compose message and action generators in sequence, threading variables between them via ProgramBuilder. They are the easiest way for the fuzzer to reach deep protocol states when many constraints (matching keys, valid signatures, correct sequencing) need to align. (e.g., ChannelOpenFlow, HtlcAddFlow, HtlcFulfillFlow, InteractiveTxFlow).
Standalone vs. Insertion
Generators are used both to generate programs from scratch and to insert new code into existing programs as part of the GeneratorInsertMutator.
Mutators
Mutators transform existing programs while preserving structural validity.
Planned Mutators
| Mutator |
What it does |
OperationParamMutator |
Pick a random instruction with mutable parameters and mutate its embedded literal. Type-aware: amounts get boundary values (0, 1, u64::MAX) and random ranges; byte arrays get bit flips, insertions, and deletions; feerates/delays get truncated or maximized. |
InputSwapMutator |
Replace a variable reference in a random instruction with a different variable of the same VariableType. |
InstructionReorderMutator |
Swap two Act instructions (SendMessage, MineBlocks, etc.) that have no data dependency between them. |
SpliceMutator |
Pick a random program from the corpus and interleave its instruction subsequence into the current program at a random point, adjusting variable indices. |
InstructionDeleteMutator |
Remove a random instruction. |
GeneratorInsertionMutator |
Insert a freshly generated instruction subsequence (via a generator) at a random point. |
Snapshot Setup
Different fuzzing goals require different starting states. IrScenario<T, S> is parameterized by a SnapshotSetup trait:
trait SnapshotSetup<T: Target> {
fn setup(target: &T, conn: &mut NoiseConnection) -> Result<ProgramContext, ScenarioError>;
}
Snapshot setup is Rust code that drives the target through an initial deterministic protocol sequence using NoiseConnection and bitcoin-cli directly. IrScenario calls setup() before snapshotting the VM state.
Sample snapshot variants:
| Setup |
Snapshot state |
IR fuzzes... |
PostInitSetup |
After handshake + init exchange |
open_channel, gossip, any first message |
PostChannelOpenSetup |
After channel is funded + ready |
HTLCs, commitment rounds, fees, closure |
InteractiveTxSetup |
Mid-negotiation (after open_channel2 + accept_channel2) |
tx_[add/remove]_[input/output], tx_complete |
ProgramContext carries setup state into the executor. For example:
struct ProgramContext {
// Always present
target_pubkey: [u8; 33],
chain_hash: [u8; 32],
block_height: u32,
target_features: Vec<u8>,
// Present after PostChannelOpenSetup
channel_id: Option<[u8; 32]>,
local_keys: Option<ChannelKeys>,
remote_keys: Option<ChannelKeys>,
funding_outpoint: Option<OutPoint>,
commitment_number: Option<u64>,
}
LoadContext* operations access these fields, erroring at execution time if absent for the current snapshot variant. Each setup variant is a separate binary (e.g., ldk_ir_post_init, ldk_ir_post_channel), enabling independent fuzzing campaigns with different corpora.
Implementation Plan
I've put together rough milestones structured as "vertical slices" -- each milestone delivers a working end-to-end system for a narrow set of messages. After Milestone 1 is completed, most of the other milestones could be developed in parallel.
Milestone 1: open_channel End-to-End
The "minimum viable product". Minimal implementations of IR, mutators, generators, executor, etc. to enable basic fuzzing. The fuzzer can generate structurally valid open_channel messages via IR, send them to the target, read the accept_channel response, and extract variables from it.
Milestone 2: Funding Flow
The fuzzer can complete the channel establishment sequence through funding confirmation, including valid signing and mining of the funding transaction.
Milestone 3: HTLC and Commitment Operations
The fuzzer can add, fulfill, and fail HTLCs on an open channel, including the commitment dance.
Milestone 4: Co-op Channel Closes
The fuzzer can co-op close channels.
Milestone 5: Channel Reestablish
The fuzzer can disconnect, reconnect, and successfully resume channels via channel_reestablish.
Milestone 6: Gossip Messages
The fuzzer can send gossip messages with valid signatures.
Milestone 7: Advanced Mutators
Mutators exist for instruction reordering, instruction deletion, inserting generated snippets, and splicing two programs together.
Milestone 8: Interactive Tx Protocol
The fuzzer can complete dual-funded channel negotiations.
Milestone 9+: Advanced Features
Endless possibilities here, but some ideas:
- Build valid onion packets.
- Channel state oracle: detect various protocol violations during execution (e.g., accepting HTLCs on a shutting-down channel).
- Add "constraint-based generators" that always respect protocol constraints (e.g., increasing commitment numbers, valid HTLC IDs).
- Multi-channel scenarios.
Appendix: Fuzzing Engine Tradeoffs
Smite currently uses AFL++ with Nyx for all targets. The IR design is intended to work with AFL++ today and is structured to enable LibAFL migration later.
AFL++ with custom mutator -- current approach:
- Advantages: No migration needed. Nyx integration, queue management, crash dedup, and UI all work today. Simple C ABI.
- Disadvantages: Must use
AFL_CUSTOM_MUTATOR_ONLY=1 (AFL++ byte-level havoc would corrupt IR structure). Serialization overhead on every mutation round. Must implement afl_custom_trim for instruction-level trimming (byte-level trimming destroys IR).
LibAFL with Nyx executor -- future approach:
- Advantages: First-class IR support (mutators operate directly on
Program structs, no serialization per mutation). Structural trimming and splicing. Feedback-driven generation. Fuzzamoto uses this approach.
- Disadvantages: More implementation work.
Problem
The current
EncryptedBytesScenarioencrypts and sends raw bytes as BOLT messages. This mostly tests message parsing, never reaching deeper protocol states.To fuzz deeper, we need the fuzzer to become structure aware, able to send and receive sequences of valid (or semi-valid) messages that get past the initial parsing code and exercise deeper protocol logic. Ideally the fuzzer would be smart enough to generate the common message sequences to open channels, send or receive HTLCs, close channels, etc. Many of these message sequences have dependencies between the messages -- e.g.,
commitment_signedmust contain a channel ID that matches a previously-opened channel, as well as signatures generated from the previously negotiated keys and commitment states. We want the fuzzer to be able to satisfy these dependencies as well.Solution
We can use an intermediate representation (IR) to capture the type and structure knowledge needed to fuzz deeper. The fuzzer can then use this IR to generate and mutate short programs to be executed in the Nyx VM.
The following design is inspired by ideas from both syzkaller and fuzzamoto.
Architecture
We add a new crate
smite-ir/to contain the IR, custom mutators, and program generators, and a new scenarioIrScenario<T, S>undersmite-scenarios/to execute IR programs in the Nyx VM.The
smite-ircrate is intended to be engine-agnostic -- no dependency on AFL++ or LibAFL. We intend to use AFL++ at first with a thin custom mutator wrapper librarylibsmite_ir_mutator.soloaded viaAFL_CUSTOM_MUTATOR_LIBRARY. In the future, we may migrate to LibAFL by simply replacing the wrapper. See "Appendix: Fuzzing Engine Tradeoffs" for details.Our custom mutators and generators create new
Programs and serialize them for the fuzzing engine, which then sends eachProgramto the Nyx VM via shared memory. Inside the VM, ourIrScenariodeserializes theProgramand executes it line-by-line.IrScenariothen checks for and reports any crashes or hangs and resets the VM snapshot before processing the nextProgram.Example IR Program
This program executes the channel funding flow up to the point where the target sends
funding_signed. We will refer to this example in subsequent sections.Key things to notice:
v1 = DerivePoint(v0)means instruction 1 takesv0as input and producesv1.v24,v33) indicate void instructions likeSendMessagethat have side effects but no output.RecvAcceptChannel()producesv25, anAcceptChannelDatacontaining all parsed response fields.ExtractFundingPubkey(v25)pulls one field into a primitivePointfor use by later Build operations.InputSwapMutatorcould swapv3(revocation_basepoint) withv5(payment_basepoint) inBuildOpenChannel, since both arePoint.OperationParamMutatorcould changeLoadAmount(100_000)toLoadAmount(7_654).Core Concepts
Program and Instructions
A
Programis an ordered list ofInstructions. Programs are serialized with postcard for transport between AFL++ and the VM. Snapshot state (target pubkey, chain hash, block height, channel keys if a channel is already open) lives in a separateProgramContextthat is supplied to the executor at run time.An
Instructionis anOperationplus input variable indices. In the example,BuildFundingCreated(v12, v28, v31)has operationBuildFundingCreatedand inputs[12, 28, 31].Operations
An
Operationis one of four categories:LoadAmount(100_000),LoadChainHash,LoadContextChannelId).DerivePoint,Sign,HashPaymentPreimage,Extract*).BuildOpenChannel,BuildCommitmentSigned).SendMessage,RecvAcceptChannel,MineBlocks,Reconnect).Embedded literal data lives in the Operation itself (e.g.,
LoadAmount(100_000),GeneratePrivateKey(0x05)).Recv Operations
Recv*operations (e.g.,RecvAcceptChannel) read from the connection and produce compound variables containing all parsed fields. The executor's receive loop auto-responds to pings (pong) and returns the first non-ping message. If the returned message matches the expected type, it is parsed into a compound variable. If it doesn't match (e.g., target sendserrorinstead ofaccept_channel), the executor stops immediately.To minimize gossip noise, we can disable
option_gossip_queriesin our init features and drain any initial gossip messages received during pre-snapshot setup. In some scenarios it may also be helpful to usegossip_timestamp_filterto request the target to refrain from sending us gossip.Variables
A
Variableis a typed runtime value produced by the executor --ChannelId,Point,PrivateKey,Signature,Amount,Message, etc. Variables correspond to the runtime SSA outputs produced by each executedInstructionin aProgram.Compound variables (e.g.,
AcceptChannelData,FundingSignedData) bundle all fields from a parsed target response.Extract*operations pull individual fields into primitive types:Each
Variablehas aVariableTypewe can use to ensure mutations are type-safe.Executor
The
Executoris used byIrScenarioto walk aPrograminstruction by instruction, executing the specified actions and maintaining aVec<Variable>store.Unlike fuzzamoto's
Compiler(which pre-compiles the full program into flat actions on the host), we choose to directly interpretPrograms in the VM. This simplifies fuzzing of the many interactive flows in the Lightning protocol, which require us to construct later messages using data sent by the target in earlier messages.Generators
Generators produce type-correct instruction sequences that represent protocol interactions. Each generator knows the shape of a protocol flow (what messages to construct, what keys to generate, what order to send/recv) but delegates value selection and variable reuse to
ProgramBuilder.ProgramBuilder
ProgramBuilderis the shared infrastructure that all generators use. It maintains:pick_variable()method for probabilistic variable selectionGenerators call builder methods -- they never manipulate instruction indices or variable references directly:
This separation means generators encode protocol knowledge (e.g.,
open_channelneeds 6 key pairs) while the builder encodes fuzzing strategy (e.g., picks which variables to reuse or generate randomly).Resource-Aware Variable Selection
When a generator needs a variable,
ProgramBuilder::pick_variable()selects randomly from different strategies according to their weight:Generator Types
Generators are organized into different types.
Message generators emit the instructions for a single protocol message: load parameters, build the message, send the message, and optionally receive the response. These are building blocks for generating interesting protocol flows. (e.g.,
OpenChannelMsg,FundingCreatedMsg,ChannelReadyMsg,UpdateAddHtlcMsg).Action generators do some single action, such as mining blocks via bitcoind. (e.g.,
MineBlocksAction).Flow generators compose message and action generators in sequence, threading variables between them via
ProgramBuilder. They are the easiest way for the fuzzer to reach deep protocol states when many constraints (matching keys, valid signatures, correct sequencing) need to align. (e.g.,ChannelOpenFlow,HtlcAddFlow,HtlcFulfillFlow,InteractiveTxFlow).Standalone vs. Insertion
Generators are used both to generate programs from scratch and to insert new code into existing programs as part of the
GeneratorInsertMutator.Mutators
Mutators transform existing programs while preserving structural validity.
Planned Mutators
OperationParamMutatoru64::MAX) and random ranges; byte arrays get bit flips, insertions, and deletions; feerates/delays get truncated or maximized.InputSwapMutatorVariableType.InstructionReorderMutatorSendMessage,MineBlocks, etc.) that have no data dependency between them.SpliceMutatorInstructionDeleteMutatorGeneratorInsertionMutatorSnapshot Setup
Different fuzzing goals require different starting states.
IrScenario<T, S>is parameterized by aSnapshotSetuptrait:Snapshot setup is Rust code that drives the target through an initial deterministic protocol sequence using
NoiseConnectionandbitcoin-clidirectly.IrScenariocallssetup()before snapshotting the VM state.Sample snapshot variants:
PostInitSetupinitexchangeopen_channel, gossip, any first messagePostChannelOpenSetupInteractiveTxSetupopen_channel2+accept_channel2)tx_[add/remove]_[input/output],tx_completeProgramContextcarries setup state into the executor. For example:LoadContext*operations access these fields, erroring at execution time if absent for the current snapshot variant. Each setup variant is a separate binary (e.g.,ldk_ir_post_init,ldk_ir_post_channel), enabling independent fuzzing campaigns with different corpora.Implementation Plan
I've put together rough milestones structured as "vertical slices" -- each milestone delivers a working end-to-end system for a narrow set of messages. After Milestone 1 is completed, most of the other milestones could be developed in parallel.
open_channelEnd-to-EndMilestone 1:
open_channelEnd-to-EndThe "minimum viable product". Minimal implementations of IR, mutators, generators, executor, etc. to enable basic fuzzing. The fuzzer can generate structurally valid
open_channelmessages via IR, send them to the target, read theaccept_channelresponse, and extract variables from it.Milestone 2: Funding Flow
The fuzzer can complete the channel establishment sequence through funding confirmation, including valid signing and mining of the funding transaction.
Milestone 3: HTLC and Commitment Operations
The fuzzer can add, fulfill, and fail HTLCs on an open channel, including the commitment dance.
Milestone 4: Co-op Channel Closes
The fuzzer can co-op close channels.
Milestone 5: Channel Reestablish
The fuzzer can disconnect, reconnect, and successfully resume channels via
channel_reestablish.Milestone 6: Gossip Messages
The fuzzer can send gossip messages with valid signatures.
Milestone 7: Advanced Mutators
Mutators exist for instruction reordering, instruction deletion, inserting generated snippets, and splicing two programs together.
Milestone 8: Interactive Tx Protocol
The fuzzer can complete dual-funded channel negotiations.
Milestone 9+: Advanced Features
Endless possibilities here, but some ideas:
Appendix: Fuzzing Engine Tradeoffs
Smite currently uses AFL++ with Nyx for all targets. The IR design is intended to work with AFL++ today and is structured to enable LibAFL migration later.
AFL++ with custom mutator -- current approach:
AFL_CUSTOM_MUTATOR_ONLY=1(AFL++ byte-level havoc would corrupt IR structure). Serialization overhead on every mutation round. Must implementafl_custom_trimfor instruction-level trimming (byte-level trimming destroys IR).LibAFL with Nyx executor -- future approach:
Programstructs, no serialization per mutation). Structural trimming and splicing. Feedback-driven generation. Fuzzamoto uses this approach.