diff --git a/doc/llm/CLAUDE.md b/doc/llm/CLAUDE.md index c33dc1825..0d7dd179c 100644 --- a/doc/llm/CLAUDE.md +++ b/doc/llm/CLAUDE.md @@ -181,6 +181,45 @@ SEARCH (fdom _) SEARCH (_ %/ _) ``` +### When the agent can't hold a persistent pipe + +Some LLM front-ends (Claude Code, Codex CLI, etc.) expose the shell as +one-shot command execution — the agent *cannot* keep stdin/stdout open +to `easycrypt llm` across turns. Naively, every tactic would then +require re-`LOAD`-ing the file prefix. + +The `scripts/llm/` directory contains a reference fifo-backed daemon +that lets many short-lived shell invocations drive one persistent EC +REPL. Start it once at the beginning of a session, then issue each +tactic as a separate `ec_send.sh` call. See `scripts/llm/README.md` for +the full pattern. + +### Iterate interactively, then codify + +For long proofs — especially program-logic bodies with 5+ chained +`ecall`s, or big algebraic derivations — write tactics one at a time +in the REPL, inspecting `GOALS` after each, and only commit the final +sequence back to the file once it works. Two reasons: + +- **SMT budget.** Committing a long batch and issuing one LOAD + compounds SMT cost across every intermediate side-condition at once. + The same tactics often pass individually with a shorter per-call + timeout. +- **Cheap backtracking.** Use `CHECKPOINT` at every branching point + and `REVERT` liberally — it's O(1), whereas re-LOAD re-compiles the + whole prefix. + +Pattern: + +``` +LOAD "file.ec" 100 ← note uuid:N +CHECKPOINT before_body + +GOALS ← inspect + +REVERT before_body ← if the branch fails +``` + ## EasyCrypt proof strategy ### General approach @@ -246,6 +285,64 @@ SEARCH (_ %/ _) - `rewrite lemma in H` modifies hypothesis `H` in place (it does not consume it). If you need to preserve the original, copy it first: `have H' := H; rewrite lemma in H'`. +- **`have H := lemma args` may leave unevaluated lambdas.** Lemma + applications that produce `init`/`map`/`filter` terms often carry + β-redexes into the introduced hypothesis. Later `rewrite H` can then + hang indefinitely while the kernel tries to unify. Force + normalization at introduction with `have /= H := lemma args`. +- **Tactic-repetition (`!tac`) over-applies on nested structures.** + `!mapiE`, `!initiE`, `!allP` descend into inner layers at different + array sizes, generate side-conditions that can't be closed by `/#`, + and either leave stale goals or hang. Use the exact count needed + (usually 1 or 2 for an equality: `rewrite mapiE 1:/# /= mapiE 1:/#`). +- **`1:/#` vs `1:smt(hints)` in rewrite chains.** Inside a + `rewrite lemma 1:...` position only `/#` parses; `1:smt(hints)` is + a parse error. Break the chain: + `rewrite lemma 1:/#; first smt(hints). rewrite next_lemma.` +- **`suff` not `suffices`.** The surface syntax uses `suff H : P.`; + `suffices` triggers a parse error. +- **`rewrite allP` only touches the first `all`.** When unfolding + range predicates on both sides of an implication + (`all P x => all Q x`), use `!allP` — otherwise the subsequent + `=> H j Hj` fails with "nothing to introduce" because the conclusion + is still wrapped in `all`. +- **`congr` vs `congr 1`.** `congr` walks down an array equality all + the way to list/element equality; `congr 1` takes one structural + step. Reach for `congr 1` when `congr` descends too far and exposes + stale `init`/`mkseq` structure. + +### Reconciling syntactically distinct constants + +When a word-type modulus (`W13.modulus`, `W8.modulus`, …) appears on +one side of a goal and its numeric literal (`8192`, `256`, …) on the +other — for instance after one subterm was lowered through a circuit +lemma while the other came from the spec — `smt()` treats them as +distinct atoms and won't bridge. Normalize explicitly: + +```easycrypt +have W13_eq : W13.modulus = 8192 by smt(W13.ge2_modulus). +rewrite W13_eq. +``` + +Same pattern for `modulus_W`, cardinalities of finite types, and other +constants that have both a symbolic and a numeric form. + +### Ring/field commutativity: use the fully-qualified path + +Bare `mulrC`/`addrC` may not resolve against a concrete ring clone's +theory. When working in a cloned ring (`Zq`, `Fq`, polynomial rings +over Zq, …), the fully-qualified name of the `ComRing`/`ZModule` lemma +usually succeeds where the short form fails: + +```easycrypt +by rewrite Zq.ComRing.mulrC. +``` + +Use `SEARCH` to find the right path: + +``` +SEARCH (_ * _) (commutative _) +``` ## EasyCrypt language overview diff --git a/scripts/llm/README.md b/scripts/llm/README.md new file mode 100644 index 000000000..600add1ed --- /dev/null +++ b/scripts/llm/README.md @@ -0,0 +1,107 @@ +# LLM-oriented tooling for `easycrypt llm` + +This directory contains two example clients for the interactive REPL +exposed by `easycrypt llm` (see `doc/llm/CLAUDE.md` for the wire +protocol). They are layered: each can be useful independently, depending +on how your driver (human, script, or LLM agent) is structured. + +| Layer | File | Use when | +|-------|------|----------| +| Python binding | `ec_llm.py` | Writing a Python program that drives one EasyCrypt REPL. | +| Fifo daemon | `daemon.py` + `ec_send.sh` | Driving one persistent REPL from *many* short-lived shell invocations — notably LLM agents that issue one bash tool call per tactic. | + +## Why a fifo daemon? + +`easycrypt llm` is a stateful, long-running process. Python programs can +keep its stdin/stdout pipe open across commands, but many LLM-agent +front-ends (Claude Code, Codex CLI, etc.) only expose the shell via +one-shot command execution. Without a persistent intermediary, every +tactic would require re-LOADing the file prefix — minutes wasted per +step on a large project. + +The daemon opens `easycrypt llm` once, listens on a named pipe for JSON +requests, and writes JSON responses to a second named pipe. The +companion `ec_send.sh` is the shell-side client: it takes a single JSON +request, forwards it through the pipes, and prints the structured +response. Combined, they let an agent drive a persistent REPL with one +bash call per step. + +## Python binding (Layer 1) + +```python +from ec_llm import ECLLM + +with ECLLM(cwd="/path/to/project", extra_args=["-p", "Z3"]) as ec: + ok, resp = ec.load("proofs/myfile.ec", 42, nosmt=True) + print(ec.goals()) + ok, resp = ec.tactic("smt().") + print(resp) +``` + +Everything in the `easycrypt llm` protocol is exposed as a method: +`load`, `checkpoint`, `revert`, `undo`, `goals`, `goals_all`, `tactic`, +`tactic_multiline`, `search`, `quiet`, `close`. Status parsing tracks +the running uuid in `ec.uuid`. + +CLI mode provides a quick "LOAD and report" check: + +``` +python3 ec_llm.py proofs/myfile.ec 42 --cwd /path/to/project --nosmt +``` + +## Fifo daemon (Layer 2) + +Start the daemon once (most easily in the background): + +``` +# Create the fifos if needed and start the daemon +mkfifo /tmp/ec_in /tmp/ec_out 2>/dev/null || true +python3 scripts/llm/daemon.py \ + --cwd /path/to/project \ + --prover Z3 --timeout 5 & +``` + +Drive it with `ec_send.sh`: + +``` +./ec_send.sh '{"op":"load","arg":"proofs/myfile.ec","arg2":"42","nosmt":true}' +./ec_send.sh '{"op":"goals"}' +./ec_send.sh '{"op":"tactic","arg":"smt()."}' +./ec_send.sh '{"op":"checkpoint","arg":"before_split"}' +./ec_send.sh '{"op":"tactic","arg":"split."}' +./ec_send.sh '{"op":"revert","arg":"before_split"}' +``` + +For payloads with shell-hostile content (EC escapes like `/\\`), +write the JSON to a file and use `--file`: + +``` +cat > /tmp/req.json <<'EOF' +{"op":"tactic","arg":"rewrite /wpoly_srng !allP => H j Hj; smt(hint)."} +EOF +./ec_send.sh --file /tmp/req.json +``` + +## Operational tips + +- **Per-session setup.** Create fifos once per session. The daemon keeps + them; don't delete them while it's running. +- **Recovering from a dead daemon.** If the EasyCrypt process crashes + (stack overflow, OOM on SMT), `ec_send.sh` will hang. Kill the daemon + process, recreate the fifos (`rm -f /tmp/ec_in /tmp/ec_out; mkfifo + /tmp/ec_in /tmp/ec_out`), and restart. +- **Fast SMT feedback.** Pass `--timeout 1 --prover Z3` to the daemon + during exploration; fragile SMT calls surface immediately. Raise the + timeout when codifying a final proof. +- **Parallel sessions.** To run multiple daemons side-by-side, give each + a distinct pair of fifo paths and set `EC_IN_FIFO` / `EC_OUT_FIFO` + for the `ec_send.sh` caller. +- **Logging.** Exceptions inside the daemon are written to `/tmp/ec_daemon.log` + (or `--log`). Check there first if a request seems silently dropped. + +## Security note + +The daemon trusts whatever shows up on its input fifo — it forwards the +`arg` payload straight to EasyCrypt as tactic text. Don't expose the +fifos on a shared filesystem, and don't run the daemon as a privileged +user. diff --git a/scripts/llm/daemon.py b/scripts/llm/daemon.py new file mode 100755 index 000000000..16a681dba --- /dev/null +++ b/scripts/llm/daemon.py @@ -0,0 +1,142 @@ +#!/usr/bin/env python3 +""" +Fifo-backed daemon exposing one persistent `easycrypt llm` instance to many +shell callers. Typical use case: an LLM agent (Claude Code, etc.) that +invokes a bash tool *once per tactic* — the agent can't hold stdin/stdout +open across turns, so without a daemon every tactic would re-LOAD the +whole file prefix. + +Protocol (matches the JSON line format produced by `ec_send.sh`): + + request (one line on stdin-fifo): + {"op": "", "arg": "...", "arg2": "...", "nosmt": bool} + + response (one line on stdout-fifo): + {"ok": bool, "uuid": N, "resp": "..."} + +Supported ops: + load arg=path, arg2=line (string or int), optional nosmt: bool + tactic arg="proc." + tactic_multiline arg="" (wraps in /) + undo + revert arg="" + checkpoint arg="" + goals + goals_all + search arg="" + quiet arg="on"|"off" + close + +Usage: + # start daemon (in background) + python3 scripts/llm/daemon.py --cwd /path/to/project \\ + --in-fifo /tmp/ec_in --out-fifo /tmp/ec_out & + + # then drive with scripts/llm/ec_send.sh (see that file for details) + +The fifos are created if they don't exist. Only one connected writer and +one connected reader are expected at a time — the client (`ec_send.sh`) +enforces this by opening them sequentially. +""" + +import argparse +import json +import os +import sys +import traceback + +HERE = os.path.dirname(os.path.abspath(__file__)) +sys.path.insert(0, HERE) +from ec_llm import ECLLM # noqa: E402 + + +def _ensure_fifo(path): + if not os.path.exists(path): + os.mkfifo(path) + elif not os.path.exists(path): # raced out; recreate + os.mkfifo(path) + + +def _handle(req, ec): + op = req.get("op") + arg = req.get("arg", "") + arg2 = req.get("arg2", "") + + if op == "load": + ok, resp = ec.load(arg, int(arg2), nosmt=bool(req.get("nosmt", False))) + elif op == "tactic": + ok, resp = ec.tactic(arg) + elif op == "tactic_multiline": + ok, resp = ec.tactic_multiline(arg) + elif op == "undo": + ok, resp = ec.undo() + elif op == "revert": + ok, resp = ec.revert(arg) + elif op == "checkpoint": + ok, resp = ec.checkpoint(arg) + elif op == "goals": + resp, ok = ec.goals(), True + elif op == "goals_all": + resp, ok = ec.goals_all(), True + elif op == "search": + resp, ok = ec.search(arg), True + elif op == "quiet": + ok, resp = ec.quiet(on=(arg.lower() == "on")) + elif op == "close": + ec.close() + return {"ok": True, "uuid": ec.uuid, "resp": "bye"}, True + else: + return {"ok": False, "uuid": ec.uuid, "resp": f"unknown op: {op}"}, False + + return {"ok": ok, "uuid": ec.uuid, "resp": resp}, False + + +def main(): + p = argparse.ArgumentParser(description=__doc__.splitlines()[1]) + p.add_argument("--cwd", default=None, help="Project root (default: cwd)") + p.add_argument("--in-fifo", default="/tmp/ec_in", help="Request fifo path") + p.add_argument("--out-fifo", default="/tmp/ec_out", help="Response fifo path") + p.add_argument("--log", default="/tmp/ec_daemon.log", help="Log file") + p.add_argument("--prover", default=None, help="SMT prover (e.g. Z3)") + p.add_argument("--timeout", type=int, default=None, help="Per-goal SMT timeout (s)") + args = p.parse_args() + + _ensure_fifo(args.in_fifo) + _ensure_fifo(args.out_fifo) + + extra = [] + if args.prover: + extra += ["-p", args.prover] + if args.timeout is not None: + extra += ["-timeout", str(args.timeout)] + + ec = ECLLM(cwd=args.cwd, extra_args=extra) + with open(args.log, "w") as lf: + lf.write(f"EC started. uuid={ec.uuid}\n") + + while True: + try: + # Opening a fifo for reading blocks until a writer opens it. + with open(args.in_fifo, "r") as fi: + line = fi.readline() + if not line.strip(): + continue + req = json.loads(line) + result, should_exit = _handle(req, ec) + with open(args.out_fifo, "w") as fo: + fo.write(json.dumps(result) + "\n") + if should_exit: + break + except Exception: + with open(args.log, "a") as lf: + lf.write("EXCEPTION: " + traceback.format_exc() + "\n") + try: + with open(args.out_fifo, "w") as fo: + fo.write(json.dumps({"ok": False, "uuid": -1, + "resp": traceback.format_exc()}) + "\n") + except Exception: + pass + + +if __name__ == "__main__": + main() diff --git a/scripts/llm/ec_llm.py b/scripts/llm/ec_llm.py new file mode 100644 index 000000000..09f34fba3 --- /dev/null +++ b/scripts/llm/ec_llm.py @@ -0,0 +1,205 @@ +""" +Python binding for the `easycrypt llm` interactive REPL. + +Usage: + from ec_llm import ECLLM + + ec = ECLLM(cwd="path/to/project") # start REPL in project dir, wait for READY + ec.load("proofs/foo.ec", 42) # load through line 42 + ec.load("proofs/foo.ec", 100, nosmt=True) # fast prefix load, skip SMT + ec.checkpoint("c0") # name current state + ec.goals() # first subgoal + remaining count + ec.goals_all() # all subgoals + ec.tactic("apply H.") # send one tactic (must end with '.') + ec.revert("c0") # instant jump back to saved state + ec.search("(fdom _)") # pattern-based lemma search + ec.close() + +Protocol (from `doc/llm/CLAUDE.md`): + - Each response ends with a line containing only ``. + - `OK [uuid:N] ... ` — success, uuid incremented. + - `ERROR [uuid:N] ... ` — failure, uuid unchanged. + +Run EasyCrypt from a directory that contains (or is under) a valid +`easycrypt.project`, otherwise `LOAD` will not resolve library paths. +""" + +import os +import re +import subprocess + + +class ECLLM: + """Thin Python wrapper around one `easycrypt llm` process.""" + + def __init__(self, cwd=None, extra_args=None, exe="easycrypt"): + """ + Start an `easycrypt llm` subprocess and block until READY. + + Args: + cwd: Working directory (should contain / be under easycrypt.project). + Defaults to os.getcwd(). + extra_args: Extra CLI flags passed to `easycrypt llm`, e.g. + `["-p", "Z3", "-timeout", "5"]`. + exe: Name or path of the EasyCrypt binary (default: `easycrypt`). + """ + if cwd is None: + cwd = os.getcwd() + args = [exe, "llm"] + if extra_args: + args.extend(extra_args) + self.proc = subprocess.Popen( + args, + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.STDOUT, + text=True, + bufsize=1, + cwd=cwd, + ) + self.uuid = 0 + banner = self._read_until_end() + if "READY" not in banner: + raise RuntimeError(f"Expected READY, got:\n{banner}") + + # ------------------------------------------------------------------ protocol + + def _read_until_end(self): + """Read lines until a line that is exactly `` (stripped).""" + lines = [] + for raw in self.proc.stdout: + line = raw.rstrip("\n") + if line.strip() == "": + break + lines.append(line) + return "\n".join(lines) + + def _send(self, command): + """Send one command line and return the full response body.""" + self.proc.stdin.write(command + "\n") + self.proc.stdin.flush() + return self._read_until_end() + + def _parse_status(self, response): + """Update self.uuid and return True on OK, False on ERROR. + + LOAD responses may include theory/axiom dumps before the status + line, so we search the entire response. + """ + m_ok = re.search(r'^OK\s+\[uuid:(\d+)\]', response, re.MULTILINE) + m_err = re.search(r'^ERROR\s+\[uuid:(\d+)\]', response, re.MULTILINE) + if m_ok: + self.uuid = int(m_ok.group(1)) + return True + if m_err: + return False + # Fallback: tolerate early-banner responses + return response.lstrip().upper().startswith("OK") + + # ------------------------------------------------------------------ commands + + def load(self, path, line, nosmt=False): + """LOAD `path` through `line`. Returns `(ok, response)`.""" + cmd = f'LOAD "{path}" {line}' + if nosmt: + cmd += " -nosmt" + resp = self._send(cmd) + return self._parse_status(resp), resp + + def checkpoint(self, name): + """Save the current uuid under `name` for later REVERT.""" + resp = self._send(f"CHECKPOINT {name}") + return self._parse_status(resp), resp + + def revert(self, name_or_uuid): + """Jump back to a named checkpoint or a uuid.""" + resp = self._send(f"REVERT {name_or_uuid}") + return self._parse_status(resp), resp + + def undo(self): + """Undo the last successful step.""" + resp = self._send("UNDO") + return self._parse_status(resp), resp + + def goals(self): + """Return the first subgoal + remaining count.""" + return self._send("GOALS") + + def goals_all(self): + """Return all open subgoals.""" + return self._send("GOALS ALL") + + def tactic(self, tac): + """Send one EasyCrypt statement (must end with `.`). Returns `(ok, response)`.""" + resp = self._send(tac) + return self._parse_status(resp), resp + + def tactic_multiline(self, body): + """Send a multi-line EasyCrypt block delimited by ``/``.""" + payload = "\n" + body.rstrip() + "\n" + resp = self._send(payload) + return self._parse_status(resp), resp + + def search(self, pattern): + """Run `SEARCH pattern` — pattern syntax per the EasyCrypt manual.""" + return self._send(f"SEARCH {pattern}") + + def quiet(self, on=True): + """Turn automatic goal display after each tactic on/off.""" + resp = self._send("QUIET ON" if on else "QUIET OFF") + return self._parse_status(resp), resp + + def close(self): + """Terminate the subprocess (sends QUIT, then waits).""" + try: + self.proc.stdin.write("QUIT\n") + self.proc.stdin.flush() + self.proc.stdin.close() + except (BrokenPipeError, ValueError): + pass + try: + self.proc.wait(timeout=5) + except subprocess.TimeoutExpired: + self.proc.kill() + + # Context manager support + def __enter__(self): + return self + + def __exit__(self, exc_type, exc, tb): + self.close() + + +# -------------------------------------------------------------------------- CLI + +def _main(): + import argparse + p = argparse.ArgumentParser( + description="Quick LOAD check: compile a file through a line and report status." + ) + p.add_argument("file", help="EasyCrypt source file (relative to --cwd)") + p.add_argument("line", type=int, help="Line number to load through") + p.add_argument("--cwd", default=None, help="Project root (default: cwd)") + p.add_argument("--nosmt", action="store_true", help="Skip SMT during LOAD") + p.add_argument("--timeout", type=int, default=None, help="Per-goal SMT timeout (s)") + p.add_argument("--prover", default=None, help="SMT prover (e.g. Z3)") + p.add_argument("--quiet", action="store_true", help="Suppress response body") + args = p.parse_args() + + extra = [] + if args.prover: + extra += ["-p", args.prover] + if args.timeout is not None: + extra += ["-timeout", str(args.timeout)] + + with ECLLM(cwd=args.cwd, extra_args=extra) as ec: + ok, resp = ec.load(args.file, args.line, nosmt=args.nosmt) + status = "OK" if ok else "ERROR" + print(f"{status} uuid={ec.uuid}") + if not args.quiet: + print(resp) + raise SystemExit(0 if ok else 1) + + +if __name__ == "__main__": + _main() diff --git a/scripts/llm/ec_send.sh b/scripts/llm/ec_send.sh new file mode 100755 index 000000000..3daefb2df --- /dev/null +++ b/scripts/llm/ec_send.sh @@ -0,0 +1,36 @@ +#!/bin/bash +# Send one JSON request to the EasyCrypt LLM daemon, print the response. +# +# Usage: +# ec_send.sh '{"op":"tactic","arg":"proc."}' # inline JSON +# ec_send.sh --file /path/to/request.json # JSON from file +# +# The --file form is useful for tactic payloads containing shell-meta +# characters (e.g. EC escapes like /\\) that would otherwise need +# elaborate quoting. +# +# Fifo paths default to /tmp/ec_in, /tmp/ec_out and can be overridden via +# env vars EC_IN_FIFO, EC_OUT_FIFO. The daemon must already be running +# and listening on the same fifos. +set -eu + +IN_FIFO="${EC_IN_FIFO:-/tmp/ec_in}" +OUT_FIFO="${EC_OUT_FIFO:-/tmp/ec_out}" + +if [ "${1:-}" = "--file" ]; then + REQ=$(cat "$2") +else + REQ="$1" +fi + +# Write in background so we can read the response; fifos block until both +# ends are connected. +printf '%s\n' "$REQ" > "$IN_FIFO" & +WRITER=$! +RESP=$(cat "$OUT_FIFO") +wait "$WRITER" + +python3 -c "import sys, json +r = json.loads(sys.argv[1]) +print('OK' if r['ok'] else 'ERR', 'uuid=', r['uuid']) +print(r['resp'])" "$RESP"