feat(platform-integrations): unify plugin code under a single canonical source#235
feat(platform-integrations): unify plugin code under a single canonical source#235illeatmyhat wants to merge 31 commits intomainfrom
Conversation
Captures the design that came out of the planning session for #219: treat platform-integrations/ as generated output from a new plugin-source/ canonical tree, rendered via Jinja2, with a CI gate enforcing render-equality. Records the alternatives weighed (symlinks, separate repo, gitignored output, Go/Rust tooling) and their rejection reasons so the decision isn't relitigated later. Establishes docs/adr/ as the project's ADR home. Refs #219
|
Important Review skippedToo many files! This PR contains 173 files, which is 23 over the limit of 150. ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (173)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…ipeline Adds the canonical source tree (plugin-source/) and the build pipeline that renders it into platform-integrations/. The first managed slice is the four identical lib/*.py helpers shared by claude and claw-code; the byte-identical render produces no diff vs the previously committed copies. What's wired: - plugin-source/MANIFEST.toml declares platforms (claude, claw-code, codex, bob) and the per-file render targets. Verbatim entries only for now; Jinja2 templating and per-platform overlays land in subsequent commits. - scripts/build_plugins.py renders the manifest and detects drift. Stdlib only (tomllib, filecmp, shutil); no new project deps. - justfile gains compile-plugins and check-plugins-rendered recipes. - Pre-commit gains a plugins-rendered hook scoped to plugin-source/, platform-integrations/, and scripts/build_plugins.py. - CI gains a check-plugins-rendered job. - tests/platform_integrations/test_build_pipeline.py covers manifest loading, full-render output, and drift detection (positive and negative cases). Codex and bob declare plugin_root entries but no managed files yet — those land when those platforms' content is migrated in later commits. The existing install.sh continues to do the runtime lib copy for them in the meantime. Refs #219
…in-source Sweeps the six skill scripts that are byte-identical between claude and claw-code today into plugin-source/skills/<name>/scripts/. The render remains byte-identical to committed platform-integrations/, so this is a pure source relocation — no behavior change. Migrated: - learn/scripts/save_entities.py - publish/scripts/publish.py - subscribe/scripts/subscribe.py - unsubscribe/scripts/unsubscribe.py - sync/scripts/sync.py - save-trajectory/scripts/save_trajectory.py Not yet migrated (left for the Jinja2 commit): - recall/scripts/retrieve_entities.py — varies across all four platforms. - learn/scripts/on_stop.py and on_stop.sh — claude-only hooks. - save-trajectory/scripts/on_stop.py — claude-only hook. - All SKILL.md files — diverge across platforms. - codex and bob copies of these scripts — diverge from claude/claw-code due to runtime-environment differences (lib path discovery, hook contracts). Refs #219
Adds Jinja2 rendering for source files ending in .j2. Each platform's
[platforms.<name>] table in MANIFEST.toml now accepts arbitrary keys
beyond plugin_root; everything else is forwarded to the template as a
context variable, alongside `platform = "<name>"`. Verbatim copy
remains the default for non-.j2 sources.
Demonstrates the mechanism on skills/learn/SKILL.md, the first templated
file. Two real per-platform variations are now expressed in one shared
.j2 template:
- forked_context (bool) — claude wraps learn in a forked execution model
and needs a "Step 0: Load the Conversation" section that reads the
stop-hook transcript; claw-code does not. The bool gates a {% if %}
block plus a small inline phrasing tweak in Step 1.
- save_entities_invocation (str) — claude invokes the save script via
${CLAUDE_PLUGIN_ROOT}/...; claw-code does a config-home lookup dance.
The string is substituted in three places (Method 1/2/3 examples).
Render produces byte-identical output to the previously committed
SKILL.md files for both claude and claw-code; drift gate stays green.
Build-pipeline tests grow a TestJinjaTemplating class that asserts a
shared .j2 source produces platform-specific output; existing tests
updated for the renamed Manifest.platforms attribute (was
platform_roots) and split into "every target rendered" plus "verbatim
files match source byte-for-byte".
This is commit 3a of the migration plan; commit 3b will sweep the
remaining drifted SKILL.md files and the per-platform script variation
(retrieve_entities.py, codex/bob save_entities.py, on_stop.* hooks).
Refs #219
Sweeps the remaining seven SKILL.md files (recall, publish, subscribe,
unsubscribe, sync, save-trajectory, save) into shared .j2 templates that
render byte-identically (modulo one trivial whitespace fix, see below)
to the previously committed claude and claw-code copies.
The dominant per-platform variation across these files is the script
invocation snippet — claude expands ${CLAUDE_PLUGIN_ROOT} via its plugin
runtime; claw-code does a config-home discovery dance wrapped in
sh -lc '...'. Rather than store the long claw-code shell command as a
manifest variable for each skill, this introduces a shared Jinja2
macro (plugin-source/_macros.j2 :: invoke(skill, script, args)) that
emits the platform-appropriate form. `args` accepts None, a string, or
a list — when given a list, claude renders one arg per line with
backslash continuation (matches the existing publish/subscribe
formatting); claw-code stays single-line because the whole command is
inside sh -lc '...'.
The remaining variation is captured in two new per-platform manifest
keys plus an inline conditional block in recall:
- forked_context (bool) — Step 0 of learn loads a forked-context
transcript on claude; not relevant on claw-code.
- save_example_script_root (str) — placeholder root used in save's
example invocations (${CLAUDE_PLUGIN_ROOT}/skills vs ~/.claw/skills).
- user_skills_dir (str) — where the save skill writes the new skill
(~/.claude/skills vs ~/.claw/skills).
- recall's "How It Works" prose differs in step 1-2 wording (claude
fires on user prompt submit; claw-code fires on PreToolUse) and
references "Claude" vs "the agent" in two places. Inline {% if %}.
learn/SKILL.md.j2 (introduced in the previous commit) is migrated from
its bespoke `save_entities_invocation` manifest var to the shared
invoke() macro. The save_entities_invocation key is dropped.
One incidental cleanup: save/SKILL.md had four trailing spaces on two
blank lines inside an embedded python code-block example (legacy of an
earlier editor). The .j2 template renders those lines without the
trailing whitespace; the committed claude+claw-code copies are updated
to match. No semantic change.
Codex and bob SKILL.md files are not migrated in this commit — their
prose diverges substantially (different audience LLMs, different
hook contracts) and they need either deeper conditionals or
per-platform overlay files. Those land in commit 3c alongside the
script-synthesis work.
Refs #219
Three files exist only on the claude tree (not on claw-code, codex, or bob): the forked-context stop hooks for `learn` and `save-trajectory`. Bringing them under build management uses the per-platform overlay pattern — manifest entries with `platforms = ["claude"]` and a single source path under plugin-source/. The renderer emits them only into the claude tree; the drift gate enforces byte-identity. Files: learn/scripts/on_stop.py, learn/scripts/on_stop.sh, save-trajectory/scripts/on_stop.py. Mypy now also excludes plugin-source/ (it already excluded platform-integrations/). The two on_stop.py files share a module name, which the existing exclusion handled in the rendered tree but not in the source tree. Notes on what is NOT in this commit: - save_entities.py for codex is *not* synthesized in this commit. Codex's variant ignores incoming owner/visibility values from stdin (see test_codex_sharing.py::test_save_ignores_incoming_owner_and_visibility), while claude/claw-code preserve them if set. That is a deliberate per-platform security stance, not drift, and collapsing it would either change codex behavior or introduce a new behavior-flag knob — worth its own PR with explicit user buy-in. - retrieve_entities.py is also not synthesized here. Beyond the lib-path discovery prelude (which the shim pattern would cover), the bodies legitimately differ across platforms: claude logs env vars and argv for debugging while codex doesn't, codex calls find_entities_dir while claude calls find_recall_entity_dirs, and the output header text varies. Synthesis warrants a focused commit. - Codex and bob SKILL.md files remain hand-edited in platform-integrations/. Their prose is tuned for different audience LLMs and would mostly require Pattern B (per-platform overlay files) rather than Jinja2 conditionals; deferring until the broader migration shape settles. Refs #219
Bob is the only platform that used colon-prefixed names on disk (.bob/skills/evolve-lite:<x>/, .bob/commands/evolve-lite:<x>.md). Windows treats `:` as a drive separator and rejects it in path components, so the existing layout couldn't be checked out or installed on Windows. Other platforms (claude, codex, claw-code) synthesize the colon namespace from a plugin manifest and don't have the issue. Renames every colon-prefixed source path to a hyphen-prefixed name (evolve-lite-<x>) and updates every reference: bob's custom_modes.yaml prompt, bob's command-file frontmatter, install.sh's BobInstaller glob patterns and status output, and the affected tests in tests/platform_integrations/. User-facing slash-command surface change for Bob users: /evolve-lite:learn → /evolve-lite-learn (etc). Other platforms are unchanged because their plugin manifests still synthesize the colon form for the user-facing namespace. The sole reference to evolve-lite:recall left intact is in install.sh's CodexInstaller post-install message — codex's plugin manifest still produces /evolve-lite:recall as the slash command, so the hyphenated name there would be wrong. Pre-existing test failures unrelated to this rename: - test_bob_sharing.py and test_sync.py and test_codex_sharing.py expect "invalid subscription name" in stdout but sync.py logs "invalid name" to stderr. This drift exists on main (verified before the rename) across all three platforms; same 5 failures before and after. Out of scope here, will need its own commit. The rename FIXES one pre-existing test: test_skill_directory_names.py::test_bob_lite_skills_follow_naming_convention (which now matches the new evolve-lite- prefix expectation). Refs #219
Drops Step 0 of the evolve-lite mode prompt, which used to enumerate specific .bob/skills/<skill>/SKILL.md paths the agent had to read up front. The relationship between the mode and the skills it depends on was largely a coincidence of the prompt — the mode's job is the workflow contract (recall → work → save-trajectory → learn → complete); the skill registry is whatever Bob's runtime resolves under .bob/skills/. Replaces the path enumeration with a generic instruction to read each skill's SKILL.md before first invocation. Workflow steps still call the relevant skills by name (recall, save-trajectory, learn, plus the optional sharing skills), since the mode's contract is precisely "use these skills in this order." Names, not paths. This finishes the migration plan from #219: 1. ✅ Build pipeline + render-equality gate (commit 1) 2. ✅ Migrate identical claude/claw-code skill scripts (commit 2) 3a. ✅ Jinja2 templating + first per-platform .j2 (commit 3a) 3b. ✅ Sweep remaining claude/claw-code SKILL.md prose (commit 3b) 3c. ✅ Claude-only on_stop overlay files (commit 3c) 4. ✅ Bob colon-prefix rename for Windows compat (commit 4) 5. ✅ Decouple custom_modes.yaml from skill paths (this commit) Followups outside this PR's scope: - Synthesize codex's save_entities.py and the four-platform retrieve_entities.py (real semantic synthesis, deserves focused PR) - Migrate codex/bob SKILL.md content into plugin-source as Pattern B per-platform overlays - Move claw-code's installed-path convention off colons (separate Windows-compat issue, parallel to bob's) - Resolve the pre-existing "invalid subscription name" stdout/stderr drift across claude/codex/bob (5 failing tests on main, untouched by this PR) Refs #219
Resolves four pre-existing test failures across claude, codex, and bob sync tests that asserted "invalid subscription name" appeared in stdout when an entry in evolve.config.yaml had an unsafe name (e.g. '../evil', '.', '..'). Root cause: every platform's sync.py used `normalize_repos(cfg)`, which routes through `_coerce_repo` in lib/config.py. _coerce_repo silently filtered invalid entries (after a stray stderr print with a slightly different phrasing — "ignoring repo entry 'X' — invalid name") and returned None. The downstream "skipped — invalid subscription name" branch in each sync.py ran on already-filtered entries, so it never fired. The user saw "No subscriptions configured" and a stderr log with a different message; the tests saw neither in stdout. Fix: - lib/config.py: drop the stderr prints inside _coerce_repo. They were leaky from a library function (callers, not the lib, should decide where to surface a rejection). Add `classify_repo_entry` which returns (repo, rejection) for one raw entry — exactly one is non-None — so callers can iterate raw `cfg["repos"]` and report rejections per their own UX. - claude/claw-code/codex/bob sync.py: replace `normalize_repos(cfg)` with manual iteration over raw entries via classify_repo_entry. Rejection reasons are added to the same `summaries` list that already collects per-repo sync results, so they appear in the user-visible "Synced N repo(s): …" stdout line. Dedup by name is preserved inline. - test_config.py::test_invalid_scope_entries_dropped: replaced its capsys assertion (which depended on the now-removed stderr print) with a direct call to classify_repo_entry that returns the same rejection reason structurally. Test impact: - Fixes test_sync.py::test_skips_invalid_subscription_name - Fixes test_bob_sharing.py::test_skips_invalid_subscription_name - Fixes test_bob_sharing.py::test_rejects_dot_and_double_dot_names - Fixes test_codex_sharing.py::test_sync_skips_invalid_subscription_name - One pre-existing failure remains: test_subscribe_warns_when_audit_write_fails in test_codex_sharing.py. That test asserts subscribe.py warns and continues when the audit log can't be written; the current subscribe.py rolls back and exits 1 (claude and codex both). That's a separate design decision (fail-open UX vs fail-closed security) that deserves its own focused commit. Refs #219
Replaces the SKILL.codex.md / SKILL.bob.md per-platform-overlay approach (the dropped c6c76a0) with a single SKILL.md.j2 per skill that renders for all four platforms. Codex's prose is the canonical base — it is the most refined / production-tested variant — and Jinja2 branches handle the genuinely platform-specific bits. What this does for each cross-platform skill (learn, publish, recall, subscribe, sync, unsubscribe): - Frontmatter description switches to codex's trigger-oriented wording across all platforms (claude/claw-code/bob previously carried a more passive "Analyze ..." description). - claude keeps `context: fork` in the frontmatter via a Jinja branch. - learn keeps Step 0 (forked-context transcript loading) for claude only via the existing `forked_context` flag. - recall adopts codex's "Required Action / Completion Rule / Required Visible Completion Note / Failure Conditions" guards on every platform, with a per-platform "How It Works" branch that describes claude's UserPromptSubmit hook, claw-code's PreToolUse hook, codex's optional codex_hooks integration, and bob's manual workflow respectively. - sync gains a "Notes" implementation-detail section sourced from bob's prose (additive, applies to all platforms). - unsubscribe keeps the claude/claw-code-only `--force` addendum inside a `{% if platform in ["claude", "claw-code"] %}` branch because only those platforms' unsubscribe.py refuses to remove a write-scope clone without it. save-trajectory now also renders for bob (codex has no save-trajectory skill). The Write+temp-file pattern from claude applies to bob too — bob's prior heredoc form had the same escaping fragility claude's note warned against. The macro layer (_macros.j2): - `invoke(skill, script, args)` gains codex and bob branches: codex → python3 "$(git rev-parse --show-toplevel ...)/plugins/.../<script>" bob → python3 .bob/skills/evolve-lite-<skill>/scripts/<script> Codex paths now standardise on the git-rev-parse form (codex's pre-existing prose mixed that with bare relative paths). - new `skill_ref(name)` macro expands to the platform-appropriate cross-reference syntax: `/evolve-lite:<name>` for claude / claw-code, `evolve-lite:<name>` for codex, `evolve-lite-<name>` for bob. MANIFEST.toml: - Adds `forked_context = false` to the codex and bob platform tables so StrictUndefined doesn't trip on the `{% if forked_context %}` branch in learn. - For each cross-platform skill, the [[files]] entries collapse from "claude/claw-code .j2 + codex overlay + bob overlay" (3 sources) into a single source with two target rows — one for [claude, claw-code, codex] hitting `skills/<skill>/SKILL.md`, one for [bob] hitting `skills/evolve-lite-<skill>/SKILL.md` (post-rename folder). The codex/bob/claude/claw-code on-disk SKILL.md outputs are now all freshly rendered from these unified sources. The drift gate (`just check-plugins-rendered`) is green; platform_integrations tests still pass at 307/308 (the same pre-existing `test_subscribe_warns_when_audit_write_fails` failure tracked elsewhere). Refs #219 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c6c76a0 to
2363dc5
Compare
…antics
All four platforms now render save_entities.py from a single source in
plugin-source/. The unified script adopts codex's strict-overwrite
ownership stamping verbatim:
entity["owner"] = args.user or "unknown"
entity["visibility"] = "private"
This replaces the older preserve-if-set form claude/claw-code carried:
if args.user and not entity.get("owner"):
entity["owner"] = args.user
if not entity.get("visibility"):
entity["visibility"] = "private"
Why strict wins, per the timeline: claude's preserve-if-set form
landed 2026-04-21 in #188 (6f79732 "feat(evolve-lite): add entity
sharing skills and CI tests"). Codex's strict form landed 2026-04-22
in #196 (cd4204c "feat(codex): add lite sharing skills and
session-start sync"), whose commit body explicitly lists "fix(codex):
tighten sharing script safeguards" and "fix(codex): harden sharing
scripts and tests". Codex was a deliberate second pass on the same
script after the spoofing risk was identified — untrusted upstream
input (a prompt-injected agent) must not be able to dictate `owner`
or `visibility` on the resulting on-disk entity.
The strict semantics are pinned by
test_codex_sharing.py::test_save_ignores_incoming_owner_and_visibility,
which still passes against the unified source. No test pinned the
preserve-if-set behavior on the claude/claw-code side, so dropping
that branch costs nothing observable and closes the spoofing vector
on those two platforms as well.
Lib-path discovery is also unified: the walk-up loop checks
`<ancestor>/lib` (claude / claw-code / codex installed layout),
`<ancestor>/evolve-lib` (bob's installed layout), and the existing
`<ancestor>/platform-integrations/claude/plugins/evolve-lite/lib`
monorepo-dev fallback that codex's variant carried. One discovery
prelude works for every platform, no Jinja branching needed.
MANIFEST: save_entities.py expands from `["claude", "claw-code"]` to
two entries — one targeting `skills/learn/scripts/save_entities.py`
for [claude, claw-code, codex], one targeting
`skills/evolve-lite-learn/scripts/save_entities.py` for [bob]
(post-rename folder).
Tests: 307/308 platform_integrations pass — same baseline as before
(the one pre-existing failure
test_codex_sharing.py::test_subscribe_warns_when_audit_write_fails
predates this branch).
Refs #219
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All four platforms now render retrieve_entities.py from a single
source in plugin-source/. The unified script adopts codex's prose
and structure verbatim where the variants diverged, with two
deliberate concessions to preserve other-platform behavior.
Synthesis decisions, by divergence point:
Lib-path discovery — same walk-up loop as save_entities.py:
`<ancestor>/lib` (claude / claw-code / codex), `<ancestor>/evolve-lib`
(bob), and the existing `<ancestor>/platform-integrations/claude/...`
monorepo-dev fallback. One discovery prelude, no Jinja branching.
find_recall_entity_dirs vs find_entities_dir — codex's
`find_entities_dir` wins. Both functions resolve to the same
canonical `<evolve_dir>/entities` path today, so the multi-root
list form (claude/claw-code) collapses to the single-dir form with
no observable behavior change.
Output header — codex's "## Evolve entities for this task / Review
these stored entities and apply any that are relevant to the user's
request:" propagates to all four platforms (claude/claw-code/bob
previously emitted the shorter "## Entities for this task" form).
Two test header pins updated to match: SCRIPT_VARIANTS in
test_retrieve.py and the bob-side assertion in test_bob_sharing.py.
Item formatting — codex's plain `Rationale:` / `When:` lines win
over claude/claw-code/bob's italicised `_Rationale: ..._` /
`_When: ..._` form.
Subscribed-source detection — codex's relative-path approach
(`md.relative_to(entities_dir).parts`) wins over the
search-for-"entities"-in-parts logic claude carried.
Symlink + .git filtering — preserved as additive defensive features
even though codex didn't have the .git skip. Skipping git
bookkeeping when a write-scope clone lives under
entities/subscribed/{name}/.git/ is the right thing to do, and it
doesn't conflict with codex's behavior on a clean entities tree.
Stdin handling — codex's strict "json.load + return on
JSONDecodeError" is preserved (the
test_handles_invalid_json_stdin_gracefully test pins this on every
variant). Empty stdin is treated as "no input, continue with entity
loading" rather than an error, so bob's manual-invocation path
(which never pipes anything upstream) keeps working without an
`echo {}` workaround.
Argv dump — claude carried a "=== Command-Line Arguments ===" log
block; codex didn't. Dropped, codex wins.
"# Made with Bob" footer — dropped.
MANIFEST: retrieve_entities.py adds two entries — one targeting
`skills/recall/scripts/retrieve_entities.py` for [claude, claw-code,
codex], one targeting `skills/evolve-lite-recall/scripts/retrieve_entities.py`
for [bob]. Same shape as save_entities.py from the previous commit.
Tests: 307/308 platform_integrations pass — same baseline (the one
pre-existing failure
test_codex_sharing.py::test_subscribe_warns_when_audit_write_fails
predates this branch).
Refs #219
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ajectory
All five remaining sharing/recall scripts now render from a single
source under plugin-source/. The unified scripts adopt codex's
variants verbatim where codex and claude diverged, with two
mechanical changes per file:
1. The lib-path-discovery prelude is replaced with the same
walk-up loop introduced in the save_entities.py and
retrieve_entities.py commits — checks `<ancestor>/lib`
(claude/claw-code/codex), `<ancestor>/evolve-lib` (bob), then
the existing `<ancestor>/platform-integrations/claude/.../lib`
monorepo-dev fallback codex's variants carried.
2. The "(Codex)" / "(Bob)" docstring annotations and the trailing
"# Made with Bob" comment are dropped.
Per-script trade-offs codex-wins introduces on claude/claw-code:
publish.py
- No behavior delta vs the prior plugin-source version that
mattered to existing tests; codex and claude were already very
close here. Soft-warn-on-audit-failure semantics preserved.
subscribe.py
- codex's `project_root` derives from `evolve_dir.resolve()`
(handles a non-".evolve"-named EVOLVE_DIR) instead of always
using `str(evolve_dir.resolve().parent)`.
- codex re-raises rather than printing "Error: failed to record
subscription — clone removed:" when save_config fails. The
rollback semantics are unchanged (clone is removed, repos
list popped); only the user-visible error string differs.
Updated test_rolls_back_clone_if_config_write_fails to drop
its message-string check; the rollback behavior it actually
cares about still passes.
- Argument help text loses claude's longer descriptions; codex's
terser arg help propagates.
sync.py
- Drops claude's `git -c safe.directory={repo_path}` flag from
the inner `_git` helper. No test pinned this; its only effect
is whether sync works inside a repo owned by a different uid
than the running process (matters in shared-filesystem
installs, doesn't matter in the test sandbox).
- Drops claude's head_before / head_after short-circuit and
always counts a delta after a fetch+rebase/reset; the
subscribed-base path-traversal check codex carried in the
main loop is added on top of the lib-level rejection list, so
both layers of name validation now apply.
- codex's audit_root indirection (handles a non-".evolve"-named
EVOLVE_DIR for the audit log path) propagates to all
platforms.
unsubscribe.py
- codex's combined `is_valid_repo_name` + path-traversal check
replaces claude's two separate-step form. Same observable
validation; the rejection error string is identical.
- codex's `project_root` derivation matches the subscribe.py
change above.
save_trajectory.py
- codex has no save-trajectory skill, so the canonical here is
claude's existing plugin-source variant (lazy log creation,
atomic O_EXCL claim, file-arg-or-stdin input). Bob's prior
variant was simpler and used naive `open()`; replacing it
with the claude version is a strict improvement (handles
same-second collisions, supports the tmp-file-input pattern
the SKILL.md prose now describes for all platforms).
MANIFEST: each of the five scripts gains a second [[files]] entry
mirroring the save_entities.py / retrieve_entities.py pattern — one
target for [claude, claw-code, codex] under
`skills/<skill>/scripts/<script>.py`, one target for [bob] under
`skills/evolve-lite-<skill>/scripts/<script>.py`.
Tests: 307/308 platform_integrations pass — same baseline (the one
pre-existing failure
test_codex_sharing.py::test_subscribe_warns_when_audit_write_fails
remains; it pins soft-warn audit semantics on subscribe.py while
both claude and codex variants implement hard-fail, which is a
separate fail-open-vs-fail-closed design call).
After this commit every Python script under platform-integrations/<platform>/
is rendered from plugin-source/. The only files still outside build
management are infrastructure that has no unification opportunity:
README.md, .claude-plugin/ / .codex-plugin/ manifests, bob's
commands/ directory, bob's custom_modes.yaml, and the parallel
evolve-full/ plugin tree.
Refs #219
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make `target` optional in MANIFEST.toml [[files]] entries. When omitted the renderer falls back to source minus a trailing `.j2`. Drops 14 lines from MANIFEST.toml without changing the rendered output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch the learn template to {% if forked_context | default(false) %} so
non-claude platforms no longer need to declare forked_context = false
just to satisfy StrictUndefined. Drops three lines from MANIFEST.toml
and makes the platform definitions only declare what differs from the
default.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `[[platforms.<name>.target_rewrites]]` — a list of (regex, replacement) substitutions the renderer applies to each entry's target path under that platform. Use it on bob to map `skills/<name>/` to `skills/evolve-lite-<name>/` so the platform definition (not 14 duplicate manifest entries) carries the folder-rename rule from commit 07a171c. Collapses every `[[files]]` pair (one for claude/claw-code/codex, one for bob's prefixed target) into a single entry that lists every receiving platform. Drops MANIFEST.toml from 232 lines to 132 with no change to the rendered output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make `platforms` optional in [[files]] entries. When omitted the renderer fans the entry out to every platform declared in the manifest. Drops the `platforms = ["claude", "claw-code", "codex", "bob"]` line from the 12 fully-shared entries — the common case for skill scripts and SKILL.md templates after the bob duplicates collapsed. MANIFEST.toml is now 132 lines (from 232 at the start of this batch); no change to the rendered output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add bob and codex to the lib/ entries in MANIFEST.toml. Each platform's plugin tree now ships its own copy of lib/__init__.py, lib/audit.py, lib/config.py, lib/entity_io.py — codex and bob no longer rely on a walk-up to claude's monorepo lib. Simplify the script preludes accordingly: drop the `platform-integrations/claude/plugins/evolve-lite/lib/` fallback from the walk-up loop; the local lib/ or evolve-lib/ sibling is always present now. Update install.sh — bob now sources its lib from its own plugin tree instead of reaching into claude's; codex's redundant claude-lib copy goes away (the plugin copytree already includes lib/). Drop the PYTHONPATH=claude-lib injection in test_bob_sharing.py — bob's scripts find their own lib via the walk-up. Tests pass without it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…atform Drop platform restrictions from the four entries that previously covered partial subsets: - save (was claude+claw-code only) → all four - save-trajectory script + SKILL.md (was missing codex) → all four - on_stop.py / on_stop.sh hooks (was claude only) → all four For platforms where these don't have full runtime support today, the files ship as inert artifacts. Per-platform behavior tightening (e.g. making save-trajectory work under codex, plumbing on_stop hook contracts on non-claude platforms) is tracked as follow-up issues. Add user_skills_dir / save_example_script_root context vars for codex and bob so the save SKILL.md template renders. The codex/bob prose is tilted toward project-local skill paths rather than user home — fix later. Wrap the `context: fork` frontmatter line in the save template with a claude-only branch (matching save-trajectory's pattern). Add commands/evolve-lite-save.md to bob's plugin tree to satisfy the "every skill has a command file" gate now that bob has evolve-lite-save. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restructure plugin-source/ so every skill folder lives under a shared `evolve-lite/` parent: plugin-source/skills/<name>/ → plugin-source/skills/evolve-lite/<name>/. Mirror this in the rendered output for claude/claw-code/codex; bob keeps its flat skills/evolve-lite-<name>/ layout via the existing target_rewrite (pattern updated to match the new source path). Plugin metadata follows: - claude/codex plugin.json: skills key now points at ./skills/evolve-lite/ - claw-code plugin.json: gains a `skills` key pointing at the same path - claude hooks/hooks.json + claw-code hooks/retrieve_entities.sh: shell paths inserted with the evolve-lite/ segment - _macros.j2 invoke() macro: claude and codex paths gain the same segment (claw-code uses runtime colon notation independent of source layout; bob's flat installed path also unchanged) - install.sh: codex hook commands rewritten to the new path; status output reflects the nested layout Tests updated mechanically — every hardcoded skills/<name>/ reference in tests/platform_integrations/ now reads skills/evolve-lite/<name>/. The bob path-rewrite pattern is exercised end-to-end: source skills flow through the rewrite and end up at skills/evolve-lite-<name>/ under platform-integrations/bob/. Tests: 307/308 baseline maintained (the pre-existing test_subscribe_warns_when_audit_write_fails is unchanged). Validation note: claude / claw-code / codex plugin loaders are assumed to honor the `"skills": "./skills/evolve-lite/"` key. Bob's runtime is unaffected — the rewrite produces the same flat .bob/skills/<name>/ layout as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move per-platform configuration into a PLATFORMS dict at the top of scripts/build_plugins.py. The renderer now walks plugin-source/ and fans every file out to every platform — no manifest entries, no explicit `platforms = [...]` lists, no `target = "..."` overrides. Files at plugin-source/ root that are not shipped (_macros.j2, README.md) are listed as RESERVED_SOURCES. The build pipeline keeps the same public surface (`load_manifest()`, `render_to()`, `check_drift()`, `Manifest`/`PlatformConfig`/`FileEntry` dataclasses) so tests and external callers stay working. The MANIFEST_PATH constant is gone; the perturbation drift test no longer needs to patch it. Bob's path rewrite stays the only structural divergence — encoded inline in PLATFORMS as `[(pattern, replacement)]`. Adding a new skill now requires only creating its directory under plugin-source/skills/evolve-lite/; the build picks it up automatically. Tests: 307/308 baseline maintained. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The script and the source tree it walks now live together. Add build_plugins.py to RESERVED_SOURCES so the renderer skips itself, and exclude any __pycache__/ directory the interpreter creates from the source walk. Update consumer paths in justfile, .pre-commit-config.yaml, the GitHub Actions workflow, and the test harness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-platform plugin.json files become generated artifacts of a single source-of-truth plugin-source/plugin.toml, rendered through pydantic input + output models. Drift gate covers them alongside the existing tree-walk. [plugin] holds host-agnostic metadata (only name + version required); [claude] / [claw-code] / [codex] tables hold genuinely host-specific fields. All models are extra="allow", so undeclared TOML keys flow through: [plugin] extras fan out to every host's top-level, host-table extras go to that host only, [codex] extras land in codex's interface block. Bob has no plugin.json output. Refs #219.
… text
Apply evolve-lite:<skill> as bob's runtime skill name across SKILL.md
frontmatter, _macros.j2 skill_ref, custom_modes.yaml workflow steps, and
the commands/*.md slash-command definitions, so bob's UX matches claude
and codex (`/evolve-lite:learn`). On-disk folder layout stays
hyphenated (.bob/skills/evolve-lite-<skill>/) so the plugin tree
installs cleanly on Windows, which rejects colons in path components.
Also folds in the in-flight learn/recall polish: recall switches to
verbatim entity quoting in forked-context renders (the parent agent
can't see intermediate Read results) and uses ${EVOLVE_DIR:-.evolve}
consistently; learn's Step 0 finds the most recent trajectory by
scanning ${EVOLVE_DIR}/trajectories/ instead of parsing on_stop's
transcript_path marker, so the skill is robust to any trajectory the
save-trajectory hook (or skill) wrote.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the 8 hand-maintained bob slash-command definitions from platform-integrations/bob/evolve-lite/commands/ into plugin-source/commands/ so they're now driven by the same fan-out build that already covers SKILL.md, scripts, and per-platform metadata. Adds a target_excludes pattern list to PlatformConfig — claude / claw-code / codex declare `^commands/` to opt out of the new subtree since they have their own command surfaces (plugin.json, $-registry); bob alone keeps it. The renderer skips excluded files in both render_to and check_drift, so pre-commit drift detection keeps working without seeing claude/codex as "missing" the bob-only files. Output content is byte-identical to the prior hand-maintained commands directory (this is purely a source-layout move + build-time filter), verified via `build_plugins.py check`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Conflict: platform-integrations/.../skills/learn/SKILL.md was deleted on this branch (path moved to skills/evolve-lite/learn/, generated from plugin-source/) but modified on main by #236 and #243. Re-applied main's intent in plugin-source/skills/evolve-lite/learn/ and re-rendered: - on_stop.py: derive session_id, emit only "The saved trajectory path is: ..." marker; drop the live-transcript marker (#236, #243). - SKILL.md.j2 Step 0: read the saved trajectory with the Read tool, no cat/head/wc/python3 -c shell-outs; exit zero if missing rather than reaching into ~/.claude/projects/ (#243). - SKILL.md.j2: new Step 4 "Review Existing Guidelines" using Glob+Read, forbidding cat/find/for-loops on the entity tree (#243). - SKILL.md.j2 Step 6: add trajectory field to the entity JSON schema so every guideline carries a back-reference to its source transcript (#236). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mands
Three changes that compose:
1. Bob commands are now derived 1:1 from the skills walk in build_plugins.py
instead of static plugin-source/commands/*.md files. Each command file
uses `description` from the skill's SKILL.md frontmatter (bob's command
schema only honors `description` / `argument-hints` — the slash-command
identifier comes from the file name). The body references the on-disk
folder name (`evolve-lite-<skill>`, dash form) since that's what bob
resolves skills by; folders stay colon-free for Windows compatibility.
2. Per-platform routing: any source file under `plugin-source/_<platform>/`
ships only to that platform, with the `_<platform>/` prefix stripped
from the output target. This is how single-platform artifacts now live
alongside the universal sources:
- _bob/custom_modes.yaml (bob's mandatory workflow definition)
- _bob/README.md
- _claude/hooks/hooks.json (Stop / UserPromptSubmit / SessionStart)
- _claude/README.md
- _claw-code/hooks/retrieve_entities.sh
- _claw-code/README.md
- _codex/README.md
3. render_to() now wipes each platform's plugin_root before writing, so
files removed from plugin-source/ (renamed skills, deleted scripts,
obsolete commands) cannot linger as orphans. Together with (2), this
makes platform-integrations/ fully derivable from plugin-source/.
Also drops the now-dead `^commands/` target_excludes from claude / claw-code
/ codex, since the static commands directory is gone.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous TestRender / TestCheckDrift suite had two brittle patterns that caused the CI failures from commit fc94634: - Iterated `manifest.files × entry.platforms` and asserted every combo was emitted, without honoring `cfg.excludes(...)`. Failed the moment any platform had a non-empty `target_excludes`. - Picked the alphabetically-first verbatim entry × first platform for drift detection, so when that file happened to be excluded for that platform, the perturbation landed in a path check_drift skipped and no `drift:` message was emitted. Redesign: - Shared `isolated_repo` / `rendered_repo` fixtures monkeypatch REPO_ROOT and PLUGIN_SOURCE_DIR so render / check operate against an isolated tmp tree. - Headline invariant test: render then check is silent and returns 0. - Property tests: render is idempotent; render wipes orphans under each plugin_root. - Iteration tests now honor `cfg.excludes(...)` so reintroducing an exclude can't silently break the contract. - Drift tests address files by name (claude learn/SKILL.md, claude learn/scripts/on_stop.py, bob's evolve-lite-learn.md) instead of by `next(...)` over a sorted manifest, so they're stable as the file tree changes. - New TestPerPlatformRouting covers the `_<platform>/` prefix convention; new TestBobCommandGeneration covers the 1:1 skill → command auto-generation, the dash-form body, the description-from- frontmatter rule, and the no-`name:` constraint. Also adds tests/smoke_skills.py (the three-platform skill harness) as part of the PR test plan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
I tested the Bob and Claude Code integrations end-to-end — both work as expected. One thing I'd like to understand better: how should we shape edits that are mostly shared but have a narrow platform-specific seam? Example: #239 adds audit-log influence tracking — the audit schema, Under the new model, is the intended pattern a shared source with a small Jinja-conditional (or per-platform macro) for the session_id derivation, rather than a full overlay file? The |
|
@vinodmut I think in general we should try to ship the same script to all agents even if parts of it are irrelevant. Unlike the web, we aren't driven to minimize file sizes. As it turns out, Claude exposes environment variables to scripts that it runs, apparently like CLAUDECODE=1 # flag that the script is running under Claude Code
CLAUDE_CODE_ENTRYPOINT=cli # how Claude Code was launched
CLAUDE_CODE_EXECPATH=~/.local/share/claude/versions/2.1.126 # path to the running Claude Code binaryThe other CLIs probably do too, and if they don't maybe we should add to the SKILL.md a part that identifies the running platform |
|
Makes sense — runtime env-var detection beats compile-time Jinja conditionals here, and keeps SKILL.md readable. A single One note for #239: the seam is really "how do I identify the current session's transcript" — Claude gives us |
|
@vinodmut {%- if platform == "claude" -%}
{%- set AGENT = "Claude" -%}
{%- elif platform == "claw-code" -%}
{%- set AGENT = "Claw" -%}
{%- elif platform == "codex" -%}
{%- set AGENT = "Codex" -%}
{%- elif platform == "bob" -%}
{%- set AGENT = "Bob" -%}
{%- endif -%} |
|
It's true that Bob doesn't even support hooks so we'd have to find a different way to deal with transcripts |
|
+1 — a tiny compiled-in |
Drops the MainThread group from the live region (it was redrawing the entire view on every orchestrator log line, which stacked duplicate `── MainThread ──` headers when long lines wrapped past the cursor-up wipe). Inlines per-skill `✓/✗ name detail` lines into each platform's section as steps complete, matching the old summary format — and removes the post-run summary block since the same info now lives in the sections themselves. Bob's install-only message also corrected: --resume works upstream again; the real reason we skip skill execution is that bob has no way to run slash commands non-interactively from a one-shot prompt.
Removes the line truncation in LiveGroupedHandler in favor of a wrap-aware redraw. `_last_lines` is now a physical-row count (each buffered line contributes ceil(len / term_width) rows), so the cursor-up wipe (\033[nF) still lands on the start of the live region when lines wrap onto multiple rows. Terminal width is re-read on every render so window resizes mid-run don't desync the wipe math.
Unifies the four hand-edited plugin copies under
platform-integrations/(bob,claude,codex,claw-code) behind a single canonical source atplugin-source/, rendered by a Python+Jinja2 build script.platform-integrations/is treated as generated output, with a render-equality gate enforced by pre-commit and CI.Implements the design captured in #219.
What changed
plugin-source/is the source of truth. A skill'sSKILL.md.j2, its scripts, and its descriptions all live in one place; per-platform output is fanned out byplugin-source/build_plugins.pywith a per-platform Jinja context (forked-context flags, skill-dir paths, etc.).plugin-source/plugin.toml. Each host'splugin.json(or absence thereof) is projected from a single TOML; per-host extras live in[claude]/[claw-code]/[codex]tables._<platform>/prefix. Anything underplugin-source/_<platform>/...ships only to that platform, with the_<platform>/prefix stripped from the output. This is how single-platform artifacts live alongside the universal sources without a separate manifest:_bob/custom_modes.yaml— bob's mandatory workflow definition_claude/hooks/hooks.json—Stop/UserPromptSubmit/SessionStarthooks_claw-code/hooks/retrieve_entities.sh— optional PreToolUse hook_<platform>/README.md— each platform's plugin-facing READMEplugin-source/commands/directory;_bob_command_targets()walks the skill folders and emits oneevolve-lite-<skill>.mdper skill. Frontmatter uses onlydescription(pulled from the skill'sSKILL.mdfrontmatter — bob's command schema only honorsdescription/argument-hints); the body references the on-disk folder name (evolve-lite-<skill>, dash form) since that's what bob resolves against. Folders stay colon-free for Windows compatibility.render_to()wipes each platform'splugin_rootbefore writing, so renamed skills, deleted scripts, or obsolete commands cannot linger as orphans. Together with the routing convention above, this makesplatform-integrations/fully derivable fromplugin-source/.name: evolve-lite:<skill>in frontmatter, referenced asevolve-lite:<skill>in prose) while their on-disk folder remains hyphenated (.bob/skills/evolve-lite-<skill>/) for Windows compatibility.plugins-renderedpre-commit hook and CI job runbuild_plugins.py check, which exits non-zero if committedplatform-integrations/differs from a fresh render. A redesigned 20-test suite attests/platform_integrations/test_build_pipeline.pypins the headline invariant (render → checkis silent), idempotence, orphan wipe, per-platform routing, bob command generation, and drift detection on perturbed/missing files.How to validate locally
Generate the plugins — write
plugin-source/out toplatform-integrations/:just compile-plugins # or, equivalently: uv run python plugin-source/build_plugins.py renderExpect every per-platform tree under
platform-integrations/<host>/to be regenerated.Verify no drift — confirm the committed tree matches a fresh render:
just check-plugins-rendered # or: uv run python plugin-source/build_plugins.py checkSilent on success; exits non-zero with a
drift:/missing managed file:message otherwise. The same check runs in pre-commit and CI.Run the unit tests for the build pipeline (fast, hermetic, ~6s):
Run the cross-platform smoke harness. This runs the install flow +
learn/recall/publishon real CLIs (claude / codex / bob), so a--no-liveflag is supported for offline runs that exercise everything except live model calls:--keepleaves the temp install dir on exit so you can poke at the rendered plugin layout under<tempdir>/<platform>/.Optional sanity checks on the rendered tree — quickly inspect a few outputs:
Notes for reviewers
tests/smoke_skills.py.🤖 Generated with Claude Code