Skip to content

Custom tools: user-defined tools via .nanocoder/tools/ #520

@will-lamerton

Description

@will-lamerton

Custom Tools

Let users define their own tools - with input schemas, validators, execution logic, and formatters - by dropping files into .nanocoder/tools/. Same precedent as custom commands, but tools execute code rather than inject prompts.


The Problem

Today users can extend Nanocoder three ways:

  • Custom commands (.nanocoder/commands/) - markdown that becomes prompt context. No execution.
  • MCP servers - full tool execution, but requires running a separate process, writing a server, and configuring mcpServers in agents.config.json. Heavyweight for a one-off helper.
  • Patching the source - add a tool to source/tools/, rebuild. Not viable for end-users.

There is no lightweight middle ground: "I want a tool the model can call that runs kubectl get pods -n {{ namespace }} and returns the output, with namespace validated and approval required." Today that requires either an MCP server or forking the project.


The Vision

A user drops .nanocoder/tools/k8s-pods.md into their repo:

---
name: k8s_pods
description: List pods in a Kubernetes namespace. Returns kubectl output as text.
parameters:
  namespace:
    type: string
    required: true
    description: The Kubernetes namespace
    pattern: ^[a-z0-9-]+$
    maxLength: 63
  selector:
    type: string
    description: Optional label selector (e.g. "app=api")
approval: never
read_only: true
---

kubectl get pods -n {{ namespace }} {{# selector }}-l "{{ selector }}"{{/ selector }}

Restart Nanocoder. The tool appears alongside built-ins, gets fed to the LLM via the AI SDK with a real JSON Schema, validates inputs before invocation, runs the script, and returns stdout as the tool result.

For power users who need richer formatters or non-shell logic, a .ts form (Phase 2) lets them export a full NanocoderToolExport.


Why Now

The tool architecture is in good shape after the AI SDK v6 migration. Tools are plain NanocoderToolExport objects that flow through ToolManagerToolRegistry. MCP already proves external tools can be loaded into the same registry with the same approval/validator/formatter contracts. A file-based loader is the natural next step - and lets people stop reaching for MCP for trivial wrappers.

Beyond the immediate ergonomics win, this is the foundation for two strategic moves:

  1. Extensibility as a first-class concern. Today, extending Nanocoder past MCP requires forking. A file-based tool loader puts user-authored tools on equal footing with built-ins (same registry, same contracts, same approval flow) and gives us a coherent story for "how do I add behavior to Nanocoder" that doesn't end in "run a separate server."
  2. A path to an open marketplace. Once tools live as portable, declarative files (and Phase 2 TS modules), they're shareable artifacts - copy a markdown file into .nanocoder/tools/, done. The same loader architecture extends naturally to subagents (source/subagents/ already loads markdown definitions via subagent-loader.ts). A future community registry for both - user-authored tools, user-authored subagents, and bundles that combine the two - becomes a packaging problem on top of an existing format, not a new architecture.

Architecture

Anatomy of a Built-In Tool (Recap)

Every Nanocoder tool is a NanocoderToolExport (source/types/core.ts:113, unchanged):

interface NanocoderToolExport {
  name: string;
  tool: AISDKCoreTool;          // AI SDK v6 tool with description, inputSchema, execute, needsApproval
  formatter?: ToolFormatter;     // (args, result?) => ReactElement | string
  streamingFormatter?: StreamingFormatter;
  validator?: ToolValidator;     // (args) => Promise<{valid:true} | {valid:false; error:string}>
  readOnly?: boolean;
}

These are collected in source/tools/index.ts as allToolExports and passed to ToolManager at construction (source/tools/tool-manager.ts:64). MCP tools are added later via registry.registerMany(...) (source/tools/tool-manager.ts:99). Custom tools will go through the same registerMany path - no new code path through the registry.

Where Custom Tools Plug In

ToolManager.constructor()                     ← static built-ins
  ↓
await ToolManager.initializeMCP(servers)      ← MCP tools (existing)
  ↓
await ToolManager.initializeCustomTools()     ← NEW: file-based custom tools
  ↓
registry now contains all three sources, indistinguishable downstream

Confirmation flow (source/components/tool-confirmation.tsx), execution (source/message-handler.ts), approval policy, mode filtering, profiles, and tool-name resolution all keep working unchanged because they only touch the unified registry.

Subagent Integration (Free)

Subagents are constructed with a reference to the same ToolManager instance (source/subagents/subagent-executor.ts:46,52) and resolve tools via toolManager.getAllTools() and toolManager.getToolEntry(name) (:200,:481). Because custom tools register into the unified registry before subagent execution begins, subagents see custom tools automatically with no additional wiring. The existing per-subagent allowedTools / disallowedTools filter (:208) works on them like any other tool. This is a load-bearing property of the marketplace vision: a custom subagent shipped alongside a custom tool will compose without coordination, because they meet at the registry. Add an end-to-end test that proves this - subagent invokes custom tool, both loaded from disk - so a future refactor cannot silently regress it.

Discovery & Layering

Mirror custom-commands precedent (source/custom-commands/loader.ts:35 - projectCommandsDir/personalCommandsDir pattern):

Priority Path
1 (highest) <cwd>/.nanocoder/tools/
2 ~/.config/nanocoder/tools/ (or platform equivalent via getConfigPath())

Project tools shadow personal tools by name. Same conflict resolution as commands - keep the rule consistent.


File Format - Phase 1 (Shell Tools)

A custom tool is a markdown file with YAML frontmatter and a shell-script body. One file, one tool. No directory-as-tool support in Phase 1.

---
name: <required, snake_case, must match regex ^[a-z][a-z0-9_]*$>
description: <required, what the tool does - fed to the LLM>
parameters:
  <param_name>:
    type: string | number | integer | boolean | array
    description: <surfaced to the LLM>
    required: true | false              # default false
    default: <value>                    # used when not provided
    enum: [a, b, c]                     # restrict values
    pattern: ^regex$                    # string only
    minLength: <n>                      # string only
    maxLength: <n>                      # string only
    min: <n>                            # number/integer only
    max: <n>                            # number/integer only
    items: { type: string }             # array only
approval: never | always | destructive  # default: always
read_only: true | false                 # default: !approval==always; surfaces to ToolEntry.readOnly
timeout_ms: <n>                          # default: 30000, max: 300000
cwd: <path>                              # default: project root; supports ${VAR} substitution
env:                                     # extra env vars; ${VAR} substitution allowed
  FOO: bar
shell: bash | sh                         # default: bash if available, else sh
---

# Body is a shell script. Parameters interpolate as {{ name }}.
# Use {{# name }}…{{/ name }} for "if defined" sections (Mustache-style).

kubectl get pods -n {{ namespace }} {{# selector }}-l "{{ selector }}"{{/ selector }}

Parameter → JSON Schema

The loader converts parameters to the AI SDK inputSchema (a JSON Schema). One source of truth: the same description and constraints surface to the model and to the validator.

Synthesized Validator

A ToolValidator is generated from the parameter declarations:

  • required missing → {valid:false, error:"⚒ Missing required parameter: <name>"}
  • type mismatch (e.g. number got string) → typed error
  • pattern, minLength, maxLength, min, max, enum violations → specific errors
  • Unknown extra params → silently dropped (consistent with how the AI SDK handles dynamic tool args)

Errors use the same emoji-prefixed style as webSearchValidator (source/tools/web-search.tsx:197).

Execution Handler

The synthesized handler:

  1. Resolves cwd and merges env (with ${VAR} substitution from process.env).
  2. Renders the body template - replaces {{ name }} with the shell-quoted value (uses child_process.execFile-style args where possible; for the Mustache-style body, route through a single bash -c <rendered> only after escaping with shell-quote or equivalent).
  3. Spawns the shell with the rendered script, applies timeout_ms, captures stdout + stderr.
  4. On exit code 0: returns stdout (truncated at a token-budget limit, e.g. 32k chars, like other tools).
  5. On non-zero exit: throws Error("Custom tool failed (exit ${code}): ${stderr}"). The conversation loop already surfaces tool errors to the LLM as the tool result.

Security: the body runs as the user. Same trust model as execute_bash. Parameter values are shell-quoted before substitution to block injection - but a custom tool author can still write a malicious script. Custom tools live in the user's repo, just like custom commands; treat them as user-authored code.

Approval

Map approval: to needsApproval on the AI SDK tool:

  • neverneedsApproval: false (still subject to mode-based overrides via getEffectiveTools)
  • alwaysneedsApproval: true
  • destructive → reuse createFileToolApproval(name) semantics from source/utils/tool-approval.ts:8 so it auto-approves in auto-accept and yolo but prompts in normal

read_only: true lights up parallelization (mirrors webSearchTool.readOnly).

Default Formatter

Phase 1 ships one shared formatter for all shell tools - no per-tool React components. It renders:

⚒ <tool_name>
Args: <key=value pairs, truncated>
Output: ~<n> tokens

Style copied from web-search.tsx:184 for visual consistency. If the user wants a custom UI they upgrade to Phase 2.


File Format - Phase 2 (TypeScript Module Tools)

For users who need:

  • A real Ink formatter
  • Logic that isn't shell (HTTP calls, parsing, state)
  • A streaming formatter

Drop a .ts file into .nanocoder/tools/ that default-exports a NanocoderToolExport:

import {tool, jsonSchema, type NanocoderToolExport} from 'nanocoder/tools';

export default {
  name: 'jira_ticket',
  tool: tool({
    description: 'Fetch a Jira ticket by key.',
    inputSchema: jsonSchema({
      type: 'object',
      properties: { key: { type: 'string' } },
      required: ['key'],
    }),
    needsApproval: false,
    execute: async ({ key }) => {
      const r = await fetch(`https://example.atlassian.net/rest/api/3/issue/${key}`);
      return await r.text();
    },
  }),
  readOnly: true,
} satisfies NanocoderToolExport;

Loader uses tsx's programmatic API (already a dev dep - see package.json "tsx" usage) or strips types via --experimental-strip-types (Node 23+) to dynamic-import the file. Open question: shipping a stable public import surface (nanocoder/tools) requires the project to expose helpers as a package export. Not blocking for Phase 1 since .ts tools are deferred.

Phase 2 risk: arbitrary code execution at startup. Same trust model as .nanocoder/commands/, but more dangerous because TS modules can do anything on import (not just on invocation). Mitigation:

  • Surface a one-time confirmation on first encountering a TS tool in a directory (similar to the directory-trust gate via useDirectoryTrust).
  • Log loaded TS tools at startup so the user sees what got pulled in.

Implementation Plan

Phase 1: Shell Tools (MVP)

1. Types - extend source/types/

  • Add source/types/custom-tools.ts:
    • CustomToolMetadata (parsed frontmatter shape)
    • CustomToolParameterDef (one entry under parameters:)
  • Re-export from source/types/index.ts.

2. Parser - source/custom-tools/parser.ts

  • Reuse YAML frontmatter logic from source/custom-commands/parser.ts:44 - extract into a shared util in source/utils/frontmatter.ts and have both call it. (Don't duplicate the multi-line / dash-array YAML code; refactor it out.)
  • Validate metadata against CustomToolMetadata. On invalid: logError and skip (consistent with custom-commands behavior on parse failure).

3. Schema synthesis - source/custom-tools/schema-builder.ts

  • buildJsonSchema(metadata: CustomToolMetadata): object → AI SDK inputSchema-compatible JSON Schema.
  • buildValidator(metadata: CustomToolMetadata): ToolValidator → synthesized validation with typed error messages.

4. Body interpolation - source/custom-tools/template.ts

  • renderBody(body: string, args: Record<string, unknown>): string
  • Support {{ name }} (shell-quoted substitution) and {{# name }}...{{/ name }} (conditional include if param truthy).
  • Use a small inline implementation; do not pull in mustache.js for this.
  • All substitutions must shell-escape via a tested escape function (write a focused unit test against injection vectors: ; rm -rf /, backticks, $(), newlines).

5. Handler synthesis - source/custom-tools/handler.ts

  • buildHandler(metadata, body, sourceDir): ToolHandler
  • Spawn via child_process.spawn with the chosen shell; apply timeout_ms; capture stdout/stderr; truncate output; return string or throw.
  • Honor cwd and merged env with ${VAR} substitution.

6. Loader - source/custom-tools/loader.ts

  • class CustomToolLoader mirroring CustomCommandLoader (source/custom-commands/loader.ts:24):
    • Constructor takes projectRoot.
    • loadTools(): NanocoderToolExport[] - scans both directories, parses, synthesizes, returns exports.
    • Personal first, then project (project shadows personal by name).
    • On duplicate name within the same directory: logError, keep the first.
  • Skip files that don't end in .md (Phase 1 ignores .ts until Phase 2).

7. Integration - source/tools/tool-manager.ts

Add a method paralleling initializeMCP:

async initializeCustomTools(projectRoot?: string): Promise<{loaded: string[]; errors: Array<{file: string; error: string}>}> {
  const loader = new CustomToolLoader(projectRoot);
  const exports = loader.loadTools();
  const entries = exports.map(toToolEntry);   // reuse helper from ToolRegistry.fromToolExports
  this.registry.registerMany(entries);
  return { loaded: exports.map(e => e.name), errors: loader.getErrors() };
}

Wire it into useAppInitialization (source/hooks/useAppInitialization.tsx:71) after initializeMCP. Surface load errors via logError and a status line at startup ("Loaded N custom tools"). The --plain non-Ink shell (source/plain/shell.ts) must also load custom tools and emit equivalent errors via stderr; verify both paths.

8. Slash command - /tools (or extend an existing one)

Add a /tools command that lists all available tools by source (built-in / MCP / custom) so users can verify their custom tools loaded. Lives in source/commands/tools.tsx, registered via source/commands/lazy-registry.ts.

9. Mode, profile, and config interaction

Custom tools should respect the existing exclusion logic in ToolManager.MODE_EXCLUDED_TOOLS (source/tools/tool-manager.ts:26). Decisions:

  • Default: custom tools are NOT included in any tune profile's allowlist (so tune.toolProfile = "minimal" and especially "nano" - the strictest 5-tool budget at source/tools/tool-profiles.ts:32 - exclude them). Users opt in via project-level tune config that names the tool explicitly.
  • plan mode: exclude custom tools whose approval is not never AND read_only is not true. (A read-only no-approval tool is safe in plan mode; anything else is mutation-risk.)
  • scheduler mode (internal; tool-manager.ts:53, used by source/schedule/ for cron-driven runs): scheduler already disables ask_user and agent. Apply the same posture to custom tools - exclude any whose approval is not never, because the run is non-interactive and there is no human to confirm.
  • disabledTools config (source/types/config.ts:165, applied in tool-manager.ts:117): custom tools must be filterable via top-level disabledTools the same way built-ins and MCP tools are. No special handling - they live in the unified registry, so this falls out for free, but include a test that confirms it.
  • alwaysAllow config (source/types/config.ts:159): the breaking: remove redundant nanocoderTools.alwaysAllow change (commit e10b530) made the top-level alwaysAllow list the single source of pre-approval. Custom tools participate by name automatically - no per-tool config surface needed, which is consistent with the "no new config surface in Phase 1" stance below.

10. Documentation

  • Add docs/custom-tools.md with the full file format, examples (kubectl, gh, jq, curl wrappers), and the security model.
  • Update CLAUDE.md "Command System" section with a sibling "Custom Tools" subsection.
  • Add an example to .nanocoder/tools/README.md (created by the loader on first run? - decision: no, don't auto-write; just document).

11. Tests (AVA)

  • source/custom-tools/parser.spec.ts - frontmatter parsing, error cases.
  • source/custom-tools/schema-builder.spec.ts - JSON Schema output for each parameter type, validator behavior on each constraint.
  • source/custom-tools/template.spec.ts - interpolation, shell-escape correctness, conditional sections, injection vectors.
  • source/custom-tools/handler.spec.ts - execution, timeout, exit codes, env merging, cwd resolution.
  • source/custom-tools/loader.spec.ts - directory layering, name shadowing, duplicate handling, malformed file skip.
  • One end-to-end spec that loads a real markdown file from a temp dir, registers it through ToolManager, and confirms it appears in getAllTools() with correct schema.

Phase 2: TypeScript Module Tools (Follow-up)

Defer until Phase 1 ships and we have user feedback. Outline:

  • Loader detects .ts / .js files; dynamic-imports them.
  • Validates default export against NanocoderToolExport (zod or runtime shape check).
  • Exposes nanocoder/tools as a package export so authors can import {tool, jsonSchema} without depth into source/.
  • One-time-per-directory trust prompt before loading any TS tool (parallel to useDirectoryTrust).
  • Decide build path: tsx programmatic API vs Node native type stripping vs requiring .js only. Recommendation: require .js initially (zero new infra), add .ts once we know what users want.

Open Questions

  1. Hot-reload? Custom commands today require restart. Same for tools, or watch the directory? Recommend: same as commands (restart). Hot-reload adds complexity for marginal benefit; ship it later if asked.

  2. Cross-platform shells. On Windows, bash may not exist. Phase 1 should fall back to sh and document that custom tools are best-effort on Windows. PowerShell support = Phase 3.

  3. Argument passing model. Two viable options:

    • (A) Template body - {{ param }} interpolation as designed above. Familiar, Mustache-like.
    • (B) Env vars - params injected as $NANOCODER_PARAM_<NAME>, body is a plain script. Safer (no string substitution) but uglier and less Mustache-friendly.

    Recommend (A) for ergonomics, with rigorously tested shell-escape. (B) is the fallback if escape correctness becomes a maintenance liability.

  4. Streaming output. Some tools (e.g. tail -f-ish things) want to stream. Phase 1: capture-then-return. Phase 2 with TS modules: full streamingFormatter access. Don't try to invent streaming for shell tools in Phase 1.

  5. Package-level naming collisions. A custom tool named read_file would shadow a built-in. Decision: disallow custom tool names that match built-in tool names; logError and skip. MCP tools already deduplicate by name; do the same here but with a hard-fail to avoid silently breaking a built-in.

  6. Config schema in agents.config.json. Do we expose a customTools.enabled flag, an extra search path, or per-tool overrides? Resolved: none in Phase 1. The existing top-level disabledTools (source/types/config.ts:165) and alwaysAllow (:159) lists already cover the two real needs - turning a custom tool off and pre-approving it - without inventing new config. If someone wants to disable all custom tools they delete the directory.


What This Is Not

  • Not an MCP replacement. MCP is for tools that need their own process, state, or are shared across Nanocoder users. Custom tools are for personal/project-local helpers.
  • Not a sandbox. A custom tool runs with the user's full shell privileges. Trust model = "you wrote this file or you trust the repo it came from."
  • Not for distributing tools. No tool registry, no nanocoder install <pkg>. Out of scope.

Rough Sequencing

Step Effort Notes
Refactor frontmatter parser into shared util S Pure refactor, no behavior change
Types + schema builder + validator synthesis M Mostly mechanical
Template interpolation + shell-escape M The risky bit - heavy testing
Handler (spawn + timeout + capture) S Reuse patterns from execute-bash.tsx
Loader + ToolManager integration M Mirror initializeMCP
/tools slash command S New command, lazy-registered
Mode/profile interaction wiring S Two ifs in tool-manager.ts
Docs + tests M
Phase 2 (TS modules) M-L Defer until feedback

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions