Custom tools: user-defined tools via .nanocoder/tools/

# Custom Tools

Let users define their own tools - with input schemas, validators, execution logic, and formatters - by dropping files into `.nanocoder/tools/`. Same precedent as custom commands, but tools execute code rather than inject prompts.

---

## The Problem

Today users can extend Nanocoder three ways:

- **Custom commands** (`.nanocoder/commands/`) - markdown that becomes prompt context. No execution.
- **MCP servers** - full tool execution, but requires running a separate process, writing a server, and configuring `mcpServers` in `agents.config.json`. Heavyweight for a one-off helper.
- **Patching the source** - add a tool to `source/tools/`, rebuild. Not viable for end-users.

There is no lightweight middle ground: *"I want a tool the model can call that runs `kubectl get pods -n {{ namespace }}` and returns the output, with `namespace` validated and approval required."* Today that requires either an MCP server or forking the project.

---

## The Vision

A user drops `.nanocoder/tools/k8s-pods.md` into their repo:

```markdown
---
name: k8s_pods
description: List pods in a Kubernetes namespace. Returns kubectl output as text.
parameters:
  namespace:
    type: string
    required: true
    description: The Kubernetes namespace
    pattern: ^[a-z0-9-]+$
    maxLength: 63
  selector:
    type: string
    description: Optional label selector (e.g. "app=api")
approval: never
read_only: true
---

kubectl get pods -n {{ namespace }} {{# selector }}-l "{{ selector }}"{{/ selector }}
```

Restart Nanocoder. The tool appears alongside built-ins, gets fed to the LLM via the AI SDK with a real JSON Schema, validates inputs before invocation, runs the script, and returns stdout as the tool result.

For power users who need richer formatters or non-shell logic, a `.ts` form (Phase 2) lets them export a full `NanocoderToolExport`.

---

## Why Now

The tool architecture is in good shape after the AI SDK v6 migration. Tools are plain `NanocoderToolExport` objects that flow through `ToolManager` → `ToolRegistry`. MCP already proves *external* tools can be loaded into the same registry with the same approval/validator/formatter contracts. A file-based loader is the natural next step - and lets people stop reaching for MCP for trivial wrappers.

Beyond the immediate ergonomics win, this is the foundation for two strategic moves:

1. **Extensibility as a first-class concern.** Today, extending Nanocoder past MCP requires forking. A file-based tool loader puts user-authored tools on equal footing with built-ins (same registry, same contracts, same approval flow) and gives us a coherent story for "how do I add behavior to Nanocoder" that doesn't end in "run a separate server."
2. **A path to an open marketplace.** Once tools live as portable, declarative files (and Phase 2 TS modules), they're shareable artifacts - copy a markdown file into `.nanocoder/tools/`, done. The same loader architecture extends naturally to subagents (`source/subagents/` already loads markdown definitions via `subagent-loader.ts`). A future community registry for both - user-authored tools, user-authored subagents, and bundles that combine the two - becomes a packaging problem on top of an existing format, not a new architecture.

---

## Architecture

### Anatomy of a Built-In Tool (Recap)

Every Nanocoder tool is a `NanocoderToolExport` (`source/types/core.ts:113`, unchanged):

```ts
interface NanocoderToolExport {
  name: string;
  tool: AISDKCoreTool;          // AI SDK v6 tool with description, inputSchema, execute, needsApproval
  formatter?: ToolFormatter;     // (args, result?) => ReactElement | string
  streamingFormatter?: StreamingFormatter;
  validator?: ToolValidator;     // (args) => Promise<{valid:true} | {valid:false; error:string}>
  readOnly?: boolean;
}
```

These are collected in `source/tools/index.ts` as `allToolExports` and passed to `ToolManager` at construction (`source/tools/tool-manager.ts:64`). MCP tools are added later via `registry.registerMany(...)` (`source/tools/tool-manager.ts:99`). **Custom tools will go through the same `registerMany` path** - no new code path through the registry.

### Where Custom Tools Plug In

```
ToolManager.constructor()                     ← static built-ins
  ↓
await ToolManager.initializeMCP(servers)      ← MCP tools (existing)
  ↓
await ToolManager.initializeCustomTools()     ← NEW: file-based custom tools
  ↓
registry now contains all three sources, indistinguishable downstream
```

Confirmation flow (`source/components/tool-confirmation.tsx`), execution (`source/message-handler.ts`), approval policy, mode filtering, profiles, and tool-name resolution all keep working unchanged because they only touch the unified registry.

### Subagent Integration (Free)

Subagents are constructed with a reference to the same `ToolManager` instance (`source/subagents/subagent-executor.ts:46,52`) and resolve tools via `toolManager.getAllTools()` and `toolManager.getToolEntry(name)` (`:200,:481`). Because custom tools register into the unified registry *before* subagent execution begins, **subagents see custom tools automatically with no additional wiring**. The existing per-subagent `allowedTools` / `disallowedTools` filter (`:208`) works on them like any other tool. This is a load-bearing property of the marketplace vision: a custom subagent shipped alongside a custom tool will compose without coordination, because they meet at the registry. Add an end-to-end test that proves this - subagent invokes custom tool, both loaded from disk - so a future refactor cannot silently regress it.

### Discovery & Layering

Mirror custom-commands precedent (`source/custom-commands/loader.ts:35` - `projectCommandsDir`/`personalCommandsDir` pattern):

| Priority | Path |
|---|---|
| 1 (highest) | `<cwd>/.nanocoder/tools/` |
| 2 | `~/.config/nanocoder/tools/` (or platform equivalent via `getConfigPath()`) |

Project tools shadow personal tools by `name`. Same conflict resolution as commands - keep the rule consistent.

---

## File Format - Phase 1 (Shell Tools)

A custom tool is a markdown file with YAML frontmatter and a shell-script body. **One file, one tool.** No directory-as-tool support in Phase 1.

```markdown
---
name: <required, snake_case, must match regex ^[a-z][a-z0-9_]*$>
description: <required, what the tool does - fed to the LLM>
parameters:
  <param_name>:
    type: string | number | integer | boolean | array
    description: <surfaced to the LLM>
    required: true | false              # default false
    default: <value>                    # used when not provided
    enum: [a, b, c]                     # restrict values
    pattern: ^regex$                    # string only
    minLength: <n>                      # string only
    maxLength: <n>                      # string only
    min: <n>                            # number/integer only
    max: <n>                            # number/integer only
    items: { type: string }             # array only
approval: never | always | destructive  # default: always
read_only: true | false                 # default: !approval==always; surfaces to ToolEntry.readOnly
timeout_ms: <n>                          # default: 30000, max: 300000
cwd: <path>                              # default: project root; supports ${VAR} substitution
env:                                     # extra env vars; ${VAR} substitution allowed
  FOO: bar
shell: bash | sh                         # default: bash if available, else sh
---

# Body is a shell script. Parameters interpolate as {{ name }}.
# Use {{# name }}…{{/ name }} for "if defined" sections (Mustache-style).

kubectl get pods -n {{ namespace }} {{# selector }}-l "{{ selector }}"{{/ selector }}
```

### Parameter → JSON Schema

The loader converts `parameters` to the AI SDK `inputSchema` (a JSON Schema). One source of truth: the same description and constraints surface to the model and to the validator.

### Synthesized Validator

A `ToolValidator` is generated from the parameter declarations:

- `required` missing → `{valid:false, error:"⚒ Missing required parameter: <name>"}`
- type mismatch (e.g. `number` got string) → typed error
- `pattern`, `minLength`, `maxLength`, `min`, `max`, `enum` violations → specific errors
- Unknown extra params → silently dropped (consistent with how the AI SDK handles dynamic tool args)

Errors use the same emoji-prefixed style as `webSearchValidator` (`source/tools/web-search.tsx:197`).

### Execution Handler

The synthesized handler:

1. Resolves `cwd` and merges `env` (with `${VAR}` substitution from `process.env`).
2. Renders the body template - replaces `{{ name }}` with the **shell-quoted** value (uses `child_process.execFile`-style args where possible; for the Mustache-style body, route through a single `bash -c <rendered>` only after escaping with `shell-quote` or equivalent).
3. Spawns the shell with the rendered script, applies `timeout_ms`, captures stdout + stderr.
4. On exit code 0: returns stdout (truncated at a token-budget limit, e.g. 32k chars, like other tools).
5. On non-zero exit: throws `Error("Custom tool failed (exit ${code}): ${stderr}")`. The conversation loop already surfaces tool errors to the LLM as the tool result.

**Security**: the body runs *as the user*. Same trust model as `execute_bash`. Parameter values are shell-quoted before substitution to block injection - but a custom tool author can still write a malicious script. Custom tools live in the user's repo, just like custom commands; treat them as user-authored code.

### Approval

Map `approval:` to `needsApproval` on the AI SDK tool:

- `never` → `needsApproval: false` (still subject to mode-based overrides via `getEffectiveTools`)
- `always` → `needsApproval: true`
- `destructive` → reuse `createFileToolApproval(name)` semantics from `source/utils/tool-approval.ts:8` so it auto-approves in `auto-accept` and `yolo` but prompts in `normal`

`read_only: true` lights up parallelization (mirrors `webSearchTool.readOnly`).

### Default Formatter

Phase 1 ships one shared formatter for all shell tools - no per-tool React components. It renders:

```
⚒ <tool_name>
Args: <key=value pairs, truncated>
Output: ~<n> tokens
```

Style copied from `web-search.tsx:184` for visual consistency. If the user wants a custom UI they upgrade to Phase 2.

---

## File Format - Phase 2 (TypeScript Module Tools)

For users who need:

- A real Ink formatter
- Logic that isn't shell (HTTP calls, parsing, state)
- A streaming formatter

Drop a `.ts` file into `.nanocoder/tools/` that **default-exports a `NanocoderToolExport`**:

```ts
import {tool, jsonSchema, type NanocoderToolExport} from 'nanocoder/tools';

export default {
  name: 'jira_ticket',
  tool: tool({
    description: 'Fetch a Jira ticket by key.',
    inputSchema: jsonSchema({
      type: 'object',
      properties: { key: { type: 'string' } },
      required: ['key'],
    }),
    needsApproval: false,
    execute: async ({ key }) => {
      const r = await fetch(`https://example.atlassian.net/rest/api/3/issue/${key}`);
      return await r.text();
    },
  }),
  readOnly: true,
} satisfies NanocoderToolExport;
```

Loader uses `tsx`'s programmatic API (already a dev dep - see `package.json` "tsx" usage) or strips types via `--experimental-strip-types` (Node 23+) to dynamic-import the file. **Open question**: shipping a stable public import surface (`nanocoder/tools`) requires the project to expose helpers as a package export. Not blocking for Phase 1 since `.ts` tools are deferred.

**Phase 2 risk**: arbitrary code execution at startup. Same trust model as `.nanocoder/commands/`, but more dangerous because TS modules can do anything on import (not just on invocation). Mitigation:

- Surface a one-time confirmation on first encountering a TS tool in a directory (similar to the directory-trust gate via `useDirectoryTrust`).
- Log loaded TS tools at startup so the user sees what got pulled in.

---

## Implementation Plan

### Phase 1: Shell Tools (MVP)

**1. Types - extend `source/types/`**

- Add `source/types/custom-tools.ts`:
  - `CustomToolMetadata` (parsed frontmatter shape)
  - `CustomToolParameterDef` (one entry under `parameters:`)
- Re-export from `source/types/index.ts`.

**2. Parser - `source/custom-tools/parser.ts`**

- Reuse YAML frontmatter logic from `source/custom-commands/parser.ts:44` - extract into a shared util in `source/utils/frontmatter.ts` and have both call it. (Don't duplicate the multi-line / dash-array YAML code; refactor it out.)
- Validate metadata against `CustomToolMetadata`. On invalid: `logError` and skip (consistent with custom-commands behavior on parse failure).

**3. Schema synthesis - `source/custom-tools/schema-builder.ts`**

- `buildJsonSchema(metadata: CustomToolMetadata): object` → AI SDK `inputSchema`-compatible JSON Schema.
- `buildValidator(metadata: CustomToolMetadata): ToolValidator` → synthesized validation with typed error messages.

**4. Body interpolation - `source/custom-tools/template.ts`**

- `renderBody(body: string, args: Record<string, unknown>): string`
- Support `{{ name }}` (shell-quoted substitution) and `{{# name }}...{{/ name }}` (conditional include if param truthy).
- Use a small inline implementation; do not pull in mustache.js for this.
- All substitutions must shell-escape via a tested escape function (write a focused unit test against injection vectors: `; rm -rf /`, backticks, `$()`, newlines).

**5. Handler synthesis - `source/custom-tools/handler.ts`**

- `buildHandler(metadata, body, sourceDir): ToolHandler`
- Spawn via `child_process.spawn` with the chosen shell; apply `timeout_ms`; capture stdout/stderr; truncate output; return string or throw.
- Honor `cwd` and merged `env` with `${VAR}` substitution.

**6. Loader - `source/custom-tools/loader.ts`**

- `class CustomToolLoader` mirroring `CustomCommandLoader` (`source/custom-commands/loader.ts:24`):
  - Constructor takes `projectRoot`.
  - `loadTools(): NanocoderToolExport[]` - scans both directories, parses, synthesizes, returns exports.
  - Personal first, then project (project shadows personal by `name`).
  - On duplicate name within the same directory: `logError`, keep the first.
- Skip files that don't end in `.md` (Phase 1 ignores `.ts` until Phase 2).

**7. Integration - `source/tools/tool-manager.ts`**

Add a method paralleling `initializeMCP`:

```ts
async initializeCustomTools(projectRoot?: string): Promise<{loaded: string[]; errors: Array<{file: string; error: string}>}> {
  const loader = new CustomToolLoader(projectRoot);
  const exports = loader.loadTools();
  const entries = exports.map(toToolEntry);   // reuse helper from ToolRegistry.fromToolExports
  this.registry.registerMany(entries);
  return { loaded: exports.map(e => e.name), errors: loader.getErrors() };
}
```

Wire it into `useAppInitialization` (`source/hooks/useAppInitialization.tsx:71`) after `initializeMCP`. Surface load errors via `logError` and a status line at startup ("Loaded N custom tools"). The `--plain` non-Ink shell (`source/plain/shell.ts`) must also load custom tools and emit equivalent errors via stderr; verify both paths.

**8. Slash command - `/tools` (or extend an existing one)**

Add a `/tools` command that lists all available tools by source (built-in / MCP / custom) so users can verify their custom tools loaded. Lives in `source/commands/tools.tsx`, registered via `source/commands/lazy-registry.ts`.

**9. Mode, profile, and config interaction**

Custom tools should respect the existing exclusion logic in `ToolManager.MODE_EXCLUDED_TOOLS` (`source/tools/tool-manager.ts:26`). Decisions:

- **Default**: custom tools are NOT included in any tune profile's allowlist (so `tune.toolProfile = "minimal"` and especially `"nano"` - the strictest 5-tool budget at `source/tools/tool-profiles.ts:32` - exclude them). Users opt in via project-level tune config that names the tool explicitly.
- **`plan` mode**: exclude custom tools whose `approval` is not `never` AND `read_only` is not `true`. (A read-only no-approval tool is safe in plan mode; anything else is mutation-risk.)
- **`scheduler` mode** (internal; `tool-manager.ts:53`, used by `source/schedule/` for cron-driven runs): scheduler already disables `ask_user` and `agent`. Apply the same posture to custom tools - exclude any whose `approval` is not `never`, because the run is non-interactive and there is no human to confirm.
- **`disabledTools` config** (`source/types/config.ts:165`, applied in `tool-manager.ts:117`): custom tools must be filterable via top-level `disabledTools` the same way built-ins and MCP tools are. No special handling - they live in the unified registry, so this falls out for free, but include a test that confirms it.
- **`alwaysAllow` config** (`source/types/config.ts:159`): the `breaking: remove redundant nanocoderTools.alwaysAllow` change (commit `e10b530`) made the top-level `alwaysAllow` list the single source of pre-approval. Custom tools participate by name automatically - no per-tool config surface needed, which is consistent with the "no new config surface in Phase 1" stance below.

**10. Documentation**

- Add `docs/custom-tools.md` with the full file format, examples (kubectl, gh, jq, curl wrappers), and the security model.
- Update `CLAUDE.md` "Command System" section with a sibling "Custom Tools" subsection.
- Add an example to `.nanocoder/tools/README.md` (created by the loader on first run? - **decision**: no, don't auto-write; just document).

**11. Tests (AVA)**

- `source/custom-tools/parser.spec.ts` - frontmatter parsing, error cases.
- `source/custom-tools/schema-builder.spec.ts` - JSON Schema output for each parameter type, validator behavior on each constraint.
- `source/custom-tools/template.spec.ts` - interpolation, shell-escape correctness, conditional sections, **injection vectors**.
- `source/custom-tools/handler.spec.ts` - execution, timeout, exit codes, env merging, cwd resolution.
- `source/custom-tools/loader.spec.ts` - directory layering, name shadowing, duplicate handling, malformed file skip.
- One end-to-end spec that loads a real markdown file from a temp dir, registers it through `ToolManager`, and confirms it appears in `getAllTools()` with correct schema.

---

### Phase 2: TypeScript Module Tools (Follow-up)

Defer until Phase 1 ships and we have user feedback. Outline:

- Loader detects `.ts` / `.js` files; dynamic-imports them.
- Validates default export against `NanocoderToolExport` (zod or runtime shape check).
- Exposes `nanocoder/tools` as a package export so authors can `import {tool, jsonSchema}` without depth into `source/`.
- One-time-per-directory trust prompt before loading any TS tool (parallel to `useDirectoryTrust`).
- Decide build path: `tsx` programmatic API vs Node native type stripping vs requiring `.js` only. **Recommendation**: require `.js` initially (zero new infra), add `.ts` once we know what users want.

---

## Open Questions

1. **Hot-reload?** Custom commands today require restart. Same for tools, or watch the directory? Recommend: **same as commands** (restart). Hot-reload adds complexity for marginal benefit; ship it later if asked.

2. **Cross-platform shells.** On Windows, `bash` may not exist. Phase 1 should fall back to `sh` and document that custom tools are best-effort on Windows. PowerShell support = Phase 3.

3. **Argument passing model.** Two viable options:
   - **(A) Template body** - `{{ param }}` interpolation as designed above. Familiar, Mustache-like.
   - **(B) Env vars** - params injected as `$NANOCODER_PARAM_<NAME>`, body is a plain script. Safer (no string substitution) but uglier and less Mustache-friendly.

   Recommend (A) for ergonomics, with rigorously tested shell-escape. (B) is the fallback if escape correctness becomes a maintenance liability.

4. **Streaming output.** Some tools (e.g. `tail -f`-ish things) want to stream. Phase 1: capture-then-return. Phase 2 with TS modules: full `streamingFormatter` access. Don't try to invent streaming for shell tools in Phase 1.

5. **Package-level naming collisions.** A custom tool named `read_file` would shadow a built-in. Decision: **disallow** custom tool names that match built-in tool names; `logError` and skip. MCP tools already deduplicate by name; do the same here but with a hard-fail to avoid silently breaking a built-in.

6. **Config schema in `agents.config.json`.** ~~Do we expose a `customTools.enabled` flag, an extra search path, or per-tool overrides?~~ **Resolved**: none in Phase 1. The existing top-level `disabledTools` (`source/types/config.ts:165`) and `alwaysAllow` (`:159`) lists already cover the two real needs - turning a custom tool off and pre-approving it - without inventing new config. If someone wants to disable all custom tools they delete the directory.

---

## What This Is Not

- **Not an MCP replacement.** MCP is for tools that need their own process, state, or are shared across Nanocoder users. Custom tools are for personal/project-local helpers.
- **Not a sandbox.** A custom tool runs with the user's full shell privileges. Trust model = "you wrote this file or you trust the repo it came from."
- **Not for distributing tools.** No tool registry, no `nanocoder install <pkg>`. Out of scope.

---

## Rough Sequencing

| Step | Effort | Notes |
|---|---|---|
| Refactor frontmatter parser into shared util | S | Pure refactor, no behavior change |
| Types + schema builder + validator synthesis | M | Mostly mechanical |
| Template interpolation + shell-escape | M | The risky bit - heavy testing |
| Handler (spawn + timeout + capture) | S | Reuse patterns from `execute-bash.tsx` |
| Loader + ToolManager integration | M | Mirror `initializeMCP` |
| `/tools` slash command | S | New command, lazy-registered |
| Mode/profile interaction wiring | S | Two `if`s in `tool-manager.ts` |
| Docs + tests | M | |
| Phase 2 (TS modules) | M-L | Defer until feedback |


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Custom tools: user-defined tools via .nanocoder/tools/ #520

Custom Tools

The Problem

The Vision

Why Now

Architecture

Anatomy of a Built-In Tool (Recap)

Where Custom Tools Plug In

Subagent Integration (Free)

Discovery & Layering

File Format - Phase 1 (Shell Tools)

Parameter → JSON Schema

Synthesized Validator

Execution Handler

Approval

Default Formatter

File Format - Phase 2 (TypeScript Module Tools)

Implementation Plan

Phase 1: Shell Tools (MVP)

Phase 2: TypeScript Module Tools (Follow-up)

Open Questions

What This Is Not

Rough Sequencing

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Priority	Path
1 (highest)	`<cwd>/.nanocoder/tools/`
2	`~/.config/nanocoder/tools/` (or platform equivalent via `getConfigPath()`)

Step	Effort	Notes
Refactor frontmatter parser into shared util	S	Pure refactor, no behavior change
Types + schema builder + validator synthesis	M	Mostly mechanical
Template interpolation + shell-escape	M	The risky bit - heavy testing
Handler (spawn + timeout + capture)	S	Reuse patterns from `execute-bash.tsx`
Loader + ToolManager integration	M	Mirror `initializeMCP`
`/tools` slash command	S	New command, lazy-registered
Mode/profile interaction wiring	S	Two `if`s in `tool-manager.ts`
Docs + tests	M
Phase 2 (TS modules)	M-L	Defer until feedback

Uh oh!

Custom tools: user-defined tools via .nanocoder/tools/ #520

Description

Custom Tools

The Problem

The Vision

Why Now

Architecture

Anatomy of a Built-In Tool (Recap)

Where Custom Tools Plug In

Subagent Integration (Free)

Discovery & Layering

File Format - Phase 1 (Shell Tools)

Parameter → JSON Schema

Synthesized Validator

Execution Handler

Approval

Default Formatter

File Format - Phase 2 (TypeScript Module Tools)

Implementation Plan

Phase 1: Shell Tools (MVP)

Phase 2: TypeScript Module Tools (Follow-up)

Open Questions

What This Is Not

Rough Sequencing

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions