mdkb

Local memory, search, and code intelligence — integrated with Claude Code and Codex via CLI, lifecycle hooks, and MCP.

mdkb indexes your project's docs, source code, and persistent knowledge into a local hybrid search engine — then exposes it to Claude Code, Codex, or any MCP client so the AI finds what it needs instead of guessing.

No cloud APIs. No token-heavy context dumps. Just fast, local, relevant retrieval.

What it does

Hybrid search — BM25 + semantic vectors over your markdown docs
Code intelligence — tree-sitter parsing for 13 languages, call graphs, symbol search
Persistent memory — AI-created knowledge entries that survive across sessions, including time-bound reminder entries with due-date surfacing and prior entries for behavioral patterns (30-day TTL default)
Lifecycle hooks — proactive context injection and reindex enqueue via Claude Code / Codex CLI hooks (no tool call required)
Markdown-native memory — export/import memory entries as a folder of .md files for review, git tracking, or bulk edit
Unified diagnostics — mdkb stats renders a static ASCII dashboard (index health, collections, memory, code, sessions, hooks)
Zero config serving — auto-indexes on startup, watches for file changes, auto-VACUUMs on drift

Recent highlights (3.0.0 / 2.2.0 / 2.0.0)

Full details in CHANGES.md.

3.0.0 (breaking) — Hook dispatch via daemon IPC (Unix socket JSON-RPC instead of in-process execution); reindex-queue.jsonl removed (PostToolUse sends paths directly to daemon watcher channel); hook event logging to hook-events.jsonl; per-event configurable latency thresholds; spawn_blocking for CPU-bound hook work.
2.2.0 — prior entry type for behavioral patterns (30d TTL default, excluded from searches); mdkb cheatsheet AI-friendly command reference; --entry-type filter on mdkb search; PreToolUse Grep hook suggests CLI commands (works without MCP); optimized injected text (~185 fewer tokens per turn).
2.0.0 (breaking) — mdkb status removed (use mdkb stats); mdkb memory export/import round-trip entries as .md files with YAML frontmatter; unified ASCII stats dashboard with --format json and --no-color.
1.4.0 — reminder entry type with due_in (surfaced in session warmup once due); schema migration v9 → v10; input hardening (reject control chars in titles/tags).

Installation

Homebrew (macOS/Linux)

brew install sstraus/tap/mdkb

From source

cargo install --path .

Pre-built binaries

Download from Releases — macOS (arm64/x64), Linux (arm64/x64), Windows (x64).

Quick Start

cd your-project
mdkb init
mdkb collection add docs ./docs
mdkb update

Connect to Claude Code

# Project-scoped (recommended)
mdkb setup mcp claude --scope local

# Or user-scoped (global)
mdkb setup mcp claude --scope user

Restart Claude Code after setup. The MCP server auto-indexes on startup and watches for file changes.

Hooks (optional, recommended)

MCP gives the assistant tools; hooks make it use them. Hooks also work standalone without MCP — the PreToolUse Grep interceptor suggests CLI commands via current_exe(), and SessionStart points to mdkb cheatsheet for the full command reference.

Register the lifecycle dispatcher so Claude gets a memory warmup at session start, relevant context on every prompt, and Grep-to-mdkb suggestions — without having to call search first:

# Claude Code, project-scoped (writes .claude/settings.local.json)
mdkb setup hooks claude --scope local

# Claude Code, user-scoped / global (writes ~/.claude/settings.json)
mdkb setup hooks claude --scope user

# Codex CLI (writes ~/.codex/hooks.json)
mdkb setup hooks codex

# Preview the merged settings JSON without writing
mdkb setup hooks claude --scope local --dry-run

# Disable specific events at install time
mdkb setup hooks claude --disable post-tool-use
mdkb setup hooks claude --disable user-prompt-submit,post-tool-use

Restart the host CLI after setup. Re-running is idempotent: existing hook entries are replaced, unrelated settings preserved. Events: session-start, user-prompt-submit, pre-tool-use (Grep interceptor), post-tool-use. Full contract, config, and opt-out in docs/hooks.md.

Binary path caveat

mdkb setup mcp … and mdkb setup hooks … hard-code the absolute path of the binary that ran the setup. If you later move or rebuild the binary, the recorded command breaks. For stable global installs, first run cargo install --path . (binary lands in ~/.cargo/bin/mdkb), then run setup from that binary.

Uninstalling

# Remove all Claude Code registrations (MCP + hooks)
mdkb setup remove claude --scope local   # per-project
mdkb setup remove claude --scope user    # global

# Remove individually
mdkb setup remove mcp claude --scope local
mdkb setup remove mcp codex
mdkb setup remove hooks claude --scope local
mdkb setup remove hooks codex

Soft alternatives before uninstalling: create an empty .mdkbignore-hooks marker at the repo root to silence hooks for that working tree, or toggle session_start_enabled / user_prompt_submit_enabled / post_tool_use_enabled in .mdkb/config.toml.

Manual MCP Setup

Add to your Claude Code MCP config (.claude/mcp.json or ~/.claude/mcp.json):

{
  "mcpServers": {
    "mdkb": {
      "type": "stdio",
      "command": "/path/to/mdkb",
      "args": ["serve"],
      "cwd": "/path/to/your/project"
    }
  }
}

The cwd must point to a directory with .mdkb/ initialized.

MCP Tools (11)

Tool	Description
`search`	Hybrid search across docs+memory (default), or scoped to `docs`, `memory`, `code`, `symbols`. `scope="memory"` accepts `min_confidence` to filter decayed entries
`get`	Retrieve by ID, path, memory slug, glob pattern, or comma-separated list
`code_graph`	Call graph queries: `calls`, `callers`, or `impact` (transitive)
`status`	Index health, collections, and code index stats
`update`	Differential reindex of all collections and source code
`memory_write`	Create or update a memory entry (supports `ttl`, `due_in` for reminders, near-duplicate rejection)
`memory_write_batch`	Create or update multiple memory entries at once (max 20)
`memory_confirm`	Atomic Bayesian signal — `outcome="confirmed"` / `"refuted"` bumps `confirmations` and `last_confirmed_at` without rewriting content
`memory_delete`	Delete a memory entry
`memory_list`	List memory entries sorted by recency, popularity, or creation date
`usage`	Session and lifetime token ledger (per-tool call counts, token totals, truncation stats)

Search Scopes

Scope	What it searches
(omit)	Docs + memory combined (default)
`docs`	Hybrid BM25 + semantic over markdown documents
`memory`	Full-text over memory entries
`symbols`	Exact symbol lookup by name, filterable by `kind` and `file`
`code`	Semantic code search across indexed symbols

Memory

Persistent AI knowledge that survives across sessions — decisions, patterns, solved problems:

Confidence scoring — entries decay over time unless re-confirmed (0-1 score based on age, access count, source type)
Duplicate detection — near-duplicate entries are rejected before writing
Revision tracking — manual entries track up to 3 revision diffs
TTL (time-to-live) — pass ttl (seconds) to memory_write for auto-expiring entries. Expired entries are filtered from searches and listings but remain accessible via get(id) with an [EXPIRED] marker, so they can be inspected or renewed. Omit ttl for permanent entries.

Entry types: topic (concepts), problem (solutions), decision (architectural choices), reminder (time-bound — see below), prior (behavioral patterns — 30-day TTL default, excluded from default searches), handoff (session handover — no default TTL).

Reminders

Create with memory_write(id, title, content, entry_type="reminder", due_in=<seconds>) (or mdkb memory add --entry-type reminder --due-in N). While due_at > now the reminder is hidden from searches and listings. Once due, it appears in the session warmup index prefixed [reminder:DUE] {id}: {title} so the MCP client sees it on the next turn. The AI is instructed to ask for confirmation before deleting and to snooze via memory_write with a new due_in (same id updates the record).

Priors

Behavioral pattern entries written by external analyzers (e.g., HUD stop hooks). Create with memory_write(id, title, content, entry_type="prior") or mdkb memory add <id> --entry-type prior. Priors default to 30-day TTL and are excluded from all default searches — query them explicitly with mdkb search --scope memory --entry-type prior "query" or search(query, scope="memory", entry_type="prior") via MCP.

Handoffs

Session context transfer entries. Create with memory_write(id, title, content, entry_type="handoff") or mdkb memory add <id> --entry-type handoff. Use --file <path> (CLI) or source_file (MCP) to read content from a file — saves tokens when agents write handoffs to the filesystem. The file path is persisted as source_path metadata. Handoffs have no default TTL; confidence decay handles relevance naturally.

Source types control confidence weighting:

Source Type	Multiplier	Use Case
`official_docs`	1.0	Verified documentation
`user_statement`	0.85	Human-stated facts (default)
`auto_extracted`	0.70	Automated knowledge capture
`inference`	0.65	AI-inferred knowledge

Code Intelligence

Tree-sitter parsing for 13 languages: Rust, Go, TypeScript, JavaScript, Python, Java, Kotlin, C, C++, C#, PHP, Swift, Lua, and GDScript.

Substring search — find symbols by partial name (FTS5 trigram, works from 3 characters)
Semantic code search — find conceptually similar code using embeddings
Persistent call graph — function calls, callers, and transitive impact radius survive restarts

Generate semantic embeddings (downloads ~30MB ONNX model on first run):

mdkb embed

CLI Reference

Search

mdkb search "authentication flow"
mdkb search "handler" --scope symbols --kind function
mdkb search "auth handler" --scope code

Collections

mdkb collection add <name> <path> [--pattern <glob>]
mdkb collection remove <name>
mdkb collection rename <old> <new>

Document Retrieval

mdkb get <id|path|slug>
mdkb get 42 --lines 10:50
mdkb get "docs/*.md"

Code Commands

mdkb code index
mdkb code search "handler" --kind fn
mdkb code calls main
mdkb code callers handle_get
mdkb code impact init --depth 5

Memory

mdkb memory add auth-patterns -t "OAuth2 PKCE Flow" -T topic --tags auth,security \
  -c "Always use PKCE for public clients..."
mdkb memory add pay-bill -t "Pay electricity bill" -T reminder --due-in 86400 \
  -c "Monthly utility payment"
mdkb memory list
mdkb memory search "authentication"
mdkb memory history auth-patterns

# Export all entries to .mdkb/memory/entries/ (one .md file per entry)
mdkb memory export
mdkb memory export --dir ./memories --include-expired --overwrite

# Import from a markdown folder (auto-detected) or legacy JSON file
mdkb memory import .mdkb/memory/entries --skip-duplicates
mdkb memory import entries.json --dry-run --skip-duplicates

Stats

mdkb stats is the unified diagnostic dashboard introduced in 2.0.0 (replaces the former mdkb status — not aliased, it was removed).

# Unified ASCII diagnostic dashboard
mdkb stats

# Machine-readable JSON output (safe for pipes and scripts)
mdkb stats --format json

# Plain text (no ANSI color, no Unicode box-drawing)
mdkb stats --no-color

The report is stacked: header (repo, version, db size, last update) → index health → collections → memory (by entry type, reminders DUE / upcoming 7d) → code (by language, top files by tokens) → sessions (totals, top tools) → hooks (slow events last 7d, reindex queue pending). Output auto-detects whether stdout is a TTY; the JSON format is stable for scripting.

Configuration

Configuration lives in .mdkb/config.toml:

[search]
default_limit = 10

[indexing]
debounce_ms = 100
# When true, the doc/collection walker honors .gitignore.
# When false (default), it reads .mdkbignore instead.
respect_gitignore = false

[code.indexing]
# When true (default), the code walker honors .gitignore.
# When false, it reads .mdkbignore instead.
respect_gitignore = true

[mcp]
max_response_tokens = 50000
max_document_tokens = 10000

Environment overrides: MDKB_SEARCH_DEFAULT_LIMIT=20, MDKB_INDEXING_DEBOUNCE_MS=200.

Controlling what gets indexed

Both the document walker (mdkb update) and the code walker (mdkb code index) share a unified ignore system:

Mode	Files honored	Use when
`respect_gitignore = true`	`.gitignore` (+ `# mdkb:index` force-include)	Your ignore rules are already correct for indexing.
`respect_gitignore = false`	`.mdkbignore` only	You want to index content that `.gitignore` excludes (e.g. `stories/`, generated sources), or you need a different ignore scope from git.

Defaults:

Code indexing: respect_gitignore = true — source trees usually want .gitignore honored (skip target/, node_modules/, etc.).
Document indexing: respect_gitignore = false — project knowledge often lives in gitignored folders (plans, stories, drafts).

# mdkb:index annotation (only active when respect_gitignore = true):

Force-include a gitignored path by prefixing it with a # mdkb:index comment line in .gitignore:

# mdkb:index
generated/
# mdkb:index
docs/api/*.md

Blank lines between the annotation and the pattern are tolerated. The annotation is case-insensitive.

.mdkbignore (only active when respect_gitignore = false):

Uses the same syntax as .gitignore, including !pattern for re-inclusion. Place one at the repo root.

Storage

All data stays local in .mdkb/:

.mdkb/
├── config.toml
├── index.sqlite      # FTS5 + document metadata
├── code.sqlite       # Source code symbols + call graph
└── memory/           # Memory entries (markdown files)

The embedding model (AllMiniLML6V2, ~30MB ONNX) is downloaded on first use and cached locally.

Add .mdkb/ to .gitignore — it can be regenerated with mdkb update && mdkb embed.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.agents/skills/agent-dx-cli-scale		.agents/skills/agent-dx-cli-scale
.cargo		.cargo
.github/workflows		.github/workflows
docs		docs
history		history
plans		plans
reviews		reviews
scripts		scripts
src		src
stories		stories
tests		tests
.gitignore		.gitignore
CHANGES.md		CHANGES.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
evolution.md		evolution.md

Folders and files

Latest commit

History

Repository files navigation

mdkb

What it does

Recent highlights (3.0.0 / 2.2.0 / 2.0.0)

Installation

Homebrew (macOS/Linux)

From source

Pre-built binaries

Quick Start

Connect to Claude Code

Hooks (optional, recommended)

Binary path caveat

Uninstalling

Manual MCP Setup

MCP Tools (11)

Search Scopes

Memory

Reminders

Priors

Handoffs

Code Intelligence

CLI Reference

Search

Collections

Document Retrieval

Code Commands

Memory

Stats

Configuration

Controlling what gets indexed

Storage

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages