SeekLink is a local semantic search CLI for Markdown vaults. It indexes a folder
of .md files, searches with hybrid keyword + vector retrieval, and returns
line-anchored results that humans and agents can read with simple shell
commands.
It is built for personal knowledge bases, Obsidian-compatible vaults, bilingual English/Chinese notes, and local agent workflows. It is also a useful search layer for Markdown wiki patterns such as Andrej Karpathy's llm-wiki: an agent can search existing pages, read precise line windows, then update the wiki without sending the vault to a hosted service.
Everything runs locally. No API key. No cloud search service. No Obsidian plugin required.
uv tool install seeklink
# or
pip install seeklink# 1. Build the index first.
seeklink index --vault /path/to/vault
# 2. Search it.
seeklink search "machine learning" --vault /path/to/vaultDaily use is simpler if you set a default vault:
export SEEKLINK_VAULT=/path/to/vault
seeklink index
seeklink search "agent memory systems"
seeklink get notes/agent-memory-patterns.md:1 -C 20seeklink search and seeklink index auto-use a resident daemon when
SEEKLINK_VAULT is set and --vault is not passed. The daemon keeps the
embedder and optional reranker in memory. seeklink status and seeklink get
always stay cold-start: status only reads SQLite metadata, and get reads the
file directly from disk.
Text search output is stable:
SCORE PATH[:LINE] TITLE
<content preview, one line, up to 120 chars>
PATHis relative to the vault root.LINEis 1-indexed and points to the best matching chunk in the current file.- Exit code is
0for success, including no results, and1for vault/config errors or missing files. - Scores are useful for sorting within one query. Do not compare scores across reranker-enabled and reranker-disabled runs.
Use JSON when an agent needs structured output:
seeklink search "agent memory systems" --vault PATH --json
seeklink status --vault PATH --jsonseeklink search "query" --vault PATH [options]Options:
--top-k N Number of results. Default: 10.
--json Emit one machine-readable JSON object.
--tags TAG [TAG] Filter by tags. AND semantics.
--folder PREFIX Filter by vault-relative folder prefix.
--rerank-k N|auto Rerank candidate budget. Default: auto.
--no-rerank Skip cross-encoder reranking for this query.
--title-weight F Override title/alias/heading channel weight. Default: 1.5.
Read a precise file window without using the database or daemon:
seeklink get notes/spaced-repetition.md
seeklink get notes/spaced-repetition.md:12
seeklink get notes/spaced-repetition.md:12 -l 40
seeklink get notes/spaced-repetition.md:12 -C 20-l/--lines prints lines starting at LINE. -C/--context prints lines before
and after LINE, grep-style. Path escapes such as ../.. are rejected.
seeklink status --vault PATH
seeklink status --vault PATH --jsonStatus reports index counts, model names, SQLite WAL status, and freshness warnings. It does not load the embedding or reranking models.
seeklink index --vault PATH
seeklink index path/to/file.md --vault PATHFull-vault indexing skips unchanged files by content hash. Single-file indexing updates one Markdown file.
seeklink daemon --vault PATHYou normally do not run this directly. search and index auto-spawn and
auto-restart the daemon when appropriate. Passing --vault to search or
index forces a one-shot cold-start path because the daemon is bound to one
vault at startup.
SeekLink fuses four channels with Reciprocal Rank Fusion:
| Channel | Purpose |
|---|---|
| BM25 / FTS5 | Exact words, code terms, acronyms, CJK lexical matches |
| Vector search | Semantic matches across different wording |
| Title / aliases / headings | Exact note and section lookup |
| Wikilink indegree | Small graph-quality prior from existing [[links]] |
The default embedder is jinaai/jina-embeddings-v2-base-zh through
fastembed. CJK full-text search uses a jieba FTS5 tokenizer when the local
Python/SQLite build can safely register it; otherwise SeekLink falls back to
SQLite's built-in trigram tokenizer instead of crashing.
On Apple Silicon, SeekLink can rerank candidates with
mlx-community/Qwen3-Reranker-0.6B-mxfp8. Reranking is local and optional. Use
--no-rerank for one query or set SEEKLINK_RERANKER_MODEL="" to disable it
globally.
Markdown frontmatter is optional. When present, SeekLink uses it for tags and aliases:
---
tags: [ai, memory]
aliases: [LLM memory, agent memory]
---tagssupport filtered search:seeklink search "memory" --tags aialiasesare indexed for search and used when resolving wikilinks
SeekLink writes one SQLite database inside the vault:
/path/to/vault/.seeklink/seeklink.db
The database contains source metadata, chunks, FTS5 tables, sqlite-vec vectors,
and a wikilink graph. Delete .seeklink/ and run seeklink index to rebuild.
| Area | Status |
|---|---|
| Python | 3.11, 3.12, 3.13, 3.14 |
| OS | macOS and Linux |
| Windows | Not supported as a first-class path |
| File format | Markdown .md |
| Vault style | Plain folder or Obsidian-compatible vault |
| CJK | Native path via jieba, with trigram fallback on static SQLite builds |
| Reranker | Apple Silicon via MLX; disabled elsewhere |
| Daemon | Single vault per machine |
- Hosted or synced multi-user search.
- Non-Markdown sources without conversion.
- A GUI or Obsidian plugin.
- Sub-millisecond search over millions of notes.
- Cloud embedding or reranking APIs.
Agents can use SeekLink through ordinary subprocess calls:
seeklink status --vault PATH
seeklink index --vault PATH
seeklink search "query" --vault PATH --json
seeklink get PATH:LINE -C 20 --vault PATHFor hot loops, the daemon exposes a length-prefixed JSON protocol over the Unix
socket at ~/.rhizome/seeklink.sock. Most agents should prefer the CLI JSON
surface unless they specifically need socket-level latency.
See llms.txt for the compact agent contract.
Search-quality tests live in tests/blind/; the method is documented in
docs/blind-test.md. Release claims should be backed by
the bundled fixture queries or by clearly labeled private-vault measurements.
git clone https://github.com/simonsysun/seeklink
cd seeklink
uv sync --dev
uv run python -m pytest tests/ -qKeep runtime dependencies small, keep public docs user-facing, and add a
CHANGELOG.md entry for user-visible changes.
MIT