Skip to content

Vektor-Memory/Vektor-memory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 

Repository files navigation

VEKTOR MEMORY - Slipstream

Persistent memory for AI agents. Local-first. No cloud. No amnesia.

npm version npm downloads LoCoMo benchmark recall latency license MCP tools

Documentation · Install · Quick Start · MCP Tools · Pricing


banner1

VEKTOR fixes the architecture. Not the prompt.

The problems are architectural, not instructional. You cannot prompt your way out of a stateless architecture.

Session starts        ──►  Reconstruct context from logs   ──►  10,000–30,000 tokens burned
                                                                 before a single line of work runs

Cron job fires        ──►  Agent has no memory of last run  ──►  Repeats completed work
                                                                 Loops. Bills stack up.

Add more guardrails   ──►  Longer prompts                   ──►  More tokens
                      ──►  More complexity                  ──►  More failure surface
                      ──►  More maintenance                 ──►  Less time saved

The control paradox: the more control you try to add through prompts, the more expensive and fragile the system becomes. You end up spending more time fixing the automation than the automation saves.


The Solution Stack

┌─────────────────────────────────────────────────────────────┐
│                                                             │
│   DXT       drag-and-drop install · 44 tools registered     │
│             automatically · no JSON editing                 |  
│                                                             │
│   MCP       stateless on-demand tool invocation             │
│             no persistent process between runs              │
│             agent wakes, works, terminates cleanly          │
│                                                             │
│   Skill     ~150 tokens of scoped context injected          │
│   Files     only when relevant · unloaded when done         │
│             90% less context overhead per session           │
│                                                             │
│   VEKTOR ◄─ persistent memory graph · BM25 + vector RRF     │
│             recall · self-organising intelligence layer     │
│             state that actually survives between sessions   │
│                                                             │
└─────────────────────────────────────────────────────────────┘
Layer Solves Token impact
DXT Setup friction, misconfigured tools Surfaces only relevant tools per task
MCP Persistent process requirement, cold starts Stateless invocation on demand
Skill Files Monster prompts, competing instructions 150 tokens vs 8,000–20,000
VEKTOR Session amnesia, broken cron jobs, control paradox 250–4,000 tokens regardless of DB size

Install

npm install -g vektor-slipstream
npx vektor setup

Or drag vektor-slipstream.dxt directly into Claude Desktop. All 44 tools register automatically. No JSON editing. No path configuration.


Quick Start

const { createMemory } = require('vektor-slipstream');

const memory = await createMemory({
  agentId: 'my-agent',
  licenceKey: process.env.VEKTOR_LICENCE_KEY,
});

// Store a memory
await memory.remember('User prefers TypeScript. Deployed to prod on Friday.');

// Recall by semantic similarity -- sub-1ms, fully local
const results = await memory.recall('deployment preferences', 5);
// → [{ content, score, id, timestamp }]

// Traverse the associative memory graph
const graph = await memory.graph('TypeScript', { hops: 2 });

// What changed in the last 7 days?
const delta = await memory.delta('project decisions', 7);

// Morning briefing from recent memories
const brief = await memory.briefing();

Before vs After

Without VEKTOR With VEKTOR
Context cost per session 15,000–50,000 tokens reconstructing history 250–4,000 tokens for full semantic recall
Cron jobs Agent repeats completed work -- no memory of last run Recalls previous run outcome in one call
Configuration memory Forgotten every session Graph surfaces what worked last time automatically
Autonomy vs control Either full autonomy (dangerous) or manual gates (slow) Agent learns from outcome history when to proceed vs escalate
Between-session state Persistent process required or state is lost SQLite persists -- stateless invocation, stateful recall
Embedding cost Cloud API call on every store and recall $0 -- fully local ONNX, no API key required

Session Flow

Task triggered (cron / webhook / user action)
        │
        ▼
Skill File injected based on task context          ~150 tokens
        │
        ▼
vektor_recall_rrf called                           ~800 tokens
Top-10 semantically relevant memories returned
        │
        ▼
Agent classifies situation from memory history
        │
   ┌────┴─────────────────────┐
   ▼                          ▼
familiar pattern          novel / previously failed
proceed autonomously       surface for human review
   │                          │
   └────────────┬─────────────┘
                ▼
Execute task via MCP tools
        │
        ▼
Result stored via vektor_store
Memory graph updated with outcome
        │
        ▼
Session ends · SQLite persists everything
        │
        ▼
Next invocation: same startup cost · full outcome history available

Total context overhead for a routine task: under 2,000 tokens. The same task with a monolithic system prompt and history reconstruction: 15,000–50,000 tokens, with no retention of outcome.


Performance

Metric Value
Recall latency sub-1ms (local SQLite + ONNX)
Embedding cost $0 -- fully local ONNX
Embedding latency ~10ms GPU / ~25ms CPU
LoCoMo benchmark 66.9% adjusted judge accuracy
Min tokens for full recall 250
Max tokens regardless of DB size 4,000
First run ~2 min (downloads ~25MB model once)
Subsequent boots <100ms

**LoCoMo benchmark results

Category Accuracy
Multi-hop 79.1%
Adversarial 70.4%
Single-hop 51.6%
Temporal 46.2%
Adjusted total 66.9%

CLI Chat -- Persistent Memory Across Every Session

npx vektor chat                                        # auto-detects Ollama
npx vektor chat --provider claude                      # Anthropic Claude
npx vektor chat --provider groq --model llama-3.3-70b-versatile
npx vektor chat --provider openai
npx vektor chat --provider gemini
Provider Details
ollama Default -- free, local, no API key. Auto-detects best model.
claude Anthropic Claude -- set ANTHROPIC_API_KEY
openai OpenAI GPT -- set OPENAI_API_KEY
groq Groq LLaMA -- set GROQ_API_KEY (free tier available)
gemini Google Gemini -- set GEMINI_API_KEY

In-chat commands:

Command Action
/recall <query> Search memory mid-conversation
/stats Node count, edges, pinned memories
/briefing Generate memory briefing inline
/exit Exit (Ctrl+C also works)

One-liner commands:

# Store facts
npx vektor remember "I prefer TypeScript over JavaScript"
npx vektor remember "deadline is Friday" --importance 5
cat meeting-notes.txt | npx vektor remember

# Query
npx vektor ask "what stack am I using?"
npx vektor ask "what did we decide about the database?"

# Autonomous agent
npx vektor agent "summarise everything I know about project Alpha"
npx vektor agent "research AI memory tools" --steps 15 --provider groq

Claude Desktop Extension (DXT)

Install the .dxt extension for zero-config persistent memory in every Claude Desktop session.

Install: drag vektor-slipstream.dxt onto the Claude Desktop Extensions page.

Once installed, Claude automatically:

  • Recalls relevant context at session start
  • Stores facts and decisions during conversation
  • Summarises and consolidates at session end

All 44 tools available. No configuration beyond your licence key.

Download: vektormemory.com/docs/dxt


MCP Tools -- All 44

Memory

Tool Function
vektor_recall Semantic + BM25 + graph search across memory
vektor_recall_rrf BM25+RRF dual-channel recall with cross-encoder rerank
vektor_store Store memory with importance score
vektor_ingest Batch ingest conversation turns with session date
vektor_graph Traverse associative memory graph
vektor_delta See what changed on a topic over time
vektor_briefing Generate briefing from recent memories
vektor_stats Memory DB stats -- node count, edges, entities
vektor_timeline Query memories by date range

Cloak -- Stealth Browser, SSH, Fetch

Tool Function
cloak_fetch Stealth headless browser fetch via Playwright
cloak_fetch_smart Checks llms.txt first, falls back to stealth browser
cloak_render Full CSS/DOM layout sensor
cloak_diff Semantic diff of URL since last fetch
cloak_diff_text Structural diff between two text blobs
cloak_passport AES-256-GCM credential vault (get/set/delete/list)
cloak_ssh_exec Execute commands on remote server via SSH
cloak_ssh_upload Upload file to remote server via SFTP
cloak_cortex Scan project directory into memory graph
cloak_cortex_anatomy Get cached file anatomy without rescanning
tokens_saved Token efficiency ROI calculator

Identity + Behaviour (Anti-Bot Bypass)

Tool Function
cloak_identity_create Create persistent browser fingerprint identity
cloak_identity_use Apply saved identity to a fetch call
cloak_identity_list List saved identities with trust summary
cloak_inject_behaviour Human mouse/scroll injection for reCAPTCHA/Cloudflare
cloak_behaviour_stats List available patterns and categories
cloak_load_pattern Load custom recorded behaviour pattern
cloak_pattern_stats Self-improving pattern store tier breakdown
cloak_pattern_list List patterns with scores and tier
cloak_pattern_prune Remove stale/low-scoring patterns
cloak_pattern_seed Seed store with built-in patterns

CAPTCHA

Tool Function
cloak_detect_captcha Detect CAPTCHA type and sitekey
cloak_solve_captcha Solve via vision AI (Claude/GPT-4o/2captcha)

Compression

Tool Function
turbo_quant_compress PolarQuant vector compression (~75% smaller)
turbo_quant_stats Compression ratio and savings stats

Multimodal

Tool Function
vektor_text Text generation (OpenAI/Claude/Groq/Gemini/NVIDIA NIM)
vektor_image Image generation (DALL-E, Stability, NVIDIA)
vektor_vision Image understanding and analysis
vektor_speech Text-to-speech and transcription
vektor_search Web search with memory integration
vektor_providers List available providers and status

Agent

Tool Function
vektor_agent_run Autonomous goal executor with memory
vektor_swarm Multi-agent swarm task
vektor_watch File system watcher -- auto-ingest on change

All CLI Commands

npx vektor setup       # First-run wizard -- licence, hardware, integrations
npx vektor activate    # Activate licence key on this machine
npx vektor test        # Test memory engine with progress bar
npx vektor status      # System health check
npx vektor mcp         # Start Claude Desktop MCP server
npx vektor rem         # Run REM dream cycle (memory consolidation)
npx vektor chat        # Persistent memory chat (all LLMs)
npx vektor remember    # Store a fact
npx vektor ask         # Query memory + LLM answer
npx vektor agent       # Autonomous goal executor
npx vektor help        # All commands

Claude Code Setup

Add to .claude/settings.json in your project:

{
  "mcpServers": {
    "vektor": {
      "command": "node",
      "args": ["/path/to/node_modules/vektor-slipstream/index.js"],
      "env": {
        "VEKTOR_LICENCE_KEY": "your-licence-key",
        "CLOAK_PROJECT_PATH": "/path/to/your/project"
      }
    }
  }
}

What's Included

Memory Core (MAGMA)

  • 4-layer associative graph -- semantic, causal, temporal, entity edges
  • bge-small-en-v1.5 bi-encoder + ms-marco cross-encoder reranker
  • BM25 + stemmed BM25 + RRF fusion -- keyword + semantic dual-channel recall
  • Persistent entity index -- guaranteed named-entity retrieval
  • Foresight extraction -- future-tense statements stored with temporal metadata
  • ADD-only contradiction detection -- full history preserved, no silent overwrites
  • REM dream cycle -- up to 50:1 memory compression
  • Sub-1ms recall -- local SQLite, no network required
  • Local ONNX embeddings -- $0 embedding cost, no API key required

Intelligence Layer (runs automatically, no config)

Module Function
recall-tune Adjusts retrieval weights based on which memories produced correct outcomes
confidence Scores memories by reliability across corroborating sources
dedup Removes semantic duplicates, keeps the graph clean
selforg Reorganises memory clusters as new information accumulates
rl-memory Reinforcement signals surface higher-quality memories preferentially
briefing-scheduler Periodic summaries of memory activity

Integrations

  • Claude Desktop -- DXT extension, 44 tools, auto-memory on every session
  • Claude Code -- MCP server, all 44 tools
  • CLI -- chat, remember, ask, agent commands
  • LangChain -- v1 + v2 adapter included
  • OpenAI Agents SDK -- drop-in integration
  • Groq · Gemini · Ollama · NVIDIA NIM -- provider agnostic

Hardware Auto-Detection

Zero config. VEKTOR detects and uses the best available accelerator:

  • NVIDIA CUDA -- GPU acceleration
  • Apple Silicon -- CoreML
  • CPU -- optimised fallback, works everywhere

Environment Variables

Variable Default Purpose
VEKTOR_SUMMARIZE false Enable LLM session summarisation on ingest
VEKTOR_TRIPLES true Enable batch triple extraction on ingest
VEKTOR_FORESIGHT true Extract future-tense foresight signals
VEKTOR_TEMPORAL true Enable temporal index and date boosting
VEKTOR_CONTRADICT true Enable ADD-only contradiction detection
VEKTOR_DEBUG -- Enable verbose retrieval debug output
VEKTOR_MODEL Xenova/bge-small-en-v1.5 Swap embedding model
VEKTOR_RERANK true Enable cross-encoder reranking

Research Foundation

Built on peer-reviewed research:


Pricing

Plan Price Licences
Solo $9/mo 3
Team $35/mo 5
Studio $59/mo 10
Enterprise $99/mo 25

Purchase at vektormemory.com


What's New in v1.5.0

Retrieval pipeline rebuilt from scratch:

  • bge-small-en-v1.5 bi-encoder + ms-marco cross-encoder reranker (spec-decode architecture)
  • BM25 + Porter-stemmed BM25 + named entity injection, fused via RRF
  • MAGMA graph layer -- co-occurrence and temporal edges between entities in SQLite
  • Persistent entity index (vektor_entities) for guaranteed named-entity recall
  • Foresight extraction -- future-tense statements stored for temporal queries
  • Question type classifier -- routes single-hop vs multi-hop to optimal retrieval path
  • ADD-only contradiction detection -- conflicting facts survive with timestamps
  • Agentic sufficiency check -- reformulates query if key entities missing from top results

vektormemory.com · Docs · hello@vektormemory.com

Stop prompting like it's 2024. Build agents that remember.

About

VEKTOR MEMORY - Slipstream a persistent multi-layered memory architecture. 4-layer associative graph memory (MAGMA) with autonomous REM cycle, CLI, DXT-MCP Cloak Tools.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors