How to Fix Your Context

As Karpathy said, Context Engineering is the delicate art and science of filling the context window with just the right information for the next step. There are many ways to do this. In Drew Breunig's post "How to Fix Your Context", he outlines 6 common context engineering techniques. This repository demonstrates each technique using LangGraph.

🚀 Quickstart

Prerequisites

Python 3.9 or higher
uv package manager

Installation

Clone the repository and activate a virtual environment:

git clone https://github.com/langchain-ai/how_to_fix_your_context
cd how_to_fix_your_context
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies:

uv pip install -r requirements.txt

Set up environment variables for the model provider(s) you want to use:

export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"

Background

The Context Problem

Chroma's report on Context Rot explains that LLMs do not treat every token in their context window equally. Across 18 models (including GPT‑4.1, Claude 4, Gemini 2.5, Qwen3, etc.), they show that performance on even very simple tasks degrades—often in non‑uniform and surprising ways—as the input length grows. Drew Breunig outlined four failure modes that help to explain why long contexts fail:

Context Poisoning - Hallucinations or errors that enter the context and get repeatedly referenced
Context Distraction - When context grows so large that models focus more on accumulated history than training
Context Confusion - Superfluous content that influences response quality, as models feel compelled to use all available context
Context Clash - Conflicting information within the accumulated context that degrades reasoning

Context Engineering

Drew outlined 6 context engineering techniques to help fix these failure modes, including:

RAG (Retrieval-Augmented Generation)
Tool Loadout
Context Quarantine
Context Pruning
Context Summarization
Context Offloading

We implement each of these techniques in a set of Jupyter notebooks using LangGraph, as outlined below.

LangGraph

LangGraph is a low is a low-level orchestration framework for building AI applications. You can lay out agents or workflows as a set of nodes, define the logic within each one, and define a state object that is passed between them. A StateGraph is LangGraph's primary abstraction for building these stateful workflows and agents with:

Nodes are processing steps that receive current state and return updates
Edges connect nodes to create execution flow (linear, conditional, or cyclical)
State serves as a shared scratchpad between nodes

This low-level control makes it easy to implement each of the context engineering techniques.

1. RAG (Retrieval-Augmented Generation)

Notebook: notebooks/01-rag.ipynb

Retrieval-Augmented Generation (RAG) is the act of selectively adding relevant information to help the LLM generate a better response.

Implementation: Creates a RAG agent using LangGraph with a retrieval tool built from Lilian Weng's blog posts. The agent uses Claude Sonnet to intelligently search for relevant context before answering questions.

Key Components:

Document loading and chunking with RecursiveCharacterTextSplitter
Vector store creation using OpenAI embeddings
LangGraph StateGraph with conditional edges for tool calling
System prompt that guides the agent to clarify research scope before retrieval

Performance: Used 25k tokens for a complex query about reward hacking types, driven by token-heavy tool calls.

2. Tool Loadout

Notebook: notebooks/02-tool-loadout.ipynb

Tool Loadout is the act of selecting only relevant tool definitions to add to your context.

Implementation: Demonstrates semantic tool selection by indexing all Python math library functions in a vector store and dynamically selecting only relevant tools based on user queries.

Key Components:

Tool registry with UUID mapping for all math functions
Vector store indexing of tool descriptions using embeddings
Dynamic tool binding based on semantic similarity search (limit 5 tools)
Extended state class to track selected tools per conversation

Benefits: Avoids context confusion from overlapping tool descriptions and improves tool selection accuracy compared to loading all available tools.

3. Context Quarantine

Notebook: notebooks/03-context-quarantine.ipynb

Context Quarantine is the act of isolating contexts in their own dedicated threads, each used separately by one or more LLMs.

Implementation: Creates a supervisor multi-agent system using LangGraph Supervisor architecture with specialized agents that have isolated context windows.

Key Components:

Supervisor agent that routes tasks to appropriate specialists
Math expert agent with addition/multiplication tools and focused mathematical prompt
Research expert agent with web search capabilities and research-focused prompt
Clear delegation rules based on task type (research vs. calculations)

Benefits: Each agent operates in its own context window, preventing context clash and distraction. The supervisor coordinates between agents using tool-based handoffs for complex tasks requiring multiple skills.

4. Context Pruning

Notebook: notebooks/04-context-pruning.ipynb

Context Pruning is the act of removing irrelevant or otherwise unneeded information from the context.

Implementation: Extends the RAG agent with an intelligent pruning step that removes irrelevant content from retrieved documents before passing them to the main LLM.

Key Components:

Tool pruning prompt that instructs a smaller LLM to extract only relevant information
GPT-4o-mini as the pruning model to reduce costs
Extended state class with summary field for context compression
Pruning based on the original user request to maintain relevance

Performance Improvement: Reduced token usage from 25k to 11k tokens for the same query compared to basic RAG, demonstrating significant context compression while maintaining answer quality.

5. Context Summarization

Notebook: notebooks/05-context-summarization.ipynb

Context Summarization is the act of boiling down an accrued context into a condensed summary.

Implementation: Builds on the RAG agent by adding a summarization step that condenses tool call results to reduce context size while preserving essential information.

Key Components:

Tool summarization prompt that creates comprehensive yet concise versions of documents
GPT-4o-mini as the summarization model for cost efficiency
Guidelines to preserve all key information while eliminating verbosity (50-70% reduction target)
Extended state class with summary field for tracking condensed content

Approach: Unlike pruning which removes irrelevant content, summarization condenses all information into a more compact format, making it suitable when all retrieved content is relevant but verbose.

6. Context Offloading

Notebook: notebooks/06-context-offloading.ipynb

Context Offloading is the act of storing information outside the LLM's context, usually via a tool that stores and manages the data.

Implementation: Demonstrates two approaches to context offloading - temporary scratchpad storage during a session and persistent cross-thread memory using LangGraph's store interface.

Key Components:

Extended state class with scratchpad field for temporary storage
WriteToScratchpad and ReadFromScratchpad tools for note-taking
InMemoryStore for persistent cross-thread memory
Research workflow that maintains organized notes and builds upon previous research

Two Storage Patterns:

Session Scratchpad: Temporary storage within a single conversation thread
Persistent Memory: Cross-thread storage using namespaced key-value pairs that persist across different conversation sessions

Benefits: Enables agents to maintain research plans, accumulate findings, and access previous work across multiple interactions, similar to how Anthropic's multi-agent researcher and products like ChatGPT implement memory.

References

How to Fix Your Context by Drew Breunig
How Contexts Fail and How to Fix Them by Drew Breunig

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
notebooks		notebooks
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to Fix Your Context

🚀 Quickstart

Prerequisites

Installation

Background

The Context Problem

Context Engineering

LangGraph

1. RAG (Retrieval-Augmented Generation)

2. Tool Loadout

3. Context Quarantine

4. Context Pruning

5. Context Summarization

6. Context Offloading

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

How to Fix Your Context

🚀 Quickstart

Prerequisites

Installation

Background

The Context Problem

Context Engineering

LangGraph

1. RAG (Retrieval-Augmented Generation)

2. Tool Loadout

3. Context Quarantine

4. Context Pruning

5. Context Summarization

6. Context Offloading

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages