MCP Server · Local · Zero Cloud

Index once.
Save 93%.

A local MCP server for AI coding agents. AST-aware indexing, semantic search, and automatic compression. Your agent stops re-reading your entire codebase every session.

$ uv tool install code-context-engine

or: pipx install code-context-engine

~/my-project

$ cce init Code Context Engine · my-project ──────────────────────────────── ✓ Git hooks installed (3 hooks, auto-updates) ✓ MCP server registered in .mcp.json ✓ CLAUDE.md created ✓ .gitignore updated Indexing project... ██████████████████████████████ 89/89 files 100% ✓ Indexed 1,247 chunks from 89 files Done! Restart Claude Code to activate CCE.

Benchmark

Verified on FastAPI.
Reproduce it yourself.

20 real coding questions against FastAPI (48 source files, 19K lines). No synthetic queries, no cherry-picking.

Retrieval

93%

full files → relevant chunks

Compression

90%

chunks → signatures

Recall@10

0.80

found the right files

Latency p50

0.4ms

per search query

Token flow per query (avg)

Full files

75,355

After retrieval

5,381

After compression

541

Per-Layer Savings

Each layer measured against its own baseline. Not stacked.

Retrieval · measured 93%

Chunk Compression · measured 90%

Output Compression · estimated 65%

Grammar · measured 13%

Reproduce it

$ pip install code-context-engine
$ python benchmarks/run_benchmark.py \
--repo https://github.com/fastapi/fastapi.git \
--source-dir fastapi

Full results: benchmarks/results/fastapi.md

How it works

Three commands.
Permanent savings.

🗂

Index your codebase

Tree-sitter parses your code into semantic chunks (functions, classes, modules). Stored locally with vector embeddings. Git hooks keep the index current automatically.

cce init

🔍

Claude searches, not reads

Instead of reading whole files, Claude calls context_search via MCP. Hybrid vector + keyword search finds the relevant chunks. Gets the 800 tokens it needs, not an 8,000-token dump.

context_search "payment"

📊

Track real savings

Every query is recorded. cce savings shows exactly how many tokens CCE saved you, with dollar estimates from live Anthropic pricing.

cce savings

Features

Everything Claude needs.
Nothing it doesn't.

🌳

AST-Aware Chunking

Tree-sitter parses Python, JavaScript, TypeScript, PHP, Go, Rust, and Java into semantic chunks. Functions, classes, imports. No raw file dumps, no context waste.

⚡

Hybrid Search + Graph Expansion

Vector similarity + BM25 keyword search merged via Reciprocal Rank Fusion. Then CCE walks one hop on the code graph — if auth.py is a hit, utils.py it imports comes too.

🧠

Smart Compression

With Ollama running locally, chunks are summarized by phi3:mini. Without it, smart truncation extracts signatures and docstrings. Four output levels: off, lite, standard, max.

🔗

Session Memory

Recall decisions and code areas across sessions via session_recall, record_decision, and record_code_area. No re-explaining your architecture every session.

🛡

Security by Default

Secret files (.env, *.pem) are never indexed. Content is scanned for API keys and credentials. PII is scrubbed from memory writes. Path traversal protection on all inputs.

📊

Web Dashboard

Run cce dashboard for donut charts, bar graphs, file health, and live 5-second polling. Or use cce savings for a quick terminal summary.

Feature	No tool	Caveman	CCE (default)	CCE + Ollama
Compress output tokens	✗	✓	✓	✓
Compress input tokens	✗	✗	✓	✓
Codebase indexing	✗	✗	✓	✓
Session memory	✗	✗	✓	✓
LLM summarization	✗	✗	✗	✓
Cost per session (Opus 4, medium project)	$2.25	$1.28	$0.68	$0.45

Index once.
Save 93%.

See it in action

Verified on FastAPI.
Reproduce it yourself.

Per-Layer Savings

Reproduce it

Three commands.
Permanent savings.

Everything Claude needs.
Nothing it doesn't.

How CCE stacks up

Stop paying to
re-read code.

Index once.Save 93%.

See it in action

Verified on FastAPI.Reproduce it yourself.

Per-Layer Savings

Reproduce it

Three commands.Permanent savings.

Everything Claude needs.Nothing it doesn't.

How CCE stacks up

Stop paying tore-read code.

Index once.
Save 93%.

Verified on FastAPI.
Reproduce it yourself.

Three commands.
Permanent savings.

Everything Claude needs.
Nothing it doesn't.

Stop paying to
re-read code.