A local MCP server for AI coding agents. AST-aware indexing, semantic search, and automatic compression. Your agent stops re-reading your entire codebase every session.
Install, index, search, and see savings in under 30 seconds.
20 real coding questions against FastAPI (48 source files, 19K lines). No synthetic queries, no cherry-picking.
Each layer measured against its own baseline. Not stacked.
Full results: benchmarks/results/fastapi.md
Tree-sitter parses your code into semantic chunks (functions, classes, modules). Stored locally with vector embeddings. Git hooks keep the index current automatically.
cce initInstead of reading whole files, Claude calls context_search via MCP. Hybrid vector + keyword search finds the relevant chunks. Gets the 800 tokens it needs, not an 8,000-token dump.
Every query is recorded. cce savings shows exactly how many tokens CCE saved you, with dollar estimates from live Anthropic pricing.
Tree-sitter parses Python, JavaScript, TypeScript, PHP, Go, Rust, and Java into semantic chunks. Functions, classes, imports. No raw file dumps, no context waste.
Vector similarity + BM25 keyword search merged via Reciprocal Rank Fusion. Then CCE walks one hop on the code graph — if auth.py is a hit, utils.py it imports comes too.
With Ollama running locally, chunks are summarized by phi3:mini. Without it, smart truncation extracts signatures and docstrings. Four output levels: off, lite, standard, max.
Recall decisions and code areas across sessions via session_recall, record_decision, and record_code_area. No re-explaining your architecture every session.
Secret files (.env, *.pem) are never indexed. Content is scanned for API keys and credentials. PII is scrubbed from memory writes. Path traversal protection on all inputs.
Run cce dashboard for donut charts, bar graphs, file health, and live 5-second polling. Or use cce savings for a quick terminal summary.
| Feature | No tool | Caveman | CCE (default) | CCE + Ollama |
|---|---|---|---|---|
| Compress output tokens | ✗ | ✓ | ✓ | ✓ |
| Compress input tokens | ✗ | ✗ | ✓ | ✓ |
| Codebase indexing | ✗ | ✗ | ✓ | ✓ |
| Session memory | ✗ | ✗ | ✓ | ✓ |
| LLM summarization | ✗ | ✗ | ✗ | ✓ |
| Cost per session (Opus 4, medium project) | $2.25 | $1.28 | $0.68 | $0.45 |
One install. Automatic savings. Everything local.