Index your codebase with tree-sitter. Give Claude Code, Cursor, Copilot, and Codex the exact context they need - not entire directories.
Built on tree-sitter. Designed for MCP. Zero cloud dependencies.
tree-sitter extracts symbols, docstrings, and signatures from Python, JavaScript, TypeScript, Go, Rust, Java, C, and C++. Every function, class, method, and type definition.
One command writes .claude/mcp.json, .cursor/mcp.json, and .vscode/settings.json automatically. Three MCP tools ready immediately.
Weighted scoring detects whether you want to implement, debug, refactor, test, or explain - and picks the optimal retrieval strategy. Not a simple regex match.
Every symbol is annotated with the git author and date from git blame. Your AI assistant knows who last touched a function and when.
The file-watcher applies a 2-second cooldown so rapid consecutive saves do not trigger redundant re-indexing. Incremental SHA-1 hashing skips unchanged files.
A lock file prevents two concurrent processes from corrupting the SQLite database. Safe to run from file-watchers, pre-commit hooks, and CI pipelines simultaneously.
A simple three-step pipeline from your codebase to AI assistant context.
tree-sitter walks your project files, extracts symbols with git blame annotations, and stores everything in a local SQLite database.
The intent classifier scores your query, selects the best retrieval strategy, and builds a token-budgeted context string with only the relevant symbols.
The MCP server exposes three tools that Claude Code, Cursor, and Codex call automatically. Or pipe the output directly into any prompt.
Install, index, and connect to your AI assistant with three commands.
$ pip install "semtree[all]" # Index your project $ cd your-project/ $ semtree index # Configure all AI assistants $ semtree setup --target all # Query context from CLI $ semtree context "add rate limiting" # Search symbols $ semtree search "RateLimiter" -k class # Store project rules $ semtree memory add rule style "Use dataclasses"
The [all] extra includes tree-sitter parsers, tiktoken for accurate token counting, and the MCP server. Requires Python 3.11+.
Incremental by default - only changed files are re-parsed. Use --force to rebuild the full index. A .ctx/ directory is created locally.
semtree setup --target claude writes .claude/mcp.json. Restart Claude Code and the index_project, get_context, and search_symbols tools appear automatically.
Use semtree context "your task" to get a token-budgeted context string you can paste directly into any prompt or pipe into a script.
How semtree extends and improves on the original context-lens approach.
| Feature | semtree | context-lens |
|---|---|---|
| Multi-language docstrings (Python, JS/TS, Go, Rust) | Yes | Python only |
| MCP auto-config (.claude/mcp.json) | Yes | Manual |
| Hook debounce (2s cooldown) | Yes | No |
| Git temporal context (author, date) | Yes | No |
| Intent detection | Weighted scoring | Regex 30% |
| Store return types | Dataclasses | Raw sqlite3.Row |
| CLI structure | Click groups | 1000-line monolith |
| Concurrent-safe indexing | Lock file | No protection |
Stop pasting entire directories. Give your AI exactly what it needs to understand your codebase.