Open source - MIT license

Semantic code trees
for AI assistants

Index your codebase with tree-sitter. Give Claude Code, Cursor, Copilot, and Codex the exact context they need - not entire directories.

bash
$ pip install "semtree[all]"
$ semtree index
$ semtree setup --target all
87%
token savings
45K tokens -> 6K tokens
0.2ms
symbol queries
SQLite FTS5 full-text search
8+
languages supported
Python, JS/TS, Go, Rust, Java, C/C++

Everything your AI assistant needs

Built on tree-sitter. Designed for MCP. Zero cloud dependencies.

Multi-language docstrings

tree-sitter extracts symbols, docstrings, and signatures from Python, JavaScript, TypeScript, Go, Rust, Java, C, and C++. Every function, class, method, and type definition.

MCP auto-configuration

One command writes .claude/mcp.json, .cursor/mcp.json, and .vscode/settings.json automatically. Three MCP tools ready immediately.

Smart intent detection

Weighted scoring detects whether you want to implement, debug, refactor, test, or explain - and picks the optimal retrieval strategy. Not a simple regex match.

Git temporal context

Every symbol is annotated with the git author and date from git blame. Your AI assistant knows who last touched a function and when.

Debounced hooks

The file-watcher applies a 2-second cooldown so rapid consecutive saves do not trigger redundant re-indexing. Incremental SHA-1 hashing skips unchanged files.

Concurrent-safe indexing

A lock file prevents two concurrent processes from corrupting the SQLite database. Safe to run from file-watchers, pre-commit hooks, and CI pipelines simultaneously.

Index once, query forever

A simple three-step pipeline from your codebase to AI assistant context.

01 - Index

Parse your codebase

tree-sitter walks your project files, extracts symbols with git blame annotations, and stores everything in a local SQLite database.

$ semtree index
Indexed 847 files, 12,304 symbols
Done in 1.2s
02 - Query

Describe your task

The intent classifier scores your query, selects the best retrieval strategy, and builds a token-budgeted context string with only the relevant symbols.

$ semtree context "add rate limiting"
Intent: implement (0.87)
Symbols: 23 - Budget: 6K tokens
03 - Context

Feed your AI assistant

The MCP server exposes three tools that Claude Code, Cursor, and Codex call automatically. Or pipe the output directly into any prompt.

MCP tool: get_context
Returned: 6,012 tokens
Saved: 38,988 tokens (87%)

Up and running in 60 seconds

Install, index, and connect to your AI assistant with three commands.

terminal bash
$ pip install "semtree[all]"

# Index your project
$ cd your-project/
$ semtree index

# Configure all AI assistants
$ semtree setup --target all

# Query context from CLI
$ semtree context "add rate limiting"

# Search symbols
$ semtree search "RateLimiter" -k class

# Store project rules
$ semtree memory add rule style "Use dataclasses"
1

Install with all extras

The [all] extra includes tree-sitter parsers, tiktoken for accurate token counting, and the MCP server. Requires Python 3.11+.

2

Index your project

Incremental by default - only changed files are re-parsed. Use --force to rebuild the full index. A .ctx/ directory is created locally.

3

Connect your AI assistant

semtree setup --target claude writes .claude/mcp.json. Restart Claude Code and the index_project, get_context, and search_symbols tools appear automatically.

4

Query from the terminal

Use semtree context "your task" to get a token-budgeted context string you can paste directly into any prompt or pipe into a script.

semtree vs context-lens

How semtree extends and improves on the original context-lens approach.

Feature semtree context-lens
Multi-language docstrings (Python, JS/TS, Go, Rust) Yes Python only
MCP auto-config (.claude/mcp.json) Yes Manual
Hook debounce (2s cooldown) Yes No
Git temporal context (author, date) Yes No
Intent detection Weighted scoring Regex 30%
Store return types Dataclasses Raw sqlite3.Row
CLI structure Click groups 1000-line monolith
Concurrent-safe indexing Lock file No protection

Feed smart context to your AI assistant

Stop pasting entire directories. Give your AI exactly what it needs to understand your codebase.