tree-sitter-language-pack
Tree-sitter-language-pack¶
A Rust core that bundles 306 tree-sitter grammars behind one parsing and code-intelligence API. Parsers download on demand and cache locally, so the install footprint stays small. Native bindings ship for Python, TypeScript / Node.js, Rust, Go, Java, Kotlin (Android), C#, Ruby, PHP, Elixir, Dart, Swift, Zig, and WebAssembly, plus the standalone ts-pack CLI.
Why tree-sitter-language-pack¶
- 306 Languages Available by v1.9
One pack covers every mainstream language and most niche ones — Python, Rust, Go, Java, TypeScript, C++, Kotlin, Swift, Zig, Elixir, Haskell, Julia, R, and 290+ more.
- Native-speed Parsing
Tree-sitter parsers are C code, called directly from a Rust core. No interpreter overhead, no per-file process spawn.
- On-demand Download
Parsers are fetched and cached on first use. The base install stays small; you only pay for the languages you actually parse.
- Code Intelligence
Beyond raw syntax trees: functions, classes, imports, exports, symbols, comments, and docstrings — extracted with one call.
- LLM-aware Chunking
Split source at natural boundaries (functions, classes, blocks) so chunks stay semantically intact for embeddings and prompt windows.
- 15 Language Surfaces + CLI
The same Rust core ships as a PyPI wheel, an npm module, a crate, a Go module, a Maven JAR, an Android AAR (Maven), a NuGet package, a gem, a Composer package, a Hex package, a pub.dev package, a SwiftPM package, a Zig tarball, a C FFI library, a WASM module, and a static-binary CLI.
Language Support¶
| Language | Install | API Reference |
|---|---|---|
| Python | pip install tree-sitter-language-pack |
API Reference |
| TypeScript / Node.js | npm install @kreuzberg/tree-sitter-language-pack |
API Reference |
| Rust | cargo add tree-sitter-language-pack |
API Reference |
| Go | go get github.com/kreuzberg-dev/tree-sitter-language-pack/packages/go |
API Reference |
| Java | Maven Central dev.kreuzberg.treesitterlanguagepack:tree-sitter-language-pack |
API Reference |
| C# | dotnet add package TreeSitterLanguagePack |
API Reference |
| Ruby | gem install tree_sitter_language_pack |
API Reference |
| PHP | composer require kreuzberg-dev/tree-sitter-language-pack |
API Reference |
| Elixir | {:tree_sitter_language_pack, "~> 1.9"} |
API Reference |
| Dart / Flutter | dart pub add tree_sitter_language_pack |
API Reference |
| Kotlin (Android) | implementation("dev.kreuzberg.tslp.android:tree-sitter-language-pack-android:1.9.0-rc.49") |
API Reference |
| Swift | .package(url: "https://github.com/kreuzberg-dev/tree-sitter-language-pack", exact: "1.9.0-rc.49") |
API Reference |
| Zig | zig fetch --save <release tarball url> |
API Reference |
| WebAssembly | npm install @kreuzberg/tree-sitter-language-pack-wasm |
API Reference |
| C (FFI) | Shared library + header | API Reference |
| CLI | curl -fsSL https://raw.githubusercontent.com/kreuzberg-dev/tree-sitter-language-pack/main/install.sh \| bash |
CLI Guide |
→ See all 306 supported languages
Quick Example¶
import tree_sitter_language_pack as tslp
# Parsers download automatically on first use
result = tslp.process(
"def hello():\n print('world')\n",
tslp.ProcessConfig(language="python", structure=True, imports=True),
)
print(f"Language: {result.language}")
print(f"Functions: {len(result.structure)}")
import { process } from "@kreuzberg/tree-sitter-language-pack";
const result = process("function hello() { console.log('world'); }", {
language: "javascript",
structure: true,
imports: true,
});
console.log(`Language: ${result.language}`);
console.log(`Functions: ${result.structure?.length ?? 0}`);
use tree_sitter_language_pack::{ProcessConfig, process};
fn main() -> anyhow::Result<()> {
let config = ProcessConfig::new("rust").all();
let result = process("fn main() { println!(\"hello\"); }", &config)?;
println!("Language: {}", result.language);
println!("Functions: {}", result.structure.len());
Ok(())
}
Part of Kreuzberg.dev¶
Tree-sitter-language-pack is built by the kreuzberg.dev team, the same people behind a family of Rust-core, polyglot-bindings libraries.
Document intelligence for 90+ formats — PDF, Office, images, HTML, email — with optional OCR.
Managed document extraction API. Same engine as the open-source library, hosted.
Fast HTML to Markdown conversion with the same Rust-core, polyglot-bindings shape.
Polite, resumable web crawler that hands pages to html-to-markdown or Kreuzberg for extraction.
Universal LLM API client: one surface across many providers, proxy and MCP servers included.
Join the community for questions, design discussions, and announcements across all kreuzberg.dev projects.
Explore the Docs¶
- Getting Started
Install for your language, download parsers, and parse your first file in minutes.
- Parsing
Build syntax trees, choose a language, walk nodes, handle parse errors.
- Code Intelligence
Structure, imports, exports, symbols, comments, and docstrings — not just raw nodes.
- Chunking for LLMs
Split source at natural boundaries so chunks stay semantically intact.
- Concepts
Architecture, download model, and the code-intelligence pipeline.
- API Reference
Complete reference for every binding: Python, TypeScript, Rust, Go, Java, Kotlin (Android), C#, Ruby, PHP, Elixir, Dart, Swift, Zig, WASM, and C FFI.
Getting Help¶
- Bugs and feature requests — Open an issue on GitHub
- Community chat — Join the Discord
- Contributing — Read the contributor guide