Skip to content

Ruby API Reference

Installation

Add to Gemfile:

gem "tree_sitter_language_pack"
```text

Then run:

```bash
bundle install
```text

Or install directly:

```bash
gem install tree_sitter_language_pack
```text

## Quick Start

```ruby
require "tree_sitter_language_pack"

# Pre-download languages
TreeSitterLanguagePack.init(["python", "rust"])

# Get a language
language = TreeSitterLanguagePack.get_language("python")

# Get a pre-configured parser
parser = TreeSitterLanguagePack.get_parser("python")
tree = parser.parse("def hello(): pass")
puts tree.root_node.sexp

# Extract code intelligence
config = TreeSitterLanguagePack::ProcessConfig.new("python").all
result = TreeSitterLanguagePack.process("def hello(): pass", config)
puts "Functions: #{result["structure"].length}"
```text

## Download Management

### `TreeSitterLanguagePack.init(languages = nil, groups = nil, cache_dir = nil)`

Initialize the language pack with optional pre-downloads.

**Parameters:**

- `languages` (Array<String> | nil): Languages to download
- `groups` (Array<String> | nil): Language groups to download
- `cache_dir` (String | nil): Custom cache directory

**Returns:** nil

**Raises:**

- `DownloadError`: If downloads fail or network unavailable

**Example:**

```ruby
# Pre-download specific languages
TreeSitterLanguagePack.init(["python", "javascript", "rust"])

# Or download language groups
TreeSitterLanguagePack.init(groups: ["web", "data"])

# With custom cache directory
TreeSitterLanguagePack.init(
  languages: ["python"],
  cache_dir: "/opt/ts-pack"
)
```text

### `TreeSitterLanguagePack.configure(cache_dir = nil)`

Apply configuration without downloading.

Use to set custom cache directory before first `get_language` call.

**Parameters:**

- `cache_dir` (String | nil): Custom cache directory

**Returns:** nil

**Raises:**

- `DownloadError`: If lock cannot be acquired

**Example:**

```ruby
TreeSitterLanguagePack.configure(cache_dir: "/data/ts-pack")

language = TreeSitterLanguagePack.get_language("python")
```text

### `TreeSitterLanguagePack.download(names)`

Download specific languages to cache.

**Parameters:**

- `names` (Array<String>): Language names to download

**Returns:** Integer - Count of newly downloaded languages

**Raises:**

- `DownloadError`: If language not found or download fails
- `LanguageNotFoundError`: If language not in manifest

**Example:**

```ruby
count = TreeSitterLanguagePack.download(
  ["python", "rust", "typescript"]
)
puts "Downloaded #{count} new languages"
```text

### `TreeSitterLanguagePack.download_all`

Download all available languages (170+).

**Returns:** Integer - Count of newly downloaded languages

**Raises:**

- `DownloadError`: If manifest fetch fails

**Example:**

```ruby
count = TreeSitterLanguagePack.download_all
puts "Downloaded #{count} languages total"
```text

### `TreeSitterLanguagePack.manifest_languages`

Get all available languages from remote manifest.

Fetches and caches the manifest.

**Returns:** Array<String> - Sorted language names

**Raises:**

- `DownloadError`: If manifest fetch fails

**Example:**

```ruby
languages = TreeSitterLanguagePack.manifest_languages
puts "Available: #{languages.length} languages"
```text

### `TreeSitterLanguagePack.downloaded_languages`

Get languages already cached locally.

Does not perform network requests.

**Returns:** Array<String> - Cached language names

**Example:**

```ruby
cached = TreeSitterLanguagePack.downloaded_languages
cached.each { |lang| puts lang }
```text

### `TreeSitterLanguagePack.clean_cache`

Delete all cached parser shared libraries.

**Returns:** nil

**Raises:**

- `DownloadError`: If cache cannot be removed

**Example:**

```ruby
TreeSitterLanguagePack.clean_cache
puts "Cache cleaned"
```text

### `TreeSitterLanguagePack.cache_dir`

Get the current cache directory path.

**Returns:** String - Absolute cache directory path

**Example:**

```ruby
dir = TreeSitterLanguagePack.cache_dir
puts "Cache at: #{dir}"
```text

## Language Discovery

### `TreeSitterLanguagePack.get_language(name)`

Get a tree-sitter Language by name.

Resolves aliases (e.g., `"shell"` → `"bash"`). Auto-downloads if needed.

**Parameters:**

- `name` (String): Language name or alias

**Returns:** Language - tree-sitter Language object

**Raises:**

- `LanguageNotFoundError`: If language not recognized
- `DownloadError`: If auto-download fails

**Example:**

```ruby
language = TreeSitterLanguagePack.get_language("python")

parser = TreeSitter::Parser.new
parser.set_language(language)
tree = parser.parse("x = 1")
puts tree.root_node.type # "module"
```text

### `TreeSitterLanguagePack.get_parser(name)`

Get a pre-configured Parser for a language.

**Parameters:**

- `name` (String): Language name or alias

**Returns:** Parser - Pre-configured tree-sitter Parser

**Raises:**

- `LanguageNotFoundError`: If language not recognized
- `DownloadError`: If auto-download fails
- `ParserError`: If parser setup fails

**Example:**

```ruby
parser = TreeSitterLanguagePack.get_parser("rust")
tree = parser.parse("fn main() {}")
puts tree.root_node.has_error? # false
```text

### `TreeSitterLanguagePack.available_languages`

List all available language names.

**Returns:** Array<String> - Sorted language names

**Example:**

```ruby
langs = TreeSitterLanguagePack.available_languages
langs.each { |lang| puts lang }
```text

### `TreeSitterLanguagePack.has_language?(name)`

Check if a language is available.

**Parameters:**

- `name` (String): Language name or alias

**Returns:** Boolean - True if available

**Example:**

```ruby
if TreeSitterLanguagePack.has_language?("python")
  puts "Python available"
end

raise "Shell not available" unless TreeSitterLanguagePack.has_language?("shell")
```text

### `TreeSitterLanguagePack.language_count`

Get total number of available languages.

**Returns:** Integer - Language count

**Example:**

```ruby
count = TreeSitterLanguagePack.language_count
puts "#{count} languages available"
```text

## Parsing

### `TreeSitterLanguagePack.parse_string(source, language)`

Parse source code into a syntax tree.

**Parameters:**

- `source` (String): Source code
- `language` (String): Language name

**Returns:** Tree - Parsed syntax tree

**Raises:**

- `LanguageNotFoundError`: If language not found
- `ParseError`: If parsing fails
- `DownloadError`: If auto-download fails

**Example:**

```ruby
tree = TreeSitterLanguagePack.parse_string(
  "def foo(): pass",
  "python"
)
puts tree.root_node.sexp
```text

## Code Intelligence

### `TreeSitterLanguagePack.process(source, config)`

Extract code intelligence from source code.

**Parameters:**

- `source` (String): Source code
- `config` (ProcessConfig): Configuration

**Returns:** Hash - Result with structure, imports, exports, etc.

**Raises:**

- `LanguageNotFoundError`: If language not found
- `ParseError`: If parsing fails
- `ProcessError`: If analysis fails

**Example:**

```ruby
config = TreeSitterLanguagePack::ProcessConfig.new("python")
  .structure
  .import_exports
  .with_chunks(2000, 400)

result = TreeSitterLanguagePack.process(
  "def hello(): pass",
  config
)

puts "Functions: #{result["structure"].length}"
puts "Lines: #{result["metrics"]["total_lines"]}"
```text

## Types

### `ProcessConfig`

Configuration for code intelligence analysis.

**Constructor:**

```ruby
config = TreeSitterLanguagePack::ProcessConfig.new("python")
```text

**Methods:**

#### `#structure`

Enable structure extraction.

#### `#import_exports`

Enable imports/exports extraction.

#### `#comments`

Enable comment extraction.

#### `#docstrings`

Enable docstring extraction.

#### `#symbols`

Enable symbol extraction.

#### `#metrics`

Enable metric extraction.

#### `#diagnostics`

Enable diagnostic extraction.

#### `#with_chunks(max_size, overlap)`

Configure code chunking.

#### `#all`

Enable all features.

**Example:**

```ruby
config = TreeSitterLanguagePack::ProcessConfig.new("python")
  .structure
  .import_exports
  .comments
  .with_chunks(2000, 400)
```text

### Result Hash

Result from `process` method.

**Keys:**

- `"language"` (String) - Language name
- `"metrics"` (Hash) - File metrics
    - `"total_lines"` (Integer)
    - `"code_lines"` (Integer)
    - `"comment_lines"` (Integer)
    - `"blank_lines"` (Integer)
- `"structure"` (Array) - Code structure items
    - Each item has `"kind"`, `"name"`, `"line"`, `"column"`, etc.
- `"imports"` (Array) - Import statements
- `"exports"` (Array) - Export statements
- `"comments"` (Array) - Comments
- `"docstrings"` (Array) - Docstrings
- `"symbols"` (Array) - Symbols
- `"diagnostics"` (Array) - Diagnostics
- `"chunks"` (Array) - Code chunks
- `"parse_errors"` (Integer) - Number of parse errors

**Example:**

```ruby
result = TreeSitterLanguagePack.process(source, config)

puts result["language"]
result["structure"].each do |item|
  puts "  #{item["kind"]}: #{item["name"]}"
end
```text

## Exception Handling

```ruby
require "tree_sitter_language_pack"

begin
  language = TreeSitterLanguagePack.get_language("python")
  parser = TreeSitter::Parser.new
  parser.set_language(language)
  tree = parser.parse("x = 1")
rescue TreeSitterLanguagePack::LanguageNotFoundError => e
  puts "Language not found: #{e.message}"
rescue TreeSitterLanguagePack::DownloadError => e
  puts "Download failed: #{e.message}"
rescue TreeSitterLanguagePack::ParseError => e
  puts "Parse error: #{e.message}"
rescue => e
  puts "Unexpected error: #{e.message}"
end
```text

## Usage Patterns

### Pre-download Languages

```ruby
# config/initializers/tree_sitter.rb
TreeSitterLanguagePack.init(
  languages: %w[python rust typescript javascript]
)
```text

Then use in your application:

```ruby
require "tree_sitter_language_pack"

# Fast, no network required
parser = TreeSitterLanguagePack.get_parser("python")
```text

### Custom Cache Directory

```ruby
TreeSitterLanguagePack.configure(
  cache_dir: "/data/ts-pack-cache"
)

language = TreeSitterLanguagePack.get_language("python")
```text

### Batch Processing

```ruby
def analyze_files(dir, language)
  Dir.glob("#{dir}/**/*.#{language}").each do |file|
    begin
      source = File.read(file)
      config = TreeSitterLanguagePack::ProcessConfig.new(language).all
      result = TreeSitterLanguagePack.process(source, config)

      puts "#{file}: #{result["structure"].length} items"
    rescue => e
      puts "Error: #{e.message}"
    end
  end
end

analyze_files("./src", "py")
```text

### Parse and Walk Tree

```ruby
parser = TreeSitterLanguagePack.get_parser("python")
tree = parser.parse("def hello(): pass")

def walk_tree(node, depth = 0)
  indent = "  " * depth
  puts "#{indent}#{node.type}"

  node.children.each { |child| walk_tree(child, depth + 1) }
end

walk_tree(tree.root_node)
```text

### Extract Specific Patterns

```ruby
config = TreeSitterLanguagePack::ProcessConfig.new("python")
  .structure

result = TreeSitterLanguagePack.process(File.read("code.py"), config)

# Find all functions
functions = result["structure"].select { |item| item["kind"] == "function" }
functions.each do |func|
  puts func["name"]
end
```text

### Concurrent Processing (with Mutex)

```ruby
require "concurrent"

parser_pool = Concurrent::Array.new

def get_or_create_parser(pool, language)
  # Ensure thread safety
  pool.find { |p| p.language == language } ||
    pool << TreeSitterLanguagePack.get_parser(language)
end

# Use pool in threads
(1..10).map do |i|
  Thread.new do
    parser = get_or_create_parser(parser_pool, "python")
    source = File.read("file#{i}.py")
    tree = parser.parse(source)
    puts "Parsed file #{i}"
  end
end.each(&:join)