scribe-cli is both an executable binary that can be run, and a library that can be used in Rust programs.

Installing `scribe` `test_budget_utilization` executables

Assuming you have Rust/Cargo installed, run this command in a terminal:

cargo install scribe-cli

It will make scribe test_budget_utilization commands available in your PATH if you've allowed the PATH to be modified when installing Rust. cargo uninstall scribe-cli uninstalls.

Adding `scribe` library as a dependency

Run this command in a terminal, in your project's directory:

cargo add scribe-cli

To add it manually, edit your project's Cargo.toml file and add to the [dependencies] section:

scribe-cli = "0.5.1"

The scribe library will be automatically available globally. Read the scribe library documentation.

Back to the crate overview.

Readme

Scribe - Advanced Code Analysis Library

Scribe is a comprehensive Rust library for code analysis, repository exploration, and intelligent file processing. It provides powerful tools for understanding codebases through heuristic scoring, graph analysis, and AI-powered insights.

🚀 Features

🔍 Intelligent File Analysis: Multi-dimensional heuristic scoring system for identifying important files
📊 Dependency Graph Analysis: PageRank centrality computation for understanding code relationships
⚡ High-Performance Scanning: Parallel file system traversal with git integration
🎯 Advanced Pattern Matching: Flexible glob and gitignore pattern support with preset configurations
🧠 Smart Code Selection: Context-aware code bundling and relevance scoring
🛠️ Extensible Architecture: Plugin system for custom analyzers and scorers
⚙️ Modular Design: Use only the features you need with optional components

📦 Installation

Add this to your Cargo.toml:

[dependencies]
scribe = "0.1.0"

Feature Flags

Scribe uses feature flags to allow selective compilation:

# Full installation (default)
scribe = "0.1.0"

# Minimal installation
scribe = { version = "0.1.0", default-features = false, features = ["core"] }

# Fast file operations only
scribe = { version = "0.1.0", default-features = false, features = ["fast"] }

# Analysis without graph features
scribe = { version = "0.1.0", default-features = false, features = ["core", "analysis", "scanner"] }

Available Features

Feature	Description	Dependencies
`default`	All features enabled	`core`, `analysis`, `graph`, `scanner`, `patterns`, `selection`
`core`	Essential types and utilities	None
`analysis`	Heuristic scoring and metrics	`core`
`graph`	PageRank centrality analysis	`core`, `analysis`
`scanner`	File system scanning	`core`
`patterns`	Pattern matching (glob, gitignore)	`core`
`selection`	Code selection and bundling	`core`, `analysis`, `graph`

Feature Groups

Group	Features	Use Case
`minimal`	`core`	Basic types and utilities only
`fast`	`core`, `scanner`, `patterns`	Quick file operations
`comprehensive`	All features	Complete analysis capabilities

🏃 Quick Start

Basic Repository Analysis

use scribe::prelude::*;
use std::path::Path;

#[tokio::main]
async fn main() -> Result<()> {
    // Analyze a repository with default settings
    let config = Config::default();
    let analysis = analyze_repository(".", &config).await?;
    
    // Get the most important files
    println!("Top 10 most important files:");
    for (file, score) in analysis.top_files(10) {
        println!("  {}: {:.3}", file, score);
    }
    
    // Display summary
    println!("\n{}", analysis.summary());
    
    Ok(())
}

Selective Feature Usage

// Using only core and scanner features
use scribe::core::{Config, Result};
use scribe::scanner::{Scanner, ScanOptions};

#[tokio::main]
async fn main() -> Result<()> {
    let scanner = Scanner::new();
    let options = ScanOptions::default()
        .with_git_integration(true)
        .with_parallel_processing(true);
    
    let files = scanner.scan(".", options).await?;
    println!("Found {} files", files.len());
    
    Ok(())
}

Pattern Matching

use scribe::patterns::presets;

#[tokio::main]
async fn main() -> scribe::Result<()> {
    // Use preset patterns for common file types
    let mut source_matcher = presets::source_code()?;
    let mut doc_matcher = presets::documentation()?;
    
    if source_matcher.should_process("src/main.rs")? {
        println!("Found source file!");
    }
    
    if doc_matcher.should_process("README.md")? {
        println!("Found documentation!");
    }
    
    Ok(())
}

Graph Analysis

use scribe::graph::PageRankAnalysis;

#[tokio::main]
async fn main() -> scribe::Result<()> {
    let analysis = PageRankAnalysis::for_code_analysis()?;
    
    // Compute centrality for scan results
    // let centrality_results = analysis.compute_centrality(&scan_results)?;
    // let top_files = centrality_results.top_files_by_centrality(10);
    
    Ok(())
}

CLI Covering Sets

Scribe’s CLI can compute minimal covering sets:

--covering-set <name>: target a function/class/module by name.
--covering-set-diff: build a covering set for the current git diff (uses the dependency graph to include touched files plus related dependents/dependencies).
--diff-against <ref>: diff against a specific ref (defaults to HEAD).
Shared filters: --include-dependents, --max-depth, --max-files.
Output helper: add --line-numbers to prefix every line in the bundled files, making it easy for review agents to comment by line number.

Example:

cargo run --bin scribe -- --covering-set-diff --include-dependents --max-depth 2

🏗️ Architecture

Scribe is built with a modular architecture where each crate provides specific functionality:

┌─────────────────────────────────────────────────────────────┐
│                        scribe                               │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│  │ scribe-core │ │scribe-scanner│ │    scribe-patterns     │ │
│  │   (types,   │ │(file system  │ │  (glob, gitignore,     │ │
│  │ traits,     │ │ traversal,   │ │   pattern matching)    │ │
│  │ utilities)  │ │ git support) │ │                        │ │
│  └─────────────┘ └─────────────┘ └─────────────────────────┘ │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│  │scribe-analysis│ │scribe-graph │ │   scribe-selection     │ │
│  │ (heuristic  │ │  (PageRank  │ │ (intelligent bundling, │ │
│  │  scoring,   │ │ centrality, │ │  context extraction,   │ │
│  │ code metrics)│ │ dependency  │ │   relevance scoring)   │ │
│  │             │ │  analysis)  │ │                        │ │
│  └─────────────┘ └─────────────┘ └─────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Component Overview

scribe-core: Foundation types, traits, configuration, and utilities
scribe-scanner: High-performance file system traversal with git integration
scribe-patterns: Flexible pattern matching with glob and gitignore support
scribe-analysis: Heuristic scoring algorithms and code metrics
scribe-graph: PageRank centrality and dependency graph analysis
scribe-selection: Intelligent code selection and context extraction

📖 Examples

The repository includes several examples demonstrating different usage patterns:

Run Examples

# Full analysis example
cargo run --example basic_usage -- /path/to/repository

# Minimal features example  
cargo run --example selective_features --no-default-features --features="core,scanner" -- /path/to/directory

Available Examples

basic_usage.rs: Complete repository analysis with all features
selective_features.rs: Minimal usage with core and scanner only

🔧 Performance

Scribe is designed for high performance:

Memory Efficient: Streaming file processing with configurable memory limits
Parallel Processing: Multi-threaded scanning and analysis using Rayon
Git Integration: Fast file discovery using git ls-files when available
Optimized Algorithms: Research-grade PageRank implementation with convergence detection

Benchmarks

Run benchmarks to see performance characteristics:

cargo bench

Performance characteristics on typical repositories:

Small repos (< 1k files): ~10-50ms analysis time
Medium repos (1k-10k files): ~100ms-1s analysis time
Large repos (> 10k files): ~1-10s analysis time
Memory usage: ~2MB per 1000 files for basic analysis

🛠️ Development

Building

# Build all features
cargo build

# Build with specific features
cargo build --no-default-features --features="core,scanner"

# Build for release
cargo build --release

Testing

# Run all tests
cargo test

# Test specific features
cargo test --no-default-features --features="core,analysis"

# Run tests with output
cargo test -- --nocapture

Documentation

# Generate documentation
cargo doc --open

# Generate documentation for all features
cargo doc --all-features --open

[scribe-cli]: Command-line interface for Scribe
[scribe-vscode]: Visual Studio Code extension
[scribe-jupyter]: Jupyter notebook integration

📄 License

This project is licensed under either of

Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Contribution Guidelines

Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Ensure all tests pass
Submit a pull request

📞 Support

📖 Documentation: docs.rs/scribe
🐛 Issues: GitHub Issues
💬 Discussions: GitHub Discussions

🙏 Acknowledgments

Built with Rust 🦀
Uses tree-sitter for parsing
Inspired by research in code analysis and repository mining
Community feedback and contributions

Installing scribe test_budget_utilization executables

Adding scribe library as a dependency

Installing `scribe` `test_budget_utilization` executables

Adding `scribe` library as a dependency