# Performance This page documents Forge's performance characteristics, benchmarks, and optimization guidelines. ## Table of Contents - [Overview](#overview) - [Benchmark Infrastructure](#benchmark-infrastructure) - [Git Operations Benchmarks](#git-operations-benchmarks) - [Data Operations Benchmarks](#data-operations-benchmarks) - [Performance Summary](#performance-summary) - [Complexity Analysis](#complexity-analysis) - [Optimization Guidelines](#optimization-guidelines) --- ## Overview Forge's performance profile is dominated by Git I/O operations rather than computational complexity. The benchmark suite quantifies all core operations to enable informed optimization decisions and regression testing. **Key Findings:** - **I/O is the bottleneck**: Repository discovery (~10ms) dominates all other operations by 10-100× - **Git operations are fast**: Most git2 operations complete in microseconds - **Data operations are negligible**: Developer/module management operates in nanoseconds - **Scaling is predictable**: Linear scaling observed where expected (file count, commit count) - **No runaway performance**: Built-in limits prevent performance degradation on large repositories --- ## Benchmark Infrastructure Forge uses [Criterion.rs](https://github.com/bheisler/criterion.rs) for statistical benchmarking with HTML reports. ### Running Benchmarks ```bash # Run all benchmarks cargo bench # Run specific benchmark suite cargo bench --bench git_operations cargo bench --bench data_operations # Run specific benchmark cargo bench -- discover_repo # View HTML reports open target/criterion/report/index.html ``` ### Benchmark Suites 1. **`benches/git_operations.rs`** — 7 benchmarks covering repository discovery, status, staging, history, and branches 2. **`benches/data_operations.rs`** — 4 benchmarks covering module/developer management and auto-population ### Dependencies ```toml [dev-dependencies] criterion = { version = "0.5", features = ["html_reports"] } tempfile = "3.8.1" ``` --- ## Git Operations Benchmarks ### 1. Repository Discovery **Benchmark**: `discover_repo` - **Baseline**: 9.65 ms average - **Profile**: I/O-bound, O(1) practical complexity - **Description**: Walks filesystem to find `.git` directory - **Insight**: Slowest operation; dominated by filesystem I/O **Code Tested:** ```rust GitClient::discover(&repo_path) ``` --- ### 2. HEAD Branch Retrieval **Benchmark**: `head_branch` - **Baseline**: 47.18 µs average - **Profile**: O(1), in-memory HEAD reference lookup - **Description**: Retrieves current branch from HEAD - **Insight**: Extremely fast; minimal overhead **Code Tested:** ```rust git_client.head_branch() ``` --- ### 3. List File Changes **Benchmark**: `list_changes` | File Count | Time | Scaling | | ---------- | --------- | ------- | | 10 files | 548.96 µs | — | | 50 files | 2.79 ms | 5.08× | - **Profile**: O(n) linear scaling with file count - **Description**: Lists all file status changes via `git status` - **Insight**: Expected scaling; realistic for typical projects **Code Tested:** ```rust git_client.list_changes() ``` --- ### 4. Get Commit History **Benchmark**: `get_commit_history` | Commit Count | Time | Scaling | | ------------ | ------- | ------- | | 10 commits | 333 µs | — | | 50 commits | 1.70 ms | 5.10× | | 100 commits | 1.68 ms | 0.99× | - **Profile**: O(n) up to 50-commit limit, then constant - **Description**: Retrieves commit log with built-in 50-commit limit - **Insight**: Built-in limit prevents runaway performance for large histories **Code Tested:** ```rust git_client.get_commit_history() ``` --- ### 5. List Local Branches **Benchmark**: `list_branches_local` - **Baseline**: 11.48 µs - **Profile**: O(1) practical complexity (small branch counts typical) - **Description**: Lists all local branches - **Insight**: Very fast; negligible overhead **Code Tested:** ```rust git_client.list_branches(BranchType::Local) ``` --- ### 6. List Remote Branches **Benchmark**: `list_branches_remote` - **Baseline**: 11.43 µs (equivalent to local) - **Profile**: O(1) practical complexity - **Description**: Lists all remote branches - **Insight**: Remote branches just as fast; no network latency (cached) **Code Tested:** ```rust git_client.list_branches(BranchType::Remote) ``` --- ### 7. Stage File **Benchmark**: `stage_file` - **Baseline**: 26.39 µs - **Profile**: O(1) operation - **Description**: Stages a file for commit via `git add` - **Insight**: Very fast; single index update **Code Tested:** ```rust git_client.stage_file("file.txt") ``` --- ### 8. Unstage File **Benchmark**: `unstage_file` - **Baseline**: 10.62 µs (fastest Git operation) - **Profile**: O(1) operation - **Description**: Unstages a file from staging area - **Insight**: Extremely fast; minimal overhead **Code Tested:** ```rust git_client.unstage_file("file.txt") ``` --- ## Data Operations Benchmarks ### 1. Bump Module Progress **Benchmark**: `bump_progress` - **Baseline**: 120-200 µs range (varies with module count) - **Profile**: O(n) iteration over modules for selected project - **Description**: Increments module progress on commit - **Insight**: Fast; module count typically small (< 50 modules) **Code Tested:** ```rust store.bump_progress_on_commit(&project_id, &module_id) ``` --- ### 2. Add Developer **Benchmark**: `add_developer` | Developer Count | Time | Scaling | | --------------- | ------ | ------- | | 10 developers | 204 ns | — | | 100 developers | 188 ns | 0.92× | | 1000 developers | 149 ns | 0.79× | - **Profile**: O(1) operation (scale-independent) - **Description**: Adds new developer to project - **Insight**: Extremely fast; vector append operation **Code Tested:** ```rust store.add_developer(&project_id, developer) ``` --- ### 3. Delete Developer **Benchmark**: `delete_developer` - **Baseline**: 91 ns (fastest operation overall) - **Profile**: O(1) operation (retain filter on small vector) - **Description**: Removes developer by ID - **Insight**: Extremely fast; vector filtering negligible **Code Tested:** ```rust store.delete_developer(&project_id, &developer_id) ``` --- ### 4. Auto-Populate Developers **Benchmark**: `auto_populate_developers` | Committer Count | Time | Scaling | | --------------- | ------- | ------- | | 10 committers | 233 ns | — | | 100 committers | 12.6 µs | 54.2× | | 1000 committers | 840 µs | 66.6× | - **Profile**: O(n) with duplicate checking (HashSet insertion/lookup) - **Description**: Extracts unique developers from git committer list - **Insight**: Acceptable for typical git histories; noticeable at 1000+ committers **Code Tested:** ```rust store.auto_populate_developers(&project_id, &git_client) ``` --- ## Performance Summary ### All Operations Ranked by Speed | Operation | Time | Scaling | Category | | ----------------------- | ---------- | ------- | --------- | | delete_developer | 91 ns | O(1) | Data | | add_developer | 149-204 ns | O(1) | Data | | auto_populate (10) | 233 ns | O(n) | Data | | unstage_file | 10.62 µs | O(1) | Git | | list_branches_local | 11.43 µs | O(1) | Git | | list_branches_remote | 11.48 µs | O(1) | Git | | auto_populate (100) | 12.6 µs | O(n) | Data | | stage_file | 26.39 µs | O(1) | Git | | head_branch | 47.18 µs | O(1) | Git | | bump_progress | 120-200 µs | O(n) | Data | | get_commit_history (10) | 333 µs | O(n) | Git | | list_changes (10) | 549 µs | O(n) | Git | | auto_populate (1000) | 840 µs | O(n) | Data | | get_commit_history (50) | 1.70 ms | O(n) | Git | | list_changes (50) | 2.79 ms | O(n) | Git | | discover_repo | 9.65 ms | O(1)\* | Git (I/O) | \*Repository discovery appears O(1) in practice but scales with directory depth; dominated by filesystem I/O ### Typical Workflow Performance **Full Workflow**: Discover → List Changes → Stage File → Commit - **Total Time**: ~13-15 ms - **Dominated by**: Initial repository discovery (9.65 ms) - **Interactive Performance**: Keystroke-to-response dominated by Git I/O, not computation --- ## Complexity Analysis ### Repository Discovery - **Complexity**: O(1) practical complexity - **Time**: < 10ms for most repositories - **Description**: Uses filesystem walk to find `.git` folder - **Bottleneck**: Filesystem I/O ### File Status Retrieval - **Complexity**: O(n) where n = number of files in working directory - **Time**: 50-500ms depending on repo size and filesystem speed - **Optimization**: Git status is cached until next refresh ### Commit History - **Complexity**: O(n) up to built-in limit - **Limit**: 50 most recent commits (hardcoded) - **Time**: 100-200ms for large repositories - **Trade-off**: Faster rendering vs comprehensive history ### Diff Preview Generation - **Complexity**: O(file_size) proportional to changed file size - **Strategy**: Lazy generation (only when file is selected) - **Caching**: Cached for selected file, cleared on selection change ### Merge Visualizer - **Parsing**: Minimal overhead (just tracking file list) - **Diff Generation**: Same as Changes view (on-demand) - **Resolution Tracking**: HashMap lookup is O(1) --- ## Optimization Guidelines ### Current Bottlenecks 1. **Repository Discovery** (9.65 ms) - Dominated by filesystem I/O - One-time cost on startup - Not a concern for interactive performance after initial load 2. **File Status Retrieval** (2.79 ms for 50 files) - Linear scaling with file count - Expected and acceptable - No optimization needed for typical projects ### Optimization Opportunities 1. **Commit History Caching** - Currently regenerated on each view switch - Consider caching with invalidation on new commits 2. **Diff Preview Caching** - Currently cached per-file - Consider expanding cache to recently viewed files 3. **Background Refresh** - Git status could be refreshed in background - Display stale data while fetching new status ### Performance Anti-Patterns to Avoid 1. **Unbounded Operations** - Always limit commit history fetches - Always limit file listings for large repositories 2. **Synchronous I/O in Render Loop** - Move Git operations outside render path - Cache results and refresh asynchronously 3. **Repeated Allocations** - Reuse buffers where possible - Use `Vec::with_capacity()` when size is known ### Benchmarking Best Practices 1. **Run benchmarks before optimization** - Quantify the problem before solving it - Avoid premature optimization 2. **Use criterion for statistical analysis** - Multiple iterations eliminate noise - Confidence intervals show significance 3. **Test with realistic data** - Use typical repository sizes - Test edge cases (1 file, 1000 files) 4. **Profile before micro-optimizing** - Use `perf` or `flamegraph` for hotspots - Focus on high-impact areas --- ## See Also - **[Architecture](Architecture.md)** — System design and data flow - **[Development](Development.md)** — Contributing and testing guide - **[Features](Features.md)** — Implemented features