Skip to content

perf: improve git I/O operation management#1242

Merged
scolladon merged 31 commits intomainfrom
feat/performance-optimization
Mar 7, 2026
Merged

perf: improve git I/O operation management#1242
scolladon merged 31 commits intomainfrom
feat/performance-optimization

Conversation

@scolladon
Copy link
Owner

@scolladon scolladon commented Mar 6, 2026

Explain your changes


Performance optimization of git I/O operations to reduce heap pressure and subprocess overhead.

Key changes:

  • Batch cat-file process: Replace individual git show subprocess spawns with a single long-lived git cat-file --batch child process per adapter instance, using a FIFO queue for stdin/stdout coordination
  • Scoped tree index: Replace full git ls-tree -r with scoped git ls-tree -r <rev> -- <path1> <path2> targeting only metadata directories that handlers actually need. Scope is computed from diff lines + metadata registry analysis
  • Remove buildTreeIndex fallback: The tree index is now populated exclusively by preBuildTreeIndex — no lazy fallback via unscoped ls-tree. pathExistsImpl, getFilesPathCached, and listDirAtRevision return empty results if no index was pre-built
  • Parallel tree index builds: preBuildTreeIndex for config.to and config.from run concurrently via Promise.all
  • Simplified scope resolution: Deduplicated preBuildTreeIndex call pattern in main.ts by defaulting scope to config.source and overriding when no include flags are set
  • Direct file copy (GitCopy): Single-file copy operations read directly via git cat-file without requiring the tree index
  • GitDirCopy: Directory copy operations enumerate files via getFilesPath (uses tree index) then copy each file via cat-file
  • Parallel git diff: Run git diff for A/M/D filters concurrently instead of sequentially
  • Parallel config validation: Run independent validation phases concurrently
  • FlowTranslationProcessor: Use git grep with pathspec globs instead of tree index to find translation files
  • IOExecutor moved to adapter layer: Relocated from src/service/ to src/adapter/ as it is an I/O adapter, not a service

Safety:

  • Orphaned git cat-file processes are prevented via GitAdapter.closeAll() in finally block and close event handlers that reject pending promises

  • Concurrency bounded by getConcurrencyThreshold() throughout

  • Jest tests added to cover the fix.

  • NUT tests added to cover the fix.

  • E2E tests added to cover the fix.

Any particular element that can be tested locally


No CLI-facing changes. Internal optimization only. Run sf sgd source delta as usual — behavior is identical, execution should be faster on large repos with reduced heap allocation.

Any other comments


  • 100% code coverage maintained (lines, statements, branches)
  • E2E non-regression tests pass without modification to expected output
  • DESIGN.md updated to reflect all architectural changes

…anup

- Add ROOT_PATHS handling in pathExistsImpl and getFilesPathCached to
  correctly resolve '', '.', './' paths against the tree index
- Add GitAdapter.closeAll() to ensure all batch cat-file processes are
  cleaned up, including those from alternate-revision instances
- Move batch process cleanup from IOExecutor to main.ts for centralized
  lifecycle management
- Wrap main.ts processing in try/finally to ensure GitAdapter.closeAll()
  runs on error paths, preventing orphaned git cat-file processes
- Add NaN guard in GitBatchCatFile._processBuffer to reject malformed
  headers instead of silently corrupting data
- Clear static instances map in closeAll() to release tree index memory
- Add closeAll() and malformed header tests for 100% coverage
- Update DESIGN.md to reflect tree index and batch cat-file architecture
@scolladon scolladon requested a review from mehdicherf as a code owner March 6, 2026 11:17
@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (2168ad0) to head (8e3ccad).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##              main     #1242    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files           58        60     +2     
  Lines         1603      1731   +128     
  Branches       213       249    +36     
==========================================
+ Hits          1603      1731   +128     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Raise jest coverage thresholds to 100% for all metrics
- Add tests for cached batch cat-file reuse, empty dir listing,
  and closeAll with active instances
- Remove unreachable defensive guard in listDirAtRevision
…edly

Add close event handler to GitBatchCatFile that rejects all queued
promises if the process exits with non-zero code, preventing indefinite
hangs. Document unbounded memory characteristic in DESIGN.md.
@scolladon scolladon changed the title perf: reduce git subprocess spawns via tree index and batch cat-file perf: improve git I/O operation management Mar 6, 2026
scolladon added 16 commits March 6, 2026 15:02
Accept string | string[] for the path parameter in gitGrep, spreading
the array as multiple pathspec arguments to git grep. Existing callers
passing a single string are unaffected.
Mirror the GitAdapter.gitGrep signature change, allowing callers to
pass multiple pathspecs through the fsHelper ACL.
Replace readDirs(config.source) + extension filter with grepContent()
using pathspec globs. This finds only translation files containing
flowDefinitions without building or querying the tree index, removing
FlowTranslationProcessor's dependency on the full source file listing.
Introduce GitDirCopy kind for directory copies (used by
ContainedDecomposedHandler). IOExecutor now handles GitCopy as direct
single-file cat-file without querying the tree index, and GitDirCopy
as directory enumeration via getFilesPath.

This decouples standard metadata types from the tree index entirely.
Add public method to pre-populate the tree index cache with a scoped
git ls-tree call. When called before handler processing, all subsequent
buildTreeIndex calls for the same revision hit the cache, avoiding a
full-repo ls-tree.
Analyze diff lines against the metadata registry to determine which
type directories need tree index access. Standard types are excluded,
bundle types scope to component directories, and all others scope to
type directories.
After computing diff lines, analyze which metadata type directories
need tree index access and pre-build a scoped git ls-tree covering
only those directories. When --include is set, falls back to scoping
by config.source. When generateDelta is false, no tree index is built.
- Add per-file ignore check in _executeGitDirCopy for child files
- Add GitAdapter.closeAll() in test beforeEach to prevent singleton leak
@github-actions
Copy link

github-actions bot commented Mar 7, 2026

Published under dev-1242 npm channel.

$ sf plugins install sfdx-git-delta@dev-1242

@scolladon scolladon merged commit fe2c967 into main Mar 7, 2026
23 of 24 checks passed
@scolladon scolladon deleted the feat/performance-optimization branch March 7, 2026 13:30
@github-actions
Copy link

github-actions bot commented Mar 7, 2026

Shipped in release v6.33.1.
Version v6.33.1 will be assigned to the latest npm channel soon
Install it using either v6.33.1 or the latest-rc npm channel

$ sf plugins install sfdx-git-delta@latest-rc
# Or
$ sf plugins install sfdx-git-delta@v6.33.1

💡 Enjoying sfdx-git-delta?
Your contribution helps us provide fast support 🚀 and high quality features 🔥
Become a sponsor 💙
Happy incremental deployment!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant