Skip to content

Perf: Bundle high-impact optimizations#606

Open
svdrecbd wants to merge 4 commits intopinellolab:masterfrom
svdrecbd:perf/optimization-bundle
Open

Perf: Bundle high-impact optimizations#606
svdrecbd wants to merge 4 commits intopinellolab:masterfrom
svdrecbd:perf/optimization-bundle

Conversation

@svdrecbd
Copy link
Copy Markdown

  • Speed up reverse_complement using str.translate (~3.4x faster).
  • Optimize find_indels_substitutions using set lookups.
  • Implement Banded Needleman-Wunsch alignment in CRISPResso2Align.pyx (~2.4x alignment speedup).
  • Add --needleman_wunsch_bandwidth argument (defaults to -1/off for parity).
  • Optimize multiprocessing by pre-building key lists.
  • Add integration tests for bit-exact output verification.
  • Add performance benchmark script.

Question on Parity & Future Work:

For this PR, I prioritized optimizations that preserve 100% bit-exact consistency with the existing codebase. However, I identified other potential strategies, such as K-mer based pre-filtering for references or swapping the alignment backend, that could yield even larger performance gains. These might introduce negligible differences in edge cases (e.g., handling of extremely poor alignments).

Is strict bit-exact parity a hard requirement for the project, or would you (or whatever entity is responsible for this repo) be open to exploring these kinds of trade-offs in future PRs?

Note on Diff Size: This PR includes updated C extension files (*.c) generated by Cython (CRISPResso2Align.c, etc.). These are tracked in the repo to simplify installation, but they account for the majority of the +16k line diff. The actual logic changes are in the .pyx and .py files.

- Speed up reverse_complement using str.translate (~3.4x faster).
- Optimize find_indels_substitutions using set lookups.
- Implement Banded Needleman-Wunsch alignment in CRISPResso2Align.pyx (~2.4x alignment speedup).
- Add --needleman_wunsch_bandwidth argument (defaults to -1/off for parity).
- Optimize multiprocessing by pre-building key lists.
- Add integration tests for bit-exact output verification.
- Add performance benchmark script.

Note: This commit includes updated C extension files (*.c) generated by Cython, accounting for the large diff size.
Comment thread CRISPResso2/CRISPRessoCORE.py Outdated
return

variant_file_path = os.path.join(variants_dir, f"variants_{process_id}.pkl")
with open(variant_file_path, 'wb') as file:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. I updated that block to use pickle as a fallback when I refactored the multiprocessing logic, but since result_queue is now always provided, this file-writing path is indeed dead code. I've removed it along with the unused pickle import and updated the docstrings to reflect the queue-based architecture.

Thank you for catching this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants