Dear CrisprLungo authors,
First of all, thank you very much for developing and maintaining CrisprLungo. It is a very useful and well-designed tool for long-read–based CRISPR editing analysis.
I have two questions regarding the output and recommended preprocessing for ONT data:
- Consistency of ref_seq in Allele_table.txt
In the output file Allele_table.txt(part output of demo data), the first column (ref_seq) appears to differ between rows.
$cat Allele_table.txt|c1
1 2 3 4 5 6
ref_seq mut_seq Raw_count % CIGAR Mutation_info
CACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACT CACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACT 147 33.72 40M None
GATCCCACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACTGGGCTGGCGGCCA GATCCCACAGGCGCCCTGGC-------------------ACAACTGGGCTGGCGGCCA 84 19.27 20M19D20M 2364_2382:Del_19
CACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACTGGG CACAGGCGCCCTGGCCAGTC----GGGCGGTGCTACAACTGGG 61 13.99 20M4D20M 2369_2372:Del_4
AGATCCCACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACTGGGCTGGCGGCC AGATCCCACAGGCGCCCTGG-------------------TACAACTGGGCTGGCGGCC 58 13.3 20M19D20M 2363_2381:Del_19
CCACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACTGGGCTG CCACAGGCGCCCTGGCCAGT--------CGGTGCTACAACTGGGCTG 57 13.07 20M8D20M 2368_2375:Del_8
CATGCAGATCCCACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACTG CATGCAGATCCCACAGGCGC-------------CTGGGCGGTGCTACAACTG 4 0.92 20M13D20M 2358_2370:Del_13
My understanding is that, similar to what is shown in allele_plot.png, the reference sequence (ignoring indels) should remain identical across all alleles.
Could you please clarify:
Is it expected behavior that ref_seq differs between rows in Allele_table.txt?
Or should the reference sequence be fixed across all rows (aside from alignment gaps introduced by indels)?
- Recommended QC / filtering for ONT reads before
CrisprLungo
For Oxford Nanopore (ONT) reads, before running CrisprLungo, do you recommend any specific:
quality filtering thresholds (e.g. minimum Q-score), and
preprocessing tools (e.g. NanoFilt, Filtlong, etc.)?
For example, would a filter such as: NanoFilt -q 12
be reasonable, or do you suggest different thresholds or strategies for CRISPR editing analysis with long reads?
Thank you again for making CrisprLungo available to the community. I really appreciate your work and look forward to your guidance.
Best regards,
Si
Dear CrisprLungo authors,
First of all, thank you very much for developing and maintaining CrisprLungo. It is a very useful and well-designed tool for long-read–based CRISPR editing analysis.
I have two questions regarding the output and recommended preprocessing for ONT data:
In the output file Allele_table.txt(part output of demo data), the first column (ref_seq) appears to differ between rows.
My understanding is that, similar to what is shown in
allele_plot.png, the reference sequence (ignoring indels) should remain identical across all alleles.Could you please clarify:
Is it expected behavior that
ref_seqdiffers between rows inAllele_table.txt?Or should the reference sequence be fixed across all rows (aside from alignment gaps introduced by indels)?
CrisprLungoFor Oxford Nanopore (ONT) reads, before running CrisprLungo, do you recommend any specific:
quality filtering thresholds (e.g. minimum Q-score), and
preprocessing tools (e.g. NanoFilt, Filtlong, etc.)?
For example, would a filter such as:
NanoFilt -q 12be reasonable, or do you suggest different thresholds or strategies for CRISPR editing analysis with long reads?
Thank you again for making CrisprLungo available to the community. I really appreciate your work and look forward to your guidance.
Best regards,
Si