Question about Allele_table.txt ref_seq consistency and ONT read QC recommendations

Dear CrisprLungo authors,

First of all, thank you very much for developing and maintaining CrisprLungo. It is a very useful and well-designed tool for long-read–based CRISPR editing analysis.

I have two questions regarding the output and recommended preprocessing for ONT data:

1. Consistency of ref_seq in Allele_table.txt

In the output file Allele_table.txt(part output of demo data), the first column (ref_seq) appears to differ between rows.
```
$cat Allele_table.txt|c1
1                                                             2                                                             3          4      5          6
ref_seq                                                       mut_seq                                                       Raw_count  %      CIGAR      Mutation_info
CACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACT                      CACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACT                      147        33.72  40M        None
GATCCCACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACTGGGCTGGCGGCCA    GATCCCACAGGCGCCCTGGC-------------------ACAACTGGGCTGGCGGCCA    84         19.27  20M19D20M  2364_2382:Del_19
CACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACTGGG                   CACAGGCGCCCTGGCCAGTC----GGGCGGTGCTACAACTGGG                   61         13.99  20M4D20M   2369_2372:Del_4
AGATCCCACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACTGGGCTGGCGGCC    AGATCCCACAGGCGCCCTGG-------------------TACAACTGGGCTGGCGGCC    58         13.3   20M19D20M  2363_2381:Del_19
CCACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACTGGGCTG               CCACAGGCGCCCTGGCCAGT--------CGGTGCTACAACTGGGCTG               57         13.07  20M8D20M   2368_2375:Del_8
CATGCAGATCCCACAGGCGCCCTGGCCAGTCGTCTGGGCGGTGCTACAACTG          CATGCAGATCCCACAGGCGC-------------CTGGGCGGTGCTACAACTG          4          0.92   20M13D20M  2358_2370:Del_13

```

My understanding is that, similar to what is shown in ```allele_plot.png```, the reference sequence (ignoring indels) should remain identical across all alleles.

Could you please clarify:

Is it expected behavior that ```ref_seq``` differs between rows in ```Allele_table.txt```?

Or should the reference sequence be fixed across all rows (aside from alignment gaps introduced by indels)?

2. Recommended QC / filtering for ONT reads before ```CrisprLungo```

For Oxford Nanopore (ONT) reads, before running CrisprLungo, do you recommend any specific:

quality filtering thresholds (e.g. minimum Q-score), and

preprocessing tools (e.g. NanoFilt, Filtlong, etc.)?

For example, would a filter such as: ```NanoFilt -q 12```


be reasonable, or do you suggest different thresholds or strategies for CRISPR editing analysis with long reads?

Thank you again for making CrisprLungo available to the community. I really appreciate your work and look forward to your guidance.

Best regards,
Si

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about Allele_table.txt ref_seq consistency and ONT read QC recommendations #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about Allele_table.txt ref_seq consistency and ONT read QC recommendations #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions