Skip to content

Add READSUBMIT workflow#58

Draft
ochkalova wants to merge 17 commits into
devfrom
feat/readsubmit
Draft

Add READSUBMIT workflow#58
ochkalova wants to merge 17 commits into
devfrom
feat/readsubmit

Conversation

@ochkalova
Copy link
Copy Markdown
Contributor

@ochkalova ochkalova commented Apr 29, 2026

Resolves to #28

I was able to actually submit reads with this one, but it still requires some work:

  • CREATE_READS_MANIFEST module is very basic, doesn't do any validation and doesn't support test mode (with timestamp appended to alias)
  • I used custom test data for the test, because nf-core reads from here https://github.com/nf-core/test-datasets/raw/modules/data/genomics/prokaryotes/bacteroides_fragilis/illumina/fastq/ don't pass webin cli validation. It's required to find suitable reads in nf-datasets and generate snapshots
  • Before merging this, it is required to merge this Add reads submission support to webin cli wrapper EBI-Metagenomics/mgnify-pipelines-toolkit#155 and update the toolkit in the webin cli module

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/seqsubmit branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@nf-core-bot
Copy link
Copy Markdown
Member

nf-core-bot commented Apr 29, 2026

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.5.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the Synchronisation documentation.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2026

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 5bde07a

+| ✅ 252 tests passed       |+
#| ❔   8 tests were ignored |#
#| ❔   1 tests had warnings |#
!| ❗  14 tests had warnings |!
Details

❗ Test warnings:

  • readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
  • pipeline_todos - TODO string in nextflow.config: Optionally, you can add a pipeline-specific nf-core config at https://github.com/nf-core/configs
  • pipeline_todos - TODO string in nextflow.config: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0
  • pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
  • pipeline_todos - TODO string in README.md: Add bibliography of tools and data used in your pipeline
  • pipeline_todos - TODO string in CONTRIBUTING.md: Add any pipeline specific contribution guidelines here, such as coding styles, procedures, checklists etc.
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
  • pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in nextflow.config: Specify any additional parameters here
  • local_component_structure - fasta_validation.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - genome_evaluation.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - rna_detection.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure

❔ Tests ignored:

  • files_exist - File is ignored: conf/igenomes.config
  • files_exist - File is ignored: conf/igenomes_ignored.config
  • nextflow_config - Config variable ignored: params.input
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: assets/nf-core-seqsubmit_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-seqsubmit_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-seqsubmit_logo_dark.png
  • container_configs - container_configs

❔ Tests fixed:

✅ Tests passed:

Run details

  • nf-core/tools version 4.0.2
  • Run at 2026-05-19 07:43:20

Comment thread docs/usage.md Outdated
--input samplesheet_reads.csv \
--submission_study <your_study> \
--webincli_mode submit \
--test_upload true \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be just a flag, it doesn't need to have a value (the presence should be enough --test_upload with no value)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but here it's just for transparency

Comment thread modules/local/create_reads_manifest/main.nf
Comment thread workflows/readsubmit.nf
Comment thread workflows/readsubmit.nf Outdated
Copy link
Copy Markdown
Collaborator

@mberacochea mberacochea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is not ready, but left some notes

@mberacochea
Copy link
Copy Markdown
Collaborator

mberacochea commented May 12, 2026 via email

@KateSakharova
Copy link
Copy Markdown
Contributor

@mbc

Ideally, you will need to check access for all provided studies then (do they belong to one account). Because if they are not - then pipeline will crash on the last step.

We also have a step for study registration. I will not expect people having a study already registered. So, I expect study argument/column being empty for majority of submissions (if we talk about external users).

@riceroni18
Copy link
Copy Markdown

riceroni18 commented May 15, 2026

I’ve been working on making the submit_study.py script more robust and specifically focused on early header validation and structured logging updates. My goal was to check that if the pipeline fails early on than there is a clear error message if the input CSV/TSV is malformed, rather than crashing downstream.

I've verified these changes locally using the mag_no_coverage_paired_reads.nf.test suite, and the test run completed successfully in ~79s on my WSL2 environment.

I am still early in my bioinformatics journey, I would appreciate feedback on the Python syntax to make sure it aligns with nf-core's best practices, but the current implementation is functional and passes all local tests.

I couldn't push directly to the branch, but you can see the changes here: https://github.com/riceroni18/seqsubmit/blob/dev/bin/submit_study.py

@KateSakharova
Copy link
Copy Markdown
Contributor

Hi @riceroni18, thank you for your message! If you want a feedback from us, please, create a separate PR from your fork :)

@timrozday-mgnify
Copy link
Copy Markdown
Contributor

I've added some extra samplesheet validation rules based on ENA docs. I'm not sure how much we want to support this and how ENA specific we want to make it. But since we're using webin-cli it seems like it's ENA only for now.

The rules are all from https://ena-docs.readthedocs.io/en/latest/submit/reads/webin-cli.html

… This will not affect nextflow behavior but does permit some standalone use of the script
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants