Skip to content
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: 1.2
workflows:
- name: main
subclass: Galaxy
publish: true
primaryDescriptorPath: /metadata-and-sequences-from-BioProjectIDs.ga
testParameterFiles:
- /metadata-and-sequences-from-BioProjectIDs-tests.yml
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The referenced workflow/test filenames in primaryDescriptorPath / testParameterFiles include uppercase letters and “BioProjectIDs” without a space. For new IWC workflow additions, filenames/folder names are expected to be lowercase with dashes and human-readable wording (e.g., ...-bioproject-ids...). Consider renaming the workflow and test files (and updating these paths) to match that convention.

Copilot generated this review using guidance from repository custom instructions.
authors:
- name: Giuseppe Defazio
orcid: 0000-0002-9356-5224
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
version: '0.1'
registries:
- url: https://workflowhub.eu
project: iwc
workflow: metadata-and-sequences-from-BioProjectsIDs/main
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WorkflowHub workflow: path has a typo/inconsistency (metadata-and-sequences-from-BioProjectsIDs/main): it doesn’t match this directory name (metadata-and-sequences-from-BioProjectIDs) and will likely break publication/registration. Please update it to the correct (and ideally lowercase) workflow slug.

Suggested change
workflow: metadata-and-sequences-from-BioProjectsIDs/main
workflow: metadata-and-sequences-from-bioprojectids/main

Copilot uses AI. Check for mistakes.
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Changelog

## [0.1] - 2026-03-23

- Added workflow
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Metadata and Sequences from BioProjectIDs

This workflow takes BioProject IDs as input and is able to retrieve SRA tables and FASTQ files from IDs using pysradb and SRA fetching.
The workflow may be very useful in Meta-analysis and reanalysis scenarios, giving the possibility to collect metadata and data from BioProject IDs of studies with the same design.

## Input

The workflow needs a single tabular input dataset (uploaded as txt file as well), without header, with the first column reporting one or more BioProject IDs.
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This README still uses “BioProjectIDs” (no space) in the title; please use consistent, human-readable wording (e.g., “BioProject IDs”) across README/workflow metadata, and consider clarifying the exact expected input datatype/format (plain text list vs tabular).

Copilot generated this review using guidance from repository custom instructions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind to describe here the parameters you expose, to guide the users?


## Outputs

There are 3 main outputs:

- Data collection for SRA manifest of input BioProject ID(s)
- Data collection for Paired End FASTQ files
- Data collection for Single End FASTQ files
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
- doc: Test 1 for Metadata-and-Sequences-from-BioProjectIDs
job:
BioProject IDs:
class: File
path: test-data/test1_single_prj_pe.txt
filetype: tabular
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test 1 declares the BioProject ID list input as filetype: tabular, while Test 2 uses filetype: txt. Given this input is a plain list of IDs (no header), consider using txt consistently across tests (and align with the workflow input datatype).

Suggested change
filetype: tabular
filetype: txt

Copilot uses AI. Check for mistakes.
--assay (metadata download): false
--desc (metadata download): false
--detailed (metadata download): true
--expand (metadata download): false
Group by Experiments (fastq download): false
Group by Sample (fastq download): false
outputs:
metadata_file:
element_tests:
split_file_000000.txt:
path: test-data/test1_metadata_file_split_file_000000.txt.tsv
paired_end_collection:
element_tests:
split_file_000000.txt:
elements:
SRR37073390:
elements:
forward:
path: test-data/test1_paired_end_collection_forward.fastq
decompress: true
compare: contains
reverse:
path: test-data/test1_paired_end_collection_reverse.fastq
decompress: true
compare: contains
single_end_collection:
element_tests:
split_file_000000.txt:
elements: {}

- doc: Test 2 for Metadata-and-Sequences-from-BioProjectIDs
job:
BioProject IDs:
class: File
path: test-data/test2_multiple_prj_mixed.txt
filetype: txt
--assay (metadata download): false
--desc (metadata download): false
--detailed (metadata download): true
--expand (metadata download): false
Group by Experiments (fastq download): false
Group by Sample (fastq download): false
outputs:
metadata_file:
element_tests:
split_file_000000.txt:
path: test-data/test2_metadata_file_split_file_000000.txt.tsv
compare: contains
split_file_000001.txt:
path: test-data/test2_metadata_file_split_file_000001.txt.tsv
compare: contains
paired_end_collection:
element_tests:
split_file_000000.txt:
elements:
SRR37273407:
elements:
forward:
path: test-data/test2_SRR37273407_forward.fastq
decompress: true
compare: contains
SRR37273408:
elements:
forward:
path: test-data/test2_SRR37273408_forward.fastq
decompress: true
compare: contains
reverse:
path: test-data/test2_SRR37273408_reverse.fastq
decompress: true
compare: contains
split_file_000001.txt:
elements:
SRR37073390:
elements:
forward:
path: test-data/test2_SRR37073390_forward.fastq
decompress: true
compare: contains
reverse:
path: test-data/test2_SRR37073390_reverse.fastq
decompress: true
compare: contains
single_end_collection:
element_tests:
split_file_000000.txt:
elements: {}
split_file_000001.txt:
elements: {}
Loading
Loading