-
Notifications
You must be signed in to change notification settings - Fork 89
Add workflow Metadata and Sequences from BioProjectIDs #1177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
edb846b
736f719
10ad964
dabd3ac
4713a2b
5b49bd5
7b46681
59ce11a
e9a5cfc
72c9b2d
7e23933
dc88691
92ac9ee
79bb130
67ed4a4
bcd98ac
89ca531
747ad3d
05695f2
eb1475c
daad82b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| version: 1.2 | ||
| workflows: | ||
| - name: main | ||
| subclass: Galaxy | ||
| publish: true | ||
| primaryDescriptorPath: /metadata-and-sequences-from-bioproject-ids.ga | ||
| testParameterFiles: | ||
| - /metadata-and-sequences-from-bioproject-ids-tests.yml | ||
| authors: | ||
| - name: Giuseppe Defazio | ||
| orcid: 0000-0002-9356-5224 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| version: '0.1' | ||
| registries: | ||
| - url: https://workflowhub.eu | ||
| project: iwc | ||
| workflow: metadata-and-sequences-from-bioprojects-ids/main |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| # Changelog | ||
|
|
||
| ## [0.1] - 2026-03-23 | ||
|
|
||
| - Added workflow |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| # Metadata and Sequences from BioProject IDs | ||
|
|
||
| This workflow takes BioProject IDs as input and is able to retrieve SRA tables and FASTQ files from IDs using pysradb and SRA fetching. | ||
| The workflow may be very useful in Meta-analysis and reanalysis scenarios, giving the possibility to collect metadata and data from BioProject IDs of studies with the same design. | ||
|
|
||
| ## Input | ||
|
|
||
| The workflow needs a single txt input file, without header, with the first column reporting one or more BioProject IDs as follows: | ||
|
|
||
| ```` | ||
| PRJNA1425250 | ||
| PRJNA1417619 | ||
| PRJNA1425251 | ||
| PRJNA1417617 | ||
| PRJNA1425252 | ||
| PRJEB1417616 | ||
| ```` | ||
|
|
||
|
|
||
| ## Outputs | ||
|
|
||
| There are 3 main outputs: | ||
|
|
||
| - Data collection for SRA manifest of input BioProject ID(s) | ||
| - Data collection for Paired End FASTQ files | ||
| - Data collection for Single End FASTQ files | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,76 @@ | ||
| - doc: Test 1 for Metadata-and-Sequences-from-BioProjectIDs | ||
| job: | ||
| BioProject IDs: | ||
| class: File | ||
| path: test-data/test1_single_prj_pe.txt | ||
| filetype: txt | ||
| assay (metadata download): false | ||
| desc (metadata download): false | ||
| detailed (metadata download): true | ||
| expand (metadata download): false | ||
| Group by Experiments (fastq download): false | ||
| Group by Sample (fastq download): false | ||
|
Comment on lines
+11
to
+12
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think these parameters disappeared from the workflow |
||
| outputs: | ||
| Metadata file ( SRA table ): | ||
| element_tests: | ||
| PRJNA1417618: | ||
| path: test-data/test1_metadata_file_split_file_000000.txt.tsv | ||
| Paired End Reads: | ||
| element_tests: | ||
| SRR37073390: | ||
| forward: | ||
| path: test-data/test1_paired_end_collection_forward.fastq | ||
| decompress: true | ||
| compare: contains | ||
| reverse: | ||
| path: test-data/test1_paired_end_collection_reverse.fastq | ||
| decompress: true | ||
| compare: contains | ||
|
|
||
| - doc: Test 2 for Metadata-and-Sequences-from-BioProjectIDs | ||
| job: | ||
| BioProject IDs: | ||
| class: File | ||
| path: test-data/test2_multiple_prj_mixed.txt | ||
| filetype: txt | ||
| assay (metadata download): false | ||
| desc (metadata download): false | ||
| detailed (metadata download): true | ||
| expand (metadata download): false | ||
|
Comment on lines
+36
to
+39
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would use a different combination from the first test, to increase the test coverage. |
||
| Group by Experiments (fastq download): false | ||
| Group by Sample (fastq download): false | ||
|
Comment on lines
+40
to
+41
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think these parameters disappeared from the workflow |
||
| outputs: | ||
| Metadata file ( SRA table ): | ||
| element_tests: | ||
| PRJNA1425250: | ||
| path: test-data/test2_metadata_file_split_file_000000.txt.tsv | ||
| compare: contains | ||
| PRJNA1417618: | ||
| path: test-data/test2_metadata_file_split_file_000001.txt.tsv | ||
| compare: contains | ||
| Paired End Reads: | ||
| element_tests: | ||
| SRR37273408: | ||
| forward: | ||
| path: test-data/test2_SRR37273408_forward.fastq | ||
| decompress: true | ||
| compare: contains | ||
| reverse: | ||
| path: test-data/test2_SRR37273408_reverse.fastq | ||
| decompress: true | ||
| compare: contains | ||
| SRR37073390: | ||
| forward: | ||
| path: test-data/test2_SRR37073390_forward.fastq | ||
| decompress: true | ||
| compare: contains | ||
| reverse: | ||
| path: test-data/test2_SRR37073390_reverse.fastq | ||
| decompress: true | ||
| compare: contains | ||
| Single End Reads: | ||
| element_tests: | ||
| SRR37273407: | ||
| path: test-data/test2_SRR37273407_forward.fastq | ||
| decompress: true | ||
| compare: contains | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind to describe here the parameters you expose, to guide the users?