Add workflow Metadata and Sequences from BioProjectIDs by gdefazio · Pull Request #1177 · galaxyproject/iwc

gdefazio · 2026-03-23T17:54:57Z

FOR CONTRIBUTOR:

I have read the Adding workflows guidelines
License permits unrestricted use (educational + commercial)
Please also take note of the reviewer guidelines below to facilitate a smooth review process.

FOR REVIEWERS:

.dockstore.yml: file is present and aligned with creator metadata in workflow. ORCID identifiers are strongly encouraged in creator metadata. The .dockstore.yml file is required to run tests
Workflow is sufficiently generic to be used with lab data and does not hardcode sample names, reference data and can be run without reading an accompanying tutorial.
In workflow: annotation field contains short description of what the workflow does. Should start with This workflow does/runs/performs … xyz … to generate/analyze/etc …
In workflow: workflow inputs and outputs have human readable names (spaces are fine, no underscore, dash only where spelling dictates it), no abbreviation unless it is generally understood. Altering input or output labels requires adjusting these labels in the the workflow-tests.yml file as well
In workflow: name field should be human readable (spaces are fine, no underscore, dash only where spelling dictates it), no abbreviation unless generally understood
Workflow folder: prefer dash (-) over underscore (_), prefer all lowercase. Folder becomes repository in iwc-workflows organization and is included in TRS id
Readme explains what workflow does, what are valid inputs and what outputs users can expect. If a tutorial or other resources exist they can be linked. If a similar workflow exists in IWC readme should explain differences with existing workflow and when one might prefer one workflow over another
Changelog contains appropriate entries
Large files (> 100 KB) are uploaded to zenodo and location urls are used in test file

github-actions · 2026-03-24T11:34:32Z

Test Results (powered by Planemo)

Test Summary

Test State	Count
Total	2
Passed	1
Error	0
Failure	1
Skipped	0

Failed Tests

❌ metadata-and-sequences-from-BioProjectIDs.ga_1

Problems:

Output with path /tmp/tmptjnsdwb0/SRR37273408reverse__6f94210b-dd5f-4e87-a3da-fa5ed472355d.fastqsanger.gz different than expected, difference (using contains):
( /home/runner/work/iwc/iwc/workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_SRR37273408_forward.fastq v. /tmp/tmpbnje941rtest2_SRR37273408_forward.fastq )
Failed to find '@SRR37273408.1 M02133:60:000000000-BC45T:1:1101:15828:1331/1' in history data. (lines_diff=0).

Workflow invocation details

Invocation Messages

Steps

Step 1: BioProject IDs:
- step_state: scheduled
Step 2: --assay (metadata download):
- step_state: scheduled
Step 3: --desc (metadata download):
- step_state: scheduled
Step 4: --detailed (metadata download):
- step_state: scheduled
Step 5: --expand (metadata download):
- step_state: scheduled
Step 6: Group by Experiments (fastq download):
- step_state: scheduled
Step 7: Group by Sample (fastq download):
- step_state: scheduled

Step 8: Separe BioProject IDs (toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

```
quay.io/biocontainers/python:3.5--2
```

Command Line:

mkdir ./out && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/2dae863c8f42/split_file_to_collection/split_file_to_collection.py' --out ./out --in '/tmp/tmpkm4a4wrk/files/d/9/f/dataset_d9fef476-ab20-46d4-bae1-f379df12a8ec.dat' --ftype 'txt' --chunksize 1 --file_names 'split_file' --file_ext 'txt'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"txt"`
__workflow_invocation_uuid__	`"08e3b86c277011f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
split_parms	`{"__current_case__": 5, "input": {"values": [{"id": 9, "src": "hda"}]}, "newfilenames": "split_file", "select_allocate": {"__current_case__": 2, "allocate": "byrow"}, "select_ftype": "txt", "select_mode": {"__current_case__": 0, "chunksize": "1", "mode": "chunk"}}`

Step 9: Add BioProject IDs as parameters (param_value_from_file):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"08e3b86c277011f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Job 2:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"08e3b86c277011f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Step 10: Metadata From BioProject IDs (toolshed.g2.bx.psu.edu/repos/iuc/pysradb_search/pysradb_search/2.5.1+galaxy0):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1417618' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"08e3b86c277011f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1417618", "selector": "metadata"}`
dbkey	`"?"`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1425250' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"08e3b86c277011f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1425250", "selector": "metadata"}`
dbkey	`"?"`

Step 11: Run IDs extract (toolshed.g2.bx.psu.edu/repos/iuc/table_compute/table_compute/1.2.4+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpkm4a4wrk/job_working_directory/000/13/configs/tmpf6vi7cpx' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"08e3b86c277011f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 16, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpkm4a4wrk/job_working_directory/000/14/configs/tmpvjzvyo_g' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"08e3b86c277011f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 17, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Step 12: Sequences Download (toolshed.g2.bx.psu.edu/repos/iuc/fastq_dl/fastq_dl/3.0.1+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/fastq-dl:3.0.1--pyhdfd78af_0

Command Line:

mkdir -p single-end paired-end logs && mapfile -t accessionsarr < "/tmp/tmpkm4a4wrk/files/5/b/6/dataset_5b6499bb-836a-48c6-98c5-d8096a3a0c4c.dat" &&  for accessionid in "${accessionsarr[@]}"; do fastq-dl --accession "$accessionid" --provider ena --only-provider   ; exit_code=$? ; if [ $exit_code -ne 0 ]; then echo "fastq-dl failed for accession: ${accessionid}" >&2 ; exit $exit_code ; break ; else mv fastq-run-info.tsv logs/"$accessionid"-fastq-run-info.tsv > /dev/null 2>&1 || true; fi ; done  && find . -maxdepth 1 -name "*_1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_2/_reverse/")"' {} \; && find . -maxdepth 1 -name "*_R1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_R2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R2/_reverse/")"' {} \; && mv *.gz single-end > /dev/null 2>&1 || true

Exit Code:

```
0
```

Standard Error:

2026-03-24 10:57:06 INFO     2026-03-24 10:57:06:root:INFO -     download.py:189
                             Query: SRR37273407                                 
                    INFO     2026-03-24 10:57:06:root:INFO -     download.py:190
                             Archive: ena                                       
                    INFO     2026-03-24 10:57:06:root:INFO -     download.py:195
                             Total Runs To Download: 1                          
                    INFO     2026-03-24 10:57:06:root:INFO -     download.py:214
                             Working on run SRR37273407...                      
                    INFO     2026-03-24 10:57:06:root:INFO -          ena.py:167
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/15/working/SRR37273407_1.fastq.gz FTP           
                             download attempt 1                                 
2026-03-24 10:57:21 INFO     2026-03-24 10:57:21:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/15/working/SRR37273407_1.fastq.gz               
                    INFO     2026-03-24 10:57:21:root:INFO -     download.py:311
                             Writing metadata to                                
                             /tmp/tmpkm4a4wrk/job_working_direct                
                             ory/000/15/working/fastq-run-info.t                
                             sv                                                 
2026-03-24 10:57:24 INFO     2026-03-24 10:57:24:root:INFO -     download.py:189
                             Query: SRR37273408                                 
                    INFO     2026-03-24 10:57:24:root:INFO -     download.py:190
                             Archive: ena                                       
                    INFO     2026-03-24 10:57:24:root:INFO -     download.py:195
                             Total Runs To Download: 1                          
                    INFO     2026-03-24 10:57:24:root:INFO -     download.py:214
                             Working on run SRR37273408...                      
                    INFO     2026-03-24 10:57:24:root:INFO -          ena.py:167
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/15/working/SRR37273408_1.fastq.gz FTP           
                             download attempt 1                                 
2026-03-24 10:58:32 INFO     2026-03-24 10:58:32:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/15/working/SRR37273408_1.fastq.gz               
                    INFO     2026-03-24 10:58:32:root:INFO -          ena.py:167
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/15/working/SRR37273408_2.fastq.gz FTP           
                             download attempt 1                                 
2026-03-24 10:59:04 INFO     2026-03-24 10:59:04:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/15/working/SRR37273408_2.fastq.gz               
                    INFO     2026-03-24 10:59:04:root:INFO -     download.py:311
                             Writing metadata to                                
                             /tmp/tmpkm4a4wrk/job_working_direct                
                             ory/000/15/working/fastq-run-info.t                
                             sv

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"08e3b86c277011f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
group_by_experiment	`false`
group_by_sample	`false`
input_type	`{"__current_case__": 1, "accessions_file": {"values": [{"id": 18, "src": "dce"}]}, "select_input_type": "accessions_list"}`
only_download_metadata	`false`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/fastq-dl:3.0.1--pyhdfd78af_0

Command Line:

mkdir -p single-end paired-end logs && mapfile -t accessionsarr < "/tmp/tmpkm4a4wrk/files/a/0/7/dataset_a0732496-dd24-49dc-a974-786aadbd7379.dat" &&  for accessionid in "${accessionsarr[@]}"; do fastq-dl --accession "$accessionid" --provider ena --only-provider   ; exit_code=$? ; if [ $exit_code -ne 0 ]; then echo "fastq-dl failed for accession: ${accessionid}" >&2 ; exit $exit_code ; break ; else mv fastq-run-info.tsv logs/"$accessionid"-fastq-run-info.tsv > /dev/null 2>&1 || true; fi ; done  && find . -maxdepth 1 -name "*_1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_2/_reverse/")"' {} \; && find . -maxdepth 1 -name "*_R1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_R2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R2/_reverse/")"' {} \; && mv *.gz single-end > /dev/null 2>&1 || true

Exit Code:

```
0
```

Standard Error:

2026-03-24 10:59:17 INFO     2026-03-24 10:59:17:root:INFO -     download.py:189
                             Query: SRR37073390                                 
                    INFO     2026-03-24 10:59:17:root:INFO -     download.py:190
                             Archive: ena                                       
                    INFO     2026-03-24 10:59:17:root:INFO -     download.py:195
                             Total Runs To Download: 1                          
                    INFO     2026-03-24 10:59:17:root:INFO -     download.py:214
                             Working on run SRR37073390...                      
                    INFO     2026-03-24 10:59:17:root:INFO -          ena.py:167
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/16/working/SRR37073390_1.fastq.gz FTP           
                             download attempt 1                                 
2026-03-24 11:01:29 INFO     2026-03-24 11:01:29:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/16/working/SRR37073390_1.fastq.gz               
                    INFO     2026-03-24 11:01:29:root:INFO -          ena.py:167
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/16/working/SRR37073390_2.fastq.gz FTP           
                             download attempt 1                                 
2026-03-24 11:02:22 INFO     2026-03-24 11:02:22:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/16/working/SRR37073390_2.fastq.gz               
                    INFO     2026-03-24 11:02:22:root:INFO -     download.py:311
                             Writing metadata to                                
                             /tmp/tmpkm4a4wrk/job_working_direct                
                             ory/000/16/working/fastq-run-info.t                
                             sv

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"08e3b86c277011f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
group_by_experiment	`false`
group_by_sample	`false`
input_type	`{"__current_case__": 1, "accessions_file": {"values": [{"id": 19, "src": "dce"}]}, "select_input_type": "accessions_list"}`
only_download_metadata	`false`

Other invocation details
- history_id
  - 727741a4ae178bc3
- history_state
  - ok
- invocation_id
  - 727741a4ae178bc3
- invocation_state
  - scheduled
- workflow_id
  - 727741a4ae178bc3

Passed Tests

✅ metadata-and-sequences-from-BioProjectIDs.ga_0

Workflow invocation details

Invocation Messages

Steps

Step 1: BioProject IDs:
- step_state: scheduled
Step 2: --assay (metadata download):
- step_state: scheduled
Step 3: --desc (metadata download):
- step_state: scheduled
Step 4: --detailed (metadata download):
- step_state: scheduled
Step 5: --expand (metadata download):
- step_state: scheduled
Step 6: Group by Experiments (fastq download):
- step_state: scheduled
Step 7: Group by Sample (fastq download):
- step_state: scheduled

Step 8: Separe BioProject IDs (toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

```
quay.io/biocontainers/python:3.5--2
```

Command Line:

mkdir ./out && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/2dae863c8f42/split_file_to_collection/split_file_to_collection.py' --out ./out --in '/tmp/tmpkm4a4wrk/files/9/0/8/dataset_908eb407-3c6f-4e71-aaf1-a0ba003b29ad.dat' --ftype 'txt' --chunksize 1 --file_names 'split_file' --file_ext 'txt'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tabular"`
__workflow_invocation_uuid__	`"0de38f14276f11f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
split_parms	`{"__current_case__": 5, "input": {"values": [{"id": 1, "src": "hda"}]}, "newfilenames": "split_file", "select_allocate": {"__current_case__": 2, "allocate": "byrow"}, "select_ftype": "txt", "select_mode": {"__current_case__": 0, "chunksize": "1", "mode": "chunk"}}`

Step 9: Add BioProject IDs as parameters (param_value_from_file):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"0de38f14276f11f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Step 10: Metadata From BioProject IDs (toolshed.g2.bx.psu.edu/repos/iuc/pysradb_search/pysradb_search/2.5.1+galaxy0):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1417618' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"0de38f14276f11f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1417618", "selector": "metadata"}`
dbkey	`"?"`

Step 11: Run IDs extract (toolshed.g2.bx.psu.edu/repos/iuc/table_compute/table_compute/1.2.4+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpkm4a4wrk/job_working_directory/000/5/configs/tmppvgqdg_g' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"0de38f14276f11f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 3, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Step 12: Sequences Download (toolshed.g2.bx.psu.edu/repos/iuc/fastq_dl/fastq_dl/3.0.1+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/fastq-dl:3.0.1--pyhdfd78af_0

Command Line:

mkdir -p single-end paired-end logs && mapfile -t accessionsarr < "/tmp/tmpkm4a4wrk/files/0/5/4/dataset_05477c4d-3056-43a0-8a86-f54bb682d718.dat" &&  for accessionid in "${accessionsarr[@]}"; do fastq-dl --accession "$accessionid" --provider ena --only-provider   ; exit_code=$? ; if [ $exit_code -ne 0 ]; then echo "fastq-dl failed for accession: ${accessionid}" >&2 ; exit $exit_code ; break ; else mv fastq-run-info.tsv logs/"$accessionid"-fastq-run-info.tsv > /dev/null 2>&1 || true; fi ; done  && find . -maxdepth 1 -name "*_1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_2/_reverse/")"' {} \; && find . -maxdepth 1 -name "*_R1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_R2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R2/_reverse/")"' {} \; && mv *.gz single-end > /dev/null 2>&1 || true

Exit Code:

```
0
```

Standard Error:

2026-03-24 10:51:52 INFO     2026-03-24 10:51:52:root:INFO -     download.py:189
                             Query: SRR37073390                                 
                    INFO     2026-03-24 10:51:52:root:INFO -     download.py:190
                             Archive: ena                                       
                    INFO     2026-03-24 10:51:52:root:INFO -     download.py:195
                             Total Runs To Download: 1                          
                    INFO     2026-03-24 10:51:52:root:INFO -     download.py:214
                             Working on run SRR37073390...                      
                    INFO     2026-03-24 10:51:52:root:INFO -          ena.py:167
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/6/working/SRR37073390_1.fastq.gz FTP            
                             download attempt 1                                 
2026-03-24 10:53:43 INFO     2026-03-24 10:53:43:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/6/working/SRR37073390_1.fastq.gz                
                    INFO     2026-03-24 10:53:43:root:INFO -          ena.py:167
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/6/working/SRR37073390_2.fastq.gz FTP            
                             download attempt 1                                 
2026-03-24 10:54:51 INFO     2026-03-24 10:54:51:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpkm4a4wrk/job_working_directory/0           
                             00/6/working/SRR37073390_2.fastq.gz                
                    INFO     2026-03-24 10:54:51:root:INFO -     download.py:311
                             Writing metadata to                                
                             /tmp/tmpkm4a4wrk/job_working_direct                
                             ory/000/6/working/fastq-run-info.ts                
                             v

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"0de38f14276f11f1a1ae000d3a3ac48c"`
chromInfo	`"/tmp/tmpkm4a4wrk/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
group_by_experiment	`false`
group_by_sample	`false`
input_type	`{"__current_case__": 1, "accessions_file": {"values": [{"id": 4, "src": "dce"}]}, "select_input_type": "accessions_list"}`
only_download_metadata	`false`

Other invocation details
- history_id
  - 12605b074346335e
- history_state
  - ok
- invocation_id
  - 12605b074346335e
- invocation_state
  - scheduled
- workflow_id
  - 727741a4ae178bc3

github-actions · 2026-03-24T13:29:08Z

Test Results (powered by Planemo)

Test Summary

Test State	Count
Total	2
Passed	1
Error	0
Failure	1
Skipped	0

Failed Tests

❌ metadata-and-sequences-from-BioProjectIDs.ga_1

Problems:

Output with path /tmp/tmpy64g7xaq/pysradb search metadata on  table__d3d0e3cf-9b64-439d-aad9-84372f9be6c6.tsv different than expected, difference (using diff):
( /home/runner/work/iwc/iwc/workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_metadata_file_split_file_000000.txt.tsv v. /tmp/tmps3avenf_test2_metadata_file_split_file_000000.txt.tsv )
--- local_file
+++ history_data
@@ -1,3 +1,3 @@
-run_accession	study_accession	study_title	experiment_accession	experiment_title	experiment_desc	organism_taxid	organism_name	library_name	library_strategy	library_source	library_selection	library_layout	sample_accession	sample_title	biosample	bioproject	instrument	instrument_model	instrument_model_desc	total_spots	total_size	run_total_spots	run_total_bases	run_alias	public_filename	public_size	public_date	public_md5	public_version	public_semantic_name	public_supertype	public_sratoolkit	aws_url	aws_free_egress	aws_access_type	public_url	ncbi_url	ncbi_free_egress	ncbi_access_type	gcp_url	gcp_free_egress	gcp_access_type	experiment_alias	strain	isolation_source	collection_date	geo_loc_name	sample_type	biomaterial_provider	biosamplemodel	ena_fastq_http	ena_fastq_http_1	ena_fastq_http_2	ena_fastq_ftp	ena_fastq_ftp_1	ena_fastq_ftp_2
-SRR37273407	SRP677941	Complete genome assembly of the first Streptomyces acidocola isolate from Malaysia using Nanopore long-read sequencing and Illumina polishing	SRX32206378	Nanopore DNA-Seq of Streptomyces acidicola	Nanopore DNA-Seq of Streptomyces acidicola	2596892	Streptomyces acidicola	TPS3_Nanopore	WGS	GENOMIC	RANDOM	SINGLE	SRS28131937		SAMN55411282	PRJNA1425250	MinION	MinION	OXFORD_NANOPORE	45419	184487295	45419	203067398	TPS3_nanopore_hac.fastq.gz	SRR37273407	184489529	2026-02-18 05:58:18	24eda2099b77249881e6957ff82b8498	1	SRA Normalized	Primary ETL	1	https://sra-pub-run-odp.s3.amazonaws.com/sra/SRR37273407/SRR37273407	worldwide	anonymous	https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos10/sra-pub-run-1001/SRR037/37273/SRR37273407/SRR37273407.1	https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos10/sra-pub-run-1001/SRR037/37273/SRR37273407/SRR37273407.1	worldwide	anonymous	gs://sra-pub-run-111/SRR37273407/SRR37273407.1	gs.us-east1	gcp identity		TPS3	marine sediment	2013-03	Malaysia: Tioman Island, Pahang	cell culture	Prof Annie Tan, Universiti Malaya	Microbe, viral or environmental		http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR372/007/SRR37273407/SRR37273407_1.fastq.gz			era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR372/007/SRR37273407/SRR37273407_1.fastq.gz	
-SRR37273408	SRP677941	Complete genome assembly of the first Streptomyces acidocola isolate from Malaysia using Nanopore long-read sequencing and Illumina polishing	SRX32206377	Illumina DNA-Seq of Streptomyces acidicola	Illumina DNA-Seq of Streptomyces acidicola	2596892	Streptomyces acidicola	TPS3_Illumina	WGS	GENOMIC	RANDOM	PAIRED	SRS28131937		SAMN55411282	PRJNA1425250	Illumina MiSeq	Illumina MiSeq	ILLUMINA	2935209	1013212413	2935209	1472053376	TPS3_S1_L001_R1_001.fastq.gz	SRR37273408.lite	376905109	2026-02-18 07:15:02	de5fc51b2d7f6f455ec7659b0f3467fc	1	SRA Lite	Primary ETL	1	s3://sra-pub-zq-5/SRR37273408/SRR37273408.lite.1	s3.us-east-1	aws identity	https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos10/sra-pub-zq-1002/SRR037/37273/SRR37273408/SRR37273408.lite.1	https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos10/sra-pub-zq-1002/SRR037/37273/SRR37273408/SRR37273408.lite.1	worldwide	anonymous	gs://sra-pub-zq-109/SRR37273408/SRR37273408.lite.1	gs.us-east1	gcp identity		TPS3	marine sediment	2013-03	Malaysia: Tioman Island, Pahang	cell culture	Prof Annie Tan, Universiti Malaya	Microbe, viral or environmental		http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR372/008/SRR37273408/SRR37273408_1.fastq.gz	http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR372/008/SRR37273408/SRR37273408_2.fastq.gz		era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR372/008/SRR37273408/SRR37273408_1.fastq.gz	era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR372/008/SRR37273408/SRR37273408_2.fastq.gz
+run_accession	study_accession	study_title	experiment_accession	experiment_title	experiment_desc	organism_taxid	organism_name	library_name	library_strategy	library_source	library_selection	library_layout	sample_accession	sample_title	biosample	bioproject	instrument	instrument_model	instrument_model_desc	total_spots	total_size	run_total_spots	run_total_bases	run_alias	public_filename	public_url	public_size	public_date	public_md5	public_version	public_semantic_name	public_supertype	public_sratoolkit	aws_url	aws_free_egress	aws_access_type	ncbi_url	ncbi_free_egress	ncbi_access_type	gcp_url	gcp_free_egress	gcp_access_type	experiment_alias	strain	isolation_source	collection_date	geo_loc_name	sample_type	biomaterial_provider	biosamplemodel	ena_fastq_http	ena_fastq_http_1	ena_fastq_http_2	ena_fastq_ftp	ena_fastq_ftp_1	ena_fastq_ftp_2
+SRR37273407	SRP677941	Complete genome assembly of the first Streptomyces acidocola isolate from Malaysia using Nanopore long-read sequencing and Illumina polishing	SRX32206378	Nanopore DNA-Seq of Streptomyces acidicola	Nanopore DNA-Seq of Streptomyces acidicola	2596892	Streptomyces acidicola	TPS3_Nanopore	WGS	GENOMIC	RANDOM	SINGLE	SRS28131937		SAMN55411282	PRJNA1425250	MinION	MinION	OXFORD_NANOPORE	45419	184487295	45419	203067398	TPS3_nanopore_hac.fastq.gz	TPS3_nanopore_hac.fastq.gz	https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos10/sra-pub-run-1001/SRR037/37273/SRR37273407/SRR37273407.1	210524830	2026-02-18 05:57:53	ff8a0c7837c47dc8a0331639c25ebda9	1	fastq	Original	0	s3://sra-pub-src-18/SRR37273407/TPS3_nanopore_hac.fastq.gz.1	-	Use Cloud Data Delivery	https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos10/sra-pub-run-1001/SRR037/37273/SRR37273407/SRR37273407.1	worldwide	anonymous	gs://sra-pub-run-111/SRR37273407/SRR37273407.1	gs.us-east1	gcp identity		TPS3	marine sediment	2013-03	Malaysia: Tioman Island, Pahang	cell culture	Prof Annie Tan, Universiti Malaya	Microbe, viral or environmental		http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR372/007/SRR37273407/SRR37273407_1.fastq.gz			era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR372/007/SRR37273407/SRR37273407_1.fastq.gz	
+SRR37273408	SRP677941	Complete genome assembly of the first Streptomyces acidocola isolate from Malaysia using Nanopore long-read sequencing and Illumina polishing	SRX32206377	Illumina DNA-Seq of Streptomyces acidicola	Illumina DNA-Seq of Streptomyces acidicola	2596892	Streptomyces acidicola	TPS3_Illumina	WGS	GENOMIC	RANDOM	PAIRED	SRS28131937		SAMN55411282	PRJNA1425250	Illumina MiSeq	Illumina MiSeq	ILLUMINA	2935209	1013212413	2935209	1472053376	TPS3_S1_L001_R1_001.fastq.gz	SRR37273408.lite	https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos10/sra-pub-zq-1002/SRR037/37273/SRR37273408/SRR37273408.lite.1	376905109	2026-02-18 07:15:02	de5fc51b2d7f6f455ec7659b0f3467fc	1	SRA Lite	Primary ETL	1	s3://sra-pub-zq-5/SRR37273408/SRR37273408.lite.1	s3.us-east-1	aws identity	https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos10/sra-pub-zq-1002/SRR037/37273/SRR37273408/SRR37273408.lite.1	worldwide	anonymous	gs://sra-pub-zq-109/SRR37273408/SRR37273408.lite.1	gs.us-east1	gcp identity		TPS3	marine sediment	2013-03	Malaysia: Tioman Island, Pahang	cell culture	Prof Annie Tan, Universiti Malaya	Microbe, viral or environmental		http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR372/008/SRR37273408/SRR37273408_1.fastq.gz	http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR372/008/SRR37273408/SRR37273408_2.fastq.gz		era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR372/008/SRR37273408/SRR37273408_1.fastq.gz	era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR372/008/SRR37273408/SRR37273408_2.fastq.gz

Workflow invocation details

Invocation Messages

Steps

Step 1: BioProject IDs:
- step_state: scheduled
Step 2: --assay (metadata download):
- step_state: scheduled
Step 3: --desc (metadata download):
- step_state: scheduled
Step 4: --detailed (metadata download):
- step_state: scheduled
Step 5: --expand (metadata download):
- step_state: scheduled
Step 6: Group by Experiments (fastq download):
- step_state: scheduled
Step 7: Group by Sample (fastq download):
- step_state: scheduled

Step 8: Separe BioProject IDs (toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

```
quay.io/biocontainers/python:3.5--2
```

Command Line:

mkdir ./out && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/2dae863c8f42/split_file_to_collection/split_file_to_collection.py' --out ./out --in '/tmp/tmpabiru3i0/files/2/4/0/dataset_24064dad-8e03-4536-a14a-fb53fe6a37c3.dat' --ftype 'txt' --chunksize 1 --file_names 'split_file' --file_ext 'txt'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"txt"`
__workflow_invocation_uuid__	`"ff55b0ee278211f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
split_parms	`{"__current_case__": 5, "input": {"values": [{"id": 9, "src": "hda"}]}, "newfilenames": "split_file", "select_allocate": {"__current_case__": 2, "allocate": "byrow"}, "select_ftype": "txt", "select_mode": {"__current_case__": 0, "chunksize": "1", "mode": "chunk"}}`

Step 9: Add BioProject IDs as parameters (param_value_from_file):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"ff55b0ee278211f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Job 2:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"ff55b0ee278211f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Step 10: Metadata From BioProject IDs (toolshed.g2.bx.psu.edu/repos/iuc/pysradb_search/pysradb_search/2.5.1+galaxy0):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1425250' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"ff55b0ee278211f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1425250", "selector": "metadata"}`
dbkey	`"?"`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1417618' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"ff55b0ee278211f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1417618", "selector": "metadata"}`
dbkey	`"?"`

Step 11: Run IDs extract (toolshed.g2.bx.psu.edu/repos/iuc/table_compute/table_compute/1.2.4+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpabiru3i0/job_working_directory/000/13/configs/tmpqj0rvf_x' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"ff55b0ee278211f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 16, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpabiru3i0/job_working_directory/000/14/configs/tmp01m_cuuh' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"ff55b0ee278211f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 17, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Step 12: Sequences Download (toolshed.g2.bx.psu.edu/repos/iuc/fastq_dl/fastq_dl/3.0.1+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/fastq-dl:3.0.1--pyhdfd78af_0

Command Line:

mkdir -p single-end paired-end logs && mapfile -t accessionsarr < "/tmp/tmpabiru3i0/files/f/3/3/dataset_f3357cc6-2d85-475d-ae3a-a7ee187244e8.dat" &&  for accessionid in "${accessionsarr[@]}"; do fastq-dl --accession "$accessionid" --provider ena --only-provider   ; exit_code=$? ; if [ $exit_code -ne 0 ]; then echo "fastq-dl failed for accession: ${accessionid}" >&2 ; exit $exit_code ; break ; else mv fastq-run-info.tsv logs/"$accessionid"-fastq-run-info.tsv > /dev/null 2>&1 || true; fi ; done  && find . -maxdepth 1 -name "*_1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_2/_reverse/")"' {} \; && find . -maxdepth 1 -name "*_R1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_R2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R2/_reverse/")"' {} \; && mv *.gz single-end > /dev/null 2>&1 || true

Exit Code:

```
0
```

Standard Error:

2026-03-24 13:12:50 INFO     2026-03-24 13:12:50:root:INFO -     download.py:189
                             Query: SRR37273407                                 
                    INFO     2026-03-24 13:12:50:root:INFO -     download.py:190
                             Archive: ena                                       
                    INFO     2026-03-24 13:12:50:root:INFO -     download.py:195
                             Total Runs To Download: 1                          
                    INFO     2026-03-24 13:12:50:root:INFO -     download.py:214
                             Working on run SRR37273407...                      
                    INFO     2026-03-24 13:12:50:root:INFO -          ena.py:167
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/15/working/SRR37273407_1.fastq.gz FTP           
                             download attempt 1                                 
2026-03-24 13:13:18 INFO     2026-03-24 13:13:18:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/15/working/SRR37273407_1.fastq.gz               
                    INFO     2026-03-24 13:13:18:root:INFO -     download.py:311
                             Writing metadata to                                
                             /tmp/tmpabiru3i0/job_working_direct                
                             ory/000/15/working/fastq-run-info.t                
                             sv                                                 
2026-03-24 13:13:21 INFO     2026-03-24 13:13:21:root:INFO -     download.py:189
                             Query: SRR37273408                                 
                    INFO     2026-03-24 13:13:21:root:INFO -     download.py:190
                             Archive: ena                                       
                    INFO     2026-03-24 13:13:21:root:INFO -     download.py:195
                             Total Runs To Download: 1                          
                    INFO     2026-03-24 13:13:21:root:INFO -     download.py:214
                             Working on run SRR37273408...                      
                    INFO     2026-03-24 13:13:21:root:INFO -          ena.py:167
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/15/working/SRR37273408_1.fastq.gz FTP           
                             download attempt 1                                 
2026-03-24 13:15:40 INFO     2026-03-24 13:15:40:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/15/working/SRR37273408_1.fastq.gz               
                    INFO     2026-03-24 13:15:40:root:INFO -          ena.py:167
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/15/working/SRR37273408_2.fastq.gz FTP           
                             download attempt 1                                 
2026-03-24 13:18:18 INFO     2026-03-24 13:18:18:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/15/working/SRR37273408_2.fastq.gz               
                    INFO     2026-03-24 13:18:18:root:INFO -     download.py:311
                             Writing metadata to                                
                             /tmp/tmpabiru3i0/job_working_direct                
                             ory/000/15/working/fastq-run-info.t                
                             sv

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"ff55b0ee278211f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
group_by_experiment	`false`
group_by_sample	`false`
input_type	`{"__current_case__": 1, "accessions_file": {"values": [{"id": 18, "src": "dce"}]}, "select_input_type": "accessions_list"}`
only_download_metadata	`false`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/fastq-dl:3.0.1--pyhdfd78af_0

Command Line:

mkdir -p single-end paired-end logs && mapfile -t accessionsarr < "/tmp/tmpabiru3i0/files/a/c/5/dataset_ac5f9edf-a538-4667-af4e-85c0f93dac92.dat" &&  for accessionid in "${accessionsarr[@]}"; do fastq-dl --accession "$accessionid" --provider ena --only-provider   ; exit_code=$? ; if [ $exit_code -ne 0 ]; then echo "fastq-dl failed for accession: ${accessionid}" >&2 ; exit $exit_code ; break ; else mv fastq-run-info.tsv logs/"$accessionid"-fastq-run-info.tsv > /dev/null 2>&1 || true; fi ; done  && find . -maxdepth 1 -name "*_1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_2/_reverse/")"' {} \; && find . -maxdepth 1 -name "*_R1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_R2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R2/_reverse/")"' {} \; && mv *.gz single-end > /dev/null 2>&1 || true

Exit Code:

```
0
```

Standard Error:

2026-03-24 13:18:30 INFO     2026-03-24 13:18:30:root:INFO -     download.py:189
                             Query: SRR37073390                                 
                    INFO     2026-03-24 13:18:30:root:INFO -     download.py:190
                             Archive: ena                                       
                    INFO     2026-03-24 13:18:30:root:INFO -     download.py:195
                             Total Runs To Download: 1                          
                    INFO     2026-03-24 13:18:30:root:INFO -     download.py:214
                             Working on run SRR37073390...                      
                    INFO     2026-03-24 13:18:30:root:INFO -          ena.py:167
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/16/working/SRR37073390_1.fastq.gz FTP           
                             download attempt 1                                 
2026-03-24 13:22:04 INFO     2026-03-24 13:22:04:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/16/working/SRR37073390_1.fastq.gz               
                    INFO     2026-03-24 13:22:04:root:INFO -          ena.py:167
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/16/working/SRR37073390_2.fastq.gz FTP           
                             download attempt 1                                 
2026-03-24 13:24:48 INFO     2026-03-24 13:24:48:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/16/working/SRR37073390_2.fastq.gz               
                    INFO     2026-03-24 13:24:48:root:INFO -     download.py:311
                             Writing metadata to                                
                             /tmp/tmpabiru3i0/job_working_direct                
                             ory/000/16/working/fastq-run-info.t                
                             sv

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"ff55b0ee278211f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
group_by_experiment	`false`
group_by_sample	`false`
input_type	`{"__current_case__": 1, "accessions_file": {"values": [{"id": 19, "src": "dce"}]}, "select_input_type": "accessions_list"}`
only_download_metadata	`false`

Other invocation details
- history_id
  - 5a6a5751b6a153f4
- history_state
  - ok
- invocation_id
  - 5a6a5751b6a153f4
- invocation_state
  - scheduled
- workflow_id
  - 5a6a5751b6a153f4

Passed Tests

✅ metadata-and-sequences-from-BioProjectIDs.ga_0

Workflow invocation details

Invocation Messages

Steps

Step 1: BioProject IDs:
- step_state: scheduled
Step 2: --assay (metadata download):
- step_state: scheduled
Step 3: --desc (metadata download):
- step_state: scheduled
Step 4: --detailed (metadata download):
- step_state: scheduled
Step 5: --expand (metadata download):
- step_state: scheduled
Step 6: Group by Experiments (fastq download):
- step_state: scheduled
Step 7: Group by Sample (fastq download):
- step_state: scheduled

Step 8: Separe BioProject IDs (toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

```
quay.io/biocontainers/python:3.5--2
```

Command Line:

mkdir ./out && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/2dae863c8f42/split_file_to_collection/split_file_to_collection.py' --out ./out --in '/tmp/tmpabiru3i0/files/6/5/0/dataset_650eb191-696e-426f-8905-82b86ce254eb.dat' --ftype 'txt' --chunksize 1 --file_names 'split_file' --file_ext 'txt'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tabular"`
__workflow_invocation_uuid__	`"7108edcc277f11f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
split_parms	`{"__current_case__": 5, "input": {"values": [{"id": 1, "src": "hda"}]}, "newfilenames": "split_file", "select_allocate": {"__current_case__": 2, "allocate": "byrow"}, "select_ftype": "txt", "select_mode": {"__current_case__": 0, "chunksize": "1", "mode": "chunk"}}`

Step 9: Add BioProject IDs as parameters (param_value_from_file):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"7108edcc277f11f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Step 10: Metadata From BioProject IDs (toolshed.g2.bx.psu.edu/repos/iuc/pysradb_search/pysradb_search/2.5.1+galaxy0):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1417618' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"7108edcc277f11f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1417618", "selector": "metadata"}`
dbkey	`"?"`

Step 11: Run IDs extract (toolshed.g2.bx.psu.edu/repos/iuc/table_compute/table_compute/1.2.4+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpabiru3i0/job_working_directory/000/5/configs/tmpbv_6j9a9' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"7108edcc277f11f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 3, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Step 12: Sequences Download (toolshed.g2.bx.psu.edu/repos/iuc/fastq_dl/fastq_dl/3.0.1+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/fastq-dl:3.0.1--pyhdfd78af_0

Command Line:

mkdir -p single-end paired-end logs && mapfile -t accessionsarr < "/tmp/tmpabiru3i0/files/8/e/f/dataset_8ef8b367-6528-49e8-ae21-816f4d9f5d98.dat" &&  for accessionid in "${accessionsarr[@]}"; do fastq-dl --accession "$accessionid" --provider ena --only-provider   ; exit_code=$? ; if [ $exit_code -ne 0 ]; then echo "fastq-dl failed for accession: ${accessionid}" >&2 ; exit $exit_code ; break ; else mv fastq-run-info.tsv logs/"$accessionid"-fastq-run-info.tsv > /dev/null 2>&1 || true; fi ; done  && find . -maxdepth 1 -name "*_1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_2/_reverse/")"' {} \; && find . -maxdepth 1 -name "*_R1.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R1/_forward/")"' {} \; && find . -maxdepth 1 -name "*_R2.fastq.gz" -exec bash -c 'mv "$0" "paired-end/$(basename "$0" | sed "s/_R2/_reverse/")"' {} \; && mv *.gz single-end > /dev/null 2>&1 || true

Exit Code:

```
0
```

Standard Error:

2026-03-24 12:48:32 INFO     2026-03-24 12:48:32:root:INFO -     download.py:189
                             Query: SRR37073390                                 
                    INFO     2026-03-24 12:48:32:root:INFO -     download.py:190
                             Archive: ena                                       
                    INFO     2026-03-24 12:48:32:root:INFO -     download.py:195
                             Total Runs To Download: 1                          
                    INFO     2026-03-24 12:48:32:root:INFO -     download.py:214
                             Working on run SRR37073390...                      
                    INFO     2026-03-24 12:48:32:root:INFO -          ena.py:167
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/6/working/SRR37073390_1.fastq.gz FTP            
                             download attempt 1                                 
2026-03-24 12:50:18 INFO     2026-03-24 12:50:18:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/6/working/SRR37073390_1.fastq.gz                
                    INFO     2026-03-24 12:50:18:root:INFO -          ena.py:167
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/6/working/SRR37073390_2.fastq.gz FTP            
                             download attempt 1                                 
2026-03-24 13:10:37 INFO     2026-03-24 13:10:37:root:INFO -          ena.py:195
                             Successfully downloaded                            
                             /tmp/tmpabiru3i0/job_working_directory/0           
                             00/6/working/SRR37073390_2.fastq.gz                
                    INFO     2026-03-24 13:10:37:root:INFO -     download.py:311
                             Writing metadata to                                
                             /tmp/tmpabiru3i0/job_working_direct                
                             ory/000/6/working/fastq-run-info.ts                
                             v

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"7108edcc277f11f1a1ae70a8a56e7439"`
chromInfo	`"/tmp/tmpabiru3i0/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
group_by_experiment	`false`
group_by_sample	`false`
input_type	`{"__current_case__": 1, "accessions_file": {"values": [{"id": 4, "src": "dce"}]}, "select_input_type": "accessions_list"}`
only_download_metadata	`false`

Other invocation details
- history_id
  - 9a7ddd52088ca260
- history_state
  - ok
- invocation_id
  - 9a7ddd52088ca260
- invocation_state
  - scheduled
- workflow_id
  - 5a6a5751b6a153f4

Copilot

Pull request overview

Adds a new Galaxy workflow that downloads SRA metadata tables and FASTQ collections starting from a list of BioProject IDs, along with packaging/registry metadata and workflow tests.

Changes:

Added the metadata-and-sequences-from-BioProjectIDs Galaxy workflow using pysradb search + fastq-dl.
Added Dockstore/WorkflowHub configuration plus README and changelog for the new workflow.
Added workflow tests and associated test-data fixtures (BioProject ID lists, expected metadata TSVs, and FASTQ snippets).

Reviewed changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 16 comments.

Show a summary per file

File	Description
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/metadata-and-sequences-from-BioProjectIDs.ga	New Galaxy workflow definition (inputs/steps/outputs).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/metadata-and-sequences-from-BioProjectIDs-tests.yml	New workflow test cases validating metadata + FASTQ outputs.
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/README.md	Workflow documentation (purpose, inputs, outputs).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/CHANGELOG.md	Initial changelog entry for the workflow.
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/.workflowhub.yml	WorkflowHub publishing configuration for the workflow.
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/.dockstore.yml	Dockstore descriptor pointing to the workflow and test definitions.
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test1_single_prj_pe.txt	Test input BioProject ID list (single project).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test1_metadata_file_split_file_000000.txt.tsv	Expected metadata TSV for test 1.
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test1_paired_end_collection_forward.fastq	Expected FASTQ snippet for test 1 (forward).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test1_paired_end_collection_reverse.fastq	Expected FASTQ snippet for test 1 (reverse).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_multiple_prj_mixed.txt	Test input BioProject ID list (multiple projects).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_metadata_file_split_file_000000.txt.tsv	Expected metadata TSV for test 2 (project 1).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_metadata_file_split_file_000001.txt.tsv	Expected metadata TSV for test 2 (project 2).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_SRR37273407_forward.fastq	Expected FASTQ snippet for test 2 (single-end/forward-only run).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_SRR37273408_forward.fastq	Expected FASTQ snippet for test 2 (paired-end forward).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_SRR37273408_reverse.fastq	Expected FASTQ snippet for test 2 (paired-end reverse).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_SRR37073390_forward.fastq	Expected FASTQ snippet for test 2 (paired-end forward).
workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_SRR37073390_reverse.fastq	Expected FASTQ snippet for test 2 (paired-end reverse).
.idea/.gitignore	Adds JetBrains IDE ignore rules within a committed `.idea/` directory.

Files not reviewed (1)

.idea/.gitignore: Language not supported

Copilot · 2026-03-27T14:33:56Z

+  primaryDescriptorPath: /metadata-and-sequences-from-BioProjectIDs.ga
+  testParameterFiles:
+  - /metadata-and-sequences-from-BioProjectIDs-tests.yml


The referenced workflow/test filenames in primaryDescriptorPath / testParameterFiles include uppercase letters and “BioProjectIDs” without a space. For new IWC workflow additions, filenames/folder names are expected to be lowercase with dashes and human-readable wording (e.g., ...-bioproject-ids...). Consider renaming the workflow and test files (and updating these paths) to match that convention.

Copilot · 2026-03-27T14:33:57Z

@@ -0,0 +1,517 @@
+{
+    "a_galaxy_workflow": "true",
+    "annotation": "This workflow takes BioProject IDs as input and is able to retrieve SRA tables and FASTQ files from them.",


The workflow annotation does not follow the repository’s required phrasing (it should start with “This workflow does/runs/performs …” and ideally mention the main outputs). Please reword this annotation to match the expected format.

Suggested change

"annotation": "This workflow takes BioProject IDs as input and is able to retrieve SRA tables and FASTQ files from them.",

"annotation": "This workflow performs retrieval of SRA metadata tables and FASTQ sequence files from input BioProject IDs.",

Copilot · 2026-03-27T14:33:57Z

+                {
+                    "description": "Enable it to display detailed metadata table",
+                    "name": "--detailed (metadata download)"
+                }
+            ],
+            "label": "--detailed (metadata download)",
+            "name": "Input parameter",


This workflow input label/name (--detailed (metadata download)) is CLI-flag styled and not human-readable. Please rename it to a descriptive label and update the corresponding key in *-tests.yml accordingly.

Copilot · 2026-03-27T14:33:57Z

+    BioProject IDs:
+      class: File
+      path: test-data/test1_single_prj_pe.txt
+      filetype: tabular


Test 1 declares the BioProject ID list input as filetype: tabular, while Test 2 uses filetype: txt. Given this input is a plain list of IDs (no header), consider using txt consistently across tests (and align with the workflow input datatype).

Suggested change

filetype: tabular

filetype: txt

Copilot · 2026-03-27T14:33:57Z

+registries:
+- url: https://workflowhub.eu
+  project: iwc
+  workflow: metadata-and-sequences-from-BioProjectsIDs/main


The WorkflowHub workflow: path has a typo/inconsistency (metadata-and-sequences-from-BioProjectsIDs/main): it doesn’t match this directory name (metadata-and-sequences-from-BioProjectIDs) and will likely break publication/registration. Please update it to the correct (and ideally lowercase) workflow slug.

Suggested change

workflow: metadata-and-sequences-from-BioProjectsIDs/main

workflow: metadata-and-sequences-from-bioprojectids/main

Copilot · 2026-03-27T14:34:00Z

+# Default ignored files
+/shelf/
+/workspace.xml
+# Editor-based HTTP Client requests
+/httpRequests/
+# Datasource local storage ignored files
+/dataSources/
+/dataSources.local.xml


The repository-level .gitignore does not currently ignore JetBrains .idea/, and this PR adds .idea/.gitignore. Project-specific IDE metadata generally shouldn’t be committed; instead, remove the .idea/ directory from the repo and add .idea/ to the top-level .gitignore if needed.

Suggested change

# Default ignored files

/shelf/

/workspace.xml

# Editor-based HTTP Client requests

/httpRequests/

# Datasource local storage ignored files

/dataSources/

/dataSources.local.xml

# Ignore all JetBrains IDE project files in this directory

/*

!.gitignore

Copilot · 2026-03-27T14:34:00Z

+    "name": "Metadata and Sequences from BioProjectIDs",
+    "readme": "# Metadata and Sequences from BioProjectIDs\n\n## Rationale\nIn order to promote re-analysis of publicly available sequences from INSDC databases, we propose *Metadata and Sequences from BioProjectIDs* a Galaxy workflow that starting by a list of valid BioProject IDs (e.g. PRJNA....., PRJEB.....) is able to manage data and metadata download.\n\n## Usage\nUpload a text file in which there is a BioProject ID for each row and run the workflow.\n\nThere is also the possibility to set optional options to regulate behaviour of metadata and data download.\n",


The workflow name uses “BioProjectIDs” without a space, which is not human-readable and is inconsistent with the “BioProject IDs” wording used elsewhere. Consider renaming to use spaces (e.g., “BioProject IDs”) and ensure related filenames/folder name follow the same convention.

Copilot · 2026-03-27T14:34:00Z

+                {
+                    "description": "Enable it to include assay type in output",
+                    "name": "--assay (metadata download)"
+                }
+            ],
+            "label": "--assay (metadata download)",
+            "name": "Input parameter",


This workflow input label/name (--assay (metadata download)) is CLI-flag styled and not human-readable. Please rename it to a descriptive label (e.g., “Include assay type in metadata”) and update the corresponding key in *-tests.yml accordingly.

Copilot · 2026-03-27T14:34:00Z

+            "workflow_outputs": [
+                {
+                    "label": "metadata_file",
+                    "output_name": "metadata_file",
+                    "uuid": "936ed258-6b4a-412c-9c61-b475d5da1251"
+                }


The workflow output label metadata_file is not human-readable (underscores) and won’t render nicely in Galaxy or in test definitions. Please change workflow_outputs[].label to a human-readable phrase and then update the matching output name used in metadata-and-sequences-from-BioProjectIDs-tests.yml.

Copilot · 2026-03-27T14:34:01Z

+            "workflow_outputs": [
+                {
+                    "label": "paired_end_collection",
+                    "output_name": "paired_end_collection",
+                    "uuid": "d08a8412-887b-4800-a200-916745f7c65e"
+                },
+                {
+                    "label": "single_end_collection",
+                    "output_name": "single_end_collection",
+                    "uuid": "9deedcc0-92ff-4845-8cfc-a9a1e70d9375"
+                }


The workflow output labels paired_end_collection / single_end_collection are not human-readable (underscores). Please rename these workflow_outputs[].label values to human-readable phrases and update the corresponding output keys in metadata-and-sequences-from-BioProjectIDs-tests.yml to match exactly.

mvdbeek · 2026-03-27T14:58:30Z

Thanks @gdefazio, this seems quite useful. I note that there is some overlap with https://iwc.galaxyproject.org/workflow/sra-manifest-to-concatenated-fastqs-main/ and https://iwc.galaxyproject.org/workflow/parallel-accession-download-main/. Could you use either of these workflows as a subworkflow to handle the downloads based on the manifest you're generating ?

gdefazio · 2026-03-27T18:07:53Z

Thanks @gdefazio, this seems quite useful. I note that there is some overlap with https://iwc.galaxyproject.org/workflow/sra-manifest-to-concatenated-fastqs-main/ and https://iwc.galaxyproject.org/workflow/parallel-accession-download-main/. Could you use either of these workflows as a subworkflow to handle the downloads based on the manifest you're generating ?

Hi @mvdbeek and thank you for your revision and comment. I'm new on galaxy wf implementation. For your question I need some days to better understand how your tips may be useful for this wf. Probably parallelization may be useful but please let me have an in deep look.
Thanks a lot.

gdefazio · 2026-04-08T12:46:51Z

Hi @mvdbeek, I tried to integrate parallel-accession-download-main but I had lot of problems with scheduling of job for apply-rule tool of this wf because of too nested structure. Then I integrated fasterq-dump in my wf and I think that thanks to your suggestion now the wf is better. I'm waiting for CI start from yesterday but it seems stacked.

github-actions · 2026-04-08T20:59:26Z

Test Results (powered by Planemo)

Test Summary

Test State	Count
Total	2
Passed	0
Error	0
Failure	2
Skipped	0

Failed Tests

❌ metadata-and-sequences-from-bioproject-ids.ga_0

Problems:

Output collection 'PE output': failed to find identifier 'split_file_000000.txt' in the tool generated elements ['SRR37073390']

Output collection 'SE output': failed to find identifier 'split_file_000000.txt' in the tool generated elements []

Workflow invocation details

Invocation Messages

Steps

Step 1: BioProject IDs:
- step_state: scheduled
Step 2: assay (metadata download):
- step_state: scheduled
Step 3: desc (metadata download):
- step_state: scheduled
Step 4: detailed (metadata download):
- step_state: scheduled
Step 5: expand (metadata download):
- step_state: scheduled

Step 6: Separate BioProject IDs (toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

```
quay.io/biocontainers/python:3.5--2
```

Command Line:

mkdir ./out && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/2dae863c8f42/split_file_to_collection/split_file_to_collection.py' --out ./out --in '/tmp/tmpz_szomsr/files/4/a/8/dataset_4a855306-4b4e-465e-80f5-1ee96c06d496.dat' --ftype 'txt' --chunksize 1 --file_names 'split_file' --file_ext 'txt'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"txt"`
__workflow_invocation_uuid__	`"069c75a8338711f19acb7c1e524174c7"`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
split_parms	`{"__current_case__": 5, "input": {"values": [{"id": 1, "src": "hda"}]}, "newfilenames": "split_file", "select_allocate": {"__current_case__": 2, "allocate": "byrow"}, "select_ftype": "txt", "select_mode": {"__current_case__": 0, "chunksize": "1", "mode": "chunk"}}`

Step 7: Add BioProject IDs as parameters (param_value_from_file):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"069c75a8338711f19acb7c1e524174c7"`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Step 8: Metadata From BioProject IDs (toolshed.g2.bx.psu.edu/repos/iuc/pysradb_search/pysradb_search/2.5.1+galaxy0):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1417618' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"069c75a8338711f19acb7c1e524174c7"`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1417618", "selector": "metadata"}`
dbkey	`"?"`

Step 9: Run IDs extract (toolshed.g2.bx.psu.edu/repos/iuc/table_compute/table_compute/1.2.4+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpz_szomsr/job_working_directory/000/5/configs/tmpyet33220' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"069c75a8338711f19acb7c1e524174c7"`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 3, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Step 10: fasterq-dump (toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fasterq_dump/3.1.1+galaxy1):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-2b04072095278721dc9a5772e61e406f399b6030:a95f0e0ff448eede323315668bfa8ee64c918ebb-0

Command Line:

set -o | grep -q pipefail && set -o pipefail;  mkdir -p ~/.ncbi && cp '/tmp/tmpz_szomsr/job_working_directory/000/6/configs/tmpc6vs39sa' ~/.ncbi/user-settings.mkfg &&   export SRA_PREFETCH_RETRIES=3 && export SRA_PREFETCH_ATTEMPT=1 &&    grep '^[[:space:]]*[E|S|D]RR[0-9]\{1,\}[[:space:]]*$' '/tmp/tmpz_szomsr/files/1/f/7/dataset_1f7a6a71-9a74-46e6-b18f-965339b35316.dat' > accessions && for acc in $(cat ./accessions); do ( echo "Downloading accession: $acc..." &&  while [ $SRA_PREFETCH_ATTEMPT -le $SRA_PREFETCH_RETRIES ] ; do fasterq-dump "$acc" -e ${GALAXY_SLOTS:-1} -t ${TMPDIR} --seq-defline '@$ac.$sn/$ri' --qual-defline '+' --split-3 --skip-technical 2>&1 | tee -a '/tmp/tmpz_szomsr/job_working_directory/000/6/outputs/dataset_a776ab45-57cf-4112-acfd-bbc438d51eb3.dat'; if [ $? == 0 ] && [ $(ls *.fastq | wc -l) -ge 1 ]; then break ; else echo "Prefetch attempt $SRA_PREFETCH_ATTEMPT of $SRA_PREFETCH_RETRIES exited with code $?" ; SRA_PREFETCH_ATTEMPT=`expr $SRA_PREFETCH_ATTEMPT + 1` ; sleep 1 ; fi ; done && mkdir -p output && mkdir -p outputOther && count="$(ls *.fastq | wc -l)" && echo "There are $count fastq files" && data=($(ls *.fastq)) && if [ "$count" -eq 1 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"__single.fastqsanger.gz && rm "${data[0]}"; elif [ "--split-3" = "--split-3" ]; then if [ -e "${acc}".fastq ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${acc}".fastq > outputOther/"${acc}"__single.fastqsanger.gz; fi && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_1.fastq > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_2.fastq > output/"${acc}"_reverse.fastqsanger.gz && rm "${acc}"*.fastq; elif [ "$count" -eq 2 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${data[1]}" > output/"${acc}"_reverse.fastqsanger.gz && rm "${data[0]}" && rm "${data[1]}"; else for file in ${data[*]}; do pigz -cqp ${GALAXY_SLOTS:-1} "$file" > outputOther/"$file"sanger.gz && rm "$file"; done; fi;  ); done; echo "Done with all accessions."

Exit Code:

```
0
```

Standard Output:

Downloading accession: SRR37073390...
spots read      : 13,266,400
reads read      : 26,532,800
reads written   : 26,532,800
There are 2 fastq files
Done with all accessions.

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"069c75a8338711f19acb7c1e524174c7"`
adv	`{"minlen": null, "seq_defline": "@$ac.$sn/$ri", "skip_technical": true, "split": "--split-3"}`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
input	`{"__current_case__": 2, "file_list": {"values": [{"id": 4, "src": "dce"}]}, "input_select": "file_list"}`

Step 11: PE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"069c75a8338711f19acb7c1e524174c7"`
input	`{"values": [{"id": 5, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}, {"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [2], "connectable": true, "is_workflow": false, "type": "paired_identifier"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier2", "warn": null}]}

Step 12: SE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"069c75a8338711f19acb7c1e524174c7"`
input	`{"values": [{"id": 6, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}]}

Other invocation details
- history_id
  - 86b054d6a33410bd
- history_state
  - ok
- invocation_id
  - 86b054d6a33410bd
- invocation_state
  - scheduled
- workflow_id
  - ce57138c0fb39786

❌ metadata-and-sequences-from-bioproject-ids.ga_1

Problems:

Output collection 'PE output': failed to find identifier 'elements' in the tool generated elements ['SRR37273408', 'SRR37073390']

Workflow invocation details

Invocation Messages

Steps

Step 1: BioProject IDs:
- step_state: scheduled
Step 2: assay (metadata download):
- step_state: scheduled
Step 3: desc (metadata download):
- step_state: scheduled
Step 4: detailed (metadata download):
- step_state: scheduled
Step 5: expand (metadata download):
- step_state: scheduled

Step 6: Separate BioProject IDs (toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

```
quay.io/biocontainers/python:3.5--2
```

Command Line:

mkdir ./out && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/2dae863c8f42/split_file_to_collection/split_file_to_collection.py' --out ./out --in '/tmp/tmpz_szomsr/files/4/d/6/dataset_4d62222c-7cc4-4bf7-956b-cd0f70bed67b.dat' --ftype 'txt' --chunksize 1 --file_names 'split_file' --file_ext 'txt'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"txt"`
__workflow_invocation_uuid__	`"bd64d292338911f19acb7c1e524174c7"`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
split_parms	`{"__current_case__": 5, "input": {"values": [{"id": 11, "src": "hda"}]}, "newfilenames": "split_file", "select_allocate": {"__current_case__": 2, "allocate": "byrow"}, "select_ftype": "txt", "select_mode": {"__current_case__": 0, "chunksize": "1", "mode": "chunk"}}`

Step 7: Add BioProject IDs as parameters (param_value_from_file):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"bd64d292338911f19acb7c1e524174c7"`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Job 2:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"bd64d292338911f19acb7c1e524174c7"`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Step 8: Metadata From BioProject IDs (toolshed.g2.bx.psu.edu/repos/iuc/pysradb_search/pysradb_search/2.5.1+galaxy0):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1425250' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"bd64d292338911f19acb7c1e524174c7"`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1425250", "selector": "metadata"}`
dbkey	`"?"`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1417618' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"bd64d292338911f19acb7c1e524174c7"`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1417618", "selector": "metadata"}`
dbkey	`"?"`

Step 9: Run IDs extract (toolshed.g2.bx.psu.edu/repos/iuc/table_compute/table_compute/1.2.4+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpz_szomsr/job_working_directory/000/15/configs/tmptv3bk39e' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"bd64d292338911f19acb7c1e524174c7"`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 19, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpz_szomsr/job_working_directory/000/16/configs/tmpc9wqcwhj' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"bd64d292338911f19acb7c1e524174c7"`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 20, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Step 10: fasterq-dump (toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fasterq_dump/3.1.1+galaxy1):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-2b04072095278721dc9a5772e61e406f399b6030:a95f0e0ff448eede323315668bfa8ee64c918ebb-0

Command Line:

set -o | grep -q pipefail && set -o pipefail;  mkdir -p ~/.ncbi && cp '/tmp/tmpz_szomsr/job_working_directory/000/17/configs/tmpajb5gv5p' ~/.ncbi/user-settings.mkfg &&   export SRA_PREFETCH_RETRIES=3 && export SRA_PREFETCH_ATTEMPT=1 &&    grep '^[[:space:]]*[E|S|D]RR[0-9]\{1,\}[[:space:]]*$' '/tmp/tmpz_szomsr/files/1/0/e/dataset_10e4657b-80dd-4e17-b24d-db685187c653.dat' > accessions && for acc in $(cat ./accessions); do ( echo "Downloading accession: $acc..." &&  while [ $SRA_PREFETCH_ATTEMPT -le $SRA_PREFETCH_RETRIES ] ; do fasterq-dump "$acc" -e ${GALAXY_SLOTS:-1} -t ${TMPDIR} --seq-defline '@$ac.$sn/$ri' --qual-defline '+' --split-3 --skip-technical 2>&1 | tee -a '/tmp/tmpz_szomsr/job_working_directory/000/17/outputs/dataset_45b08ec8-5468-4b6b-88a0-da5186449050.dat'; if [ $? == 0 ] && [ $(ls *.fastq | wc -l) -ge 1 ]; then break ; else echo "Prefetch attempt $SRA_PREFETCH_ATTEMPT of $SRA_PREFETCH_RETRIES exited with code $?" ; SRA_PREFETCH_ATTEMPT=`expr $SRA_PREFETCH_ATTEMPT + 1` ; sleep 1 ; fi ; done && mkdir -p output && mkdir -p outputOther && count="$(ls *.fastq | wc -l)" && echo "There are $count fastq files" && data=($(ls *.fastq)) && if [ "$count" -eq 1 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"__single.fastqsanger.gz && rm "${data[0]}"; elif [ "--split-3" = "--split-3" ]; then if [ -e "${acc}".fastq ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${acc}".fastq > outputOther/"${acc}"__single.fastqsanger.gz; fi && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_1.fastq > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_2.fastq > output/"${acc}"_reverse.fastqsanger.gz && rm "${acc}"*.fastq; elif [ "$count" -eq 2 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${data[1]}" > output/"${acc}"_reverse.fastqsanger.gz && rm "${data[0]}" && rm "${data[1]}"; else for file in ${data[*]}; do pigz -cqp ${GALAXY_SLOTS:-1} "$file" > outputOther/"$file"sanger.gz && rm "$file"; done; fi;  ); done; echo "Done with all accessions."

Exit Code:

```
0
```

Standard Output:

Downloading accession: SRR37273407...
spots read      : 45,419
reads read      : 45,419
reads written   : 45,419
There are 1 fastq files
Downloading accession: SRR37273408...
spots read      : 2,935,209
reads read      : 5,870,418
reads written   : 5,870,418
There are 2 fastq files
Done with all accessions.

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"bd64d292338911f19acb7c1e524174c7"`
adv	`{"minlen": null, "seq_defline": "@$ac.$sn/$ri", "skip_technical": true, "split": "--split-3"}`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
input	`{"__current_case__": 2, "file_list": {"values": [{"id": 21, "src": "dce"}]}, "input_select": "file_list"}`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-2b04072095278721dc9a5772e61e406f399b6030:a95f0e0ff448eede323315668bfa8ee64c918ebb-0

Command Line:

set -o | grep -q pipefail && set -o pipefail;  mkdir -p ~/.ncbi && cp '/tmp/tmpz_szomsr/job_working_directory/000/18/configs/tmpxy4nxptx' ~/.ncbi/user-settings.mkfg &&   export SRA_PREFETCH_RETRIES=3 && export SRA_PREFETCH_ATTEMPT=1 &&    grep '^[[:space:]]*[E|S|D]RR[0-9]\{1,\}[[:space:]]*$' '/tmp/tmpz_szomsr/files/1/4/d/dataset_14daf8c3-ef50-4a7f-b704-3c283cb14b3b.dat' > accessions && for acc in $(cat ./accessions); do ( echo "Downloading accession: $acc..." &&  while [ $SRA_PREFETCH_ATTEMPT -le $SRA_PREFETCH_RETRIES ] ; do fasterq-dump "$acc" -e ${GALAXY_SLOTS:-1} -t ${TMPDIR} --seq-defline '@$ac.$sn/$ri' --qual-defline '+' --split-3 --skip-technical 2>&1 | tee -a '/tmp/tmpz_szomsr/job_working_directory/000/18/outputs/dataset_7f8aa58d-6820-4082-a715-1a1c87b15752.dat'; if [ $? == 0 ] && [ $(ls *.fastq | wc -l) -ge 1 ]; then break ; else echo "Prefetch attempt $SRA_PREFETCH_ATTEMPT of $SRA_PREFETCH_RETRIES exited with code $?" ; SRA_PREFETCH_ATTEMPT=`expr $SRA_PREFETCH_ATTEMPT + 1` ; sleep 1 ; fi ; done && mkdir -p output && mkdir -p outputOther && count="$(ls *.fastq | wc -l)" && echo "There are $count fastq files" && data=($(ls *.fastq)) && if [ "$count" -eq 1 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"__single.fastqsanger.gz && rm "${data[0]}"; elif [ "--split-3" = "--split-3" ]; then if [ -e "${acc}".fastq ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${acc}".fastq > outputOther/"${acc}"__single.fastqsanger.gz; fi && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_1.fastq > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_2.fastq > output/"${acc}"_reverse.fastqsanger.gz && rm "${acc}"*.fastq; elif [ "$count" -eq 2 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${data[1]}" > output/"${acc}"_reverse.fastqsanger.gz && rm "${data[0]}" && rm "${data[1]}"; else for file in ${data[*]}; do pigz -cqp ${GALAXY_SLOTS:-1} "$file" > outputOther/"$file"sanger.gz && rm "$file"; done; fi;  ); done; echo "Done with all accessions."

Exit Code:

```
0
```

Standard Output:

Downloading accession: SRR37073390...
spots read      : 13,266,400
reads read      : 26,532,800
reads written   : 26,532,800
There are 2 fastq files
Done with all accessions.

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"bd64d292338911f19acb7c1e524174c7"`
adv	`{"minlen": null, "seq_defline": "@$ac.$sn/$ri", "skip_technical": true, "split": "--split-3"}`
chromInfo	`"/tmp/tmpz_szomsr/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
input	`{"__current_case__": 2, "file_list": {"values": [{"id": 22, "src": "dce"}]}, "input_select": "file_list"}`

Step 11: PE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"bd64d292338911f19acb7c1e524174c7"`
input	`{"values": [{"id": 15, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}, {"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [2], "connectable": true, "is_workflow": false, "type": "paired_identifier"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier2", "warn": null}]}

Step 12: SE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"bd64d292338911f19acb7c1e524174c7"`
input	`{"values": [{"id": 16, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}]}

Other invocation details
- history_id
  - ce57138c0fb39786
- history_state
  - ok
- invocation_id
  - ce57138c0fb39786
- invocation_state
  - scheduled
- workflow_id
  - ce57138c0fb39786

github-actions · 2026-04-09T09:35:36Z

Test Results (powered by Planemo)

Test Summary

Test State	Count
Total	2
Passed	0
Error	0
Failure	2
Skipped	0

Failed Tests

❌ metadata-and-sequences-from-bioproject-ids.ga_0

Problems:

Output collection 'SE output': failed to find identifier 'elements' in the tool generated elements []

Workflow invocation details

Invocation Messages

Steps

Step 1: BioProject IDs:
- step_state: scheduled
Step 2: assay (metadata download):
- step_state: scheduled
Step 3: desc (metadata download):
- step_state: scheduled
Step 4: detailed (metadata download):
- step_state: scheduled
Step 5: expand (metadata download):
- step_state: scheduled

Step 6: Separate BioProject IDs (toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

```
quay.io/biocontainers/python:3.5--2
```

Command Line:

mkdir ./out && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/2dae863c8f42/split_file_to_collection/split_file_to_collection.py' --out ./out --in '/tmp/tmpp9bzlr6o/files/b/7/c/dataset_b7c4279e-a912-4d27-a28e-d90521fdc0e6.dat' --ftype 'txt' --chunksize 1 --file_names 'split_file' --file_ext 'txt'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"txt"`
__workflow_invocation_uuid__	`"2453dbfe33f011f1b6277c1e52dd0599"`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
split_parms	`{"__current_case__": 5, "input": {"values": [{"id": 1, "src": "hda"}]}, "newfilenames": "split_file", "select_allocate": {"__current_case__": 2, "allocate": "byrow"}, "select_ftype": "txt", "select_mode": {"__current_case__": 0, "chunksize": "1", "mode": "chunk"}}`

Step 7: Add BioProject IDs as parameters (param_value_from_file):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"2453dbfe33f011f1b6277c1e52dd0599"`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Step 8: Metadata From BioProject IDs (toolshed.g2.bx.psu.edu/repos/iuc/pysradb_search/pysradb_search/2.5.1+galaxy0):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1417618' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"2453dbfe33f011f1b6277c1e52dd0599"`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1417618", "selector": "metadata"}`
dbkey	`"?"`

Step 9: Run IDs extract (toolshed.g2.bx.psu.edu/repos/iuc/table_compute/table_compute/1.2.4+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpp9bzlr6o/job_working_directory/000/5/configs/tmps1qnsbmd' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"2453dbfe33f011f1b6277c1e52dd0599"`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 3, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Step 10: fasterq-dump (toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fasterq_dump/3.1.1+galaxy1):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-2b04072095278721dc9a5772e61e406f399b6030:a95f0e0ff448eede323315668bfa8ee64c918ebb-0

Command Line:

set -o | grep -q pipefail && set -o pipefail;  mkdir -p ~/.ncbi && cp '/tmp/tmpp9bzlr6o/job_working_directory/000/6/configs/tmpitbe8dqy' ~/.ncbi/user-settings.mkfg &&   export SRA_PREFETCH_RETRIES=3 && export SRA_PREFETCH_ATTEMPT=1 &&    grep '^[[:space:]]*[E|S|D]RR[0-9]\{1,\}[[:space:]]*$' '/tmp/tmpp9bzlr6o/files/1/6/b/dataset_16b63cc2-f1ee-4ae2-a107-23d2808d9806.dat' > accessions && for acc in $(cat ./accessions); do ( echo "Downloading accession: $acc..." &&  while [ $SRA_PREFETCH_ATTEMPT -le $SRA_PREFETCH_RETRIES ] ; do fasterq-dump "$acc" -e ${GALAXY_SLOTS:-1} -t ${TMPDIR} --seq-defline '@$ac.$sn/$ri' --qual-defline '+' --split-3 --skip-technical 2>&1 | tee -a '/tmp/tmpp9bzlr6o/job_working_directory/000/6/outputs/dataset_d0db0a2f-b555-4e37-a613-107a708d5d42.dat'; if [ $? == 0 ] && [ $(ls *.fastq | wc -l) -ge 1 ]; then break ; else echo "Prefetch attempt $SRA_PREFETCH_ATTEMPT of $SRA_PREFETCH_RETRIES exited with code $?" ; SRA_PREFETCH_ATTEMPT=`expr $SRA_PREFETCH_ATTEMPT + 1` ; sleep 1 ; fi ; done && mkdir -p output && mkdir -p outputOther && count="$(ls *.fastq | wc -l)" && echo "There are $count fastq files" && data=($(ls *.fastq)) && if [ "$count" -eq 1 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"__single.fastqsanger.gz && rm "${data[0]}"; elif [ "--split-3" = "--split-3" ]; then if [ -e "${acc}".fastq ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${acc}".fastq > outputOther/"${acc}"__single.fastqsanger.gz; fi && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_1.fastq > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_2.fastq > output/"${acc}"_reverse.fastqsanger.gz && rm "${acc}"*.fastq; elif [ "$count" -eq 2 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${data[1]}" > output/"${acc}"_reverse.fastqsanger.gz && rm "${data[0]}" && rm "${data[1]}"; else for file in ${data[*]}; do pigz -cqp ${GALAXY_SLOTS:-1} "$file" > outputOther/"$file"sanger.gz && rm "$file"; done; fi;  ); done; echo "Done with all accessions."

Exit Code:

```
0
```

Standard Output:

Downloading accession: SRR37073390...
spots read      : 13,266,400
reads read      : 26,532,800
reads written   : 26,532,800
There are 2 fastq files
Done with all accessions.

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"2453dbfe33f011f1b6277c1e52dd0599"`
adv	`{"minlen": null, "seq_defline": "@$ac.$sn/$ri", "skip_technical": true, "split": "--split-3"}`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
input	`{"__current_case__": 2, "file_list": {"values": [{"id": 4, "src": "dce"}]}, "input_select": "file_list"}`

Step 11: PE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"2453dbfe33f011f1b6277c1e52dd0599"`
input	`{"values": [{"id": 5, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}, {"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [2], "connectable": true, "is_workflow": false, "type": "paired_identifier"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier2", "warn": null}]}

Step 12: SE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"2453dbfe33f011f1b6277c1e52dd0599"`
input	`{"values": [{"id": 6, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}]}

Other invocation details
- history_id
  - 012fa4c3251c84b5
- history_state
  - ok
- invocation_id
  - 012fa4c3251c84b5
- invocation_state
  - scheduled
- workflow_id
  - 77a94017f017a81d

❌ metadata-and-sequences-from-bioproject-ids.ga_1

Problems:

Output with path /tmp/tmplbcb1r0_/SRR37273407__260c9b2a-f2f3-4886-bc47-4776a8977a46.fastqsanger.gz different than expected, difference (using contains):
( /home/runner/work/iwc/iwc/workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_SRR37273407_forward.fastq v. /tmp/tmpm678dk3jtest2_SRR37273407_forward.fastq )
Failed to find '@SRR37273407.1 81826be3-9349-4299-8ccd-e1900043df2e/1' in history data. (lines_diff=0).

Workflow invocation details

Invocation Messages

Steps

Step 1: BioProject IDs:
- step_state: scheduled
Step 2: assay (metadata download):
- step_state: scheduled
Step 3: desc (metadata download):
- step_state: scheduled
Step 4: detailed (metadata download):
- step_state: scheduled
Step 5: expand (metadata download):
- step_state: scheduled

Step 6: Separate BioProject IDs (toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

```
quay.io/biocontainers/python:3.5--2
```

Command Line:

mkdir ./out && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/2dae863c8f42/split_file_to_collection/split_file_to_collection.py' --out ./out --in '/tmp/tmpp9bzlr6o/files/d/1/a/dataset_d1a032f9-a04d-44eb-9f2b-7e6d4f7123c5.dat' --ftype 'txt' --chunksize 1 --file_names 'split_file' --file_ext 'txt'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"txt"`
__workflow_invocation_uuid__	`"131fc0e833f311f1b6277c1e52dd0599"`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
split_parms	`{"__current_case__": 5, "input": {"values": [{"id": 11, "src": "hda"}]}, "newfilenames": "split_file", "select_allocate": {"__current_case__": 2, "allocate": "byrow"}, "select_ftype": "txt", "select_mode": {"__current_case__": 0, "chunksize": "1", "mode": "chunk"}}`

Step 7: Add BioProject IDs as parameters (param_value_from_file):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"131fc0e833f311f1b6277c1e52dd0599"`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Job 2:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"131fc0e833f311f1b6277c1e52dd0599"`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Step 8: Metadata From BioProject IDs (toolshed.g2.bx.psu.edu/repos/iuc/pysradb_search/pysradb_search/2.5.1+galaxy0):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1425250' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"131fc0e833f311f1b6277c1e52dd0599"`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1425250", "selector": "metadata"}`
dbkey	`"?"`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1417618' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"131fc0e833f311f1b6277c1e52dd0599"`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1417618", "selector": "metadata"}`
dbkey	`"?"`

Step 9: Run IDs extract (toolshed.g2.bx.psu.edu/repos/iuc/table_compute/table_compute/1.2.4+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpp9bzlr6o/job_working_directory/000/15/configs/tmp4r_at51r' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"131fc0e833f311f1b6277c1e52dd0599"`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 19, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmpp9bzlr6o/job_working_directory/000/16/configs/tmpmirdlkji' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"131fc0e833f311f1b6277c1e52dd0599"`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 20, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Step 10: fasterq-dump (toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fasterq_dump/3.1.1+galaxy1):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-2b04072095278721dc9a5772e61e406f399b6030:a95f0e0ff448eede323315668bfa8ee64c918ebb-0

Command Line:

set -o | grep -q pipefail && set -o pipefail;  mkdir -p ~/.ncbi && cp '/tmp/tmpp9bzlr6o/job_working_directory/000/17/configs/tmpl93wlxn3' ~/.ncbi/user-settings.mkfg &&   export SRA_PREFETCH_RETRIES=3 && export SRA_PREFETCH_ATTEMPT=1 &&    grep '^[[:space:]]*[E|S|D]RR[0-9]\{1,\}[[:space:]]*$' '/tmp/tmpp9bzlr6o/files/f/3/1/dataset_f3186fe3-7d43-4b89-ac99-42003aa06e60.dat' > accessions && for acc in $(cat ./accessions); do ( echo "Downloading accession: $acc..." &&  while [ $SRA_PREFETCH_ATTEMPT -le $SRA_PREFETCH_RETRIES ] ; do fasterq-dump "$acc" -e ${GALAXY_SLOTS:-1} -t ${TMPDIR} --seq-defline '@$ac.$sn/$ri' --qual-defline '+' --split-3 --skip-technical 2>&1 | tee -a '/tmp/tmpp9bzlr6o/job_working_directory/000/17/outputs/dataset_208c73ac-f0ff-4ec6-9607-c662a8ccff79.dat'; if [ $? == 0 ] && [ $(ls *.fastq | wc -l) -ge 1 ]; then break ; else echo "Prefetch attempt $SRA_PREFETCH_ATTEMPT of $SRA_PREFETCH_RETRIES exited with code $?" ; SRA_PREFETCH_ATTEMPT=`expr $SRA_PREFETCH_ATTEMPT + 1` ; sleep 1 ; fi ; done && mkdir -p output && mkdir -p outputOther && count="$(ls *.fastq | wc -l)" && echo "There are $count fastq files" && data=($(ls *.fastq)) && if [ "$count" -eq 1 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"__single.fastqsanger.gz && rm "${data[0]}"; elif [ "--split-3" = "--split-3" ]; then if [ -e "${acc}".fastq ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${acc}".fastq > outputOther/"${acc}"__single.fastqsanger.gz; fi && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_1.fastq > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_2.fastq > output/"${acc}"_reverse.fastqsanger.gz && rm "${acc}"*.fastq; elif [ "$count" -eq 2 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${data[1]}" > output/"${acc}"_reverse.fastqsanger.gz && rm "${data[0]}" && rm "${data[1]}"; else for file in ${data[*]}; do pigz -cqp ${GALAXY_SLOTS:-1} "$file" > outputOther/"$file"sanger.gz && rm "$file"; done; fi;  ); done; echo "Done with all accessions."

Exit Code:

```
0
```

Standard Output:

Downloading accession: SRR37273407...
spots read      : 45,419
reads read      : 45,419
reads written   : 45,419
There are 1 fastq files
Downloading accession: SRR37273408...
spots read      : 2,935,209
reads read      : 5,870,418
reads written   : 5,870,418
There are 2 fastq files
Done with all accessions.

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"131fc0e833f311f1b6277c1e52dd0599"`
adv	`{"minlen": null, "seq_defline": "@$ac.$sn/$ri", "skip_technical": true, "split": "--split-3"}`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
input	`{"__current_case__": 2, "file_list": {"values": [{"id": 21, "src": "dce"}]}, "input_select": "file_list"}`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-2b04072095278721dc9a5772e61e406f399b6030:a95f0e0ff448eede323315668bfa8ee64c918ebb-0

Command Line:

set -o | grep -q pipefail && set -o pipefail;  mkdir -p ~/.ncbi && cp '/tmp/tmpp9bzlr6o/job_working_directory/000/18/configs/tmpfp5je7d0' ~/.ncbi/user-settings.mkfg &&   export SRA_PREFETCH_RETRIES=3 && export SRA_PREFETCH_ATTEMPT=1 &&    grep '^[[:space:]]*[E|S|D]RR[0-9]\{1,\}[[:space:]]*$' '/tmp/tmpp9bzlr6o/files/e/6/a/dataset_e6af459c-742f-4755-8ab7-b0364b8bc775.dat' > accessions && for acc in $(cat ./accessions); do ( echo "Downloading accession: $acc..." &&  while [ $SRA_PREFETCH_ATTEMPT -le $SRA_PREFETCH_RETRIES ] ; do fasterq-dump "$acc" -e ${GALAXY_SLOTS:-1} -t ${TMPDIR} --seq-defline '@$ac.$sn/$ri' --qual-defline '+' --split-3 --skip-technical 2>&1 | tee -a '/tmp/tmpp9bzlr6o/job_working_directory/000/18/outputs/dataset_fdd9e3f8-3b05-4c08-be9e-8b81809ddc7e.dat'; if [ $? == 0 ] && [ $(ls *.fastq | wc -l) -ge 1 ]; then break ; else echo "Prefetch attempt $SRA_PREFETCH_ATTEMPT of $SRA_PREFETCH_RETRIES exited with code $?" ; SRA_PREFETCH_ATTEMPT=`expr $SRA_PREFETCH_ATTEMPT + 1` ; sleep 1 ; fi ; done && mkdir -p output && mkdir -p outputOther && count="$(ls *.fastq | wc -l)" && echo "There are $count fastq files" && data=($(ls *.fastq)) && if [ "$count" -eq 1 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"__single.fastqsanger.gz && rm "${data[0]}"; elif [ "--split-3" = "--split-3" ]; then if [ -e "${acc}".fastq ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${acc}".fastq > outputOther/"${acc}"__single.fastqsanger.gz; fi && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_1.fastq > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_2.fastq > output/"${acc}"_reverse.fastqsanger.gz && rm "${acc}"*.fastq; elif [ "$count" -eq 2 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${data[1]}" > output/"${acc}"_reverse.fastqsanger.gz && rm "${data[0]}" && rm "${data[1]}"; else for file in ${data[*]}; do pigz -cqp ${GALAXY_SLOTS:-1} "$file" > outputOther/"$file"sanger.gz && rm "$file"; done; fi;  ); done; echo "Done with all accessions."

Exit Code:

```
0
```

Standard Output:

Downloading accession: SRR37073390...
spots read      : 13,266,400
reads read      : 26,532,800
reads written   : 26,532,800
There are 2 fastq files
Done with all accessions.

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"131fc0e833f311f1b6277c1e52dd0599"`
adv	`{"minlen": null, "seq_defline": "@$ac.$sn/$ri", "skip_technical": true, "split": "--split-3"}`
chromInfo	`"/tmp/tmpp9bzlr6o/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
input	`{"__current_case__": 2, "file_list": {"values": [{"id": 22, "src": "dce"}]}, "input_select": "file_list"}`

Step 11: PE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"131fc0e833f311f1b6277c1e52dd0599"`
input	`{"values": [{"id": 15, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}, {"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [2], "connectable": true, "is_workflow": false, "type": "paired_identifier"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier2", "warn": null}]}

Step 12: SE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"131fc0e833f311f1b6277c1e52dd0599"`
input	`{"values": [{"id": 16, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}]}

Other invocation details
- history_id
  - 77a94017f017a81d
- history_state
  - ok
- invocation_id
  - 77a94017f017a81d
- invocation_state
  - scheduled
- workflow_id
  - 77a94017f017a81d

github-actions · 2026-04-09T19:41:57Z

Test Results (powered by Planemo)

Test Summary

Test State	Count
Total	2
Passed	1
Error	0
Failure	1
Skipped	0

Failed Tests

❌ metadata-and-sequences-from-bioproject-ids.ga_1

Problems:

Output with path /tmp/tmpiyx5crd0/SRR37273407__07f40436-1128-4f3e-96cc-c31e640257d1.fastqsanger.gz different than expected, difference (using contains):
( /home/runner/work/iwc/iwc/workflows/data-fetching/metadata-and-sequences-from-BioProjectIDs/test-data/test2_SRR37273407_forward.fastq v. /tmp/tmpslh8o9r8test2_SRR37273407_forward.fastq )
Failed to find '@SRR37273407.1 81826be3-9349-4299-8ccd-e1900043df2e/1' in history data. (lines_diff=0).

Workflow invocation details

Invocation Messages

Steps

Step 1: BioProject IDs:
- step_state: scheduled
Step 2: assay (metadata download):
- step_state: scheduled
Step 3: desc (metadata download):
- step_state: scheduled
Step 4: detailed (metadata download):
- step_state: scheduled
Step 5: expand (metadata download):
- step_state: scheduled

Step 6: Separate BioProject IDs (toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

```
quay.io/biocontainers/python:3.5--2
```

Command Line:

mkdir ./out && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/2dae863c8f42/split_file_to_collection/split_file_to_collection.py' --out ./out --in '/tmp/tmprggu1by5/files/c/7/2/dataset_c72088f6-4594-49e9-8503-8c2cbf9dde4d.dat' --ftype 'txt' --chunksize 1 --file_names 'split_file' --file_ext 'txt'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"txt"`
__workflow_invocation_uuid__	`"9c75356c343911f19acb7c1e5239ee4f"`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
split_parms	`{"__current_case__": 5, "input": {"values": [{"id": 11, "src": "hda"}]}, "newfilenames": "split_file", "select_allocate": {"__current_case__": 2, "allocate": "byrow"}, "select_ftype": "txt", "select_mode": {"__current_case__": 0, "chunksize": "1", "mode": "chunk"}}`

Step 7: Add BioProject IDs as parameters (param_value_from_file):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"9c75356c343911f19acb7c1e5239ee4f"`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Job 2:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"9c75356c343911f19acb7c1e5239ee4f"`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Step 8: Metadata From BioProject IDs (toolshed.g2.bx.psu.edu/repos/iuc/pysradb_search/pysradb_search/2.5.1+galaxy0):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1425250' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"9c75356c343911f19acb7c1e5239ee4f"`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1425250", "selector": "metadata"}`
dbkey	`"?"`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1417618' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"9c75356c343911f19acb7c1e5239ee4f"`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1417618", "selector": "metadata"}`
dbkey	`"?"`

Step 9: Run IDs extract (toolshed.g2.bx.psu.edu/repos/iuc/table_compute/table_compute/1.2.4+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmprggu1by5/job_working_directory/000/15/configs/tmp3fq4fogu' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"9c75356c343911f19acb7c1e5239ee4f"`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 19, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmprggu1by5/job_working_directory/000/16/configs/tmpl7xngk0_' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"9c75356c343911f19acb7c1e5239ee4f"`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 20, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Step 10: fasterq-dump (toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fasterq_dump/3.1.1+galaxy1):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-2b04072095278721dc9a5772e61e406f399b6030:a95f0e0ff448eede323315668bfa8ee64c918ebb-0

Command Line:

set -o | grep -q pipefail && set -o pipefail;  mkdir -p ~/.ncbi && cp '/tmp/tmprggu1by5/job_working_directory/000/17/configs/tmpivqxcgl9' ~/.ncbi/user-settings.mkfg &&   export SRA_PREFETCH_RETRIES=3 && export SRA_PREFETCH_ATTEMPT=1 &&    grep '^[[:space:]]*[E|S|D]RR[0-9]\{1,\}[[:space:]]*$' '/tmp/tmprggu1by5/files/c/c/3/dataset_cc32357c-eff0-499f-9d16-850a744e46c2.dat' > accessions && for acc in $(cat ./accessions); do ( echo "Downloading accession: $acc..." &&  while [ $SRA_PREFETCH_ATTEMPT -le $SRA_PREFETCH_RETRIES ] ; do fasterq-dump "$acc" -e ${GALAXY_SLOTS:-1} -t ${TMPDIR} --seq-defline '@$ac.$sn/$ri' --qual-defline '+' --split-3 --skip-technical 2>&1 | tee -a '/tmp/tmprggu1by5/job_working_directory/000/17/outputs/dataset_954b7893-6cde-4954-b302-dc6a15e3c621.dat'; if [ $? == 0 ] && [ $(ls *.fastq | wc -l) -ge 1 ]; then break ; else echo "Prefetch attempt $SRA_PREFETCH_ATTEMPT of $SRA_PREFETCH_RETRIES exited with code $?" ; SRA_PREFETCH_ATTEMPT=`expr $SRA_PREFETCH_ATTEMPT + 1` ; sleep 1 ; fi ; done && mkdir -p output && mkdir -p outputOther && count="$(ls *.fastq | wc -l)" && echo "There are $count fastq files" && data=($(ls *.fastq)) && if [ "$count" -eq 1 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"__single.fastqsanger.gz && rm "${data[0]}"; elif [ "--split-3" = "--split-3" ]; then if [ -e "${acc}".fastq ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${acc}".fastq > outputOther/"${acc}"__single.fastqsanger.gz; fi && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_1.fastq > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_2.fastq > output/"${acc}"_reverse.fastqsanger.gz && rm "${acc}"*.fastq; elif [ "$count" -eq 2 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${data[1]}" > output/"${acc}"_reverse.fastqsanger.gz && rm "${data[0]}" && rm "${data[1]}"; else for file in ${data[*]}; do pigz -cqp ${GALAXY_SLOTS:-1} "$file" > outputOther/"$file"sanger.gz && rm "$file"; done; fi;  ); done; echo "Done with all accessions."

Exit Code:

```
0
```

Standard Output:

Downloading accession: SRR37273407...
spots read      : 45,419
reads read      : 45,419
reads written   : 45,419
There are 1 fastq files
Downloading accession: SRR37273408...
spots read      : 2,935,209
reads read      : 5,870,418
reads written   : 5,870,418
There are 2 fastq files
Done with all accessions.

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"9c75356c343911f19acb7c1e5239ee4f"`
adv	`{"minlen": null, "seq_defline": "@$ac.$sn/$ri", "skip_technical": true, "split": "--split-3"}`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
input	`{"__current_case__": 2, "file_list": {"values": [{"id": 21, "src": "dce"}]}, "input_select": "file_list"}`

Job 2:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-2b04072095278721dc9a5772e61e406f399b6030:a95f0e0ff448eede323315668bfa8ee64c918ebb-0

Command Line:

set -o | grep -q pipefail && set -o pipefail;  mkdir -p ~/.ncbi && cp '/tmp/tmprggu1by5/job_working_directory/000/18/configs/tmpvvedcy4g' ~/.ncbi/user-settings.mkfg &&   export SRA_PREFETCH_RETRIES=3 && export SRA_PREFETCH_ATTEMPT=1 &&    grep '^[[:space:]]*[E|S|D]RR[0-9]\{1,\}[[:space:]]*$' '/tmp/tmprggu1by5/files/2/3/7/dataset_237d78a3-05c3-4f19-b85c-9858f06f9929.dat' > accessions && for acc in $(cat ./accessions); do ( echo "Downloading accession: $acc..." &&  while [ $SRA_PREFETCH_ATTEMPT -le $SRA_PREFETCH_RETRIES ] ; do fasterq-dump "$acc" -e ${GALAXY_SLOTS:-1} -t ${TMPDIR} --seq-defline '@$ac.$sn/$ri' --qual-defline '+' --split-3 --skip-technical 2>&1 | tee -a '/tmp/tmprggu1by5/job_working_directory/000/18/outputs/dataset_38993ca6-5d33-4ee9-aead-089e45729e60.dat'; if [ $? == 0 ] && [ $(ls *.fastq | wc -l) -ge 1 ]; then break ; else echo "Prefetch attempt $SRA_PREFETCH_ATTEMPT of $SRA_PREFETCH_RETRIES exited with code $?" ; SRA_PREFETCH_ATTEMPT=`expr $SRA_PREFETCH_ATTEMPT + 1` ; sleep 1 ; fi ; done && mkdir -p output && mkdir -p outputOther && count="$(ls *.fastq | wc -l)" && echo "There are $count fastq files" && data=($(ls *.fastq)) && if [ "$count" -eq 1 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"__single.fastqsanger.gz && rm "${data[0]}"; elif [ "--split-3" = "--split-3" ]; then if [ -e "${acc}".fastq ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${acc}".fastq > outputOther/"${acc}"__single.fastqsanger.gz; fi && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_1.fastq > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_2.fastq > output/"${acc}"_reverse.fastqsanger.gz && rm "${acc}"*.fastq; elif [ "$count" -eq 2 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${data[1]}" > output/"${acc}"_reverse.fastqsanger.gz && rm "${data[0]}" && rm "${data[1]}"; else for file in ${data[*]}; do pigz -cqp ${GALAXY_SLOTS:-1} "$file" > outputOther/"$file"sanger.gz && rm "$file"; done; fi;  ); done; echo "Done with all accessions."

Exit Code:

```
0
```

Standard Output:

Downloading accession: SRR37073390...
spots read      : 13,266,400
reads read      : 26,532,800
reads written   : 26,532,800
There are 2 fastq files
Done with all accessions.

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"9c75356c343911f19acb7c1e5239ee4f"`
adv	`{"minlen": null, "seq_defline": "@$ac.$sn/$ri", "skip_technical": true, "split": "--split-3"}`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
input	`{"__current_case__": 2, "file_list": {"values": [{"id": 22, "src": "dce"}]}, "input_select": "file_list"}`

Step 11: PE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"9c75356c343911f19acb7c1e5239ee4f"`
input	`{"values": [{"id": 15, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}, {"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [2], "connectable": true, "is_workflow": false, "type": "paired_identifier"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier2", "warn": null}]}

Step 12: SE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"9c75356c343911f19acb7c1e5239ee4f"`
input	`{"values": [{"id": 16, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}]}

Other invocation details
- history_id
  - 44565cbdc794da37
- history_state
  - ok
- invocation_id
  - 44565cbdc794da37
- invocation_state
  - scheduled
- workflow_id
  - 44565cbdc794da37

Passed Tests

✅ metadata-and-sequences-from-bioproject-ids.ga_0

Workflow invocation details

Invocation Messages

Steps

Step 1: BioProject IDs:
- step_state: scheduled
Step 2: assay (metadata download):
- step_state: scheduled
Step 3: desc (metadata download):
- step_state: scheduled
Step 4: detailed (metadata download):
- step_state: scheduled
Step 5: expand (metadata download):
- step_state: scheduled

Step 6: Separate BioProject IDs (toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/split_file_to_collection/0.5.2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

```
quay.io/biocontainers/python:3.5--2
```

Command Line:

mkdir ./out && python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/bgruening/split_file_to_collection/2dae863c8f42/split_file_to_collection/split_file_to_collection.py' --out ./out --in '/tmp/tmprggu1by5/files/3/0/2/dataset_302fa693-3d4a-464c-a42f-c4bd80056417.dat' --ftype 'txt' --chunksize 1 --file_names 'split_file' --file_ext 'txt'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"txt"`
__workflow_invocation_uuid__	`"8459a7ae343611f19acb7c1e5239ee4f"`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
split_parms	`{"__current_case__": 5, "input": {"values": [{"id": 1, "src": "hda"}]}, "newfilenames": "split_file", "select_allocate": {"__current_case__": 2, "allocate": "byrow"}, "select_ftype": "txt", "select_mode": {"__current_case__": 0, "chunksize": "1", "mode": "chunk"}}`

Step 7: Add BioProject IDs as parameters (param_value_from_file):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"8459a7ae343611f19acb7c1e5239ee4f"`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Step 8: Metadata From BioProject IDs (toolshed.g2.bx.psu.edu/repos/iuc/pysradb_search/pysradb_search/2.5.1+galaxy0):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-e62c45964731bf241efeedb78776ebc093302f62:3c386467fc54c7b7a8da30b0705408fd927d49c0-0

Command Line:

pysradb metadata 'PRJNA1417618' --saveto metadata_output.tsv   --detailed   && pysradb --version

Exit Code:

```
0
```

Standard Output:

```
pysradb 2.5.1
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"8459a7ae343611f19acb7c1e5239ee4f"`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
conditional_subcommand	`{"__current_case__": 1, "assay": false, "desc": false, "detailed": true, "expand": false, "prj_id": "PRJNA1417618", "selector": "metadata"}`
dbkey	`"?"`

Step 9: Run IDs extract (toolshed.g2.bx.psu.edu/repos/iuc/table_compute/table_compute/1.2.4+galaxy2):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-344874846f44224e5f0b7b741eacdddffe895d1e:d3fff24ee1297b4c3bcef48354c2a30f0c82007a-2

Command Line:

cp '/tmp/tmprggu1by5/job_working_directory/000/5/configs/tmpv87ol3h0' ./userconfig.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/safety.py' ./safety.py && cp '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/table_compute/cd36d6e45e29/table_compute/scripts/table_compute.py' ./table_compute.py && python ./table_compute.py

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tsv"`
__workflow_invocation_uuid__	`"8459a7ae343611f19acb7c1e5239ee4f"`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
out_opts	`None`
precision	`"6"`
singtabop	`{"__current_case__": 0, "adv": {"header": null, "nrows": null, "skip_blank_lines": true, "skipfooter": null}, "col_row_names": ["has_col_names"], "input": {"values": [{"id": 3, "src": "dce"}]}, "use_type": "single", "user": {"__current_case__": 1, "mode": "select", "select_cols_wanted": "1", "select_keepdupe": null, "select_rows_wanted": null}}`

Step 10: fasterq-dump (toolshed.g2.bx.psu.edu/repos/iuc/sra_tools/fasterq_dump/3.1.1+galaxy1):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Container:

quay.io/biocontainers/mulled-v2-2b04072095278721dc9a5772e61e406f399b6030:a95f0e0ff448eede323315668bfa8ee64c918ebb-0

Command Line:

set -o | grep -q pipefail && set -o pipefail;  mkdir -p ~/.ncbi && cp '/tmp/tmprggu1by5/job_working_directory/000/6/configs/tmpw8qwyi1e' ~/.ncbi/user-settings.mkfg &&   export SRA_PREFETCH_RETRIES=3 && export SRA_PREFETCH_ATTEMPT=1 &&    grep '^[[:space:]]*[E|S|D]RR[0-9]\{1,\}[[:space:]]*$' '/tmp/tmprggu1by5/files/a/6/c/dataset_a6c5b46a-3848-4eb5-b8a7-499e424a4b18.dat' > accessions && for acc in $(cat ./accessions); do ( echo "Downloading accession: $acc..." &&  while [ $SRA_PREFETCH_ATTEMPT -le $SRA_PREFETCH_RETRIES ] ; do fasterq-dump "$acc" -e ${GALAXY_SLOTS:-1} -t ${TMPDIR} --seq-defline '@$ac.$sn/$ri' --qual-defline '+' --split-3 --skip-technical 2>&1 | tee -a '/tmp/tmprggu1by5/job_working_directory/000/6/outputs/dataset_e1815531-30b4-4565-91b8-33fe0d624c8f.dat'; if [ $? == 0 ] && [ $(ls *.fastq | wc -l) -ge 1 ]; then break ; else echo "Prefetch attempt $SRA_PREFETCH_ATTEMPT of $SRA_PREFETCH_RETRIES exited with code $?" ; SRA_PREFETCH_ATTEMPT=`expr $SRA_PREFETCH_ATTEMPT + 1` ; sleep 1 ; fi ; done && mkdir -p output && mkdir -p outputOther && count="$(ls *.fastq | wc -l)" && echo "There are $count fastq files" && data=($(ls *.fastq)) && if [ "$count" -eq 1 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"__single.fastqsanger.gz && rm "${data[0]}"; elif [ "--split-3" = "--split-3" ]; then if [ -e "${acc}".fastq ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${acc}".fastq > outputOther/"${acc}"__single.fastqsanger.gz; fi && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_1.fastq > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${acc}"_2.fastq > output/"${acc}"_reverse.fastqsanger.gz && rm "${acc}"*.fastq; elif [ "$count" -eq 2 ]; then pigz -cqp ${GALAXY_SLOTS:-1} "${data[0]}" > output/"${acc}"_forward.fastqsanger.gz && pigz -cqp ${GALAXY_SLOTS:-1} "${data[1]}" > output/"${acc}"_reverse.fastqsanger.gz && rm "${data[0]}" && rm "${data[1]}"; else for file in ${data[*]}; do pigz -cqp ${GALAXY_SLOTS:-1} "$file" > outputOther/"$file"sanger.gz && rm "$file"; done; fi;  ); done; echo "Done with all accessions."

Exit Code:

```
0
```

Standard Output:

Downloading accession: SRR37073390...
spots read      : 13,266,400
reads read      : 26,532,800
reads written   : 26,532,800
There are 2 fastq files
Done with all accessions.

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"8459a7ae343611f19acb7c1e5239ee4f"`
adv	`{"minlen": null, "seq_defline": "@$ac.$sn/$ri", "skip_technical": true, "split": "--split-3"}`
chromInfo	`"/tmp/tmprggu1by5/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
input	`{"__current_case__": 2, "file_list": {"values": [{"id": 4, "src": "dce"}]}, "input_select": "file_list"}`

Step 11: PE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"8459a7ae343611f19acb7c1e5239ee4f"`
input	`{"values": [{"id": 5, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}, {"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [2], "connectable": true, "is_workflow": false, "type": "paired_identifier"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier2", "warn": null}]}

Step 12: SE collection regularization (__APPLY_RULES__):

step_state: scheduled

Jobs

Job 1:

Job state is ok

Traceback:

Job Parameters:

Job parameter	Parameter value
__workflow_invocation_uuid__	`"8459a7ae343611f19acb7c1e5239ee4f"`
input	`{"values": [{"id": 6, "src": "hdca"}]}`
rules	{"mapping": [{"collapsible_value": {"__class__": "RuntimeValue"}, "columns": [1], "connectable": true, "editing": false, "is_workflow": false, "type": "list_identifiers"}], "rules": [{"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier0", "warn": null}, {"collapsible_value": {"__class__": "RuntimeValue"}, "connectable": true, "error": null, "is_workflow": false, "type": "add_column_metadata", "value": "identifier1", "warn": null}]}

Other invocation details
- history_id
  - 2d78cb836665f42d
- history_state
  - ok
- invocation_id
  - 2d78cb836665f42d
- invocation_state
  - scheduled
- workflow_id
  - 44565cbdc794da37

lldelisle · 2026-04-10T17:21:38Z

Hi @gdefazio ,
I am currently running the workflow to really understand what it is doing. I will give my comments before one week.

lldelisle · 2026-04-13T05:10:41Z

Hi @gdefazio,
The workflow takes as input a list of project IDs, then it generates one file per project ID with all SRA inside. This is given to fasterq-dump and then the collections are flatten to get a list collection for SR and a list:pair collection for PE.

I think your workflow is a great idea but I have few problems with the current workflow:

The number of SRA per project can be large and therefore the fasterqdump step can be very long and I agree with @mvdbeek it would be better to use the subworkflow "Parallel accession download".
To get the SRA the workflow takes the first column however, I feel like if you do not choose detailed (metadata download) to true then accession_run is present in the file but is not the first column.

I have noticed something that could be improved:

The identifiers of the metadata tables should be the project IDs

So I propose here a new version: https://usegalaxy.eu/u/delislel/w/metadata-and-sequences-from-bioproject-ids-final-ld

What I did compared to your workflow:

I used the Parallel accession download as subworkflow and as input I simply concatenated all the files with SRA into one (you loose the info of from which project it comes from but in your version as at the end you flatten the collections I would say it is the same).
I used a awk step to find the column ID with 'run_accession' and output the SRA list in case the user has not put detailed (metadata download) to true.
I inserted a step to relabel the metadata tables with the project ID.

I let you try it and tell me what you think.

@mvdbeek is there a policy in case a workflow uses another workflow as subworkflow? Do they need to be in the same directory?

Don't hesitate if you have questions or remarks, I would be happy to answer.

mvdbeek · 2026-04-13T06:00:51Z

Subworkflows are just embedded inside the parent. We're working on making those references too (galaxyproject/galaxy#21887) and use symlinks in the IWC but for now this is fine as is and will just work.

gdefazio · 2026-04-13T11:21:38Z

Hi @lldelisle and thank you for the valuable work on this wf and for positive feedback.
Let me answer you by poiny.

The number of SRA per project can be large and therefore the fasterqdump step can be very long and I agree with @mvdbeek it would be better to use the subworkflow "Parallel accession download".

After Marcus comment I tried to integrate "Parallel accession download" in this WF but I had some problems with the "apply rules" step because of issue with nested structure non matching what expected. Thank you for solving that.

To get the SRA the workflow takes the first column however, I feel like if you do not choose detailed (metadata download) to true then accession_run is present in the file but is not the first column.

I was not aware about that, thanks for solving it.

I have noticed something that could be improved:

The identifiers of the metadata tables should be the project IDs

So I propose here a new version: https://usegalaxy.eu/u/delislel/w/metadata-and-sequences-from-bioproject-ids-final-ld

What I did compared to your workflow:

I used the Parallel accession download as subworkflow and as input I simply concatenated all the files with SRA into one (you loose the info of from which project it comes from but in your version as at the end you flatten the collections I would say it is the same).

Do you suggest to have one fastq collection for each BioProject ID?

I used a awk step to find the column ID with 'run_accession' and output the SRA list in case the user has not put detailed (metadata download) to true.

I inserted a step to relabel the metadata tables with the project ID.

I let you try it and tell me what you think.

I tried it and I think is so much better than my version. Thank you.

@mvdbeek is there a policy in case a workflow uses another workflow as subworkflow? Do they need to be in the same directory?

Don't hesitate if you have questions or remarks, I would be happy to answer.

Can I add you as WF author?

Thanks again for your effort.

gdefazio · 2026-04-16T08:47:28Z

Hi @lldelisle please give me some feedbacks to the previous message when you have time. Thanks in advance

lldelisle · 2026-04-16T19:52:02Z

Hi,
Yes I would be happy to be an author.
I don't think it is necessary to return one collection per project.

gdefazio added 4 commits March 17, 2026 16:56

first commit for this wf

edb846b

first tests setting

736f719

test txt files

10ad964

commit before PR

dabd3ac

This comment was marked as outdated.

Sign in to view

gdefazio added 3 commits March 23, 2026 23:11

add release

4713a2b

rename test

5b49bd5

rename test

7b46681

This comment was marked as outdated.

Sign in to view

test fix

59ce11a

gdefazio added 2 commits March 26, 2026 17:17

contains on metadata

e9a5cfc

README completion

72c9b2d

mvdbeek requested a review from Copilot March 27, 2026 14:28

Copilot started reviewing on behalf of mvdbeek March 27, 2026 14:29 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

Copilot agent suggestions commit

7e23933

gdefazio added 2 commits April 7, 2026 12:04

fasterq-dump integration

dc88691

fasterq-dump integration fix

92ac9ee

errors fix

79bb130

tests errors fix

67ed4a4

test1 "SE output" delete

bcd98ac

fastq update

89ca531

mvdbeek requested review from lldelisle and wm75 April 10, 2026 15:03

gdefazio added 2 commits April 20, 2026 17:10

lldelisle version commit

747ad3d

lldelisle version commit - errors fix

05695f2

	"annotation": "This workflow takes BioProject IDs as input and is able to retrieve SRA tables and FASTQ files from them.",
	"annotation": "This workflow performs retrieval of SRA metadata tables and FASTQ sequence files from input BioProject IDs.",

	workflow: metadata-and-sequences-from-BioProjectsIDs/main
	workflow: metadata-and-sequences-from-bioprojectids/main

		"name": "Metadata and Sequences from BioProjectIDs",
		"readme": "# Metadata and Sequences from BioProjectIDs\n\n## Rationale\nIn order to promote re-analysis of publicly available sequences from INSDC databases, we propose Metadata and Sequences from BioProjectIDs a Galaxy workflow that starting by a list of valid BioProject IDs (e.g. PRJNA....., PRJEB.....) is able to manage data and metadata download.\n\n## Usage\nUpload a text file in which there is a BioProject ID for each row and run the workflow.\n\nThere is also the possibility to set optional options to regulate behaviour of metadata and data download.\n",

Conversation

gdefazio commented Mar 23, 2026

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

github-actions bot commented Mar 24, 2026

Test Results (powered by Planemo)

Test Summary

Workflow invocation details

Workflow invocation details

Uh oh!

github-actions bot commented Mar 24, 2026

Test Results (powered by Planemo)

Test Summary

Workflow invocation details

Workflow invocation details

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

mvdbeek commented Mar 27, 2026

Uh oh!

gdefazio commented Mar 27, 2026

Uh oh!

gdefazio commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Test Results (powered by Planemo)

Test Summary

Workflow invocation details

Workflow invocation details

Uh oh!

github-actions bot commented Apr 9, 2026

Test Results (powered by Planemo)

Test Summary

Workflow invocation details

Workflow invocation details

Uh oh!

github-actions bot commented Apr 9, 2026

Test Results (powered by Planemo)

Test Summary

Workflow invocation details

Workflow invocation details

Uh oh!

lldelisle commented Apr 10, 2026

Uh oh!

lldelisle commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

lldelisle commented Apr 13, 2026 •

edited

Loading

gdefazio commented Apr 13, 2026 •

edited

Loading