Skip to content

Fix: kraken2 builder: update URLs#7883

Open
jakobnissen wants to merge 2 commits intogalaxyproject:mainfrom
jakobnissen:kraken_options
Open

Fix: kraken2 builder: update URLs#7883
jakobnissen wants to merge 2 commits intogalaxyproject:mainfrom
jakobnissen:kraken_options

Conversation

@jakobnissen
Copy link
Copy Markdown

@jakobnissen jakobnissen commented Apr 13, 2026

This change fixes the kraken2 builder with two changes:

Fix generated URLs

The generated URLs does not correspond to the actual URLs where the data is stored. This makes building about 21 of the databases fail, e.g. the standard 8 GB from 2025-10-15.

In this fix, I used the webpage https://benlangmead.github.io/aws-indexes/k2 to check the actual URLs, and verified each of them responds (but did not download every database to check it actually contains the correct file).

Fix Python compat

Commit 797b5fc changed data_managers/data_manager_build_kraken2_database/
data_manager/kraken2_build_database.xml#59 from the deprecated
datetime.utcnow() to datetime.now(datetime.UTC). However, datetime.UTC was
introduced in Python 3.11, making this tool break on systems with earlier
Python versions.

This fix should behave identically and support back to Python 3.2, long before
Galaxy's minimum Python version of 3.9

Note that local tests fail due to what I think are unrelated reasons - flaky networking not related to the URLs updated in this PR.

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

Fixes broken URL for existing tool

The generated URLs does not correspond to the actual URLs where the data is
stored. This makes building about 21 of the databases fail, e.g. the standard
8 GB from 2025-10-15.

In this fix, I used the webpage https://benlangmead.github.io/aws-indexes/k2
to check the actual URLs, and verified each of them responds (but did not
download every database to check it actually contains the correct file).
Commit 797b5fc changed data_managers/data_manager_build_kraken2_database/
data_manager/kraken2_build_database.xml#59 from the deprecated
datetime.utcnow() to datetime.now(datetime.UTC). However, datetime.UTC was
introduced in Python 3.11, making this tool break on systems with earlier
Python versions.

This fix should behave identically and support back to Python 3.2, long before
Galaxy's minimum Python version of 3.9
@bernt-matthias
Copy link
Copy Markdown
Contributor

Thanks for the contribution.

I was thinking if we should add a hidden test parameter, that could trigger the use of --spider in wget (for the large datasets). This could verify that the URL is correct .. plus: instead of unzipping create a small test file to the output folder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants