Issue: Deltares/hydromt_wflow#741
Precipitation forcing built on the h7 cluster has broken years — a single spatial pattern repeated for all 365 days. Windows/WSL builds are fine.
Running setup_precip_forcing with MSWEP V316 daily data on h7 produces years where every timestep contains the same spatial field. Broken years in the reporter's model: 1991, 2000, 2009, 2013, 2015, 2018.
Each broken year's constant pattern comes from day 0 of an adjacent year:
| Broken year | Source of repeated pattern |
|---|---|
| 1991 | 1990 day 0 |
| 2000 | 2001 day 0 |
| 2009 | 2010 day 0 |
| 2013 | 2014 day 0 |
| 2015 | 2016 day 0 |
| 2018 | 2017 day 0 |
The direction isn't consistent (+1 or -1), which points to chunk data being assigned to the wrong position rather than an off-by-one.
Generated with:
pixi run python scripts/plot_issue_comparison.py --start 2010 --end 2020WSL (orange) has normal daily variation. The cluster build (green) flatlines during 2013, 2015, and 2018.
Nearly identical — difference is ±0.005 mm/day from nearest-neighbor reprojection.
Differences reach -30 to -50 mm/day during broken years. Real rainfall completely lost.
| Package | Cluster (broken) | WSL/pixi (works) |
|---|---|---|
| Python | 3.14.3 | 3.13.13 |
| hydromt | 1.3.0 | 1.4.0.dev0 |
| hydromt_wflow | 1.0.1 | 1.0.3.dev0 |
| xarray | 2026.2.0 | 2026.4.0 |
| dask | 2026.1.2 | 2026.3.0 |
| distributed | 2026.1.2 | not installed |
| numpy | 2.3.5 | 2.4.3 |
| pandas | 2.3.3 | 3.0.2 |
| rioxarray | 0.21.0 | 0.22.0 |
| rasterio | 1.5.0 | 1.5.0 |
| netCDF4 | 1.7.4 | 1.7.4 |
Full env listings: conda_list.txt / pip_list.txt (reporter), conda_list_h7.txt / pip_list_h7.txt (older env, same cluster).
convention_resolver.py resolves wildcard URIs into a set(), giving non-deterministic order. But open_mfdataset uses combine='by_coords' so files get sorted by time coordinate regardless. Not the cause.
All 47 source files (1979–2025) have proper datetime64 time coords. Loading and reprojecting any single file gives correct temporal variation.
resample_time in meteo.py uses YE grouping. With daily-in/daily-out the frequency ratio is ~1 so no resampling happens.
One year → clip → reproject_like(method='nearest_index') → days differ correctly.
Three years loaded via open_mfdataset(combine='by_coords', parallel=True, chunks={time:1}) → clip → reproject → all correct. Bug does not appear without distributed.
Complete model (1988–2021) built on WSL produces correct forcing for all years including 2013, 2015, 2018.
dask.distributed misroutes chunk data during map_blocks in reindex2d() (called from reproject_like(method='nearest_index')).
Why this fits:
- Time coordinates are always correct, only data values are wrong
- Affected years are non-deterministic
- Does not reproduce without
distributed reindex2dcreates a task per time chunk: withchunks={time: 1}over 47 years that's 17k+ tasks- Distributed scheduling is non-deterministic
Other possible factors:
- Bug in dask 2026.1.2 fixed by 2026.3.0
- Bug in xarray 2026.2.0
map_blocksfixed by 2026.4.0 - Python 3.14 GIL/threading changes
- pixi (for default dev env)
- conda/mamba (for reproduce env)
- Access to
/p/wflow_global/ hydromtandhydromt_wflowrepos cloned as siblings (dev env only)
pixi run full-reproduce
pixi run python scripts/plot_issue_comparison.py --no-raw --start 2010 --end 2020environment_reproduce.yml pins the versions from conda_list_h7.txt:
mamba env create -f environment_reproduce.yml
conda activate hydromt-issue-741-reproduce
hydromt update wflow_sbm ./.updated -i ./config/wflow_update_forcing.yml -d ./config/deltares_data.yml -vvvVersions: hydromt 1.3.1, hydromt_wflow 1.0.2, dask+distributed 2024.11.2, xarray 2024.11.0, Python 3.12.
# Without raw MSWEP (fast)
pixi run python scripts/plot_issue_comparison.py --no-raw --start 2010 --end 2020
# With raw MSWEP (needs /p/ mount)
pixi run python scripts/plot_issue_comparison.py --start 2010 --end 2020
# Custom point / range
pixi run python scripts/plot_issue_comparison.py --lat 45.85 --lon 2.69 --start 1988 --end 2021 -o my_plot.pngOutput files:
*_wsl_vs_cluster.png*_wsl_vs_raw.png(only when raw is loaded)*_cluster_vs_raw.png(only when raw is loaded)
- Run
hydromt updatein the reproduce env locally — see ifdistributedalone triggers it - Try with an explicit
dask.distributed.Client()to force the distributed scheduler - Check dask/xarray changelogs (2024.11 → 2026.3) for
map_blocksfixes - Disable
parallel: truein the data catalog - Use larger time chunks (
{time: 365}instead of{time: 1}) to reduce the task graph from 17k to 47 tasks


