Releases: labomics/midas
Releases · labomics/midas
v0.3.0
v0.3.0 (2026-05-09)
Major refresh of the user-facing API around a single :class:MuData. The new entry points (setup_mudata, MIDAS(mdata), get_latent_representation, get_imputed_values, save / load) compose directly with mdata.obsm, sc.pp.neighbors(use_rep=...), and the rest of the standard single-cell stack. A new plotting namespace scmidas.pl and a data-prep tutorial round out the package for users coming straight from raw 10x output.
- 🚀 New —
MIDASentry points centred onMuDataMIDAS.setup_mudata(mdata, batch_key=...)— register a MuData (writes config tomdata.uns['_scmidas']).MIDAS(mdata, ...)— construct directly from a registered MuData; instance state instead of class-level state (fixes a multi-instance interference bug).model.get_latent_representation(kind='c'|'u'|'joint')— returns the joint latent aligned tomdata.obs_names. Drop straight intomdata.obsm['X_midas'].model.get_imputed_values(modality='rna')— returns imputed counts aligned tomdata.obs_names.model.save(dir)/MIDAS.load(dir, mdata)— symmetric save/load (writesmodel.pt+setup.json).MIDAS(mdata)now defaults totransform={'atac': 'binarize'}whenever'atac'is among the modalities (override by passing your owntransformdict).
- 🚀 New —
scmidas.plplotting namespacescmidas.pl.umap(mdata, basis='X_midas', color=[...])— one-line UMAP that works around the current scanpy + MuData plotting limitations via a thin AnnData wrapper.scmidas.pl.modality_grid(model, mdata, label_key=...)— collapses the per-modality vs per-batch grid (~22 lines in the previous demos) into one call. Modality columns are ordered ATAC, RNA, ADT, Joint when present.
- 🚀 New —
scmidas.datasets.from_dir- Loads the directory-format datasets (
mat/<m>.mtx,mask/<m>.csv,feat/feat_dims.toml) into aMuData, including masks, labels, and ATAC chunk dims.
- Loads the directory-format datasets (
- 📚 New tutorial — Preparing your data
docs/source/tutorials/basics/preparing_your_data.ipynbwalks from a public 10x Genomics 5k PBMC CITE-seq sample through QC, HVG selection, MuData wrap, MIDAS integration, Leiden clustering, and a synthetic mosaic example.
- 📚 Docs cleanup
inputs.rst+outputs.rstmerged intodata_layout.rst— a single page describing the MuData input/output contract. The directory format is moved to an "advanced" section.- All three demos (
demo1,demo2,demo3) rewritten to use the new API:from_dir→setup_mudata→MIDAS(mdata)→get_latent_representation. The 22-line per-modality grid block becamescmidas.pl.modality_grid(model, mdata). Each demo gained a 6.4 "After integration" section (Leiden + UMAP). - README adds a "Bring your own data" section linking the new tutorial and the data-layout reference.
- 🛠 Backwards compatibility
MIDAS.configure_data_from_mdataandMIDAS.configure_data_from_dirstill work — they emit aDeprecationWarningand will be removed in 0.4.0.save_checkpoint/load_checkpointstill work; new code should usesave/load.
- 🐛 Fixes
predict(joint_latent=False)no longer raisesKeyError: 'z_c'.- Multiple
MIDAS()instances in one process now have independent state (was previously class-level — a second instance would clobber the first).
v0.2.0
v0.2.0 (2026-05-03)
- 🚀 New —
scmidas.integrate(mdata)one-line entry point- A thin top-level wrapper around
MIDAS.configure_data_from_mdatatrain()with toy-tuned defaults (batch_size=128,
max_epochs=65,lr=3e-4) so that the bundled quickstart
dataset converges in roughly one minute on a single mid-range
GPU. The fullMIDASclass API is unchanged for users who
need control.
⚠️ The defaults are tuned for the toy quickstart only. For
real datasets, overridemax_epochs(1000-2000) and consider
batch_size=256.
- A thin top-level wrapper around
- 🚀 New — bundled quickstart dataset
scmidas.datasets.quickstart()returns a 1600-cell PBMC RNA+ADT
mosaic MuData (4 batches, full mosaic structure: one RNA-only,
one ADT-only, two paired). 500 RNA HVGs + 224 ADT features,
2.66 MB shipped inside the wheel.- Source: hand-tuned subset of
wnn_mosaic_8batch_mtx. Build
script:scripts/build_quickstart_demo.py.
- 📚 Documentation
- New
examples/quickstart.ipynb— pre-rendered notebook that
users can open in Colab via the new badge in the README, no
local install required. - README quickstart rewritten: replaces the previous
...API
sketch with a runnable five-line snippet using
scmidas.datasets.quickstart()+scmidas.integrate(),
followed by the rendered UMAP image.
- New
- ⚙️ Packaging
pyproject.tomlshipsdata/*.h5muas package data so the
quickstart dataset travels with the wheel.- Module-level
logging.basicConfig(level=INFO)removed from
five files (config,data,model,nn,utils); each
now does the canonicallogger = logging.getLogger(__name__)
instead. Demo notebooks calllogging.basicConfigthemselves
so visible output is unchanged. Libraries should not call
basicConfig— it overrides the user's own logging config.
Version 0.1.x
v0.1.19
v0.1.19 (2026-05-03)
- 📦 Packaging — narrow torch upper bound to
<2.11- torch 2.11 dropped Volta (V100, CC 7.0) and Pascal (P100, GTX
10xx, CC 6.x) from its defaultcu128/cu129wheels (to
ship cuDNN 9.15.1, which is incompatible with those archs). On
those GPUspip install scmidas==0.1.18would silently install
a torch that fails at the first CUDA op with
no kernel image is available for execution on the device. - The pin now reads
torch>=2.5,<2.11(with matching
torchvision<0.26/torchaudio<2.11). Users on
Ampere/Hopper/Ada/Blackwell GPUs can manually upgrade past the
cap; users on Volta/Pascal stay on a working default install. - No source-code change — same scmidas as 0.1.18.
- torch 2.11 dropped Volta (V100, CC 7.0) and Pascal (P100, GTX
- ✨ Enhancements
import scmidasnow runs a one-time GPU self-check: if the
local torch wheel has no kernels for the local GPU, scmidas
emits aUserWarningwith actionable guidance (downgrade torch
or use the cu126 wheel) instead of the user later seeing a raw
no kernel image is availableerror from somewhere deep in
their training loop. The check no-ops on CPU-only environments
and on working GPU setups.
- ⚙️ CI
- Test matrix gained a
torch 2.10job (the new upper bound) and
dropped the previous experimentaltorch latestjob. Lower
bound remainstorch 2.5.1across Python 3.10 / 3.11 / 3.12.
- Test matrix gained a
v0.1.18
v0.1.18 (2026-05-02)
- 🐛 Bug Fixes (DDP + mosaic data)
- Default
sampler_type='auto'now picks the DDP sampler when a
process group is initialized. Previously'auto'silently fell
back toMultiBatchSampler(a rank-agnostic sampler), so DDP
runs computed each batch on every rank in parallel — correct
but with no throughput gain over single-GPU. Users who already
passedsampler_type='ddp'explicitly are unaffected. MyDistributedSamplernow derives its shuffle order from a
seededrandom.Randominstance (cross-rank-consistent for the
dataset visit order, rank-specific for the within-dataset
shuffle), and properly initialises the base
DistributedSampler. Previously it used the global Python
randommodule, so each DDP rank sampled a different sub-batch
at the same step. With non-uniform per-sub-batch modality
combinations (mosaic data), this produced different encoder
graphs per rank and caused NCCL all-reduce to hang under
find_unused_parameters=False(Lightning default), eventually
triggering a watchdog timeout.- Heads-up — DDP reproducibility: the DDP sampling order has
changed as a side-effect of the fix. Existing seeded DDP runs
will produce different numerics; checkpoints from prior
versions still load and continue training, but the post-fix
sampling sequence is not bit-equivalent to the pre-fix one.
Single-GPU users (usingMultiBatchSampler) are unaffected.
- Default
- 🐛 Bug Fixes (API hardening)
MIDAS.configure_optimizersno longer raisesAttributeError
when entered through the simplerconfigure_datapath
(load_optimizer_statewas only set by
configure_data_from_dir/configure_data_from_mdata/
load_checkpoint).MIDAS.configure_datadefaultbatch_namesnow use f-string
formatting (f'batch_{i}') instead of the literal string
'batch_%d'repeatedlen(datalist)times.- Bad ATAC configuration in
configure_datanow raises
ValueErrorinstead of callingexit()(which killed the
Jupyter kernel without a traceback). download_filenow accepts bothstrandpathlib.Pathfor
dest_path. The signature was annotatedstrbut the body
called.name.Encoder.forwardno longer mutates the caller's batch dict.
The mask multiply is now out-of-place; the previous in-place
data[m] *= maskcorrupted upstream tensors for any modality
without atrsf_before_enc_*transform. Mathematically
equivalent (the mask is a 0/1 modality-presence indicator, and
calc_recon_lossalready multiplies the loss by the same
mask), but makes the encoder safe to re-call on the same
batch (e.g.predict'smod_latent/translatepaths).VAE.forwardno longer wraps the PoE call in a bare
try/exceptthat swallowed real errors with a malformed
logging.debugcall.
- ✅ Tests
- Added
tests/test_invariants.pypinning down the bugs above
plus the DDP sampler determinism fix (cross-rank disjoint
indices,set_epochactually changes ordering).
- Added
- 📚 Documentation
- Each basics demo now exposes a single
# === GPU configuration ===
block (GPUS+STRATEGY) at the top so switching from
single-GPU to multi-GPU only requires editing two values. - Removed the redundant standalone
advanced/multi_gpu.rst
tutorial — its contents now live inline in the basics demos
where the failure modes would actually be encountered. - README: removed the duplicated MuData section (the
from_mdata
path is one link away in the docs), corrected the Quick
Example comment about input format, and fixed the License
badge link.
- Each basics demo now exposes a single
- ⚙️ Packaging
- Version is now single-sourced from
pyproject.toml;
scmidas.__version__and the Sphinxreleaseboth read it via
importlib.metadata.version("scmidas")instead of duplicating
the literal in three files. - Relax the
torchpin from>=2.5,<2.6to>=2.5,<3(and the
matchingtorchvision/torchaudiocompanions). The previous
<2.6cap was a workaround for a suspected Lightning-DDP
incompatibility; torch 2.8 has now been verified end-to-end in
the mosaic DDP path (1000-epoch run with UMAP and numerics
consistent with the single-GPU baseline), so users on torch 2.6
/ 2.7 / 2.8 no longer have to manually override the pin.
- Version is now single-sourced from