Releases · labomics/midas

10 May 03:55

zhen-he

v0.3.0

95651c0

v0.3.0 Latest

Latest

v0.3.0 (2026-05-09)

Major refresh of the user-facing API around a single :class:MuData. The new entry points (setup_mudata, MIDAS(mdata), get_latent_representation, get_imputed_values, save / load) compose directly with mdata.obsm, sc.pp.neighbors(use_rep=...), and the rest of the standard single-cell stack. A new plotting namespace scmidas.pl and a data-prep tutorial round out the package for users coming straight from raw 10x output.

🚀 New — MIDAS entry points centred on MuData
- MIDAS.setup_mudata(mdata, batch_key=...) — register a MuData (writes config to mdata.uns['_scmidas']).
- MIDAS(mdata, ...) — construct directly from a registered MuData; instance state instead of class-level state (fixes a multi-instance interference bug).
- model.get_latent_representation(kind='c'|'u'|'joint') — returns the joint latent aligned to mdata.obs_names. Drop straight into mdata.obsm['X_midas'].
- model.get_imputed_values(modality='rna') — returns imputed counts aligned to mdata.obs_names.
- model.save(dir) / MIDAS.load(dir, mdata) — symmetric save/load (writes model.pt + setup.json).
- MIDAS(mdata) now defaults to transform={'atac': 'binarize'} whenever 'atac' is among the modalities (override by passing your own transform dict).
🚀 New — scmidas.pl plotting namespace
- scmidas.pl.umap(mdata, basis='X_midas', color=[...]) — one-line UMAP that works around the current scanpy + MuData plotting limitations via a thin AnnData wrapper.
- scmidas.pl.modality_grid(model, mdata, label_key=...) — collapses the per-modality vs per-batch grid (~22 lines in the previous demos) into one call. Modality columns are ordered ATAC, RNA, ADT, Joint when present.
🚀 New — scmidas.datasets.from_dir
- Loads the directory-format datasets (mat/<m>.mtx, mask/<m>.csv, feat/feat_dims.toml) into a MuData, including masks, labels, and ATAC chunk dims.
📚 New tutorial — Preparing your data
- docs/source/tutorials/basics/preparing_your_data.ipynb walks from a public 10x Genomics 5k PBMC CITE-seq sample through QC, HVG selection, MuData wrap, MIDAS integration, Leiden clustering, and a synthetic mosaic example.
📚 Docs cleanup
- inputs.rst + outputs.rst merged into data_layout.rst — a single page describing the MuData input/output contract. The directory format is moved to an "advanced" section.
- All three demos (demo1, demo2, demo3) rewritten to use the new API: from_dir → setup_mudata → MIDAS(mdata) → get_latent_representation. The 22-line per-modality grid block became scmidas.pl.modality_grid(model, mdata). Each demo gained a 6.4 "After integration" section (Leiden + UMAP).
- README adds a "Bring your own data" section linking the new tutorial and the data-layout reference.
🛠 Backwards compatibility
- MIDAS.configure_data_from_mdata and MIDAS.configure_data_from_dir still work — they emit a DeprecationWarning and will be removed in 0.4.0.
- save_checkpoint / load_checkpoint still work; new code should use save / load.
🐛 Fixes
- predict(joint_latent=False) no longer raises KeyError: 'z_c'.
- Multiple MIDAS() instances in one process now have independent state (was previously class-level — a second instance would clobber the first).

Assets 2

03 May 06:44

zhen-he

v0.2.0

37a5688

v0.2.0

v0.2.0 (2026-05-03)

🚀 New — scmidas.integrate(mdata) one-line entry point
- A thin top-level wrapper around MIDAS.configure_data_from_mdata
  - train() with toy-tuned defaults (batch_size=128,
    max_epochs=65, lr=3e-4) so that the bundled quickstart
    dataset converges in roughly one minute on a single mid-range
    GPU. The full MIDAS class API is unchanged for users who
    need control.
- ⚠️ The defaults are tuned for the toy quickstart only. For
  real datasets, override max_epochs (1000-2000) and consider
  batch_size=256.
🚀 New — bundled quickstart dataset
- scmidas.datasets.quickstart() returns a 1600-cell PBMC RNA+ADT
  mosaic MuData (4 batches, full mosaic structure: one RNA-only,
  one ADT-only, two paired). 500 RNA HVGs + 224 ADT features,
  2.66 MB shipped inside the wheel.
- Source: hand-tuned subset of wnn_mosaic_8batch_mtx. Build
  script: scripts/build_quickstart_demo.py.
📚 Documentation
- New examples/quickstart.ipynb — pre-rendered notebook that
  users can open in Colab via the new badge in the README, no
  local install required.
- README quickstart rewritten: replaces the previous ... API
  sketch with a runnable five-line snippet using
  scmidas.datasets.quickstart() + scmidas.integrate(),
  followed by the rendered UMAP image.
⚙️ Packaging
- pyproject.toml ships data/*.h5mu as package data so the
  quickstart dataset travels with the wheel.
- Module-level logging.basicConfig(level=INFO) removed from
  five files (config, data, model, nn, utils); each
  now does the canonical logger = logging.getLogger(__name__)
  instead. Demo notebooks call logging.basicConfig themselves
  so visible output is unchanged. Libraries should not call
  basicConfig — it overrides the user's own logging config.

Version 0.1.x

Assets 2

03 May 01:55

zhen-he

v0.1.19

2f8b024

v0.1.19

v0.1.19 (2026-05-03)

📦 Packaging — narrow torch upper bound to <2.11
- torch 2.11 dropped Volta (V100, CC 7.0) and Pascal (P100, GTX
  10xx, CC 6.x) from its default cu128 / cu129 wheels (to
  ship cuDNN 9.15.1, which is incompatible with those archs). On
  those GPUs pip install scmidas==0.1.18 would silently install
  a torch that fails at the first CUDA op with
  no kernel image is available for execution on the device.
- The pin now reads torch>=2.5,<2.11 (with matching
  torchvision<0.26 / torchaudio<2.11). Users on
  Ampere/Hopper/Ada/Blackwell GPUs can manually upgrade past the
  cap; users on Volta/Pascal stay on a working default install.
- No source-code change — same scmidas as 0.1.18.
✨ Enhancements
- import scmidas now runs a one-time GPU self-check: if the
  local torch wheel has no kernels for the local GPU, scmidas
  emits a UserWarning with actionable guidance (downgrade torch
  or use the cu126 wheel) instead of the user later seeing a raw
  no kernel image is available error from somewhere deep in
  their training loop. The check no-ops on CPU-only environments
  and on working GPU setups.
⚙️ CI
- Test matrix gained a torch 2.10 job (the new upper bound) and
  dropped the previous experimental torch latest job. Lower
  bound remains torch 2.5.1 across Python 3.10 / 3.11 / 3.12.

Assets 2

02 May 18:41

zhen-he

v0.1.18

c6d28b5

v0.1.18

v0.1.18 (2026-05-02)

🐛 Bug Fixes (DDP + mosaic data)
- Default sampler_type='auto' now picks the DDP sampler when a
  process group is initialized. Previously 'auto' silently fell
  back to MultiBatchSampler (a rank-agnostic sampler), so DDP
  runs computed each batch on every rank in parallel — correct
  but with no throughput gain over single-GPU. Users who already
  passed sampler_type='ddp' explicitly are unaffected.
- MyDistributedSampler now derives its shuffle order from a
  seeded random.Random instance (cross-rank-consistent for the
  dataset visit order, rank-specific for the within-dataset
  shuffle), and properly initialises the base
  DistributedSampler. Previously it used the global Python
  random module, so each DDP rank sampled a different sub-batch
  at the same step. With non-uniform per-sub-batch modality
  combinations (mosaic data), this produced different encoder
  graphs per rank and caused NCCL all-reduce to hang under
  find_unused_parameters=False (Lightning default), eventually
  triggering a watchdog timeout.
- Heads-up — DDP reproducibility: the DDP sampling order has
  changed as a side-effect of the fix. Existing seeded DDP runs
  will produce different numerics; checkpoints from prior
  versions still load and continue training, but the post-fix
  sampling sequence is not bit-equivalent to the pre-fix one.
  Single-GPU users (using MultiBatchSampler) are unaffected.
🐛 Bug Fixes (API hardening)
- MIDAS.configure_optimizers no longer raises AttributeError
  when entered through the simpler configure_data path
  (load_optimizer_state was only set by
  configure_data_from_dir / configure_data_from_mdata /
  load_checkpoint).
- MIDAS.configure_data default batch_names now use f-string
  formatting (f'batch_{i}') instead of the literal string
  'batch_%d' repeated len(datalist) times.
- Bad ATAC configuration in configure_data now raises
  ValueError instead of calling exit() (which killed the
  Jupyter kernel without a traceback).
- download_file now accepts both str and pathlib.Path for
  dest_path. The signature was annotated str but the body
  called .name.
- Encoder.forward no longer mutates the caller's batch dict.
  The mask multiply is now out-of-place; the previous in-place
  data[m] *= mask corrupted upstream tensors for any modality
  without a trsf_before_enc_* transform. Mathematically
  equivalent (the mask is a 0/1 modality-presence indicator, and
  calc_recon_loss already multiplies the loss by the same
  mask), but makes the encoder safe to re-call on the same
  batch (e.g. predict's mod_latent / translate paths).
- VAE.forward no longer wraps the PoE call in a bare
  try/except that swallowed real errors with a malformed
  logging.debug call.
✅ Tests
- Added tests/test_invariants.py pinning down the bugs above
  plus the DDP sampler determinism fix (cross-rank disjoint
  indices, set_epoch actually changes ordering).
📚 Documentation
- Each basics demo now exposes a single # === GPU configuration ===
  block (GPUS + STRATEGY) at the top so switching from
  single-GPU to multi-GPU only requires editing two values.
- Removed the redundant standalone advanced/multi_gpu.rst
  tutorial — its contents now live inline in the basics demos
  where the failure modes would actually be encountered.
- README: removed the duplicated MuData section (the from_mdata
  path is one link away in the docs), corrected the Quick
  Example comment about input format, and fixed the License
  badge link.
⚙️ Packaging
- Version is now single-sourced from pyproject.toml;
  scmidas.__version__ and the Sphinx release both read it via
  importlib.metadata.version("scmidas") instead of duplicating
  the literal in three files.
- Relax the torch pin from >=2.5,<2.6 to >=2.5,<3 (and the
  matching torchvision / torchaudio companions). The previous
  <2.6 cap was a workaround for a suspected Lightning-DDP
  incompatibility; torch 2.8 has now been verified end-to-end in
  the mosaic DDP path (1000-epoch run with UMAP and numerics
  consistent with the single-GPU baseline), so users on torch 2.6
  / 2.7 / 2.8 no longer have to manually override the pin.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v0.3.0 (2026-05-09)

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v0.2.0 (2026-05-03)

Version 0.1.x

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v0.1.19 (2026-05-03)

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v0.1.18 (2026-05-02)

Uh oh!

Releases: labomics/midas

v0.3.0

v0.3.0 (2026-05-09)

Uh oh!

v0.2.0

v0.2.0 (2026-05-03)

Version 0.1.x

Uh oh!

v0.1.19

v0.1.19 (2026-05-03)

Uh oh!

v0.1.18

v0.1.18 (2026-05-02)

Uh oh!