`make_arviz` relies on Tasccoda instance state, causing errors when processing multiple models

Hi,

First of all, thank you for this fantastic tool. I've been using the `Tasccoda` module for differential abundance analysis across multiple datasets (different regions) in a loop. I store the resulting `MuData` objects, which contain the MCMC results, in a dictionary for later analysis and plotting.

**The Problem**

I've encountered an issue when trying to generate ArviZ plots in a separate step after the modeling is complete. The `coda.make_arviz(data)` function fails with a matrix dimension mismatch error (`TypeError: dot_general requires contracting dimensions...`).

After investigating, I realized that `make_arviz` does not solely rely on the `MuData` object passed to it. It also uses the internal state of the `Tasccoda` instance (e.g., `self.mcmc`) which was configured by the last call to `coda.prepare()`.

This leads to a "desynchronization" problem: if I try to extract `arviz` data from a `MuData` object from a previous iteration in the loop, `make_arviz` will use the model configuration from the *last* iteration, causing a shape mismatch if the number of cell types (which usually happens as different regions have different cell populations).

**Example workflow to illustrate:**

```python
coda = pt.tl.Tasccoda()
processed_models = {}
datasets = {"region_A": data_A, "region_B": data_B} # Example datasets

for name, mdata in datasets.items():
    coda.prepare(mdata, ...)
    coda.run_nuts(mdata, ...)
    processed_models[name] = mdata

# This call fails if region_A has a different number of cell types than region_B
arviz_data = coda.make_arviz(data=processed_models["region_A"]) 
```

**Workaround**

The current workaround is to "resynchronize" the `Tasccoda` instance by calling `coda.prepare()` again with the specific `MuData` object right before calling `make_arviz`.

```python
# Workaround
coda.prepare(processed_models["region_A"], ...) # Re-run prepare
arviz_data = coda.make_arviz(data=processed_models["region_A"]) # Now it works
```

While this works, it's not very intuitive and makes the code more error-prone.

**Suggestion for Improvement**

Would it be technically feasible to make `make_arviz` a stateless function, or at least have it derive all necessary model information directly from the provided `MuData` object? The `MuData` object already stores the MCMC results and the parameters in `uns['scCODA_params']`. If the model structure could also be inferred or retrieved from the `MuData` object, it would make the function much more robust and self-contained.

This would align with the expectation that the `MuData` object is the single source of truth for a given analysis, making batch processing and later analysis much more straightforward.

Thank you for considering this suggestion!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`make_arviz` relies on Tasccoda instance state, causing errors when processing multiple models #812

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

make_arviz relies on Tasccoda instance state, causing errors when processing multiple models #812

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`make_arviz` relies on Tasccoda instance state, causing errors when processing multiple models #812