-
-
Notifications
You must be signed in to change notification settings - Fork 18
Add Zarr support #190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
keller-mark
wants to merge
155
commits into
scverse:devel
Choose a base branch
from
keller-mark:keller-mark/zarr
base: devel
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Add Zarr support #190
Changes from 108 commits
Commits
Show all changes
155 commits
Select commit
Hold shift + click to select a range
19698c3
WIP
keller-mark 5100a40
More tests passing
keller-mark 4c9d01f
Fix df read bug
keller-mark eb8b69c
More tests passing after fixing zero-dimensional get bug in pizzarr
keller-mark 58f5ce4
WIP: writing
keller-mark a15513a
Fix more tests
keller-mark 1ef1be6
Zarr df writing
keller-mark 77a7ee2
WIP: ZarrAnnData class
keller-mark 27ce6c1
Tests passing
keller-mark 8790715
Tests that compare h5ad to zarr
keller-mark bb0c6c7
Use Rarr to read full numeric arrays
keller-mark 0456ecd
Fix bugs. Add test for from_SingleCellExperiment with Zarr
keller-mark acac772
Add a to_dense param to ZarrAnnData constructor. Add overwrite params…
keller-mark 30316ed
Update
keller-mark c6b4d89
Backwards dense/sparse
keller-mark 9cadc26
Merge branch 'keller-mark/zarr' of https://github.com/keller-mark/ann…
Artur-man a40a618
Merge branch 'keller-mark/zarr' into zarr
Artur-man 1afd6eb
Simplify how obs and var names handled in ZarrAnnData (similar to #171)
Artur-man 7f53049
update extdata and documentation
Artur-man 3800f38
fix set/get zarr _index, update text example.zarr and update tests si…
Artur-man 4a7d4c2
Merge pull request #5 from Artur-man/zarr
keller-mark c881bb9
Merge
keller-mark 4a1bbde
Fix test
keller-mark 15dfbde
Revert unnecessary changes
keller-mark 438809a
Formatting
keller-mark 2215402
Merge pull request #6 from keller-mark/keller-mark/zarr-2
keller-mark 087ffb7
Add comments
keller-mark 37d1ae5
Merge pull request #7 from keller-mark/keller-mark/comments
keller-mark 357a8d7
remove unnecessary example zarr store
Artur-man d192e68
lintr and R check for zarr related utilities and functions, updated s…
Artur-man 1e0e868
add pizzarr to Suggests and README
Artur-man fe07028
proj
Artur-man 7ef94f8
Merge branch 'main' into keller-mark/zarr
Artur-man bf8e797
add keller-mark/pizzarr to Remotes
Artur-man 5abcc75
zip example.zarr
Artur-man c5ec1c0
Merge branch 'main' into keller-mark/zarr
Artur-man 84ad61f
Merge branch 'main' into keller-mark/zarr
Artur-man c3cb8aa
adapt read_zarr to Rarr
Artur-man 63f102c
adapt write_zarr to Rarr
Artur-man a31ff9b
update to most recent anndataR
Artur-man e98f877
remove old scripts
Artur-man b0bfad4
update write_zarr
Artur-man 7ebe151
initial update to ZarrAnnData
Artur-man 96c5824
update ZarrAnnData, documentation, and implement read_zarr_rec_array
Artur-man c41a042
review read zarr helpers, and update tests
Artur-man 370ac17
update read_zarr, read tests pass
Artur-man ddb5271
some updates for writing zarr
Artur-man 755904d
update write_empty_zarr
Artur-man 4290aed
remove pizzarr, update documentation
Artur-man e43c819
remove pizzarr from tests
Artur-man a98b58f
fix test-ZarrAnnData
Artur-man 2d551d8
update ZarrAnnData to imitate HDF5AnnData
Artur-man 42fcbb1
check redundant files, correct lines
Artur-man 205dee4
update example_h5ad.py, add zarr and change to example_files.py
Artur-man f7638eb
add new test example
Artur-man acede3c
some linting changes
Artur-man 07c92f7
remove read/write_zattrs since implemented in Rarr
Artur-man 2b672ab
access read/write_zarr_attr
Artur-man 555a634
Merge branch 'main' into keller-mark/zarr
Artur-man dcaf157
add some missing tests
Artur-man b10faa5
Merge branch 'main' into keller-mark/zarr
Artur-man e3d08f8
update readers, update tests
Artur-man 1e2addc
correct nullable string zarr array write/read, introduce ordering in …
Artur-man 570325b
do some linting, fix commented out code
Artur-man 0fac149
update some zarr writers and classes
Artur-man 79023b4
fix documentation
Artur-man bece447
fix compression interface for zarr
Artur-man a46c9e1
full lint check
Artur-man f90d70a
fix examples
Artur-man 73934a7
check, biocheck and lintr
Artur-man f42a6df
fix development status
Artur-man a373973
air format
Artur-man 2f73501
air format test
Artur-man 540852d
update example.zarr.zip, skip some test (waiting for Rarr)
Artur-man a22d007
update example.zarr, fix some read_zarr_
Artur-man 1cc5ff6
fix examples
Artur-man 2499d5c
remove overwrite
Artur-man bd6238a
R code styling
Artur-man aaf9801
fixes from @lazappi
Artur-man 43d4f1f
Merge branch 'main' into keller-mark/zarr
Artur-man 73ee0e3
air format
Artur-man d70a011
update some documentation
Artur-man ccb0cdf
fix some tests
Artur-man fe8f196
more fixes on anndata-zarr integration
Artur-man e6efbf2
update ZarrAnnData$initialize
Artur-man 8b0d1b0
update zarr compression
Artur-man bafae8e
fix column-order here, C based ordering for arrays
Artur-man eead040
implement roundtrip tests for anndata-zarr
Artur-man 64e4289
add zarr to vignettes
Artur-man 23f8ac5
update README and software_design.rmd
Artur-man 0852908
update AnnData-usage
Artur-man 8677fad
update write_zarr documentation
Artur-man efa2ca0
update write_zarr_null
Artur-man 5025035
fix rec_array, update tests and example datasets
Artur-man 0c53019
fix duplicate chunks in Rmd
Artur-man 5011c2c
add write_zarr_null
Artur-man e90e7b9
update write string array (zarr), air and lint
Artur-man 6b06099
implement writing empty zarr elements
Artur-man 2ac2eba
update tests for rec_array conformance of h5ad and zarr
Artur-man b85e706
update mapping conformance test for h5ad and zarr
Artur-man b26c9b5
implement H5_ITER like ordering and fix h5ad vs zarr testing
Artur-man 2c7b5b3
air and lint
Artur-man 978482f
fix test bug
Artur-man 36c2960
do not call expect_equal outside of test
Artur-man 4b657d1
implement examples, test and datasets for zarr v3
Artur-man 20bb482
Merge branch 'main' into keller-mark/zarr
Artur-man ee635b1
fix issues, lint and update example datasets to new anndata version
Artur-man 4d98248
Merge branch 'main' into keller-mark/zarr
Artur-man c6bb382
Merge branch 'main' into keller-mark/zarr
Artur-man 3c5c22a
lint and merge
Artur-man 5bcaa5d
revert some lines
Artur-man d6021e9
small changes
Artur-man 493d707
revert small changes
Artur-man 5698d7b
air format some tests
Artur-man e483e8e
Merge remote-tracking branch 'origin/devel' into keller-mark/zarr
lazappi b7080c3
Set v2 in write_zarr_* helpers
lazappi fb90100
Fix stop message in write_zarr_element()
lazappi 417fd77
Fix roxygen comment in write_zarr_element()
lazappi 5bcfed9
Expand compression list in as_ZarrAnnData()
lazappi 8be4459
Fix H5_ITER_INC_ORDERING docs
lazappi 9d3e8aa
Fix as_ZarrAnnData() compression docs
lazappi 2b50657
Fix Zarr varm roundtrip test
lazappi 6d567f8
Review duplicate entry in README
lazappi bff6b83
Fix typo in AnnData-usage docs
lazappi 2ee61eb
Fix comma in software design vignette
lazappi c7ebd33
Adjust class descriptions in software design vignette
lazappi 5263b6f
Roxygenise
lazappi 492cdb7
Minor text fixes
lazappi 5b5aeed
Document .get_compressor
lazappi 08b48f0
Minor fixes to function docs
lazappi 3a2410c
Comment logic in create_zarr_group()
lazappi 21c389e
Fix indentation in read_zarr_sparse_array()
lazappi 8084f44
Add construct sparse matrix helper
lazappi 535deb4
Add ZARR_METADATA_FILES vector
lazappi beec0d2
Eval Zarr chunks in vignettes
lazappi b6977ff
Merge test-Zarrv3-read.R into test-Zarr-read.R
lazappi 7c29c7e
Combine roundtrip tests
lazappi f438aba
Add roundtrip test helpers
lazappi c67d6bf
Refactor test-h5ad-zarr.R to use helper
lazappi 6deaf01
Refactor example files script
lazappi 6197b79
Remove H5_ITER_INC_ORDERING()
lazappi 28f8a6a
Remove Zarr compression comment
lazappi 01768ec
Fix factor creation in read_zarr_categorical()
lazappi 11b6816
Pin Rarr version
lazappi 6f17dfa
Delete existing Zarr path before writing
lazappi bd2e1b4
Add helper functions for accessing Zarr keys
lazappi 3a233e4
Update read_zarr_element() error message
lazappi 8fc907b
Add dimname warnings to ZarrAnnData
lazappi dc9a5b5
Add Zarr writeability checks/tests
lazappi a236045
Roxygenise, lint, style
lazappi 6b6993d
Use setup-bioc for all GHA
lazappi 9ecfd0c
Update WORDLIST
lazappi 5213860
Add .venv to .Rbuildignore
lazappi 8fbbd77
Clean up test output
lazappi 528eff0
Style
lazappi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -63,4 +63,4 @@ benchmarks/results_*.txt | |
| vignettes/data/*.h5ad | ||
| /doc/ | ||
| /Meta/ | ||
| /data/ | ||
| /data/ | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -49,6 +49,7 @@ Suggests: | |
| knitr, | ||
| processx, | ||
| rhdf5 (>= 2.52.1), | ||
| Rarr, | ||
| rmarkdown, | ||
| S4Vectors, | ||
| Seurat, | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,111 @@ | ||
| #' create_zarr_group | ||
| #' | ||
| #' Create a zarr group | ||
| #' | ||
| #' @param store the location of (zarr) store | ||
| #' @param name name of the group | ||
| #' @param version zarr version | ||
|
lazappi marked this conversation as resolved.
Outdated
|
||
| #' | ||
| #' @return `NULL` | ||
| #' | ||
| #' @noRd | ||
| #' | ||
| #' @examples | ||
| #' store <- tempfile(fileext = ".zarr") | ||
| #' create_zarr(store) | ||
| #' create_zarr_group(store, "gp") | ||
|
lazappi marked this conversation as resolved.
Outdated
|
||
| create_zarr_group <- function(store, name, version = "v2") { | ||
| split_name <- strsplit(name, split = "/", fixed = TRUE)[[1]] | ||
| if (length(split_name) > 1) { | ||
| split_name <- vapply( | ||
| seq_along(split_name), | ||
| function(x) paste(split_name[seq_len(x)], collapse = "/"), | ||
| FUN.VALUE = character(1) | ||
| ) | ||
| split_name <- rev(tail(split_name, 2)) | ||
| if (!dir.exists(file.path(store, split_name[2]))) { | ||
| create_zarr_group(store = store, name = split_name[2]) | ||
| } | ||
| } | ||
| dir.create(file.path(store, split_name[1]), showWarnings = FALSE) | ||
|
lazappi marked this conversation as resolved.
|
||
| switch( | ||
|
lazappi marked this conversation as resolved.
|
||
| version, | ||
| v2 = { | ||
| write( | ||
| "{\"zarr_format\":2}", | ||
| file = file.path(store, split_name[1], ".zgroup") | ||
| ) | ||
| }, | ||
| v3 = { | ||
|
lazappi marked this conversation as resolved.
|
||
| cli_abort("Currently only zarr v2 is supported!") | ||
| }, | ||
| cli_abort("Only zarr v2 is supported. Use version = 'v2'") | ||
| ) | ||
| } | ||
|
|
||
| #' create_zarr | ||
| #' | ||
| #' Create zarr store | ||
| #' | ||
| #' @param store the location of zarr store | ||
| #' @param version zarr version | ||
| #' | ||
| #' @return `NULL` | ||
| #' | ||
| #' @noRd | ||
| #' | ||
| #' @examples | ||
| #' store <- tempfile(fileext = ".zarr") | ||
| #' create_zarr(store) | ||
| create_zarr <- function(store, version = "v2") { | ||
| prefix <- basename(store) | ||
| dir <- gsub(paste0(prefix, "$"), "", store) | ||
| create_zarr_group(store = dir, name = prefix, version = version) | ||
| } | ||
|
|
||
| #' is_zarr_empty | ||
| #' | ||
| #' check if a zarr store is empty or not. | ||
| #' | ||
| #' @param store the location of zarr store | ||
| #' | ||
| #' @return returns TRUE if zarr store is empty | ||
| #' | ||
| #' @noRd | ||
| #' | ||
| #' @examples | ||
| #' store <- tempfile(fileext = ".zarr") | ||
| #' create_zarr(store) | ||
| #' is_zarr_empty(store) | ||
| is_zarr_empty <- function(store) { | ||
| files <- list.files(store, recursive = FALSE, full.names = FALSE) | ||
| all(files %in% c(".zarray", ".zattrs", ".zgroup")) | ||
| } | ||
|
|
||
| #' Zarr path exists | ||
| #' | ||
| #' Check that a path in Zarr exists | ||
| #' | ||
| #' @return Whether the `target_path` exists in `store` | ||
| #' @noRd | ||
| #' | ||
| #' @param store Path to a Zarr store | ||
| #' @param target_path The path within the store to test for | ||
| zarr_path_exists <- function(store, target_path) { | ||
| zarr <- file.path(store, target_path) | ||
| if (!dir.exists(zarr)) { | ||
| FALSE | ||
| } else { | ||
| list_files <- list.files( | ||
| path = zarr, | ||
| full.names = FALSE, | ||
| recursive = FALSE, | ||
| all.files = TRUE | ||
| ) | ||
| if (any(c(".zarray", ".zattrs", ".zgroup", "zarr.json") %in% list_files)) { | ||
| TRUE | ||
| } else { | ||
| FALSE | ||
| } | ||
| } | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.