Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 101 additions & 2 deletions src/tilefusion/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

import gc
import json
import threading
from concurrent.futures import ThreadPoolExecutor
from multiprocessing import cpu_count
from pathlib import Path
Expand Down Expand Up @@ -144,6 +145,10 @@ def __init__(
self._metadata = load_individual_tiffs_metadata(self.tiff_path)
else:
self._metadata = load_ome_tiff_metadata(self.tiff_path)
# Close the metadata handle immediately - we use thread-local handles
# for thread-safe concurrent reads instead of sharing this handle.
if "tiff_handle" in self._metadata:
self._metadata.pop("tiff_handle").close()
Comment on lines 146 to +152

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If load_ome_tiff_metadata raises an exception, the tiff_handle in the returned metadata dictionary will not be closed, leading to a resource leak. While load_ome_tiff_metadata has a try-except block to close the handle on error, if it returns successfully but an exception occurs before line 152 (e.g., during dictionary access on line 155), the handle will remain open. Consider wrapping this in a try-except block to ensure the handle is closed even if subsequent operations fail.

Copilot uses AI. Check for mistakes.

# Extract common properties
self.n_tiles = self._metadata["n_tiles"]
Expand Down Expand Up @@ -217,6 +222,96 @@ def __init__(
self.fused_ts = None
self.center = None

# Thread-local storage for TiffFile handles (thread-safe concurrent access)
self._thread_local = threading.local()
self._handles_lock = threading.Lock()
self._all_handles: List = [] # Track all handles for cleanup

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type annotation for _all_handles is incomplete. It should be List[tifffile.TiffFile] instead of just List to provide proper type hints. This would improve code clarity and enable better static type checking.

Copilot uses AI. Check for mistakes.

def close(self) -> None:
"""
Close any open file handles to release resources.

This should be called when finished using a TileFusion instance,
or use it as a context manager (``with TileFusion(...) as tf:``)
for automatic cleanup. Important for OME-TIFF inputs where file
handles are kept open for performance.
"""
# Close all thread-local handles
with self._handles_lock:
for handle in self._all_handles:
try:
handle.close()
except Exception:
pass # Best-effort cleanup

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The broad exception handler silently swallows all exceptions during cleanup. While this is acceptable for best-effort cleanup, it would be better to catch specific exceptions (e.g., AttributeError, RuntimeError) or at least log the suppressed exceptions for debugging purposes. Consider adding a debug log statement within the except block to aid troubleshooting.

Copilot uses AI. Check for mistakes.

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using bare except Exception in best-effort cleanup suppresses all errors, potentially hiding bugs. Consider catching more specific exceptions like OSError, ValueError, or AttributeError that might occur during file handle cleanup, or at minimum logging suppressed exceptions for debugging purposes.

Copilot uses AI. Check for mistakes.
self._all_handles.clear()

# Reset thread-local storage to prevent stale handle access.
# Setting individual attributes to None only affects the current thread;
# resetting the entire object ensures all threads get fresh handles.

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resetting thread-local storage by creating a new threading.local() object does not clear the attributes stored in other threads. Each thread maintains its own copy of thread-local data, and creating a new local() object in one thread only affects that thread. Other threads will still have references to closed handles in their thread-local storage. Consider documenting this behavior or implementing a more robust cleanup mechanism.

Suggested change
# Reset thread-local storage to prevent stale handle access.
# Setting individual attributes to None only affects the current thread;
# resetting the entire object ensures all threads get fresh handles.
# Reset thread-local storage for the *current* thread to prevent stale
# handle access in this thread after close(). Note that each thread
# has its own view of a threading.local() instance; this does *not*
# clear attributes in other threads. Safety therefore relies on the
# contract documented above: close() must only be called once all
# worker threads have finished using this TileFusion instance.

Copilot uses AI. Check for mistakes.
self._thread_local = threading.local()
Comment on lines +231 to +280

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After resetting _thread_local to a new threading.local() object in the close() method, any thread that previously had a handle and tries to access it will get a fresh threading.local() instance without the tiff_handle attribute. This is correct behavior - the thread will create a new handle on next access. However, this creates a subtle issue: if a thread is still actively reading when close() is called, that thread's handle gets closed while it may be in use, potentially causing errors. Consider adding a note in the docstring warning that close() should only be called when all operations are complete.

Copilot uses AI. Check for mistakes.

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims this "prevents stale handle access," but creating a new threading.local() object only affects the current thread's view. Other threads that previously accessed self._thread_local.tiff_handle still have their closed handles cached. When those threads call _get_thread_local_handle() later, the check hasattr(self._thread_local, "tiff_handle") will return True (since they still have the attribute in their thread-local namespace), but the handle is now closed. Consider either: 1) tracking thread IDs and clearing their attributes, 2) checking if handles are closed in _get_thread_local_handle(), or 3) removing this line since the warning already states operations during close() will fail.

Suggested change
# Reset thread-local storage to prevent stale handle access.
# Setting individual attributes to None only affects the current thread;
# resetting the entire object ensures all threads get fresh handles.
self._thread_local = threading.local()
# Clear this thread's cached handle; other threads must not use this
# TileFusion instance after close(), as documented in the warning above.
if hasattr(self._thread_local, "tiff_handle"):
del self._thread_local.tiff_handle

Copilot uses AI. Check for mistakes.

def __enter__(self) -> "TileFusion":
"""Enter the runtime context."""
return self

def __exit__(self, exc_type, exc_val, exc_tb) -> None:
"""Exit the runtime context and close file handles."""
self.close()

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context manager's exit method does not suppress exceptions (return value is None). This means that if an exception occurs within the context and close() also raises an exception, the original exception will be masked by the close() exception. Consider wrapping the close() call in a try-except block to ensure the original exception is preserved.

Suggested change
def __exit__(self, exc_type, exc_val, exc_tb) -> None:
"""Exit the runtime context and close file handles."""
self.close()
def __exit__(self, exc_type, exc_val, exc_tb) -> bool:
"""
Exit the runtime context and close file handles.
If an exception occurred inside the context, any exception raised by
close() is suppressed so that the original exception is not masked.
"""
if exc_type is not None:
# An exception is already in progress; do best-effort cleanup
# without masking the original error.
try:
self.close()
except Exception:
# Suppress cleanup errors to preserve the original exception.
return False
# Do not suppress the original exception.
return False
# No exception from the with-block; let close() errors propagate.
self.close()
# Always return False to avoid suppressing exceptions.
return False

Copilot uses AI. Check for mistakes.

def __del__(self):
"""
Destructor to ensure file handles are closed.

Note: This is a fallback safety net only. Python does not guarantee
when (or if) __del__ is called. Always prefer using the context
manager protocol (``with TileFusion(...) as tf:``) or explicitly
calling ``close()`` for reliable resource cleanup.
"""
try:
self.close()
except AttributeError:
pass # Object may be partially initialized

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __del__ destructor catches AttributeError for partially initialized objects, but the close() method may raise other exceptions during cleanup (e.g., if threading.Lock() operations fail). Consider catching a broader exception type or being more specific about what can fail during partial initialization. The comment mentions "Object may be partially initialized" but doesn't explain which attributes might be missing.

Suggested change
self.close()
except AttributeError:
pass # Object may be partially initialized
# During partial initialization or interpreter shutdown, attributes
# such as ``_handles_lock``, ``_all_handles`` or ``_thread_local``
# may be missing or in an invalid state, and lock operations inside
# ``close()`` may fail. Never let such errors escape from __del__.
self.close()
except Exception:
# Best-effort cleanup; ignore all errors in the destructor.
pass

Copilot uses AI. Check for mistakes.

def _get_thread_local_handle(self):

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _get_thread_local_handle() method lacks a return type annotation. According to the docstring, it returns tifffile.TiffFile or None, which should be annotated as -> Optional[tifffile.TiffFile]. This would improve type checking and code clarity. Note that you'll need to import tifffile at the module level or use a string literal for forward reference.

Copilot uses AI. Check for mistakes.
"""
Get or create a thread-local TiffFile handle for the current thread.

Each thread gets its own file handle to ensure thread-safe concurrent
reads. This avoids race conditions that can occur when multiple threads
share a single file descriptor (seek + read is not atomic on Windows).

Returns
-------
tifffile.TiffFile or None
Thread-local handle for OME-TIFF files, None for other formats.
"""
# Only applies to OME-TIFF format (not zarr, individual tiffs, etc.)
if (
self._is_zarr_format
or self._is_individual_tiffs_format
or self._is_ome_tiff_tiles_format
):
return None

# Check if this thread already has a handle
if (
hasattr(self._thread_local, "tiff_handle")
and self._thread_local.tiff_handle is not None
):
return self._thread_local.tiff_handle

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After close() is called, threads may still have closed TiffFile handles in their thread-local storage. The check only verifies the handle is not None, but doesn't check if the handle is closed. If a thread calls _read_tile() after close() but before the next handle recreation, it will pass a closed handle to read_ome_tiff_tile(), causing a cryptic error. Consider adding a check like and not handle.filehandle.closed or catching and recreating handles that have been closed.

Suggested change
if (
hasattr(self._thread_local, "tiff_handle")
and self._thread_local.tiff_handle is not None
):
return self._thread_local.tiff_handle
if hasattr(self._thread_local, "tiff_handle"):
handle = self._thread_local.tiff_handle
if handle is not None:
# Avoid reusing a closed TiffFile handle (e.g., after close()).
filehandle = getattr(handle, "filehandle", None)
if filehandle is not None and not getattr(filehandle, "closed", False):
return handle

Copilot uses AI. Check for mistakes.

# Create a new handle for this thread
import tifffile

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing tifffile inside the _get_thread_local_handle() method is unnecessary since it's already used elsewhere in the module (e.g., in the I/O functions). The import statement is executed every time a new thread-local handle is created, adding minor overhead. Consider moving this import to the module level along with the other imports, or removing it if tifffile.TiffFile is already accessible through the imported functions.

Copilot uses AI. Check for mistakes.

handle = tifffile.TiffFile(self.tiff_path)
self._thread_local.tiff_handle = handle

# Track for cleanup
with self._handles_lock:
self._all_handles.append(handle)

return handle

# -------------------------------------------------------------------------
# Properties
# -------------------------------------------------------------------------
Expand Down Expand Up @@ -319,7 +414,9 @@ def _read_tile(self, tile_idx: int, z_level: int = None, time_idx: int = 0) -> n
time_idx=time_idx,
)
else:
return read_ome_tiff_tile(self.tiff_path, tile_idx)
# Use thread-local handle for thread-safe concurrent reads
handle = self._get_thread_local_handle()
return read_ome_tiff_tile(self.tiff_path, tile_idx, handle)

def _read_tile_region(
self,
Expand Down Expand Up @@ -364,7 +461,9 @@ def _read_tile_region(
time_idx=time_idx,
)
else:
return read_ome_tiff_region(self.tiff_path, tile_idx, y_slice, x_slice)
# Use thread-local handle for thread-safe concurrent reads
handle = self._get_thread_local_handle()
return read_ome_tiff_region(self.tiff_path, tile_idx, y_slice, x_slice, handle)

# -------------------------------------------------------------------------
# Registration
Expand Down
89 changes: 70 additions & 19 deletions src/tilefusion/io/ome_tiff.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"""

from pathlib import Path
from typing import Dict, List, Tuple, Any
from typing import Any, Dict, Optional
import xml.etree.ElementTree as ET

import numpy as np
Expand All @@ -31,8 +31,24 @@ def load_ome_tiff_metadata(tiff_path: Path) -> Dict[str, Any]:
- channels: int
- pixel_size: (py, px)
- tile_positions: list of (y, x) tuples
- tiff_handle: tifffile.TiffFile
Open TIFF file handle kept for fast repeated access. The caller is
responsible for closing this handle by calling ``tiff_handle.close()``
when it is no longer needed to avoid resource leaks. The handle
remains valid until it is explicitly closed, or until a higher-level
context manager (if used) closes it on your behalf.

.. versionchanged:: 0.2.0
Breaking change: Now returns an open ``tiff_handle`` that requires
explicit cleanup. Previously this function returned only data without
keeping file handles open. Callers using this function directly must
now call ``metadata['tiff_handle'].close()`` when done, or use
``TileFusion`` which handles cleanup automatically.

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation refers to version 0.2.0, but the current version in pyproject.toml is 0.1.0. Either this version annotation should be removed until the actual 0.2.0 release, or the version should be updated to 0.2.0 if this PR is intended to be that release.

Suggested change
.. versionchanged:: 0.2.0
Breaking change: Now returns an open ``tiff_handle`` that requires
explicit cleanup. Previously this function returned only data without
keeping file handles open. Callers using this function directly must
now call ``metadata['tiff_handle'].close()`` when done, or use
``TileFusion`` which handles cleanup automatically.
Note
----
This function returns an open ``tiff_handle`` that requires explicit
cleanup. Previously this function returned only data without keeping
file handles open. Callers using this function directly must call
``metadata['tiff_handle'].close()`` when done, or use ``TileFusion``
which handles cleanup automatically.

Copilot uses AI. Check for mistakes.
"""
with tifffile.TiffFile(tiff_path) as tif:
# Keep file handle open for fast repeated access
tif = tifffile.TiffFile(tiff_path)
Comment on lines +49 to +50

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TiffFile handle is opened and stored in the returned metadata dictionary, but there's no corresponding mechanism to ensure it gets closed. This can lead to resource leaks if the caller forgets to close it or if an exception occurs before cleanup.

Consider implementing one of these solutions:

  1. Add a context manager protocol to the TileFusion class (implement enter and exit methods)
  2. Add a close() method to TileFusion that closes the tiff_handle if present
  3. Implement del in TileFusion to close the handle on garbage collection (though this is less reliable)

The documentation mentions that "the caller is responsible for closing this handle" but without a clear mechanism in the TileFusion class, this resource will likely leak in practice.

Copilot uses AI. Check for mistakes.

try:
if not tif.ome_metadata:
raise ValueError("TIFF file does not contain OME metadata")

Expand Down Expand Up @@ -66,19 +82,25 @@ def load_ome_tiff_metadata(tiff_path: Path) -> Dict[str, Any]:
else:
tile_positions.append((0.0, 0.0))

return {
"n_tiles": n_tiles,
"n_series": n_series,
"shape": (Y, X),
"channels": channels,
"time_dim": time_dim,
"position_dim": position_dim,
"pixel_size": pixel_size,
"tile_positions": tile_positions,
}


def read_ome_tiff_tile(tiff_path: Path, tile_idx: int) -> np.ndarray:
return {
"n_tiles": n_tiles,
"n_series": n_series,
"shape": (Y, X),
"channels": channels,
"time_dim": time_dim,
"position_dim": position_dim,
"pixel_size": pixel_size,
"tile_positions": tile_positions,
"tiff_handle": tif,
}
Comment on lines +86 to +96

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking API change for load_ome_tiff_metadata. The function now returns an open file handle that must be explicitly closed by the caller, changing the contract from a pure data function to one that manages resources. While the change is well-documented and the TileFusion class handles cleanup properly, external callers who import and use this function directly will now be responsible for closing the handle or risk resource leaks. Consider mentioning this breaking change in release notes or migration guides.

Copilot uses AI. Check for mistakes.
except Exception:

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception handler catches all exceptions, which may hide unrelated errors and make them appear as if they were caused by invalid OME metadata. Consider catching only the specific exceptions related to metadata parsing (e.g., ValueError, ET.ParseError, AttributeError, KeyError) to allow genuine errors to propagate with their original context.

Suggested change
except Exception:
except (ValueError, ET.ParseError, AttributeError, KeyError):

Copilot uses AI. Check for mistakes.
tif.close()
raise
Comment on lines +97 to +102

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception handler catches all exceptions too broadly. If an exception occurs after the tiff_handle has been stored in the return dictionary (line 87), the handle will be closed but the partially constructed dictionary with the now-closed handle will not be returned. However, if an exception occurs before line 78 (the return statement starts), the file is properly closed. Consider whether this broad exception handling is needed or if more specific exceptions should be caught.

Copilot uses AI. Check for mistakes.
Comment on lines +97 to +102

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception handling in load_ome_tiff_metadata properly closes the file handle on errors, but this behavior is not tested. Consider adding a test case that triggers an exception (e.g., invalid OME metadata) and verifies the file handle is properly closed, preventing resource leaks.

Copilot uses AI. Check for mistakes.


def read_ome_tiff_tile(
tiff_path: Path, tile_idx: int, tiff_handle: Optional[tifffile.TiffFile] = None
) -> np.ndarray:
Comment on lines +105 to +107

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function signature uses 'tiff_handle: tifffile.TiffFile = None' which requires the tifffile type to be in scope. Consider using 'Optional[tifffile.TiffFile]' from typing for better clarity and type checking, or document the type annotation convention being used. This makes the optional nature of the parameter more explicit in the function signature.

Copilot uses AI. Check for mistakes.
"""
Read a single tile from OME-TIFF (all channels).

Expand All @@ -88,14 +110,28 @@ def read_ome_tiff_tile(tiff_path: Path, tile_idx: int) -> np.ndarray:
Path to the OME-TIFF file.
tile_idx : int
Index of the tile to read.
tiff_handle : TiffFile, optional
Cached TiffFile handle for faster access. For repeated reads,
keep the handle open and pass it here, or use TileFusion which
manages this automatically.

Returns
-------
arr : ndarray of shape (C, Y, X)
Tile data as float32.

Warning
-------
A single TiffFile handle is NOT thread-safe for concurrent reads.
On Windows, seek+read operations are not atomic, leading to data
corruption. Use separate handles per thread (TileFusion handles
this automatically via thread-local storage).
Comment on lines +127 to +132

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring states "Warning: A single TiffFile handle is NOT thread-safe for concurrent reads." This is accurate and helpful. However, it would be clearer to also mention that this function creates thread-local handles specifically to address this limitation, so callers don't need to worry about it when using TileFusion.

Consider adding: "TileFusion addresses this by maintaining thread-local handles automatically, so concurrent reads through TileFusion methods are safe."

Copilot uses AI. Check for mistakes.
"""
with tifffile.TiffFile(tiff_path) as tif:
arr = tif.series[tile_idx].asarray()
if tiff_handle is not None:
arr = tiff_handle.series[tile_idx].asarray()
else:
with tifffile.TiffFile(tiff_path) as tif:
arr = tif.series[tile_idx].asarray()
Comment on lines +140 to +144

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cached file handle is accessed using dict.get() which returns None by default. While this is handled correctly by the conditional checks in the read functions, this pattern means users who call these functions directly without the cached handle won't benefit from the performance optimization. Consider adding a note in the function documentation about best practices for repeated tile reads.

Copilot uses AI. Check for mistakes.
Comment on lines +140 to +144

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a tiff_handle is provided, errors during asarray() do not clean up the handle. If the caller passes an open handle and an exception occurs, the caller may not know the state of the handle. Consider documenting that the caller is responsible for handle cleanup even if this function raises an exception.

Copilot uses AI. Check for mistakes.
if arr.ndim == 2:
arr = arr[np.newaxis, :, :]
# Flip along Y axis to correct orientation
Expand All @@ -108,6 +144,7 @@ def read_ome_tiff_region(
tile_idx: int,
y_slice: slice,
x_slice: slice,
tiff_handle: Optional[tifffile.TiffFile] = None,
) -> np.ndarray:
"""
Read a region of a tile from OME-TIFF.
Expand All @@ -120,14 +157,28 @@ def read_ome_tiff_region(
Index of the tile.
y_slice, x_slice : slice
Region to read.
tiff_handle : TiffFile, optional
Cached TiffFile handle for faster access. For repeated reads,
keep the handle open and pass it here, or use TileFusion which
manages this automatically.

Returns
-------
arr : ndarray of shape (C, h, w)
Tile region as float32.

Warning
-------
A single TiffFile handle is NOT thread-safe for concurrent reads.
On Windows, seek+read operations are not atomic, leading to data
corruption. Use separate handles per thread (TileFusion handles
this automatically via thread-local storage).
"""
with tifffile.TiffFile(tiff_path) as tif:
arr = tif.series[tile_idx].asarray()
if tiff_handle is not None:
arr = tiff_handle.series[tile_idx].asarray()
else:
with tifffile.TiffFile(tiff_path) as tif:
arr = tif.series[tile_idx].asarray()
Comment on lines +193 to +197

Copilot AI Dec 29, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a tiff_handle is provided, errors during asarray() do not clean up the handle. If the caller passes an open handle and an exception occurs, the caller may not know the state of the handle. Consider documenting that the caller is responsible for handle cleanup even if this function raises an exception.

Copilot uses AI. Check for mistakes.
if arr.ndim == 2:
arr = arr[np.newaxis, :, :]
# Flip along Y axis to correct orientation
Expand Down
Loading