Skip to content

Complement-targeted regions fail on Windows + Python 3.11 (pyranges/sorted_nearest dtype bug) #290

@jeromekelleher

Description

@jeromekelleher

Description

On Windows with Python 3.11, using complement=True regions/targets raises:

ValueError: Buffer dtype mismatch, expected 'const long' but got 'long long' in sorted_nearest/src/clusters.pyx:12, called via pyranges.subtract → pyranges.merge → find_clusters.

Root cause

On Windows (LLP64), the C long type is 32-bit while numpy's default integer is 64-bit. The sorted_nearest Cython extension used by pyranges declares its buffer arguments as const long and passes int32/int64 arrays interchangeably between Python steps, leading to a dtype mismatch on Windows. This is an unresolved upstream bug (pyranges#83, open since 2019). Casting Start/End columns to int32 before building the PyRanges does not help because pyranges' internal merge() step upcasts back to int64.

Scope

Only affects the complement=True code path (which calls .subtract()). The non-complement .overlap() path is unaffected. Only triggered on Python < 3.12: on Python ≥ 3.12 vcztools uses ruranges_py instead of pyranges and the bug is not reached.

Workaround / recommendation

Use Python ≥ 3.12 on Windows. vcztools depends on ruranges_py for Python ≥ 3.12, which is a Rust implementation that does not exhibit this bug. Python 3.11 on Linux and macOS is unaffected because C long is 64-bit there.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions