Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions c/src/neighbors/brute_force.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

/*
* SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -10,6 +10,7 @@

#include <raft/core/error.hpp>
#include <raft/core/mdspan_types.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resources.hpp>
#include <raft/core/serialize.hpp>

Expand Down Expand Up @@ -240,7 +241,7 @@ extern "C" cuvsError_t cuvsBruteForceDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
Comment on lines 242 to +244
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against truncated header reads before parsing dtype.

Line 243 reads 4 bytes but never checks success. On short/corrupt files, Line 244 may parse uninitialized bytes.

💡 Proposed fix
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4]{};
+    if (!is.read(dtype_string, sizeof(dtype_string))) {
+      RAFT_FAIL("Invalid or truncated index header in file %s", filename);
+    }
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/brute_force.cpp` around lines 242 - 244, The code reads 4
bytes into dtype_string using is.read and immediately calls
raft::numpy_serializer::parse_descr on those bytes; however it doesn't check
whether the read succeeded. Update the block around dtype_string/is.read to
verify the stream read (e.g., check is.gcount() == 4 or is.fail()/is.good())
before calling raft::numpy_serializer::parse_descr, and on failure handle the
truncated header by returning an error/throwing an exception or logging and
aborting the parse so parse_descr never receives uninitialized data.


index->dtype.bits = dtype.itemsize * 8;
if (dtype.kind == 'f' && dtype.itemsize == 4) {
Expand Down
3 changes: 2 additions & 1 deletion c/src/neighbors/cagra.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

#include <raft/core/error.hpp>
#include <raft/core/mdspan_types.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resources.hpp>
#include <raft/core/serialize.hpp>

Expand Down Expand Up @@ -875,7 +876,7 @@ extern "C" cuvsError_t cuvsCagraDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));

Comment on lines 877 to 880
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard the dtype prefix read before parse_descr.

Line 878 reads 4 bytes but does not validate read length before parsing on Line 879. Corrupt/truncated files can produce invalid dtype decoding.

Proposed fix
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4] = {};
+    is.read(dtype_string, sizeof(dtype_string));
+    RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
+                 "Failed to read dtype header from %s",
+                 filename);
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
char dtype_string[4] = {};
is.read(dtype_string, sizeof(dtype_string));
RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
"Failed to read dtype header from %s",
filename);
auto dtype =
raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/cagra.cpp` around lines 877 - 880, The code reads 4 bytes
into dtype_string then calls raft::numpy_serializer::parse_descr without
validating the read; first check the read succeeded (e.g., verify is.read(...)
and that is.gcount() == 4 or that the stream is in a good state) before
constructing std::string(dtype_string, 4) and calling
raft::numpy_serializer::parse_descr; if the read fails/returns fewer than 4
bytes, handle the error (throw, return an error code, or log and abort) instead
of passing potentially truncated data to parse_descr.

index->dtype.bits = dtype.itemsize * 8;
if (dtype.kind == 'f' && dtype.itemsize == 4) {
Expand Down
5 changes: 3 additions & 2 deletions c/src/neighbors/ivf_flat.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

/*
* SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -9,6 +9,7 @@

#include <raft/core/error.hpp>
#include <raft/core/mdspan_types.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resources.hpp>
#include <raft/core/serialize.hpp>
#include <raft/util/cudart_utils.hpp>
Expand Down Expand Up @@ -301,7 +302,7 @@ extern "C" cuvsError_t cuvsIvfFlatDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));

Comment on lines 303 to 306
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail early on short dtype-header reads.

Line 304 reads the dtype prefix without checking byte count before parsing on Line 305. This should be validated to handle malformed files safely.

Proposed fix
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4] = {};
+    is.read(dtype_string, sizeof(dtype_string));
+    RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
+                 "Failed to read dtype header from %s",
+                 filename);
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/ivf_flat.cpp` around lines 303 - 306, The code reads four
bytes into dtype_string then calls raft::numpy_serializer::parse_descr without
verifying the read succeeded; validate the read length immediately after
is.read(dtype_string, 4) (e.g., check is.gcount() == 4 or test stream state like
if (!is || is.gcount() != 4)) and on short reads/failure throw or return a clear
error (or set an error status) before calling
raft::numpy_serializer::parse_descr, so malformed or truncated files are handled
safely.

index->dtype.bits = dtype.itemsize * 8;
if (dtype.kind == 'f' && dtype.itemsize == 4) {
Expand Down
7 changes: 4 additions & 3 deletions c/src/neighbors/mg_cagra.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -10,6 +10,7 @@
#include <cuvs/neighbors/common.hpp>
#include <dlpack/dlpack.h>
#include <raft/core/error.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/serialize.hpp>

#include "../core/exceptions.hpp"
Expand Down Expand Up @@ -401,7 +402,7 @@ extern "C" cuvsError_t cuvsMultiGpuCagraDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
is.close();
Comment on lines 403 to 406
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate dtype header reads before parsing.

On Line 404 and Line 435, the 4-byte read is not validated. Truncated files can pass partial/garbage bytes into parse_descr, causing invalid dtype dispatch.

Proposed fix
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4] = {};
+    is.read(dtype_string, sizeof(dtype_string));
+    RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
+                 "Failed to read dtype header from %s",
+                 filename);
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
@@
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4] = {};
+    is.read(dtype_string, sizeof(dtype_string));
+    RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
+                 "Failed to read dtype header from %s",
+                 filename);
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));

Also applies to: 434-437

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/mg_cagra.cpp` around lines 403 - 406, The 4-byte dtype header
read into dtype_string before calling raft::numpy_serializer::parse_descr is not
validated; check the istream read result (e.g., is.read(...) and then
is.gcount() == 4 or !is.fail()) and handle short/truncated reads by
logging/throwing/returning an error instead of calling parse_descr with partial
data; apply the same validation to the other occurrence that reads 4 bytes so
neither is.read -> parse_descr path can receive garbage.


index->dtype.bits = dtype.itemsize * 8;
Expand Down Expand Up @@ -432,7 +433,7 @@ extern "C" cuvsError_t cuvsMultiGpuCagraDistribute(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
is.close();

index->dtype.bits = dtype.itemsize * 8;
Expand Down
7 changes: 4 additions & 3 deletions c/src/neighbors/mg_ivf_flat.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -10,6 +10,7 @@
#include <cuvs/neighbors/ivf_flat.hpp>
#include <dlpack/dlpack.h>
#include <raft/core/error.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/serialize.hpp>

#include "../core/exceptions.hpp"
Expand Down Expand Up @@ -398,7 +399,7 @@ extern "C" cuvsError_t cuvsMultiGpuIvfFlatDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
is.close();
Comment on lines 400 to 403
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate header read length before calling parse_descr in both paths.

Both blocks parse the dtype descriptor immediately after read(...) without checking stream state. Truncated files can lead to undefined behavior due to partially initialized buffers.

💡 Proposed fix (apply in both deserialize and distribute)
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4]{};
+    if (!is.read(dtype_string, sizeof(dtype_string))) {
+      RAFT_FAIL("Invalid or truncated index header in file %s", filename);
+    }
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));

Also applies to: 431-434

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/mg_ivf_flat.cpp` around lines 400 - 403, The code reads 4
bytes into dtype_string and calls raft::numpy_serializer::parse_descr without
validating the read; update both the deserialize and distribute code paths to
check is.read(...) and the stream state (e.g., verify gcount() == 4 or
is.good()/is) before constructing std::string(dtype_string,4) and calling
parse_descr, and handle truncated reads by returning/throwing an error or
reporting via existing error path; reference the dtype_string buffer, the
is.read call, and the parse_descr invocation so you change both occurrences
(around the calls near deserialize and distribute).


index->dtype.bits = dtype.itemsize * 8;
Expand Down Expand Up @@ -429,7 +430,7 @@ extern "C" cuvsError_t cuvsMultiGpuIvfFlatDistribute(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
is.close();

index->dtype.bits = dtype.itemsize * 8;
Expand Down
5 changes: 3 additions & 2 deletions c/src/neighbors/mg_ivf_pq.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

Expand All @@ -10,6 +10,7 @@
#include <cuvs/neighbors/ivf_pq.hpp>
#include <dlpack/dlpack.h>
#include <raft/core/error.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/serialize.hpp>

#include "../core/exceptions.hpp"
Expand Down Expand Up @@ -390,7 +391,7 @@ extern "C" cuvsError_t cuvsMultiGpuIvfPqDeserialize(cuvsResources_t res,
if (!is) { RAFT_FAIL("Cannot open file %s", filename); }
char dtype_string[4];
is.read(dtype_string, 4);
auto dtype = raft::detail::numpy_serializer::parse_descr(std::string(dtype_string, 4));
auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
is.close();
Comment on lines 392 to 395
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate read() result before dtype parsing.

Line 393 does not verify that 4 bytes were actually read. A short read can propagate invalid bytes into parse_descr on Line 394.

Proposed fix
-    char dtype_string[4];
-    is.read(dtype_string, 4);
-    auto dtype = raft::numpy_serializer::parse_descr(std::string(dtype_string, 4));
+    char dtype_string[4] = {};
+    is.read(dtype_string, sizeof(dtype_string));
+    RAFT_EXPECTS(is.gcount() == static_cast<std::streamsize>(sizeof(dtype_string)),
+                 "Failed to read dtype header from %s",
+                 filename);
+    auto dtype =
+      raft::numpy_serializer::parse_descr(std::string(dtype_string, sizeof(dtype_string)));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/neighbors/mg_ivf_pq.cpp` around lines 392 - 395, The code calls is.read
into dtype_string and immediately hands those bytes to
raft::numpy_serializer::parse_descr without checking the read result; update the
logic around dtype_string/is.read to validate that 4 bytes were actually read
(e.g., check is.gcount() == 4 and/or is.good()/is.fail()) before calling
parse_descr, and handle the short-read error path (return an error, throw, or
log and exit) so parse_descr is never called with incomplete data; reference the
dtype_string buffer, the is.read call, and raft::numpy_serializer::parse_descr
when making the change and ensure is.close() still runs in all code paths.


index->dtype.bits = dtype.itemsize * 8;
Expand Down
6 changes: 3 additions & 3 deletions cpp/cmake/thirdparty/get_raft.cmake
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# =============================================================================
# cmake-format: off
# SPDX-FileCopyrightText: Copyright (c) 2023-2025, NVIDIA CORPORATION.
# SPDX-FileCopyrightText: Copyright (c) 2023-2026, NVIDIA CORPORATION.
# SPDX-License-Identifier: Apache-2.0
# cmake-format: on

# Use RAPIDS_VERSION_MAJOR_MINOR from rapids_config.cmake
set(RAFT_VERSION "${RAPIDS_VERSION_MAJOR_MINOR}")
set(RAFT_FORK "rapidsai")
set(RAFT_PINNED_TAG "${rapids-cmake-checkout-tag}")
set(RAFT_FORK "julianmi")
set(RAFT_PINNED_TAG "expose-public-npy-helpers")
Comment on lines +9 to +10
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Checking upstream dependency status (rapidsai/raft PR `#3003`)..."
curl -s https://api.github.com/repos/rapidsai/raft/pulls/3003 \
  | jq '{state, merged, merge_commit_sha, head_repo: .head.repo.full_name, head_ref: .head.ref, base_ref: .base.ref}'

echo
echo "Checking whether expose-public-npy-helpers exists in official and fork repos..."
git ls-remote --heads https://github.com/rapidsai/raft.git expose-public-npy-helpers || true
git ls-remote --tags  https://github.com/rapidsai/raft.git expose-public-npy-helpers || true
git ls-remote --heads https://github.com/julianmi/raft.git expose-public-npy-helpers || true
git ls-remote --tags  https://github.com/julianmi/raft.git expose-public-npy-helpers || true

Repository: rapidsai/cuvs

Length of output: 407


🏁 Script executed:

cat -n cpp/cmake/thirdparty/get_raft.cmake

Repository: rapidsai/cuvs

Length of output: 3417


🏁 Script executed:

# Search for any comments or related discussions about this fork usage
rg "expose-public-npy-helpers|RAFT_FORK|julianmi" --type cmake --type markdown -C 3

Repository: rapidsai/cuvs

Length of output: 9046


🏁 Script executed:

# Check git history for when this fork was introduced and any commit messages
git log --oneline -20 -- cpp/cmake/thirdparty/get_raft.cmake

Repository: rapidsai/cuvs

Length of output: 109


🏁 Script executed:

# Search for any open or recent PRs that might explain the fork usage
git log --all --oneline --grep="raft\|RAFT" | head -30

Repository: rapidsai/cuvs

Length of output: 1759


Avoid hard-pinning production builds to a personal RAFT fork/ref.

This introduces a supply-chain risk for CI and releases if that fork/ref is rewritten or removed. The expose-public-npy-helpers branch currently exists in julianmi/raft, but pinning to a personal fork without a documented integration path or timeline creates long-term dependency fragility. Prefer official rapidsai/raft by default, and gate fork/ref overrides behind explicit opt-in CMake cache vars for temporary cross-PR testing.

Suggested hardening
-set(RAFT_FORK "julianmi")
-set(RAFT_PINNED_TAG "expose-public-npy-helpers")
+set(RAFT_FORK "rapidsai")
+set(RAFT_PINNED_TAG "${rapids-cmake-checkout-tag}")
+
+set(CUVS_RAFT_FORK_OVERRIDE "" CACHE STRING "Optional RAFT fork override for temporary testing")
+set(CUVS_RAFT_TAG_OVERRIDE  "" CACHE STRING "Optional RAFT tag/branch override for temporary testing")
+
+if(CUVS_RAFT_FORK_OVERRIDE)
+  set(RAFT_FORK "${CUVS_RAFT_FORK_OVERRIDE}")
+endif()
+if(CUVS_RAFT_TAG_OVERRIDE)
+  set(RAFT_PINNED_TAG "${CUVS_RAFT_TAG_OVERRIDE}")
+endif()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/cmake/thirdparty/get_raft.cmake` around lines 9 - 10, The CMake variables
RAFT_FORK and RAFT_PINNED_TAG are hard-pinned to a personal fork/branch; change
the defaults to use the official upstream (e.g., "rapidsai/raft") and a safe
default tag (empty or a release tag), and expose RAFT_FORK and RAFT_PINNED_TAG
as CMake CACHE variables so they can only be overridden explicitly via -D on the
command line; additionally gate any non-upstream usage behind an opt-in flag
(e.g., USE_CUSTOM_RAFT or RAFT_USE_FORK) so CI/releases use the official repo by
default and personal-fork overrides are explicitly documented and temporary.


function(find_and_configure_raft)
set(oneValueArgs VERSION FORK PINNED_TAG BUILD_STATIC_DEPS ENABLE_NVTX ENABLE_MNMG_DEPENDENCIES CLONE_ON_PIN)
Expand Down
7 changes: 4 additions & 3 deletions cpp/include/cuvs/neighbors/cagra.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include <raft/core/host_mdspan.hpp>
#include <raft/core/mdspan.hpp>
#include <raft/core/mdspan_types.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resource/stream_view.hpp>
#include <raft/core/serialize.hpp>

Expand Down Expand Up @@ -764,7 +765,7 @@ struct index : cuvs::neighbors::index {
if (lseek(fd.get(), 0, SEEK_SET) == -1) {
RAFT_FAIL("Failed to seek to beginning of dataset file");
}
auto header = raft::detail::numpy_serializer::read_header(stream);
auto header = raft::numpy_serializer::read_header(stream);
RAFT_EXPECTS(header.shape.size() == 2,
"Dataset file should be 2D, got %zu dimensions",
header.shape.size());
Expand Down Expand Up @@ -799,7 +800,7 @@ struct index : cuvs::neighbors::index {
if (lseek(fd.get(), 0, SEEK_SET) == -1) {
RAFT_FAIL("Failed to seek to beginning of graph file");
}
auto header = raft::detail::numpy_serializer::read_header(stream);
auto header = raft::numpy_serializer::read_header(stream);
RAFT_EXPECTS(
header.shape.size() == 2, "Graph file should be 2D, got %zu dimensions", header.shape.size());

Expand Down Expand Up @@ -840,7 +841,7 @@ struct index : cuvs::neighbors::index {
if (lseek(fd.get(), 0, SEEK_SET) == -1) {
RAFT_FAIL("Failed to seek to beginning of mapping file");
}
auto header = raft::detail::numpy_serializer::read_header(stream);
auto header = raft::numpy_serializer::read_header(stream);
RAFT_EXPECTS(header.shape.size() == 1,
"Mapping file should be 1D, got %zu dimensions",
header.shape.size());
Expand Down
9 changes: 5 additions & 4 deletions cpp/include/cuvs/util/file_io.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#pragma once

#include <raft/core/error.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/serialize.hpp>

#include <algorithm>
Expand Down Expand Up @@ -187,12 +188,12 @@ std::pair<file_descriptor, size_t> create_numpy_file(const std::string& path,
file_descriptor fd(path, O_CREAT | O_RDWR | O_TRUNC, 0644);

// Build header
const auto dtype = raft::detail::numpy_serializer::get_numpy_dtype<T>();
const bool fortran_order = false;
const raft::detail::numpy_serializer::header_t header = {dtype, fortran_order, shape};
const auto dtype = raft::numpy_serializer::get_numpy_dtype<T>();
const bool fortran_order = false;
const raft::numpy_serializer::header_t header = {dtype, fortran_order, shape};

std::stringstream ss;
raft::detail::numpy_serializer::write_header(ss, header);
raft::numpy_serializer::write_header(ss, header);
std::string header_str = ss.str();
size_t header_size = header_str.size();

Expand Down
5 changes: 3 additions & 2 deletions cpp/src/neighbors/brute_force_serialize.cu
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2024, NVIDIA CORPORATION.
* SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION.
* SPDX-License-Identifier: Apache-2.0
*/

#include <cuvs/neighbors/brute_force.hpp>
#include <raft/core/copy.cuh>
#include <raft/core/host_mdarray.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resources.hpp>
#include <raft/core/serialize.hpp>

Expand All @@ -24,7 +25,7 @@ void serialize(raft::resources const& handle,
RAFT_LOG_DEBUG(
"Saving brute force index, size %zu, dim %u", static_cast<size_t>(index.size()), index.dim());

auto dtype_string = raft::detail::numpy_serializer::get_numpy_dtype<T>().to_string();
auto dtype_string = raft::numpy_serializer::get_numpy_dtype<T>().to_string();
dtype_string.resize(4);
os << dtype_string;

Expand Down
5 changes: 3 additions & 2 deletions cpp/src/neighbors/detail/cagra/cagra_build.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include <raft/core/host_mdarray.hpp>
#include <raft/core/host_mdspan.hpp>
#include <raft/core/logger.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resource/cuda_stream.hpp>
#include <raft/util/integer_utils.hpp>

Expand Down Expand Up @@ -726,14 +727,14 @@ void ace_load_partition_dataset_from_disk(
std::ifstream is(reordered_dataset_path, std::ios::in | std::ios::binary);
if (!is) { RAFT_FAIL("Cannot open file %s", reordered_dataset_path.c_str()); }
auto start_pos = is.tellg();
raft::detail::numpy_serializer::read_header(is);
raft::numpy_serializer::read_header(is);
core_header_size = static_cast<size_t>(is.tellg() - start_pos);
}
{
std::ifstream is(augmented_dataset_path, std::ios::in | std::ios::binary);
if (!is) { RAFT_FAIL("Cannot open file %s", augmented_dataset_path.c_str()); }
auto start_pos = is.tellg();
raft::detail::numpy_serializer::read_header(is);
raft::numpy_serializer::read_header(is);
augmented_header_size = static_cast<size_t>(is.tellg() - start_pos);
}

Expand Down
3 changes: 2 additions & 1 deletion cpp/src/neighbors/detail/cagra/cagra_serialize.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include <raft/core/logger.hpp>
#include <raft/core/mdarray.hpp>
#include <raft/core/mdspan_types.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resource/cuda_stream.hpp>
#include <raft/core/serialize.hpp>
#include <raft/util/cudart_utils.hpp>
Expand Down Expand Up @@ -54,7 +55,7 @@ void serialize(raft::resources const& res,
RAFT_LOG_DEBUG(
"Saving CAGRA index, size %zu, dim %u", static_cast<size_t>(index_.size()), index_.dim());

std::string dtype_string = raft::detail::numpy_serializer::get_numpy_dtype<T>().to_string();
std::string dtype_string = raft::numpy_serializer::get_numpy_dtype<T>().to_string();
dtype_string.resize(4);
os << dtype_string;

Expand Down
8 changes: 4 additions & 4 deletions cpp/src/neighbors/detail/hnsw.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@
#include <cuvs/util/file_io.hpp>

#include <raft/core/copy.cuh>
#include <raft/core/detail/mdspan_numpy_serializer.hpp>
#include <raft/core/host_mdspan.hpp>
#include <raft/core/logger.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/pinned_mdarray.hpp>
#include <raft/util/cudart_utils.hpp>

Expand Down Expand Up @@ -399,7 +399,7 @@ void serialize_to_hnswlib_from_disk(raft::resources const& res,
std::ifstream graph_stream(graph_path, std::ios::binary);
RAFT_EXPECTS(graph_stream.good(), "Failed to open graph file: %s", graph_path.c_str());

auto header = raft::detail::numpy_serializer::read_header(graph_stream);
auto header = raft::numpy_serializer::read_header(graph_stream);
graph_header_size = static_cast<size_t>(graph_stream.tellg());
RAFT_EXPECTS(
header.shape.size() == 2, "Graph file should be 2D, got %zu dimensions", header.shape.size());
Expand All @@ -419,7 +419,7 @@ void serialize_to_hnswlib_from_disk(raft::resources const& res,
std::ifstream dataset_stream(dataset_path, std::ios::binary);
RAFT_EXPECTS(dataset_stream.good(), "Failed to open dataset file: %s", dataset_path.c_str());

auto header = raft::detail::numpy_serializer::read_header(dataset_stream);
auto header = raft::numpy_serializer::read_header(dataset_stream);
dataset_header_size = static_cast<size_t>(dataset_stream.tellg());
RAFT_EXPECTS(header.shape.size() == 2,
"Dataset file should be 2D, got %zu dimensions",
Expand All @@ -439,7 +439,7 @@ void serialize_to_hnswlib_from_disk(raft::resources const& res,
std::ifstream mapping_stream(mapping_path, std::ios::binary);
RAFT_EXPECTS(mapping_stream.good(), "Failed to open mapping file: %s", mapping_path.c_str());

auto header = raft::detail::numpy_serializer::read_header(mapping_stream);
auto header = raft::numpy_serializer::read_header(mapping_stream);
label_header_size = static_cast<size_t>(mapping_stream.tellg());
RAFT_EXPECTS(header.shape.size() == 1,
"Mapping file should be 1D, got %zu dimensions",
Expand Down
4 changes: 2 additions & 2 deletions cpp/src/neighbors/ivf_flat/ivf_flat_serialize.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@
#include <cuvs/neighbors/ivf_flat.hpp>

#include <raft/core/copy.cuh>
#include <raft/core/detail/mdspan_numpy_serializer.hpp>
#include <raft/core/mdarray.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resource/cuda_stream.hpp>
#include <raft/core/serialize.hpp>
#include <raft/util/pow2_utils.cuh>
Expand Down Expand Up @@ -44,7 +44,7 @@ void serialize(raft::resources const& handle, std::ostream& os, const index<T, I
RAFT_LOG_DEBUG(
"Saving IVF-Flat index, size %zu, dim %u", static_cast<size_t>(index_.size()), index_.dim());

std::string dtype_string = raft::detail::numpy_serializer::get_numpy_dtype<T>().to_string();
std::string dtype_string = raft::numpy_serializer::get_numpy_dtype<T>().to_string();
dtype_string.resize(4);
os << dtype_string;

Expand Down
3 changes: 2 additions & 1 deletion cpp/src/neighbors/mg/snmg.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#include <raft/core/copy.cuh>
#include <raft/core/device_mdspan.hpp>
#include <raft/core/host_mdspan.hpp>
#include <raft/core/numpy_serializer.hpp>
#include <raft/core/resource/multi_gpu.hpp>
#include <raft/core/resource/nccl_comm.hpp>
#include <raft/core/serialize.hpp>
Expand Down Expand Up @@ -738,7 +739,7 @@ void serialize(const raft::resources& clique,
std::ofstream of(filename, std::ios::out | std::ios::binary);
if (!of) { RAFT_FAIL("Cannot open file %s", filename.c_str()); }

std::string dtype_string = raft::detail::numpy_serializer::get_numpy_dtype<T>().to_string();
std::string dtype_string = raft::numpy_serializer::get_numpy_dtype<T>().to_string();
dtype_string.resize(4);
of << dtype_string;

Expand Down
Loading