Skip to content

Fix symbol export#2052

Open
vyasr wants to merge 12 commits intorapidsai:mainfrom
vyasr:fix/symbol_export
Open

Fix symbol export#2052
vyasr wants to merge 12 commits intorapidsai:mainfrom
vyasr:fix/symbol_export

Conversation

@vyasr
Copy link
Copy Markdown
Contributor

@vyasr vyasr commented May 1, 2026

This PR adds symbol visibility controls to raft to avoid exporting weak symbols that it shouldn't.

Contributes to https://github.com/rapidsai/build-infra/issues/53

@vyasr vyasr self-assigned this May 1, 2026
@vyasr vyasr added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels May 1, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 1, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@vyasr vyasr force-pushed the fix/symbol_export branch from bc22612 to a4938b7 Compare May 1, 2026 02:20
@rapidsai rapidsai deleted a comment from copy-pr-bot Bot May 1, 2026
@rapidsai rapidsai deleted a comment from copy-pr-bot Bot May 1, 2026
@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 1, 2026

/ok to test

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 1, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 1, 2026

/ok to test

3 similar comments
@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 1, 2026

/ok to test

@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 1, 2026

/ok to test

@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 1, 2026

/ok to test

vyasr and others added 7 commits May 1, 2026 11:06
Add CUVS_EXPORT/CUVS_HIDDEN macros and set CXX_VISIBILITY_PRESET hidden
on cuvs_objs and cuvs_c targets. Apply namespace CUVS_EXPORT cuvs to all
installed C++ headers and CUVS_EXPORT to C API function declarations.

Also fix pairwise_distance.cpp to include its own C API header so that
the CUVS_EXPORT annotation on the function declaration is visible to the
compiler when building with hidden default visibility.
The check-c-abi workflow only uses c/include as its header path.
export.hpp was only present in cpp/include, causing CUVS_EXPORT to be
undefined when the ABI checker parsed C headers, breaking all function
signature comparisons.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Under -fvisibility=hidden, explicit template instantiations in source
files require CUVS_EXPORT to be exported from libcuvs.so. Without this,
test binaries linking against libcuvs.so get undefined symbols (e.g.
cuvs::neighbors::ivf_flat::index<__half, long>::n_lists()).

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
NVCC does not support __attribute__((visibility("default"))) on
explicit class template instantiations (error rapidsai#1091-D). These symbols
are already exported via the namespace-level CUVS_EXPORT on the header
declarations. Only function template instantiations (template void)
need the per-instantiation CUVS_EXPORT in .cu files.
This function was the only C API declaration in pq.h missing the
CUVS_EXPORT annotation, causing an undefined symbol error when the
Python wheel tried to load pq.abi3.so against libcuvs.so built with
-fvisibility=hidden.
… Doxygen

- Rename c/include/cuvs/core/export.hpp -> export.h (pure C macros
  should use .h extension; fixes Java jextract build which only copies
  *.h files to its staging directory)
- Update all 23 C API headers to include export.h instead of export.hpp
- In Doxyfile: set MACRO_EXPANSION=YES, EXPAND_ONLY_PREDEF=YES, add
  CUVS_EXPORT= and CUVS_HIDDEN= to PREDEFINED so Doxygen strips the
  visibility attribute before generating XML, preventing breathe from
  choking on CUVS_EXPORT in C function signatures
@vyasr vyasr force-pushed the fix/symbol_export branch from 92cf4b4 to 2845c84 Compare May 1, 2026 18:06
Temporarily point RAFT_FORK and RAFT_PINNED_TAG at the raft symbol
visibility PR branch to verify cuVS works with hidden-default raft.
@vyasr vyasr marked this pull request as ready for review May 1, 2026 18:11
@vyasr vyasr requested review from a team as code owners May 1, 2026 18:11
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3524d07b-0984-4888-ad8d-62f3d382332d

📥 Commits

Reviewing files that changed from the base of the PR and between b3d67e9 and e11655d.

📒 Files selected for processing (1)
  • c/include/cuvs/neighbors/ivf_flat.h
✅ Files skipped from review due to trivial changes (1)
  • c/include/cuvs/neighbors/ivf_flat.h

📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • Chores
    • Updated library symbol visibility configuration to optimize binary size and improve ABI compatibility across platforms.

Walkthrough

This PR introduces a symbol visibility control system for cuVS by creating export macros (CUVS_EXPORT, CUVS_HIDDEN), configuring CMake with hidden visibility defaults, and systematically annotating C API functions and C++ namespaces across ~70 headers and source files to explicitly mark public symbols for external linkage.

Changes

Symbol Export Visibility System

Layer / File(s) Summary
Export Macro Definitions
c/include/cuvs/core/export.h, cpp/include/cuvs/core/export.hpp
New headers define CUVS_EXPORT and CUVS_HIDDEN macros as GCC visibility attributes on GNU-like toolchains, or empty otherwise.
Build System Configuration
c/CMakeLists.txt, cpp/CMakeLists.txt
Library targets configured with CXX_VISIBILITY_PRESET hidden and VISIBILITY_INLINES_HIDDEN ON to default-hide all symbols unless explicitly exported.
C API Function Annotations
c/include/cuvs/core/c_api.h, c/include/cuvs/cluster/kmeans.h, c/include/cuvs/distance/pairwise_distance.h, c/include/cuvs/neighbors/all_neighbors.h, c/include/cuvs/neighbors/brute_force.h, c/include/cuvs/neighbors/cagra.h, c/include/cuvs/neighbors/hnsw.h, c/include/cuvs/neighbors/ivf_flat.h, c/include/cuvs/neighbors/ivf_pq.h, c/include/cuvs/neighbors/mg_cagra.h, c/include/cuvs/neighbors/mg_ivf_flat.h, c/include/cuvs/neighbors/mg_ivf_pq.h, c/include/cuvs/neighbors/nn_descent.h, c/include/cuvs/neighbors/refine.h, c/include/cuvs/neighbors/tiered_index.h, c/include/cuvs/neighbors/vamana.h, c/include/cuvs/preprocessing/pca.h, c/include/cuvs/preprocessing/quantize/binary.h, c/include/cuvs/preprocessing/quantize/pq.h, c/include/cuvs/preprocessing/quantize/scalar.h
Public C API function declarations are prefixed with CUVS_EXPORT and headers include the export definition. ~180+ function declarations across 20+ headers now explicitly marked for export.
C++ Namespace Annotations
cpp/include/cuvs/cluster/agglomerative.hpp, cpp/include/cuvs/cluster/kmeans.hpp, cpp/include/cuvs/cluster/spectral.hpp, cpp/include/cuvs/core/bitmap.hpp, cpp/include/cuvs/core/bitset.hpp, cpp/include/cuvs/distance/distance.hpp, cpp/include/cuvs/distance/grammian.hpp, cpp/include/cuvs/distance/kde.hpp, cpp/include/cuvs/neighbors/all_neighbors.hpp, cpp/include/cuvs/neighbors/ball_cover.hpp, cpp/include/cuvs/neighbors/brute_force.hpp, cpp/include/cuvs/neighbors/cagra.hpp, cpp/include/cuvs/neighbors/common.hpp, cpp/include/cuvs/neighbors/composite/index.hpp, cpp/include/cuvs/neighbors/dynamic_batching.hpp, cpp/include/cuvs/neighbors/epsilon_neighborhood.hpp, cpp/include/cuvs/neighbors/hnsw.hpp, cpp/include/cuvs/neighbors/ivf_flat.hpp, cpp/include/cuvs/neighbors/ivf_pq.hpp, cpp/include/cuvs/neighbors/knn_merge_parts.hpp, cpp/include/cuvs/neighbors/nn_descent.hpp, cpp/include/cuvs/neighbors/refine.hpp, cpp/include/cuvs/neighbors/scann.hpp, cpp/include/cuvs/neighbors/tiered_index.hpp, cpp/include/cuvs/neighbors/vamana.hpp, cpp/include/cuvs/preprocessing/pca.hpp, cpp/include/cuvs/preprocessing/quantize/binary.hpp, cpp/include/cuvs/preprocessing/quantize/pq.hpp, cpp/include/cuvs/preprocessing/quantize/scalar.hpp, cpp/include/cuvs/preprocessing/spectral_embedding.hpp, cpp/include/cuvs/selection/select_k.hpp, cpp/include/cuvs/stats/silhouette_score.hpp, cpp/include/cuvs/stats/trustworthiness_score.hpp, cpp/include/cuvs/util/cutlass_utils.hpp, cpp/include/cuvs/util/file_io.hpp, cpp/include/cuvs/util/host_memory.hpp
Namespaces restructured from inline form (e.g., namespace cuvs::neighbors::foo) to explicit nested form with CUVS_EXPORT on outer cuvs scope (e.g., namespace CUVS_EXPORT cuvs { namespace neighbors { namespace foo { ... }}}) to enable symbol export control. ~35 headers affected.
Header Includes & Support
c/include/cuvs/core/all.h, c/include/cuvs/distance/distance.h, c/include/cuvs/neighbors/common.h, c/include/cuvs/neighbors/mg_common.h
Headers updated to include cuvs/core/export.h or .hpp as needed; copyright years extended to 2024-2026 or 2025-2026.
Documentation & Build Config
cpp/doxygen/Doxyfile
Doxygen config updated to enable macro expansion and define CUVS_EXPORT and CUVS_HIDDEN as empty for documentation processing; export.h and export.hpp excluded from input.
Explicit Template Instantiations
cpp/src/distance/kde.cu, cpp/src/neighbors/epsilon_neighborhood.cu, cpp/src/neighbors/ivf_flat_index.cpp
Explicit template instantiations for float/double and index specializations are decorated with template CUVS_EXPORT to mark them as exported symbols in the shared library.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix symbol export' is directly related to the main change: the PR implements symbol visibility controls across the codebase to avoid exporting weak symbols incorrectly.
Description check ✅ Passed The description clearly explains the purpose of the PR—adding symbol visibility controls to avoid exporting weak symbols—and links to the related issue being addressed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Review rate limit: 9/10 reviews remaining, refill in 6 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
cpp/CMakeLists.txt (1)

931-940: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Propagate the visibility defaults to cuvs-cagra-search too.

cuvs-cagra-search is also compiled into the shared library, but it still builds with the compiler defaults. That leaves one object target outside the hidden-default setup, so the new export hygiene can still leak symbols.

♻️ Suggested fix
   set_target_properties(
     cuvs_objs
     PROPERTIES CXX_STANDARD 20
                CXX_STANDARD_REQUIRED ON
                CUDA_STANDARD 20
                CUDA_STANDARD_REQUIRED ON
                POSITION_INDEPENDENT_CODE ON
                CXX_VISIBILITY_PRESET hidden
                VISIBILITY_INLINES_HIDDEN ON
   )
+  set_target_properties(
+    cuvs-cagra-search
+    PROPERTIES CXX_VISIBILITY_PRESET hidden
+               VISIBILITY_INLINES_HIDDEN ON
+  )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/CMakeLists.txt` around lines 931 - 940, The cuvs-cagra-search target is
not receiving the hidden-visibility defaults, allowing symbol leakage; update
the CMake configuration so cuvs-cagra-search is configured like cuvs_objs (or
included in the same set_target_properties call) to set CXX_STANDARD 20,
CXX_STANDARD_REQUIRED ON, CUDA_STANDARD 20, CUDA_STANDARD_REQUIRED ON,
POSITION_INDEPENDENT_CODE ON, CXX_VISIBILITY_PRESET hidden, and
VISIBILITY_INLINES_HIDDEN ON for the cuvs-cagra-search target (reference the
existing set_target_properties call for cuvs_objs to mirror or extend the same
property list).
c/src/distance/pairwise_distance.cpp (2)

46-53: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Validate the public C API inputs before dereferencing them.

translate_exceptions won't save a null cuvsResources_t or DLManagedTensor*; this path will crash before it can return a cuvsError_t. Please reject null res, x_tensor, y_tensor, and distances_tensor before touching dl_tensor.
Based on coding guidelines, Input validation must check for negative or invalid dimensions, null pointers, and invalid parameter combinations before GPU operations.

🛡️ Suggested fix
 extern "C" cuvsError_t cuvsPairwiseDistance(cuvsResources_t res,
                                             DLManagedTensor* x_tensor,
                                             DLManagedTensor* y_tensor,
                                             DLManagedTensor* distances_tensor,
                                             cuvsDistanceType metric,
                                             float metric_arg)
 {
   return cuvs::core::translate_exceptions([=] {
+    if (res == nullptr || x_tensor == nullptr || y_tensor == nullptr ||
+        distances_tensor == nullptr) {
+      RAFT_FAIL("cuvsPairwiseDistance requires non-null inputs");
+    }
     auto x_dt    = x_tensor->dl_tensor.dtype;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/distance/pairwise_distance.cpp` around lines 46 - 53, The public C API
cuvsPairwiseDistance must validate inputs before any dereference: add checks at
the start of cuvsPairwiseDistance (before calling
cuvs::core::translate_exceptions or touching dl_tensor) to reject null pointers
for res, x_tensor, y_tensor, and distances_tensor and return an appropriate
cuvsError_t; also validate tensor dimensions/sizes and metric/metric_arg
combinations (negative dims, mismatched shapes, unsupported cuvsDistanceType)
and return errors early so GPU code never runs with invalid inputs. Ensure these
checks reference the same parameter names (cuvsResources_t res, DLManagedTensor*
x_tensor, DLManagedTensor* y_tensor, DLManagedTensor* distances_tensor,
cuvsDistanceType metric, float metric_arg) and occur prior to any use of
dl_tensor or other members.

54-56: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Read dtype metadata from the matching tensors.

y_dt and dist_dt are both taken from x_tensor right now, so mixed-dtype inputs will pass validation and dispatch the wrong specialization.

🐛 Suggested fix
-    auto x_dt    = x_tensor->dl_tensor.dtype;
-    auto y_dt    = x_tensor->dl_tensor.dtype;
-    auto dist_dt = x_tensor->dl_tensor.dtype;
+    auto x_dt    = x_tensor->dl_tensor.dtype;
+    auto y_dt    = y_tensor->dl_tensor.dtype;
+    auto dist_dt = distances_tensor->dl_tensor.dtype;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/distance/pairwise_distance.cpp` around lines 54 - 56, The dtype
variables are incorrectly read from x_tensor for all three; change the
assignments so y_dt reads from y_tensor->dl_tensor.dtype and dist_dt reads from
dist_tensor->dl_tensor.dtype (leave x_dt as x_tensor->dl_tensor.dtype) so that
mixed-dtype inputs validate and dispatch correctly for functions/logic that use
x_dt, y_dt, and dist_dt in pairwise_distance.cpp (variables x_tensor, y_tensor,
dist_tensor).
c/include/cuvs/preprocessing/quantize/scalar.h (1)

88-121: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add return documentation to the exported quantizer APIs.

cuvsScalarQuantizerTrain, cuvsScalarQuantizerTransform, and cuvsScalarQuantizerInverseTransform still have parameter docs only. As per coding guidelines: public C API functions must include complete Doxygen documentation describing parameters, return values, and side effects.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/preprocessing/quantize/scalar.h` around lines 88 - 121, The
three exported functions cuvsScalarQuantizerTrain, cuvsScalarQuantizerTransform,
and cuvsScalarQuantizerInverseTransform are missing `@return` (and any
side-effect) Doxygen documentation; update each function comment block to
include an `@return` describing the cuvsError_t values returned (e.g.,
CUVS_SUCCESS, error codes) and any side effects (e.g., device/host memory
requirements, in-place modification, ownership of DLManagedTensor pointers), so
the public C API docs are complete and follow the coding guideline.
c/include/cuvs/neighbors/tiered_index.h (1)

214-220: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add the missing @return section to cuvsTieredIndexSearch.

The newly exported search API still lacks return-value documentation, which leaves the public C API docs incomplete. As per coding guidelines: public C API functions must include complete Doxygen documentation describing parameters, return values, and side effects.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/neighbors/tiered_index.h` around lines 214 - 220, Update the
Doxygen comment for the exported function cuvsTieredIndexSearch to include a
`@return` section that documents the cuvsError_t return values (e.g., CUVS_SUCCESS
on success, specific CUVS_* error codes on failure such as invalid arguments,
allocation or search errors) and any side-effects; locate the comment block
above the cuvsTieredIndexSearch declaration and add a concise `@return`
description listing possible return codes and their meaning so the public C API
docs are complete.
c/include/cuvs/preprocessing/quantize/binary.h (1)

104-139: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add return documentation to the exported binary quantizer APIs.

cuvsBinaryQuantizerTrain, cuvsBinaryQuantizerTransform, and cuvsBinaryQuantizerTransformWithParams still have parameter-only docs. As per coding guidelines: public C API functions must include complete Doxygen documentation describing parameters, return values, and side effects.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/preprocessing/quantize/binary.h` around lines 104 - 139, The
three exported APIs cuvsBinaryQuantizerTrain, cuvsBinaryQuantizerTransform, and
cuvsBinaryQuantizerTransformWithParams are missing `@return` documentation; update
each Doxygen comment to include an `@return` that describes the returned
cuvsError_t values (e.g., CUVS_SUCCESS on success and possible error codes on
failure such as invalid arguments, allocation failures, or
transform/train-specific errors) and note side effects like modifications to
output tensors (for cuvsBinaryQuantizerTransform and
cuvsBinaryQuantizerTransformWithParams the out DLManagedTensor is written, and
for cuvsBinaryQuantizerTrain the quantizer is created/initialized). Ensure the
`@return` text is clear and consistent across the three functions and references
the same error codes used elsewhere in the API.
c/include/cuvs/neighbors/hnsw.h (1)

580-585: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix broken deserialize example identifiers.

The sample uses misspelled/invalid symbols (cuvsHnsWIndexParams_t, malformed cuvsHnswDeserialize call), so copy-paste usage from docs will fail.

Suggested doc snippet correction
- * cuvsHnsWIndexParams_t hnsw_params;
+ * cuvsHnswIndexParams_t hnsw_params;
  * cuvsHnswIndexParamsCreate(&hnsw_params);
  * hnsw_params->hierarchy = NONE;
  * hnsw_index->dtype = index->dtype;
- * cuvsHnswDeserialize(res, hnsw_params, "/path/to/index", dim, metric hnsw_index);
+ * cuvsHnswDeserialize(res, hnsw_params, "/path/to/index", dim, metric, hnsw_index);

As per coding guidelines c/include/cuvs/**/*: “For public C API headers, additionally check: Doxygen documentation for all public functions/classes”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/neighbors/hnsw.h` around lines 580 - 585, The example doc uses
misspelled/invalid symbols; replace the incorrect type/name and fix the
deserialize call so it matches the public API. Use cuvsHnswIndexParams_t and
cuvsHnswIndexParamsCreate for the params variable (hnsw_params) and call
cuvsHnswDeserialize with the correct argument list and commas (e.g.,
cuvsHnswDeserialize(res, hnsw_params, "/path/to/index", dim, metric,
hnsw_index)) so the example compiles and reflects the real function signature.
c/include/cuvs/neighbors/cagra.h (1)

768-778: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the example to match the exported function name.

The docs example calls cuvsCagraSerializeHnswlib, but the public API declaration is cuvsCagraSerializeToHnswlib. This is easy to copy incorrectly and fail at compile time.

Suggested doc fix
- * cuvsCagraSerializeHnswlib(res, "/path/to/index", index);
+ * cuvsCagraSerializeToHnswlib(res, "/path/to/index", index);

As per coding guidelines c/include/cuvs/**/*: “For public C API headers, additionally check: Doxygen documentation for all public functions/classes”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/neighbors/cagra.h` around lines 768 - 778, The example in the
Doxygen block uses the wrong function name; update the sample call to use the
exported API name cuvsCagraSerializeToHnswlib (e.g., replace
cuvsCagraSerializeHnswlib(res, "/path/to/index", index); with
cuvsCagraSerializeToHnswlib(res, "/path/to/index", index);) so the docs match
the declared function cuvsCagraSerializeToHnswlib.
🧹 Nitpick comments (2)
c/include/cuvs/neighbors/hnsw.h (1)

277-282: ⚡ Quick win

Add dedicated Doxygen for cuvsHnswFromCagraWithDataset.

cuvsHnswFromCagraWithDataset is now explicitly exported but has no standalone parameter/return docs, unlike adjacent public APIs.

Proposed doc block pattern
+/**
+ * `@brief` Convert a CAGRA index to an HNSW index using an explicit dataset tensor.
+ *
+ * `@param`[in] res cuvsResources_t opaque C handle
+ * `@param`[in] params cuvsHnswIndexParams_t used to configure conversion
+ * `@param`[in] cagra_index cuvsCagraIndex_t to convert
+ * `@param`[out] hnsw_index cuvsHnswIndex_t output HNSW index
+ * `@param`[in] dataset_tensor DLManagedTensor* dataset used for conversion
+ * `@return` cuvsError_t
+ */
 CUVS_EXPORT cuvsError_t cuvsHnswFromCagraWithDataset(cuvsResources_t res,
                                          cuvsHnswIndexParams_t params,
                                          cuvsCagraIndex_t cagra_index,
                                          cuvsHnswIndex_t hnsw_index,
                                          DLManagedTensor* dataset_tensor);

As per coding guidelines c/include/cuvs/**/*: “For public C API headers, additionally check: Doxygen documentation for all public functions/classes”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/neighbors/hnsw.h` around lines 277 - 282, Add a dedicated
Doxygen comment block for the public function cuvsHnswFromCagraWithDataset
documenting the purpose and behaviour, listing each parameter (res, params,
cagra_index, hnsw_index, dataset_tensor) with types/semantics (ownership,
whether pointers may be NULL, expected tensor layout/shape/dtype for
dataset_tensor and whether it is consumed or referenced), describe the return
value cuvsError_t and possible error conditions, and include any thread-safety
or precondition notes similar to the adjacent public APIs' doc style; place this
block immediately above the cuvsHnswFromCagraWithDataset declaration so it
appears as standalone Doxygen for the exported symbol.
c/include/cuvs/neighbors/ivf_pq.h (1)

272-290: ⚡ Quick win

Expand getter docs to full Doxygen format (@param/@return).

The newly exported getter APIs in this block currently have brief comments only; please align them with the fully documented style used elsewhere in this header.

As per coding guidelines c/include/cuvs/**/*: “For public C API headers, additionally check: Doxygen documentation for all public functions/classes”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/neighbors/ivf_pq.h` around lines 272 - 290, The comments for
the public getter APIs (cuvsIvfPqIndexGetNLists, cuvsIvfPqIndexGetDim,
cuvsIvfPqIndexGetSize, cuvsIvfPqIndexGetPqDim, cuvsIvfPqIndexGetPqBits,
cuvsIvfPqIndexGetPqLen) must be expanded into full Doxygen style: add an `@brief`
line, an `@param` for the input index (describe expected handle) and for the
output pointer (describe what value is returned and that it must be non-NULL),
and an `@return` describing the cuvsError_t return values (e.g., CUVS_SUCCESS on
success, error codes on failure). Update each function’s comment block to follow
the same pattern used in other header docs so the public C API is fully
documented.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@c/include/cuvs/neighbors/ivf_flat.h`:
- Around line 159-163: Add full Doxygen comment blocks for the two new public
getters cuvsIvfFlatIndexGetNLists and cuvsIvfFlatIndexGetDim: replace the brief
inline comments with complete blocks that include `@brief` describing purpose,
`@param` for the cuvsIvfFlatIndex_t index and the output pointer (n_lists or dim)
documenting expected ownership/valid pointer requirements, `@return` documenting
the cuvsError_t return values (e.g., CUVS_SUCCESS and possible error codes) and
any side effects (none), and any notes about thread-safety or valid index
states; ensure the function names cuvsIvfFlatIndexGetNLists and
cuvsIvfFlatIndexGetDim appear in the blocks so the API docs pick them up.

In `@cpp/cmake/thirdparty/get_raft.cmake`:
- Around line 9-10: The code currently hardcodes RAFT_FORK and RAFT_PINNED_TAG
in get_raft.cmake (symbols RAFT_FORK and RAFT_PINNED_TAG); remove these fixed
set(...) calls and instead restore upstream defaults or make them configurable:
introduce CMake cache variables (e.g., RAFT_FORK and RAFT_PINNED_TAG via
option/set with CACHE STRING) with defaults pointing to the official upstream
(rapidsai/main or appropriate tag) and allow overrides from the environment or
CI (e.g., honor RAFT_FORK/RAFT_PINNED_TAG CMake cache entries or an env var like
RAFT_FORK_OVERRIDE); alternatively, delete the test pin here and move any
CI-only override into CI config so normal builds resolve upstream. Ensure
references still use the RAFT_FORK and RAFT_PINNED_TAG symbols but no longer
hardcode the test fork/tag in production CMake.

---

Outside diff comments:
In `@c/include/cuvs/neighbors/cagra.h`:
- Around line 768-778: The example in the Doxygen block uses the wrong function
name; update the sample call to use the exported API name
cuvsCagraSerializeToHnswlib (e.g., replace cuvsCagraSerializeHnswlib(res,
"/path/to/index", index); with cuvsCagraSerializeToHnswlib(res,
"/path/to/index", index);) so the docs match the declared function
cuvsCagraSerializeToHnswlib.

In `@c/include/cuvs/neighbors/hnsw.h`:
- Around line 580-585: The example doc uses misspelled/invalid symbols; replace
the incorrect type/name and fix the deserialize call so it matches the public
API. Use cuvsHnswIndexParams_t and cuvsHnswIndexParamsCreate for the params
variable (hnsw_params) and call cuvsHnswDeserialize with the correct argument
list and commas (e.g., cuvsHnswDeserialize(res, hnsw_params, "/path/to/index",
dim, metric, hnsw_index)) so the example compiles and reflects the real function
signature.

In `@c/include/cuvs/neighbors/tiered_index.h`:
- Around line 214-220: Update the Doxygen comment for the exported function
cuvsTieredIndexSearch to include a `@return` section that documents the
cuvsError_t return values (e.g., CUVS_SUCCESS on success, specific CUVS_* error
codes on failure such as invalid arguments, allocation or search errors) and any
side-effects; locate the comment block above the cuvsTieredIndexSearch
declaration and add a concise `@return` description listing possible return codes
and their meaning so the public C API docs are complete.

In `@c/include/cuvs/preprocessing/quantize/binary.h`:
- Around line 104-139: The three exported APIs cuvsBinaryQuantizerTrain,
cuvsBinaryQuantizerTransform, and cuvsBinaryQuantizerTransformWithParams are
missing `@return` documentation; update each Doxygen comment to include an `@return`
that describes the returned cuvsError_t values (e.g., CUVS_SUCCESS on success
and possible error codes on failure such as invalid arguments, allocation
failures, or transform/train-specific errors) and note side effects like
modifications to output tensors (for cuvsBinaryQuantizerTransform and
cuvsBinaryQuantizerTransformWithParams the out DLManagedTensor is written, and
for cuvsBinaryQuantizerTrain the quantizer is created/initialized). Ensure the
`@return` text is clear and consistent across the three functions and references
the same error codes used elsewhere in the API.

In `@c/include/cuvs/preprocessing/quantize/scalar.h`:
- Around line 88-121: The three exported functions cuvsScalarQuantizerTrain,
cuvsScalarQuantizerTransform, and cuvsScalarQuantizerInverseTransform are
missing `@return` (and any side-effect) Doxygen documentation; update each
function comment block to include an `@return` describing the cuvsError_t values
returned (e.g., CUVS_SUCCESS, error codes) and any side effects (e.g.,
device/host memory requirements, in-place modification, ownership of
DLManagedTensor pointers), so the public C API docs are complete and follow the
coding guideline.

In `@c/src/distance/pairwise_distance.cpp`:
- Around line 46-53: The public C API cuvsPairwiseDistance must validate inputs
before any dereference: add checks at the start of cuvsPairwiseDistance (before
calling cuvs::core::translate_exceptions or touching dl_tensor) to reject null
pointers for res, x_tensor, y_tensor, and distances_tensor and return an
appropriate cuvsError_t; also validate tensor dimensions/sizes and
metric/metric_arg combinations (negative dims, mismatched shapes, unsupported
cuvsDistanceType) and return errors early so GPU code never runs with invalid
inputs. Ensure these checks reference the same parameter names (cuvsResources_t
res, DLManagedTensor* x_tensor, DLManagedTensor* y_tensor, DLManagedTensor*
distances_tensor, cuvsDistanceType metric, float metric_arg) and occur prior to
any use of dl_tensor or other members.
- Around line 54-56: The dtype variables are incorrectly read from x_tensor for
all three; change the assignments so y_dt reads from y_tensor->dl_tensor.dtype
and dist_dt reads from dist_tensor->dl_tensor.dtype (leave x_dt as
x_tensor->dl_tensor.dtype) so that mixed-dtype inputs validate and dispatch
correctly for functions/logic that use x_dt, y_dt, and dist_dt in
pairwise_distance.cpp (variables x_tensor, y_tensor, dist_tensor).

In `@cpp/CMakeLists.txt`:
- Around line 931-940: The cuvs-cagra-search target is not receiving the
hidden-visibility defaults, allowing symbol leakage; update the CMake
configuration so cuvs-cagra-search is configured like cuvs_objs (or included in
the same set_target_properties call) to set CXX_STANDARD 20,
CXX_STANDARD_REQUIRED ON, CUDA_STANDARD 20, CUDA_STANDARD_REQUIRED ON,
POSITION_INDEPENDENT_CODE ON, CXX_VISIBILITY_PRESET hidden, and
VISIBILITY_INLINES_HIDDEN ON for the cuvs-cagra-search target (reference the
existing set_target_properties call for cuvs_objs to mirror or extend the same
property list).

---

Nitpick comments:
In `@c/include/cuvs/neighbors/hnsw.h`:
- Around line 277-282: Add a dedicated Doxygen comment block for the public
function cuvsHnswFromCagraWithDataset documenting the purpose and behaviour,
listing each parameter (res, params, cagra_index, hnsw_index, dataset_tensor)
with types/semantics (ownership, whether pointers may be NULL, expected tensor
layout/shape/dtype for dataset_tensor and whether it is consumed or referenced),
describe the return value cuvsError_t and possible error conditions, and include
any thread-safety or precondition notes similar to the adjacent public APIs' doc
style; place this block immediately above the cuvsHnswFromCagraWithDataset
declaration so it appears as standalone Doxygen for the exported symbol.

In `@c/include/cuvs/neighbors/ivf_pq.h`:
- Around line 272-290: The comments for the public getter APIs
(cuvsIvfPqIndexGetNLists, cuvsIvfPqIndexGetDim, cuvsIvfPqIndexGetSize,
cuvsIvfPqIndexGetPqDim, cuvsIvfPqIndexGetPqBits, cuvsIvfPqIndexGetPqLen) must be
expanded into full Doxygen style: add an `@brief` line, an `@param` for the input
index (describe expected handle) and for the output pointer (describe what value
is returned and that it must be non-NULL), and an `@return` describing the
cuvsError_t return values (e.g., CUVS_SUCCESS on success, error codes on
failure). Update each function’s comment block to follow the same pattern used
in other header docs so the public C API is fully documented.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5ae306dc-74cc-4afc-b172-aa2efd8b1289

📥 Commits

Reviewing files that changed from the base of the PR and between 7cf69cb and 0d94e7d.

📒 Files selected for processing (69)
  • c/CMakeLists.txt
  • c/include/cuvs/cluster/kmeans.h
  • c/include/cuvs/core/c_api.h
  • c/include/cuvs/core/export.h
  • c/include/cuvs/distance/distance.h
  • c/include/cuvs/distance/pairwise_distance.h
  • c/include/cuvs/neighbors/all_neighbors.h
  • c/include/cuvs/neighbors/brute_force.h
  • c/include/cuvs/neighbors/cagra.h
  • c/include/cuvs/neighbors/common.h
  • c/include/cuvs/neighbors/hnsw.h
  • c/include/cuvs/neighbors/ivf_flat.h
  • c/include/cuvs/neighbors/ivf_pq.h
  • c/include/cuvs/neighbors/mg_cagra.h
  • c/include/cuvs/neighbors/mg_common.h
  • c/include/cuvs/neighbors/mg_ivf_flat.h
  • c/include/cuvs/neighbors/mg_ivf_pq.h
  • c/include/cuvs/neighbors/nn_descent.h
  • c/include/cuvs/neighbors/refine.h
  • c/include/cuvs/neighbors/tiered_index.h
  • c/include/cuvs/neighbors/vamana.h
  • c/include/cuvs/preprocessing/pca.h
  • c/include/cuvs/preprocessing/quantize/binary.h
  • c/include/cuvs/preprocessing/quantize/pq.h
  • c/include/cuvs/preprocessing/quantize/scalar.h
  • c/src/distance/pairwise_distance.cpp
  • cpp/CMakeLists.txt
  • cpp/cmake/thirdparty/get_raft.cmake
  • cpp/doxygen/Doxyfile
  • cpp/include/cuvs/cluster/agglomerative.hpp
  • cpp/include/cuvs/cluster/kmeans.hpp
  • cpp/include/cuvs/cluster/spectral.hpp
  • cpp/include/cuvs/core/bitmap.hpp
  • cpp/include/cuvs/core/bitset.hpp
  • cpp/include/cuvs/core/export.hpp
  • cpp/include/cuvs/distance/distance.hpp
  • cpp/include/cuvs/distance/grammian.hpp
  • cpp/include/cuvs/distance/kde.hpp
  • cpp/include/cuvs/neighbors/all_neighbors.hpp
  • cpp/include/cuvs/neighbors/ball_cover.hpp
  • cpp/include/cuvs/neighbors/brute_force.hpp
  • cpp/include/cuvs/neighbors/cagra.hpp
  • cpp/include/cuvs/neighbors/common.hpp
  • cpp/include/cuvs/neighbors/composite/index.hpp
  • cpp/include/cuvs/neighbors/dynamic_batching.hpp
  • cpp/include/cuvs/neighbors/epsilon_neighborhood.hpp
  • cpp/include/cuvs/neighbors/hnsw.hpp
  • cpp/include/cuvs/neighbors/ivf_flat.hpp
  • cpp/include/cuvs/neighbors/ivf_pq.hpp
  • cpp/include/cuvs/neighbors/knn_merge_parts.hpp
  • cpp/include/cuvs/neighbors/nn_descent.hpp
  • cpp/include/cuvs/neighbors/refine.hpp
  • cpp/include/cuvs/neighbors/scann.hpp
  • cpp/include/cuvs/neighbors/tiered_index.hpp
  • cpp/include/cuvs/neighbors/vamana.hpp
  • cpp/include/cuvs/preprocessing/pca.hpp
  • cpp/include/cuvs/preprocessing/quantize/binary.hpp
  • cpp/include/cuvs/preprocessing/quantize/pq.hpp
  • cpp/include/cuvs/preprocessing/quantize/scalar.hpp
  • cpp/include/cuvs/preprocessing/spectral_embedding.hpp
  • cpp/include/cuvs/selection/select_k.hpp
  • cpp/include/cuvs/stats/silhouette_score.hpp
  • cpp/include/cuvs/stats/trustworthiness_score.hpp
  • cpp/include/cuvs/util/cutlass_utils.hpp
  • cpp/include/cuvs/util/file_io.hpp
  • cpp/include/cuvs/util/host_memory.hpp
  • cpp/src/distance/kde.cu
  • cpp/src/neighbors/epsilon_neighborhood.cu
  • cpp/src/neighbors/ivf_flat_index.cpp

Comment thread c/include/cuvs/neighbors/ivf_flat.h Outdated
Comment thread cpp/cmake/thirdparty/get_raft.cmake Outdated
vyasr added 2 commits May 1, 2026 12:05
The ABI checker scans all.h to verify all C headers are listed.
export.h was missing after the rename from export.hpp.
PREDEFINED alone is insufficient because Doxygen parses the source
#define in export.h/hpp and overrides the PREDEFINED value, expanding
CUVS_EXPORT to __attribute__((visibility("default"))) in the XML
output. Excluding the files prevents the redefinition, so the
PREDEFINED CUVS_EXPORT= (expand to nothing) takes effect.
@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 3, 2026

CI on b3d67e9 shows a passing run (up to the usual flaky failures) on a run containing the raft changes.

vyasr added 2 commits May 3, 2026 08:58
Add @brief, @param, and @return documentation for
cuvsIvfFlatIndexGetNLists and cuvsIvfFlatIndexGetDim.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants