Multi-GPU Nearest Neighbors#
The Multi-GPU (SNMG - single-node multi-GPUs) C API provides a set of functions to deploy ANN indexes across multiple GPUs for improved performance and scalability.
Common Types and Enums#
Common types and enums used across multi-GPU ANN algorithms.
#include <cuvs/neighbors/mg_common.h>
-
enum cuvsMultiGpuDistributionMode#
Distribution mode for multi-GPU indexes.
Values:
-
enumerator CUVS_NEIGHBORS_MG_REPLICATED#
Index is replicated on each device, favors throughput
-
enumerator CUVS_NEIGHBORS_MG_SHARDED#
Index is split on several devices, favors scaling
-
enumerator CUVS_NEIGHBORS_MG_REPLICATED#
-
enum cuvsMultiGpuReplicatedSearchMode#
Search mode when using a replicated index.
Values:
-
enumerator CUVS_NEIGHBORS_MG_LOAD_BALANCER#
Search queries are split to maintain equal load on GPUs
-
enumerator CUVS_NEIGHBORS_MG_ROUND_ROBIN#
Each search query is processed by a single GPU in a round-robin fashion
-
enumerator CUVS_NEIGHBORS_MG_LOAD_BALANCER#
Multi-GPU IVF-Flat#
The Multi-GPU IVF-Flat method extends the IVF-Flat ANN algorithm to work across multiple GPUs. It provides two distribution modes: replicated (for higher throughput) and sharded (for handling larger datasets).
#include <cuvs/neighbors/mg_ivf_flat.h>
IVF-Flat Index Build Parameters#
-
typedef struct cuvsMultiGpuIvfFlatIndexParams *cuvsMultiGpuIvfFlatIndexParams_t#
- cuvsError_t cuvsMultiGpuIvfFlatIndexParamsCreate(
- cuvsMultiGpuIvfFlatIndexParams_t *index_params
Allocate Multi-GPU IVF-Flat Index params, and populate with default values.
- Parameters:
index_params – [in] cuvsMultiGpuIvfFlatIndexParams_t to allocate
- Returns:
- cuvsError_t cuvsMultiGpuIvfFlatIndexParamsDestroy(
- cuvsMultiGpuIvfFlatIndexParams_t index_params
De-allocate Multi-GPU IVF-Flat Index params.
- Parameters:
index_params – [in]
- Returns:
-
struct cuvsMultiGpuIvfFlatIndexParams#
- #include <mg_ivf_flat.h>
Multi-GPU parameters to build IVF-Flat Index.
This structure extends the base IVF-Flat index parameters with multi-GPU specific settings.
Public Members
-
cuvsIvfFlatIndexParams_t base_params#
Base IVF-Flat index parameters
-
cuvsMultiGpuDistributionMode mode#
Distribution mode for multi-GPU setup
-
cuvsIvfFlatIndexParams_t base_params#
IVF-Flat Index Search Parameters#
-
typedef struct cuvsMultiGpuIvfFlatSearchParams *cuvsMultiGpuIvfFlatSearchParams_t#
- cuvsError_t cuvsMultiGpuIvfFlatSearchParamsCreate( )#
Allocate Multi-GPU IVF-Flat search params, and populate with default values.
- Parameters:
params – [in] cuvsMultiGpuIvfFlatSearchParams_t to allocate
- Returns:
- cuvsError_t cuvsMultiGpuIvfFlatSearchParamsDestroy( )#
De-allocate Multi-GPU IVF-Flat search params.
- Parameters:
params – [in]
- Returns:
-
struct cuvsMultiGpuIvfFlatSearchParams#
- #include <mg_ivf_flat.h>
Multi-GPU parameters to search IVF-Flat index.
This structure extends the base IVF-Flat search parameters with multi-GPU specific settings.
Public Members
-
cuvsIvfFlatSearchParams_t base_params#
Base IVF-Flat search parameters
-
cuvsMultiGpuReplicatedSearchMode search_mode#
Replicated search mode
-
cuvsMultiGpuShardedMergeMode merge_mode#
Sharded merge mode
-
int64_t n_rows_per_batch#
Number of rows per batch
-
cuvsIvfFlatSearchParams_t base_params#
IVF-Flat Index#
-
typedef cuvsMultiGpuIvfFlatIndex *cuvsMultiGpuIvfFlatIndex_t#
- cuvsError_t cuvsMultiGpuIvfFlatIndexCreate(
- cuvsMultiGpuIvfFlatIndex_t *index
Allocate Multi-GPU IVF-Flat index.
- Parameters:
index – [in] cuvsMultiGpuIvfFlatIndex_t to allocate
- Returns:
- cuvsError_t cuvsMultiGpuIvfFlatIndexDestroy( )#
De-allocate Multi-GPU IVF-Flat index.
- Parameters:
index – [in] cuvsMultiGpuIvfFlatIndex_t to de-allocate
- Returns:
-
struct cuvsMultiGpuIvfFlatIndex#
- #include <mg_ivf_flat.h>
Struct to hold address of cuvs::neighbors::mg_index<ivf_flat::index> and its active trained dtype.
IVF-Flat Index Build#
- cuvsError_t cuvsMultiGpuIvfFlatBuild(
- cuvsResources_t res,
- cuvsMultiGpuIvfFlatIndexParams_t params,
- DLManagedTensor *dataset_tensor,
- cuvsMultiGpuIvfFlatIndex_t index
Build a Multi-GPU IVF-Flat index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
params – [in] Multi-GPU IVF-Flat index parameters
dataset_tensor – [in] DLManagedTensor* training dataset
index – [out] Multi-GPU IVF-Flat index
- Returns:
IVF-Flat Index Search#
- cuvsError_t cuvsMultiGpuIvfFlatSearch(
- cuvsResources_t res,
- cuvsMultiGpuIvfFlatSearchParams_t params,
- cuvsMultiGpuIvfFlatIndex_t index,
- DLManagedTensor *queries_tensor,
- DLManagedTensor *neighbors_tensor,
- DLManagedTensor *distances_tensor
Search a Multi-GPU IVF-Flat index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
params – [in] Multi-GPU IVF-Flat search parameters
index – [in] Multi-GPU IVF-Flat index
queries_tensor – [in] DLManagedTensor* queries dataset
neighbors_tensor – [out] DLManagedTensor* output neighbors
distances_tensor – [out] DLManagedTensor* output distances
- Returns:
IVF-Flat Index Extend#
- cuvsError_t cuvsMultiGpuIvfFlatExtend(
- cuvsResources_t res,
- cuvsMultiGpuIvfFlatIndex_t index,
- DLManagedTensor *new_vectors_tensor,
- DLManagedTensor *new_indices_tensor
Extend a Multi-GPU IVF-Flat index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
index – [inout] Multi-GPU IVF-Flat index to extend
new_vectors_tensor – [in] DLManagedTensor* new vectors to add
new_indices_tensor – [in] DLManagedTensor* new indices (optional, can be NULL)
- Returns:
IVF-Flat Index Serialize#
- cuvsError_t cuvsMultiGpuIvfFlatSerialize(
- cuvsResources_t res,
- cuvsMultiGpuIvfFlatIndex_t index,
- const char *filename
Serialize a Multi-GPU IVF-Flat index to file.
- Parameters:
res – [in] cuvsResources_t opaque C handle
index – [in] Multi-GPU IVF-Flat index to serialize
filename – [in] Path to the output file
- Returns:
IVF-Flat Index Deserialize#
- cuvsError_t cuvsMultiGpuIvfFlatDeserialize(
- cuvsResources_t res,
- const char *filename,
- cuvsMultiGpuIvfFlatIndex_t index
Deserialize a Multi-GPU IVF-Flat index from file.
- Parameters:
res – [in] cuvsResources_t opaque C handle
filename – [in] Path to the input file
index – [out] Multi-GPU IVF-Flat index
- Returns:
IVF-Flat Index Distribute#
- cuvsError_t cuvsMultiGpuIvfFlatDistribute(
- cuvsResources_t res,
- const char *filename,
- cuvsMultiGpuIvfFlatIndex_t index
Distribute a local IVF-Flat index to create a Multi-GPU index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
filename – [in] Path to the local index file
index – [out] Multi-GPU IVF-Flat index
- Returns:
Multi-GPU IVF-PQ#
The Multi-GPU IVF-PQ method extends the IVF-PQ ANN algorithm to work across multiple GPUs. It provides two distribution modes: replicated (for higher throughput) and sharded (for handling larger datasets).
#include <cuvs/neighbors/mg_ivf_pq.h>
IVF-PQ Index Build Parameters#
-
typedef struct cuvsMultiGpuIvfPqIndexParams *cuvsMultiGpuIvfPqIndexParams_t#
- cuvsError_t cuvsMultiGpuIvfPqIndexParamsCreate(
- cuvsMultiGpuIvfPqIndexParams_t *index_params
Allocate Multi-GPU IVF-PQ Index params, and populate with default values.
- Parameters:
index_params – [in] cuvsMultiGpuIvfPqIndexParams_t to allocate
- Returns:
- cuvsError_t cuvsMultiGpuIvfPqIndexParamsDestroy(
- cuvsMultiGpuIvfPqIndexParams_t index_params
De-allocate Multi-GPU IVF-PQ Index params.
- Parameters:
index_params – [in]
- Returns:
-
struct cuvsMultiGpuIvfPqIndexParams#
- #include <mg_ivf_pq.h>
Multi-GPU parameters to build IVF-PQ Index.
This structure extends the base IVF-PQ index parameters with multi-GPU specific settings.
Public Members
-
cuvsIvfPqIndexParams_t base_params#
Base IVF-PQ index parameters
-
cuvsMultiGpuDistributionMode mode#
Distribution mode for multi-GPU setup
-
cuvsIvfPqIndexParams_t base_params#
IVF-PQ Index Search Parameters#
-
typedef struct cuvsMultiGpuIvfPqSearchParams *cuvsMultiGpuIvfPqSearchParams_t#
- cuvsError_t cuvsMultiGpuIvfPqSearchParamsCreate(
- cuvsMultiGpuIvfPqSearchParams_t *params
Allocate Multi-GPU IVF-PQ search params, and populate with default values.
- Parameters:
params – [in] cuvsMultiGpuIvfPqSearchParams_t to allocate
- Returns:
- cuvsError_t cuvsMultiGpuIvfPqSearchParamsDestroy( )#
De-allocate Multi-GPU IVF-PQ search params.
- Parameters:
params – [in]
- Returns:
-
struct cuvsMultiGpuIvfPqSearchParams#
- #include <mg_ivf_pq.h>
Multi-GPU parameters to search IVF-PQ index.
This structure extends the base IVF-PQ search parameters with multi-GPU specific settings.
Public Members
-
cuvsIvfPqSearchParams_t base_params#
Base IVF-PQ search parameters
-
cuvsMultiGpuReplicatedSearchMode search_mode#
Replicated search mode
-
cuvsMultiGpuShardedMergeMode merge_mode#
Sharded merge mode
-
int64_t n_rows_per_batch#
Number of rows per batch
-
cuvsIvfPqSearchParams_t base_params#
IVF-PQ Index#
-
typedef cuvsMultiGpuIvfPqIndex *cuvsMultiGpuIvfPqIndex_t#
- cuvsError_t cuvsMultiGpuIvfPqIndexCreate(
- cuvsMultiGpuIvfPqIndex_t *index
Allocate Multi-GPU IVF-PQ index.
- Parameters:
index – [in] cuvsMultiGpuIvfPqIndex_t to allocate
- Returns:
- cuvsError_t cuvsMultiGpuIvfPqIndexDestroy(
- cuvsMultiGpuIvfPqIndex_t index
De-allocate Multi-GPU IVF-PQ index.
- Parameters:
index – [in] cuvsMultiGpuIvfPqIndex_t to de-allocate
- Returns:
-
struct cuvsMultiGpuIvfPqIndex#
- #include <mg_ivf_pq.h>
Struct to hold address of cuvs::neighbors::mg_index<ivf_pq::index> and its active trained dtype.
IVF-PQ Index Build#
- cuvsError_t cuvsMultiGpuIvfPqBuild(
- cuvsResources_t res,
- cuvsMultiGpuIvfPqIndexParams_t params,
- DLManagedTensor *dataset_tensor,
- cuvsMultiGpuIvfPqIndex_t index
Build a Multi-GPU IVF-PQ index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
params – [in] Multi-GPU IVF-PQ index parameters
dataset_tensor – [in] DLManagedTensor* training dataset
index – [out] Multi-GPU IVF-PQ index
- Returns:
IVF-PQ Index Search#
- cuvsError_t cuvsMultiGpuIvfPqSearch(
- cuvsResources_t res,
- cuvsMultiGpuIvfPqSearchParams_t params,
- cuvsMultiGpuIvfPqIndex_t index,
- DLManagedTensor *queries_tensor,
- DLManagedTensor *neighbors_tensor,
- DLManagedTensor *distances_tensor
Search a Multi-GPU IVF-PQ index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
params – [in] Multi-GPU IVF-PQ search parameters
index – [in] Multi-GPU IVF-PQ index
queries_tensor – [in] DLManagedTensor* queries dataset
neighbors_tensor – [out] DLManagedTensor* output neighbors
distances_tensor – [out] DLManagedTensor* output distances
- Returns:
IVF-PQ Index Extend#
- cuvsError_t cuvsMultiGpuIvfPqExtend(
- cuvsResources_t res,
- cuvsMultiGpuIvfPqIndex_t index,
- DLManagedTensor *new_vectors_tensor,
- DLManagedTensor *new_indices_tensor
Extend a Multi-GPU IVF-PQ index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
index – [inout] Multi-GPU IVF-PQ index to extend
new_vectors_tensor – [in] DLManagedTensor* new vectors to add
new_indices_tensor – [in] DLManagedTensor* new indices (optional, can be NULL)
- Returns:
IVF-PQ Index Serialize#
- cuvsError_t cuvsMultiGpuIvfPqSerialize(
- cuvsResources_t res,
- cuvsMultiGpuIvfPqIndex_t index,
- const char *filename
Serialize a Multi-GPU IVF-PQ index to file.
- Parameters:
res – [in] cuvsResources_t opaque C handle
index – [in] Multi-GPU IVF-PQ index to serialize
filename – [in] Path to the output file
- Returns:
IVF-PQ Index Deserialize#
- cuvsError_t cuvsMultiGpuIvfPqDeserialize(
- cuvsResources_t res,
- const char *filename,
- cuvsMultiGpuIvfPqIndex_t index
Deserialize a Multi-GPU IVF-PQ index from file.
- Parameters:
res – [in] cuvsResources_t opaque C handle
filename – [in] Path to the input file
index – [out] Multi-GPU IVF-PQ index
- Returns:
IVF-PQ Index Distribute#
- cuvsError_t cuvsMultiGpuIvfPqDistribute(
- cuvsResources_t res,
- const char *filename,
- cuvsMultiGpuIvfPqIndex_t index
Distribute a local IVF-PQ index to create a Multi-GPU index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
filename – [in] Path to the local index file
index – [out] Multi-GPU IVF-PQ index
- Returns:
Multi-GPU CAGRA#
The Multi-GPU CAGRA method extends the CAGRA graph-based ANN algorithm to work across multiple GPUs. It provides two distribution modes: replicated (for higher throughput) and sharded (for handling larger datasets).
#include <cuvs/neighbors/mg_cagra.h>
CAGRA Index Build Parameters#
-
typedef struct cuvsMultiGpuCagraIndexParams *cuvsMultiGpuCagraIndexParams_t#
- cuvsError_t cuvsMultiGpuCagraIndexParamsCreate(
- cuvsMultiGpuCagraIndexParams_t *index_params
Allocate Multi-GPU CAGRA Index params, and populate with default values.
- Parameters:
index_params – [in] cuvsMultiGpuCagraIndexParams_t to allocate
- Returns:
- cuvsError_t cuvsMultiGpuCagraIndexParamsDestroy(
- cuvsMultiGpuCagraIndexParams_t index_params
De-allocate Multi-GPU CAGRA Index params.
- Parameters:
index_params – [in]
- Returns:
-
struct cuvsMultiGpuCagraIndexParams#
- #include <mg_cagra.h>
Multi-GPU parameters to build CAGRA Index.
This structure extends the base CAGRA index parameters with multi-GPU specific settings.
Public Members
-
cuvsCagraIndexParams_t base_params#
Base CAGRA index parameters
-
cuvsMultiGpuDistributionMode mode#
Distribution mode for multi-GPU setup
-
cuvsCagraIndexParams_t base_params#
CAGRA Index Search Parameters#
-
typedef struct cuvsMultiGpuCagraSearchParams *cuvsMultiGpuCagraSearchParams_t#
- cuvsError_t cuvsMultiGpuCagraSearchParamsCreate(
- cuvsMultiGpuCagraSearchParams_t *params
Allocate Multi-GPU CAGRA search params, and populate with default values.
- Parameters:
params – [in] cuvsMultiGpuCagraSearchParams_t to allocate
- Returns:
- cuvsError_t cuvsMultiGpuCagraSearchParamsDestroy( )#
De-allocate Multi-GPU CAGRA search params.
- Parameters:
params – [in]
- Returns:
-
struct cuvsMultiGpuCagraSearchParams#
- #include <mg_cagra.h>
Multi-GPU parameters to search CAGRA index.
This structure extends the base CAGRA search parameters with multi-GPU specific settings.
Public Members
-
cuvsCagraSearchParams_t base_params#
Base CAGRA search parameters
-
cuvsMultiGpuReplicatedSearchMode search_mode#
Replicated search mode
-
cuvsMultiGpuShardedMergeMode merge_mode#
Sharded merge mode
-
int64_t n_rows_per_batch#
Number of rows per batch
-
cuvsCagraSearchParams_t base_params#
CAGRA Index#
-
typedef cuvsMultiGpuCagraIndex *cuvsMultiGpuCagraIndex_t#
- cuvsError_t cuvsMultiGpuCagraIndexCreate(
- cuvsMultiGpuCagraIndex_t *index
Allocate Multi-GPU CAGRA index.
- Parameters:
index – [in] cuvsMultiGpuCagraIndex_t to allocate
- Returns:
- cuvsError_t cuvsMultiGpuCagraIndexDestroy(
- cuvsMultiGpuCagraIndex_t index
De-allocate Multi-GPU CAGRA index.
- Parameters:
index – [in] cuvsMultiGpuCagraIndex_t to de-allocate
- Returns:
-
struct cuvsMultiGpuCagraIndex#
- #include <mg_cagra.h>
Struct to hold address of cuvs::neighbors::mg_index<cagra::index> and its active trained dtype.
CAGRA Index Build#
- cuvsError_t cuvsMultiGpuCagraBuild(
- cuvsResources_t res,
- cuvsMultiGpuCagraIndexParams_t params,
- DLManagedTensor *dataset_tensor,
- cuvsMultiGpuCagraIndex_t index
Build a Multi-GPU CAGRA index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
params – [in] Multi-GPU CAGRA index parameters
dataset_tensor – [in] DLManagedTensor* training dataset
index – [out] Multi-GPU CAGRA index
- Returns:
CAGRA Index Search#
- cuvsError_t cuvsMultiGpuCagraSearch(
- cuvsResources_t res,
- cuvsMultiGpuCagraSearchParams_t params,
- cuvsMultiGpuCagraIndex_t index,
- DLManagedTensor *queries_tensor,
- DLManagedTensor *neighbors_tensor,
- DLManagedTensor *distances_tensor
Search a Multi-GPU CAGRA index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
params – [in] Multi-GPU CAGRA search parameters
index – [in] Multi-GPU CAGRA index
queries_tensor – [in] DLManagedTensor* queries dataset
neighbors_tensor – [out] DLManagedTensor* output neighbors
distances_tensor – [out] DLManagedTensor* output distances
- Returns:
CAGRA Index Extend#
- cuvsError_t cuvsMultiGpuCagraExtend(
- cuvsResources_t res,
- cuvsMultiGpuCagraIndex_t index,
- DLManagedTensor *new_vectors_tensor,
- DLManagedTensor *new_indices_tensor
Extend a Multi-GPU CAGRA index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
index – [inout] Multi-GPU CAGRA index to extend
new_vectors_tensor – [in] DLManagedTensor* new vectors to add
new_indices_tensor – [in] DLManagedTensor* new indices (optional, can be NULL)
- Returns:
CAGRA Index Serialize#
- cuvsError_t cuvsMultiGpuCagraSerialize(
- cuvsResources_t res,
- cuvsMultiGpuCagraIndex_t index,
- const char *filename
Serialize a Multi-GPU CAGRA index to file.
- Parameters:
res – [in] cuvsResources_t opaque C handle
index – [in] Multi-GPU CAGRA index to serialize
filename – [in] Path to the output file
- Returns:
CAGRA Index Deserialize#
- cuvsError_t cuvsMultiGpuCagraDeserialize(
- cuvsResources_t res,
- const char *filename,
- cuvsMultiGpuCagraIndex_t index
Deserialize a Multi-GPU CAGRA index from file.
- Parameters:
res – [in] cuvsResources_t opaque C handle
filename – [in] Path to the input file
index – [out] Multi-GPU CAGRA index
- Returns:
CAGRA Index Distribute#
- cuvsError_t cuvsMultiGpuCagraDistribute(
- cuvsResources_t res,
- const char *filename,
- cuvsMultiGpuCagraIndex_t index
Distribute a local CAGRA index to create a Multi-GPU index.
- Parameters:
res – [in] cuvsResources_t opaque C handle
filename – [in] Path to the local index file
index – [out] Multi-GPU CAGRA index
- Returns: