pairwise_distances#
- cuml.metrics.pairwise_distances(X, Y=None, metric='euclidean', convert_dtype=True, **kwds)[source]#
Compute the distance matrix from a feature array X and optional Y.
This function takes either one or two feature arrays, and returns a distance matrix.
- Parameters:
- X{array-like, sparse matrix}, shape=(n_samples_X, n_features)
A feature array.
- Y{array-like, sparse matrix}, shape=(n_samples_y, n_features), default=None
A second feature array. If None, Y=X will be used.
- metricstr, default=”euclidean”
The metric to use when calculating distance between instances in a feature array. Valid options are:
Supports both dense and sparse data: [‘canberra’, ‘chebyshev’, ‘cityblock’, ‘cosine’, ‘euclidean’, ‘hellinger’, ‘l1’, ‘l2’, ‘manhattan’, ‘minkowski’, ‘sqeuclidean’].
Supports dense only: [‘correlation’, ‘hamming’, ‘jensenshannon’, ‘kldivergence’, ‘nan_euclidean’, ‘russellrao’].
Supports sparse only: [‘dice’, ‘inner_product’, ‘jaccard’].
- convert_dtypebool, optional (default = True)
When set to True, the method will, when necessary, convert Y to be the same data type as X if they differ. This will increase memory used for the method.
- **kwdsoptional keyword parameters
Any additional metric-specific parameters. For example, with
metric="minkowski", passingpsets the norm used.
- Returns:
- Darray, shape=(n_samples_X, n_samples_X) or (n_samples_X, n_samples_Y)
A distance matrix D such that D_{i, j} is the distance between the ith and jth vectors of the given matrix X, if Y is None. If Y is not None, then D_{i, j} is the distance between the ith array from X and the jth array from Y.
Examples
>>> import cupy as cp >>> from cuml.metrics import pairwise_distances
>>> X = cp.array([[0., 0., 0.], [1., 1., 1.]]) >>> Y = cp.array([[1., 0., 0.], [1., 1., 0.]])
>>> pairwise_distances(X, metric="sqeuclidean") array([[0., 3.], [3., 0.]])
>>> pairwise_distances(X, Y, metric="sqeuclidean") array([[1., 2.], [2., 1.]])