make_blobs#
- cuml.datasets.make_blobs(n_samples=100, n_features=2, centers=None, cluster_std=1.0, center_box=(-10.0, 10.0), shuffle=True, random_state=None, return_centers=False, order='F', dtype='float32')[source]#
Generate isotropic Gaussian blobs for clustering.
- Parameters:
- n_samplesint or array-like, optional (default=100)
If int, it is the total number of points equally divided among clusters. If array-like, each element of the sequence indicates the number of samples per cluster.
- n_featuresint, optional (default=2)
The number of features for each sample.
- centersint or array of shape [
n_centers,n_features], optional (default=None) The number of centers to generate, or the fixed center locations. If
n_samplesis an int and centers is None, 3 centers are generated. Ifn_samplesis array-like, centers must be either None or an array of length equal to the length ofn_samples.- cluster_stdfloat or sequence of floats, optional (default=1.0)
The standard deviation of the clusters.
- center_boxpair of floats (min, max), optional (default=(-10.0, 10.0))
The bounding box for each cluster center when centers are generated at random.
- shuffleboolean, optional (default=True)
Shuffle the samples.
- random_stateint, RandomState instance, default=None
Determines random number generation for dataset creation. Pass an int for reproducible output across multiple function calls.
- return_centersbool, optional (default=False)
If True, then return the centers of each cluster
- order: str, optional (default=’F’)
The order of the generated samples
- dtypestr, optional (default=’float32’)
Dtype of the generated samples
- Returns:
- Xdevice array of shape [n_samples, n_features]
The generated samples.
- ydevice array of shape [n_samples]
The integer labels for cluster membership of each sample.
- centersdevice array, shape [n_centers, n_features]
The centers of each cluster. Only returned if
return_centers=True.
See also
make_classificationa more intricate variant
Examples
>>> from sklearn.datasets import make_blobs >>> X, y = make_blobs(n_samples=10, centers=3, n_features=2, ... random_state=0) >>> print(X.shape) (10, 2) >>> y array([0, 0, 1, 0, 2, 2, 2, 1, 1, 0]) >>> X, y = make_blobs(n_samples=[3, 3, 4], centers=None, n_features=2, ... random_state=0) >>> print(X.shape) (10, 2) >>> y array([0, 1, 2, 0, 2, 2, 2, 1, 1, 0])