Note

This page is a reference documentation. It only explains the class signature, and not how to use it. Please refer to the user guide for the big picture.

nilearn.regions.HierarchicalKMeans

class nilearn.regions.HierarchicalKMeans(n_clusters=None, init='k-means++', batch_size=1000, n_init=10, max_no_improvement=10, verbose=0, random_state=0, scaling=False)[source]

Hierarchical KMeans.

First clusterize the samples into big clusters. Then clusterize the samples inside these big clusters into smaller ones.

Parameters:
n_clustersint

The number of clusters to find.

init{‘k-means++’, ‘random’ or an ndarray}, default=’k-means++’

Method for initialization.

  • ‘k-means++’ : selects initial cluster centers for k-means clustering in a smart way to speed up convergence. See section Notes in k_init for more details.

  • ‘random’: choose k observations (rows) at random from data for the initial centroids.

  • If an ndarray is passed, it should be of shape (n_clusters, n_features) and gives the initial centers.

batch_sizeint, optional, default: 1000

Size of the mini batches. (Kmeans performed through MiniBatchKMeans)

n_initint, default=10

Number of random initializations that are tried. In contrast to KMeans, the algorithm is only run once, using the best of the n_init initializations as measured by inertia.

max_no_improvementint, default: 10

Control early stopping based on the consecutive number of mini batches that does not yield an improvement on the smoothed inertia. To disable convergence detection based on inertia, set max_no_improvement to None.

verboseint, default=0

Verbosity level (0 means no message).

random_stateint, RandomState instance or None, default=0

Determines random number generation for centroid initialization and random reassignment. Use an int to make the randomness deterministic.

scalingbool, default=False

If scaling is True, each cluster is scaled by the square root of its size during transform(), preserving the l2-norm of the image. inverse_transform() will apply inversed scaling to yield an image with same l2-norm as input.

Attributes:
labels_ndarray, shape = [n_features]

cluster labels for each feature.

n_features_in_int

Number of features seen during fit.

sizes_ndarray, shape = [n_features]

It contains the size of each cluster.

__init__(n_clusters=None, init='k-means++', batch_size=1000, n_init=10, max_no_improvement=10, verbose=0, random_state=0, scaling=False)[source]
fit(X, y=None)[source]

Compute clustering of the data.

Parameters:
Xndarray, shape = [n_samples, n_features]

Training data.

yNone

This parameter is unused. It is solely included for scikit-learn compatibility.

Returns:
self
fit_predict(X, y=None, **kwargs)

Perform clustering on X and returns cluster labels.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input data.

yIgnored

Not used, present for API consistency by convention.

**kwargsdict

Arguments to be passed to fit.

Added in version 1.4.

Returns:
labelsndarray of shape (n_samples,), dtype=np.int64

Cluster labels.

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_feature_names_out(input_features=None)

Get output feature names for transformation.

The feature names out will prefixed by the lowercased class name. For example, if the transformer outputs 3 features, then the feature names out are: [“class_name0”, “class_name1”, “class_name2”].

Parameters:
input_featuresarray-like of str or None, default=None

Only used to validate feature names with the names seen in fit.

Returns:
feature_names_outndarray of str objects

Transformed feature names.

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

inverse_transform(X_red)[source]

Send the reduced 2D data matrix back to the original feature space (voxels).

Parameters:
X_redndarray , shape = [n_samples, n_clusters]

Data reduced with agglomerated signal for each cluster

Returns:
X_invndarray, shape = [n_samples, n_features]

Data reduced expanded to the original feature space

set_inverse_transform_request(*, X_red='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the inverse_transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to inverse_transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to inverse_transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
X_redstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for X_red parameter in inverse_transform.

Returns:
selfobject

The updated object.

set_output(*, transform=None)

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(X, y=None)[source]

Apply clustering, reduce the dimensionality of the data.

Parameters:
Xndarray, shape = [n_samples, n_features]

Data to transform with the fitted clustering.

yNone

This parameter is unused. It is solely included for scikit-learn compatibility.

Returns:
X_red: numpy.ndarray, pandas.DataFrame or polars.DataFrame

Data reduced with agglomerated signal for each cluster.

The type of the output is determined by set_output():
see the scikit-learn documentation.