This page discusses how clustering can be used to parcellate the brain into homogeneous regions from functional imaging data.

**Reference**

A big-picture reference on the use of clustering for brain parcellations.

Thirion, et al. “Which fMRI clustering gives good brain parcellations?.” Frontiers in neuroscience 8.167 (2014): 13.

Clustering is commonly applied to resting-state data, but any brain
functional data will give rise of a functional parcellation, capturing
intrinsic brain architecture in the case of resting-state data.
In the examples, we use rest data downloaded with the function
`fetch_adhd` (see *Inputing data: file names or image objects*).

Before clustering, the brain volumes need to be turned to a data matrix,
for instance of time-series. The `nilearn.input_data.NiftiMasker`
extract these on a mask. If no mask is given with the data, the masker
can compute one.

The masker can perform important *preprocessing operations*, such as detrending signals, standardizing
them, removing confounds, or smoothing the images.

**Example code**

All the steps discussed in this section can be seen implemented in
*a full code example*.

**Which clustering to use**

The question of which clustering method to use is in itself subject to debate. There are many clustering methods; their computational cost will vary, as well as their results. A well-cited empirical comparison suggests that:

- For a large number of clusters, it is preferable to use Ward agglomerative clustering with spatial constraints
- For a small number of clusters, it is preferable to use Kmeans clustering after spatially-smoothing the data.

Both clustering algorithms (as well as many others) are provided by scikit-learn. Ward clustering is the easiest to use, as it can be done with the Feature agglomeration object. It is also very fast. We detail it bellow.

**Compute a connectivity matrix**
Before applying Ward’s method, we compute a spatial neighborhood matrix,
aka connectivity matrix. This is useful to constrain clusters to form
contiguous parcels (see the scikit-learn documentation)

This is done from the mask computed by the masker: a niimg from which we extract a numpy array and then the connectivity matrix.

**Ward clustering principle**
Ward’s algorithm is a hierarchical clustering algorithm: it
recursively merges voxels, then clusters that have similar signal
(parameters, measurements or time courses).

**Caching** In practice the implementation of Ward clustering first
computes a tree of possible merges, and then, given a requested number of
clusters, breaks apart the tree at the right level.

As the tree is independent of the number of clusters, we can rely on caching to speed things up when varying the
number of clusters. In Wards clustering,
the *memory* parameter is used to cache the computed component tree. You
can give it either a *joblib.Memory* instance or the name of a directory
used for caching.

Note

The Ward clustering computing 1000 parcels runs typically in about 10 seconds. Admitedly, this is very fast.

For every scikit-learn clustering object, the labels of the parcellation
are found its labels_ after fitting it to the data. To turn them into a
brain image, we need to unmask them with the `NiftiMasker`
*inverse_transform* method.

Note that by default, clusters are labeled from 0 to (n_clusters - 1), and the label 0 may be confused with a background.

To visualize the clusters, we assign random colors to each cluster for the labels visualization.

The clustering can be used to transform the data into a smaller representation, taking the average on each parcel:

- call
*ward.transform*to obtain the mean value of each cluster (for each scan) - call
*ward.inverse_transform*on the previous result to turn it back into the masked picture shape

We can see that using only 2000 parcels, the original image is well approximated.

**Example code**

All the steps discussed in this section can be seen implemented in
*a full code example*.