8.4.12. Functional connectivity matrices for group analysis of connectomes

This example compares different kinds of functional connectivity between regions of interest : correlation, partial correlation, as well as a kind called tangent. The resulting connectivity coefficients are used to discriminate ADHD patients from healthy controls and the tangent kind outperforms the standard connectivity kinds.

# A useful matrix plotting function
import numpy as np
import matplotlib.pylab as plt

def plot_matrices(matrices, matrix_kind):
    n_matrices = len(matrices)
    plt.figure(figsize=(n_matrices * 4, 4))
    for n_subject, matrix in enumerate(matrices):
        plt.subplot(1, n_matrices, n_subject + 1)
        matrix = matrix.copy()  # avoid side effects
        # Set diagonal to zero, for better visualization
        np.fill_diagonal(matrix, 0)
        vmax = np.max(np.abs(matrix))
        plt.imshow(matrix, vmin=-vmax, vmax=vmax, cmap='RdBu_r',
        plt.title('{0}, subject {1}'.format(matrix_kind, n_subject)) Load ADHD dataset and MSDL atlas

We study only 20 subjects from the ADHD dataset, to save computation time.

from nilearn import datasets

adhd_data = datasets.fetch_adhd(n_subjects=20)

We use probabilistic regions of interest (ROIs) from the MSDL atlas.

msdl_data = datasets.fetch_atlas_msdl()
msdl_coords = msdl_data.region_coords
n_regions = len(msdl_coords)
print('MSDL has {0} ROIs, part of the following networks :\n{1}.'.format(
    n_regions, msdl_data.networks))


MSDL has 39 ROIs, part of the following networks :
['Aud', 'Aud', 'Striate', 'DMN', 'DMN', 'DMN', 'DMN', 'Occ post', 'Motor', 'R V Att', 'R V Att', 'R V Att', 'R V Att', 'Basal', 'L V Att', 'L V Att', 'L V Att', 'D Att', 'D Att', 'Vis Sec', 'Vis Sec', 'Vis Sec', 'Salience', 'Salience', 'Salience', 'Temporal', 'Temporal', 'Language', 'Language', 'Language', 'Language', 'Language', 'Cereb', 'Dors PCC', 'Cing-Ins', 'Cing-Ins', 'Cing-Ins', 'Ant IPS', 'Ant IPS']. Region signals extraction

To extract regions time series, we instantiate a nilearn.input_data.NiftiMapsMasker object and pass the atlas the file name to it, as well as filtering band-width and detrending option.

from nilearn import input_data

masker = input_data.NiftiMapsMasker(
    msdl_data.maps, resampling_target="data", t_r=2.5, detrend=True,
    low_pass=.1, high_pass=.01, memory='nilearn_cache', memory_level=1)

Then we compute region signals and extract useful phenotypic informations.

adhd_subjects = []
pooled_subjects = []
site_names = []
adhd_labels = []  # 1 if ADHD, 0 if control
for func_file, confound_file, phenotypic in zip(
        adhd_data.func, adhd_data.confounds, adhd_data.phenotypic):
    time_series = masker.fit_transform(func_file, confounds=confound_file)
    is_adhd = phenotypic['adhd']
    if is_adhd:


print('Data has {0} ADHD subjects.'.format(len(adhd_subjects)))


Data has 13 ADHD subjects. ROI-to-ROI correlations of ADHD patients

The simpler and most commonly used kind of connectivity is correlation. It models the full (marginal) connectivity between pairwise ROIs. We can estimate it using nilearn.connectome.ConnectivityMeasure.

from nilearn.connectome import ConnectivityMeasure

correlation_measure = ConnectivityMeasure(kind='correlation')

From the list of ROIs time-series for ADHD subjects, the correlation_measure computes individual correlation matrices.

correlation_matrices = correlation_measure.fit_transform(adhd_subjects)

# All individual coefficients are stacked in a unique 2D matrix.
print('Correlations of ADHD patients are stacked in an array of shape {0}'


Correlations of ADHD patients are stacked in an array of shape (13, 39, 39)

as well as the average correlation across all fitted subjects.

mean_correlation_matrix = correlation_measure.mean_
print('Mean correlation has shape {0}.'.format(mean_correlation_matrix.shape))


Mean correlation has shape (39, 39).

We display the connectomes of the first 3 ADHD subjects and the mean correlation matrix over all ADHD patients.

from nilearn import plotting

plot_matrices(correlation_matrices[:4], 'correlation')
plotting.plot_connectome(mean_correlation_matrix, msdl_coords,
                         title='mean correlation over 13 ADHD subjects')
  • ../../_images/sphx_glr_plot_group_level_connectivity_001.png
  • ../../_images/sphx_glr_plot_group_level_connectivity_002.png

Look at blocks structure, reflecting functional networks. Examine partial correlations

We can also study direct connections, revealed by partial correlation coefficients. We just change the ConnectivityMeasure kind

partial_correlation_measure = ConnectivityMeasure(kind='partial correlation')

and repeat the previous operation.

partial_correlation_matrices = partial_correlation_measure.fit_transform(

Most of direct connections are weaker than full connections, resulting in a sparse mean connectome graph.

plot_matrices(partial_correlation_matrices[:4], 'partial')
    partial_correlation_measure.mean_, msdl_coords,
    title='mean partial correlation over 13 ADHD subjects')
  • ../../_images/sphx_glr_plot_group_level_connectivity_003.png
  • ../../_images/sphx_glr_plot_group_level_connectivity_004.png Extract subjects variabilities around a robust group connectivity

We can use both correlations and partial correlations to capture reproducible connectivity patterns at the group-level and build a robust group connectivity matrix. This is done by the tangent kind.

tangent_measure = ConnectivityMeasure(kind='tangent')

We fit our ADHD group and get the group connectivity matrix stored as in tangent_measure.mean_, and individual deviation matrices of each subject from it.

tangent_matrices = tangent_measure.fit_transform(adhd_subjects)

tangent_matrices model individual connectivities as perturbations of the group connectivity matrix tangent_measure.mean_. Keep in mind that these subjects-to-group variability matrices do not straight reflect individual brain connections. For instance negative coefficients can not be interpreted as anticorrelated regions.

plot_matrices(tangent_matrices[:4], 'tangent variability')
    tangent_measure.mean_, msdl_coords,
    title='mean tangent connectivity over 13 ADHD subjects')
  • ../../_images/sphx_glr_plot_group_level_connectivity_005.png
  • ../../_images/sphx_glr_plot_group_level_connectivity_006.png

The mean connectome graph is not as sparse as partial correlations graph, yet it is less dense than correlations graph. What kind of connectivity is most powerful for classification?

ConnectivityMeasure can output the estimated subjects coefficients as a 1D arrays through the parameter vectorize.

connectivity_biomarkers = {}
kinds = ['correlation', 'partial correlation', 'tangent']
for kind in kinds:
    conn_measure = ConnectivityMeasure(kind=kind, vectorize=True)
    connectivity_biomarkers[kind] = conn_measure.fit_transform(pooled_subjects)

# For each kind, all individual coefficients are stacked in a unique 2D matrix.
print('{0} correlation biomarkers for each subject.'.format(


780 correlation biomarkers for each subject.

Note that we use the pooled groups. This is crucial for tangent kind, to get the displacements from a unique mean_ of all subjects.

We stratify the dataset into homogeneous classes according to phenotypic and scan site. We then split the subjects into 3 folds with the same proportion of each class as in the whole cohort

from sklearn.cross_validation import StratifiedKFold

classes = ['{0}{1}'.format(site_name, adhd_label)
           for site_name, adhd_label in zip(site_names, adhd_labels)]
cv = StratifiedKFold(classes, n_folds=3)

and use the connectivity coefficients to classify ADHD patients vs controls.

from sklearn.svm import LinearSVC
from sklearn.cross_validation import cross_val_score

mean_scores = []
for kind in kinds:
    svc = LinearSVC(random_state=0)
    cv_scores = cross_val_score(svc, connectivity_biomarkers[kind],
                                y=adhd_labels, cv=cv, scoring='accuracy')

Finally, we can display the classification scores.

plt.figure(figsize=(6, 4))
positions = np.arange(len(kinds)) * .1 + .1
plt.barh(positions, mean_scores, align='center', height=.05)
yticks = [kind.replace(' ', '\n') for kind in kinds]
plt.yticks(positions, yticks)
plt.xlabel('Classification accuracy')


Total running time of the script: ( 0 minutes 40.177 seconds)

Generated by Sphinx-Gallery