.. only:: html
.. note::
:class: sphx-glr-download-link-note
Click :ref:`here ` to download the full example code or to run this example in your browser via Binder
.. rst-class:: sphx-glr-example-title
.. _sphx_glr_auto_examples_07_advanced_plot_age_group_prediction_cross_val.py:
Functional connectivity predicts age group
==========================================
This example compares different kinds of functional connectivity between
regions of interest : correlation, partial correlation, and tangent space
embedding.
The resulting connectivity coefficients can be used to
discriminate children from adults. In general, the tangent space embedding
**outperforms** the standard correlations: see `Dadi et al 2019
`_
for a careful study.
Load brain development fMRI dataset and MSDL atlas
-------------------------------------------------------------------
We study only 60 subjects from the dataset, to save computation time.
.. code-block:: default
from nilearn import datasets
development_dataset = datasets.fetch_development_fmri(n_subjects=60)
We use probabilistic regions of interest (ROIs) from the MSDL atlas.
.. code-block:: default
from nilearn.input_data import NiftiMapsMasker
msdl_data = datasets.fetch_atlas_msdl()
msdl_coords = msdl_data.region_coords
masker = NiftiMapsMasker(
msdl_data.maps, resampling_target="data", t_r=2, detrend=True,
low_pass=.1, high_pass=.01, memory='nilearn_cache', memory_level=1).fit()
masked_data = [masker.transform(func, confounds) for
(func, confounds) in zip(
development_dataset.func, development_dataset.confounds)]
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
/home/nicolas/anaconda3/envs/nilearn/lib/python3.8/site-packages/numpy/lib/npyio.py:2405: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
output = genfromtxt(fname, **kwargs)
/home/nicolas/GitRepos/nilearn-fork/nilearn/image/image.py:1106: FutureWarning: The parameter "sessions" will be removed in 0.9.0 release of Nilearn. Please use the parameter "runs" instead.
data = signal.clean(
What kind of connectivity is most powerful for classification?
--------------------------------------------------------------
we will use connectivity matrices as features to distinguish children from
adults. We use cross-validation and measure classification accuracy to
compare the different kinds of connectivity matrices.
.. code-block:: default
# prepare the classification pipeline
from sklearn.pipeline import Pipeline
from nilearn.connectome import ConnectivityMeasure
from sklearn.svm import LinearSVC
from sklearn.dummy import DummyClassifier
from sklearn.model_selection import GridSearchCV
kinds = ['correlation', 'partial correlation', 'tangent']
pipe = Pipeline(
[('connectivity', ConnectivityMeasure(vectorize=True)),
('classifier', GridSearchCV(LinearSVC(), {'C': [.1, 1., 10.]}, cv=5))])
param_grid = [
{'classifier': [DummyClassifier('most_frequent')]},
{'connectivity__kind': kinds}
]
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
/home/nicolas/anaconda3/envs/nilearn/lib/python3.8/site-packages/sklearn/utils/validation.py:70: FutureWarning: Pass strategy=most_frequent as keyword args. From version 1.0 (renaming of 0.25) passing these as positional arguments will result in an error
warnings.warn(f"Pass {args_msg} as keyword args. From version "
We use random splits of the subjects into training/testing sets.
StratifiedShuffleSplit allows preserving the proportion of children in the
test set.
.. code-block:: default
from sklearn.model_selection import GridSearchCV, StratifiedShuffleSplit
from sklearn.preprocessing import LabelEncoder
groups = [pheno['Child_Adult'] for pheno in development_dataset.phenotypic]
classes = LabelEncoder().fit_transform(groups)
cv = StratifiedShuffleSplit(n_splits=30, random_state=0, test_size=10)
gs = GridSearchCV(pipe, param_grid, scoring='accuracy', cv=cv, verbose=1,
refit=False, n_jobs=8)
gs.fit(masked_data, classes)
mean_scores = gs.cv_results_['mean_test_score']
scores_std = gs.cv_results_['std_test_score']
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
Fitting 30 folds for each of 4 candidates, totalling 120 fits
display the results
.. code-block:: default
from matplotlib import pyplot as plt
plt.figure(figsize=(6, 4))
positions = [.1, .2, .3, .4]
plt.barh(positions, mean_scores, align='center', height=.05, xerr=scores_std)
yticks = ['dummy'] + list(gs.cv_results_['param_connectivity__kind'].data[1:])
yticks = [t.replace(' ', '\n') for t in yticks]
plt.yticks(positions, yticks)
plt.xlabel('Classification accuracy')
plt.gca().grid(True)
plt.gca().set_axisbelow(True)
plt.tight_layout()
.. image:: /auto_examples/07_advanced/images/sphx_glr_plot_age_group_prediction_cross_val_001.png
:alt: plot age group prediction cross val
:class: sphx-glr-single-img
This is a small example to showcase nilearn features. In practice such
comparisons need to be performed on much larger cohorts and several
datasets.
`Dadi et al 2019
`_
Showed that across many cohorts and clinical questions, the tangent
kind should be preferred.
.. code-block:: default
plt.show()
.. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 1 minutes 26.634 seconds)
.. _sphx_glr_download_auto_examples_07_advanced_plot_age_group_prediction_cross_val.py:
.. only :: html
.. container:: sphx-glr-footer
:class: sphx-glr-footer-example
.. container:: binder-badge
.. image:: images/binder_badge_logo.svg
:target: https://mybinder.org/v2/gh/nilearn/nilearn.github.io/main?filepath=examples/auto_examples/07_advanced/plot_age_group_prediction_cross_val.ipynb
:alt: Launch binder
:width: 150 px
.. container:: sphx-glr-download sphx-glr-download-python
:download:`Download Python source code: plot_age_group_prediction_cross_val.py `
.. container:: sphx-glr-download sphx-glr-download-jupyter
:download:`Download Jupyter notebook: plot_age_group_prediction_cross_val.ipynb `
.. only:: html
.. rst-class:: sphx-glr-signature
`Gallery generated by Sphinx-Gallery `_