.. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_07_advanced_plot_age_group_prediction_cross_val.py: Functional connectivity predicts age group ========================================== This example compares different kinds of functional connectivity between regions of interest : correlation, partial correlation, and tangent space embedding. The resulting connectivity coefficients can be used to discriminate children from adults. In general, the tangent space embedding **outperforms** the standard correlations: see `Dadi et al 2019 `_ for a careful study. Load brain development fMRI dataset and MSDL atlas ------------------------------------------------------------------- We study only 60 subjects from the dataset, to save computation time. .. code-block:: default from nilearn import datasets development_dataset = datasets.fetch_development_fmri(n_subjects=60) We use probabilistic regions of interest (ROIs) from the MSDL atlas. .. code-block:: default from nilearn.input_data import NiftiMapsMasker msdl_data = datasets.fetch_atlas_msdl() msdl_coords = msdl_data.region_coords masker = NiftiMapsMasker( msdl_data.maps, resampling_target="data", t_r=2, detrend=True, low_pass=.1, high_pass=.01, memory='nilearn_cache', memory_level=1).fit() masked_data = [masker.transform(func, confounds) for (func, confounds) in zip( development_dataset.func, development_dataset.confounds)] .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /home/nicolas/anaconda3/envs/nilearn/lib/python3.8/site-packages/numpy/lib/npyio.py:2405: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default. output = genfromtxt(fname, **kwargs) /home/nicolas/GitRepos/nilearn-fork/nilearn/image/image.py:1106: FutureWarning: The parameter "sessions" will be removed in 0.9.0 release of Nilearn. Please use the parameter "runs" instead. data = signal.clean( What kind of connectivity is most powerful for classification? -------------------------------------------------------------- we will use connectivity matrices as features to distinguish children from adults. We use cross-validation and measure classification accuracy to compare the different kinds of connectivity matrices. .. code-block:: default # prepare the classification pipeline from sklearn.pipeline import Pipeline from nilearn.connectome import ConnectivityMeasure from sklearn.svm import LinearSVC from sklearn.dummy import DummyClassifier from sklearn.model_selection import GridSearchCV kinds = ['correlation', 'partial correlation', 'tangent'] pipe = Pipeline( [('connectivity', ConnectivityMeasure(vectorize=True)), ('classifier', GridSearchCV(LinearSVC(), {'C': [.1, 1., 10.]}, cv=5))]) param_grid = [ {'classifier': [DummyClassifier('most_frequent')]}, {'connectivity__kind': kinds} ] .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /home/nicolas/anaconda3/envs/nilearn/lib/python3.8/site-packages/sklearn/utils/validation.py:70: FutureWarning: Pass strategy=most_frequent as keyword args. From version 1.0 (renaming of 0.25) passing these as positional arguments will result in an error warnings.warn(f"Pass {args_msg} as keyword args. From version " We use random splits of the subjects into training/testing sets. StratifiedShuffleSplit allows preserving the proportion of children in the test set. .. code-block:: default from sklearn.model_selection import GridSearchCV, StratifiedShuffleSplit from sklearn.preprocessing import LabelEncoder groups = [pheno['Child_Adult'] for pheno in development_dataset.phenotypic] classes = LabelEncoder().fit_transform(groups) cv = StratifiedShuffleSplit(n_splits=30, random_state=0, test_size=10) gs = GridSearchCV(pipe, param_grid, scoring='accuracy', cv=cv, verbose=1, refit=False, n_jobs=8) gs.fit(masked_data, classes) mean_scores = gs.cv_results_['mean_test_score'] scores_std = gs.cv_results_['std_test_score'] .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Fitting 30 folds for each of 4 candidates, totalling 120 fits display the results .. code-block:: default from matplotlib import pyplot as plt plt.figure(figsize=(6, 4)) positions = [.1, .2, .3, .4] plt.barh(positions, mean_scores, align='center', height=.05, xerr=scores_std) yticks = ['dummy'] + list(gs.cv_results_['param_connectivity__kind'].data[1:]) yticks = [t.replace(' ', '\n') for t in yticks] plt.yticks(positions, yticks) plt.xlabel('Classification accuracy') plt.gca().grid(True) plt.gca().set_axisbelow(True) plt.tight_layout() .. image:: /auto_examples/07_advanced/images/sphx_glr_plot_age_group_prediction_cross_val_001.png :alt: plot age group prediction cross val :class: sphx-glr-single-img This is a small example to showcase nilearn features. In practice such comparisons need to be performed on much larger cohorts and several datasets. `Dadi et al 2019 `_ Showed that across many cohorts and clinical questions, the tangent kind should be preferred. .. code-block:: default plt.show() .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 1 minutes 26.634 seconds) .. _sphx_glr_download_auto_examples_07_advanced_plot_age_group_prediction_cross_val.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/nilearn/nilearn.github.io/main?filepath=examples/auto_examples/07_advanced/plot_age_group_prediction_cross_val.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_age_group_prediction_cross_val.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_age_group_prediction_cross_val.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_