.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/07_advanced/plot_age_group_prediction_cross_val.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_07_advanced_plot_age_group_prediction_cross_val.py: Functional connectivity predicts age group ========================================== This example compares different kinds of functional connectivity between regions of interest : correlation, partial correlation, and tangent space embedding. The resulting connectivity coefficients can be used to discriminate children from adults. In general, the tangent space embedding **outperforms** the standard correlations: see `Dadi et al 2019 `_ for a careful study. .. include:: ../../../examples/masker_note.rst .. GENERATED FROM PYTHON SOURCE LINES 20-23 Load brain development fMRI dataset and MSDL atlas ------------------------------------------------------------------- We study only 60 subjects from the dataset, to save computation time. .. GENERATED FROM PYTHON SOURCE LINES 23-27 .. code-block:: default from nilearn import datasets development_dataset = datasets.fetch_development_fmri(n_subjects=60) .. GENERATED FROM PYTHON SOURCE LINES 28-29 We use probabilistic regions of interest (ROIs) from the MSDL atlas. .. GENERATED FROM PYTHON SOURCE LINES 29-41 .. code-block:: default from nilearn.maskers import NiftiMapsMasker msdl_data = datasets.fetch_atlas_msdl() msdl_coords = msdl_data.region_coords masker = NiftiMapsMasker( msdl_data.maps, resampling_target="data", t_r=2, detrend=True, low_pass=.1, high_pass=.01, memory='nilearn_cache', memory_level=1).fit() masked_data = [masker.transform(func, confounds) for (func, confounds) in zip( development_dataset.func, development_dataset.confounds)] .. GENERATED FROM PYTHON SOURCE LINES 42-47 What kind of connectivity is most powerful for classification? -------------------------------------------------------------- we will use connectivity matrices as features to distinguish children from adults. We use cross-validation and measure classification accuracy to compare the different kinds of connectivity matrices. .. GENERATED FROM PYTHON SOURCE LINES 47-66 .. code-block:: default # prepare the classification pipeline from sklearn.pipeline import Pipeline from nilearn.connectome import ConnectivityMeasure from sklearn.svm import LinearSVC from sklearn.dummy import DummyClassifier from sklearn.model_selection import GridSearchCV kinds = ['correlation', 'partial correlation', 'tangent'] pipe = Pipeline( [('connectivity', ConnectivityMeasure(vectorize=True)), ('classifier', GridSearchCV(LinearSVC(), {'C': [.1, 1., 10.]}, cv=5))]) param_grid = [ {'classifier': [DummyClassifier(strategy='most_frequent')]}, {'connectivity__kind': kinds} ] .. GENERATED FROM PYTHON SOURCE LINES 67-70 We use random splits of the subjects into training/testing sets. StratifiedShuffleSplit allows preserving the proportion of children in the test set. .. GENERATED FROM PYTHON SOURCE LINES 70-83 .. code-block:: default from sklearn.model_selection import GridSearchCV, StratifiedShuffleSplit from sklearn.preprocessing import LabelEncoder groups = [pheno['Child_Adult'] for pheno in development_dataset.phenotypic] classes = LabelEncoder().fit_transform(groups) cv = StratifiedShuffleSplit(n_splits=30, random_state=0, test_size=10) gs = GridSearchCV(pipe, param_grid, scoring='accuracy', cv=cv, verbose=1, refit=False, n_jobs=8) gs.fit(masked_data, classes) mean_scores = gs.cv_results_['mean_test_score'] scores_std = gs.cv_results_['std_test_score'] .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Fitting 30 folds for each of 4 candidates, totalling 120 fits .. GENERATED FROM PYTHON SOURCE LINES 84-85 display the results .. GENERATED FROM PYTHON SOURCE LINES 85-99 .. code-block:: default from matplotlib import pyplot as plt plt.figure(figsize=(6, 4)) positions = [.1, .2, .3, .4] plt.barh(positions, mean_scores, align='center', height=.05, xerr=scores_std) yticks = ['dummy'] + list(gs.cv_results_['param_connectivity__kind'].data[1:]) yticks = [t.replace(' ', '\n') for t in yticks] plt.yticks(positions, yticks) plt.xlabel('Classification accuracy') plt.gca().grid(True) plt.gca().set_axisbelow(True) plt.tight_layout() .. image-sg:: /auto_examples/07_advanced/images/sphx_glr_plot_age_group_prediction_cross_val_001.png :alt: plot age group prediction cross val :srcset: /auto_examples/07_advanced/images/sphx_glr_plot_age_group_prediction_cross_val_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 100-107 This is a small example to showcase nilearn features. In practice such comparisons need to be performed on much larger cohorts and several datasets. `Dadi et al 2019 `_ Showed that across many cohorts and clinical questions, the tangent kind should be preferred. .. GENERATED FROM PYTHON SOURCE LINES 107-109 .. code-block:: default plt.show() .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 1 minutes 30.134 seconds) **Estimated memory usage:** 954 MB .. _sphx_glr_download_auto_examples_07_advanced_plot_age_group_prediction_cross_val.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/nilearn/nilearn.github.io/main?filepath=examples/auto_examples/07_advanced/plot_age_group_prediction_cross_val.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_age_group_prediction_cross_val.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_age_group_prediction_cross_val.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_