Note
Go to the end to download the full example code. or to run this example in your browser via Binder
Voxel-Based Morphometry on Oasis dataset¶
This example uses Voxel-Based Morphometry (VBM) to study the relationship between aging and gray matter density.
The data come from the OASIS project. If you use it, you need to agree with the data usage agreement available on the website.
It has been run through a standard VBM pipeline (using SPM8 and NewSegment) to create VBM maps, which we study here.
Predictive modeling analysis: VBM bio-markers of aging?¶
We run a standard SVM-ANOVA nilearn pipeline to predict age from the VBM data. We use only 100 subjects from the OASIS dataset to limit the memory usage.
Note that for an actual predictive modeling study of aging, the study should be ran on the full set of subjects. Also, all parameters should be set by cross-validation. This includes the smoothing applied to the data and the number of features selected by the ANOVA step. Indeed, even these data-preparation parameter impact significantly the prediction score.
Also, parameters such as the smoothing should be applied to the data and the number of features selected by the ANOVA step should be set by nested cross-validation, as they impact significantly the prediction score.
Brain mapping with mass univariate¶
SVM weights are very noisy, partly because heavy smoothing is detrimental for the prediction here. A standard analysis using mass-univariate GLM (here permuted to have exact correction for multiple comparisons) gives a much clearer view of the important regions.
Note
If you are using Nilearn with a version older than 0.9.0
,
then you should either upgrade your version or import maskers
from the input_data
module instead of the maskers
module.
That is, you should manually replace in the following example all occurrences of:
from nilearn.maskers import NiftiMasker
with:
from nilearn.input_data import NiftiMasker
import numpy as np
from nilearn import datasets
from nilearn.image import get_data
from nilearn.maskers import NiftiMasker
n_subjects = 100 # more subjects requires more memory
Load Oasis dataset¶
oasis_dataset = datasets.fetch_oasis_vbm(
n_subjects=n_subjects, legacy_format=False
)
gray_matter_map_filenames = oasis_dataset.gray_matter_maps
age = oasis_dataset.ext_vars["age"].to_numpy()
# Split data into training set and test set
from sklearn.model_selection import train_test_split
gm_imgs_train, gm_imgs_test, age_train, age_test = train_test_split(
gray_matter_map_filenames, age, train_size=0.6, random_state=0
)
# print basic information on the dataset
print(
"First gray-matter anatomy image (3D) is located at: "
f"{oasis_dataset.gray_matter_maps[0]}"
)
print(
"First white-matter anatomy image (3D) is located at: "
f"{oasis_dataset.white_matter_maps[0]}"
)
[get_dataset_dir] Dataset found in /home/runner/nilearn_data/oasis1
First gray-matter anatomy image (3D) is located at: /home/runner/nilearn_data/oasis1/OAS1_0001_MR1/mwrc1OAS1_0001_MR1_mpr_anon_fslswapdim_bet.nii.gz
First white-matter anatomy image (3D) is located at: /home/runner/nilearn_data/oasis1/OAS1_0001_MR1/mwrc2OAS1_0001_MR1_mpr_anon_fslswapdim_bet.nii.gz
Preprocess data¶
nifti_masker = NiftiMasker(
standardize=False, smoothing_fwhm=2, memory="nilearn_cache"
) # cache options
gm_maps_masked = nifti_masker.fit_transform(gm_imgs_train)
# The features with too low between-subject variance are removed using
# :class:`sklearn.feature_selection.VarianceThreshold`.
from sklearn.feature_selection import VarianceThreshold
variance_threshold = VarianceThreshold(threshold=0.01)
variance_threshold.fit_transform(gm_maps_masked)
# Then we convert the data back to the mask image in order to use it for
# decoding process
mask = nifti_masker.inverse_transform(variance_threshold.get_support())
Prediction pipeline with ANOVA and SVR using
nilearn.decoding.DecoderRegressor
Object
In nilearn we can benefit from the built-in DecoderRegressor object to
do ANOVA with SVR instead of manually defining the whole pipeline.
This estimator also uses Cross Validation to select best models and ensemble
them. Furthermore, you can pass n_jobs=<some_high_value>
to the
DecoderRegressor class to take advantage of a multi-core system.
To save time (because these are anat images with many voxels), we include
only the 1-percent voxels most correlated with the age variable to fit. We
also want to set mask hyperparameter to be the mask we just obtained above.
from nilearn.decoding import DecoderRegressor
decoder = DecoderRegressor(
estimator="svr",
mask=mask,
scoring="neg_mean_absolute_error",
screening_percentile=1,
n_jobs=2,
standardize="zscore_sample",
)
# Fit and predict with the decoder
decoder.fit(gm_imgs_train, age_train)
# Sort test data for better visualization (trend, etc.)
perm = np.argsort(age_test)[::-1]
age_test = age_test[perm]
gm_imgs_test = np.array(gm_imgs_test)[perm]
age_pred = decoder.predict(gm_imgs_test)
prediction_score = -np.mean(decoder.cv_scores_["beta"])
print("=== DECODER ===")
print(f"explained variance for the cross-validation: {prediction_score:f}")
print()
=== DECODER ===
explained variance for the cross-validation: 10.670599
Visualization¶
weight_img = decoder.coef_img_["beta"]
# Create the figure
from nilearn.plotting import plot_stat_map, show
bg_filename = gray_matter_map_filenames[0]
z_slice = 0
display = plot_stat_map(
weight_img, bg_img=bg_filename, display_mode="z", cut_coords=[z_slice]
)
display.title("SVM weights")
show()
/home/runner/work/nilearn/nilearn/.tox/doc/lib/python3.9/site-packages/nilearn/plotting/displays/_slicers.py:1510: MatplotlibDeprecationWarning:
Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
Visualize the quality of predictions¶
import matplotlib.pyplot as plt
plt.figure(figsize=(6, 4.5))
plt.suptitle(f"Decoder: Mean Absolute Error {prediction_score:.2f} years")
linewidth = 3
plt.plot(age_test, label="True age", linewidth=linewidth)
plt.plot(age_pred, "--", c="g", label="Predicted age", linewidth=linewidth)
plt.ylabel("age")
plt.xlabel("subject")
plt.legend(loc="best")
plt.figure(figsize=(6, 4.5))
plt.plot(
age_test - age_pred, label="True age - predicted age", linewidth=linewidth
)
plt.xlabel("subject")
plt.legend(loc="best")
<matplotlib.legend.Legend object at 0x7ff0b5a7fbb0>
Inference with massively univariate model¶
print("Massively univariate model")
gm_maps_masked = NiftiMasker().fit_transform(gray_matter_map_filenames)
data = variance_threshold.fit_transform(gm_maps_masked)
# Statistical inference
from nilearn.mass_univariate import permuted_ols
# This can be changed to use more CPUs.
neg_log_pvals, t_scores_original_data, _ = permuted_ols(
age,
data, # + intercept as a covariate by default
n_perm=2000, # 1,000 in the interest of time; 10000 would be better
verbose=1, # display progress bar
n_jobs=2,
)
signed_neg_log_pvals = neg_log_pvals * np.sign(t_scores_original_data)
signed_neg_log_pvals_unmasked = nifti_masker.inverse_transform(
variance_threshold.inverse_transform(signed_neg_log_pvals)
)
# Show results
threshold = -np.log10(0.1) # 10% corrected
fig = plt.figure(figsize=(5.5, 7.5), facecolor="k")
display = plot_stat_map(
signed_neg_log_pvals_unmasked,
bg_img=bg_filename,
threshold=threshold,
cmap=plt.cm.RdBu_r,
display_mode="z",
cut_coords=[z_slice],
figure=fig,
)
title = (
"Negative $\\log_{10}$ p-values\n(Non-parametric + max-type correction)"
)
display.title(title)
n_detections = (get_data(signed_neg_log_pvals_unmasked) > threshold).sum()
print(f"\n{int(n_detections)} detections")
show()
Massively univariate model
[Parallel(n_jobs=2)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=2)]: Done 2 out of 2 | elapsed: 16.6s remaining: 0.0s
[Parallel(n_jobs=2)]: Done 2 out of 2 | elapsed: 16.6s finished
/home/runner/work/nilearn/nilearn/.tox/doc/lib/python3.9/site-packages/nilearn/plotting/displays/_slicers.py:1510: MatplotlibDeprecationWarning:
Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
1978 detections
Total running time of the script: (0 minutes 51.857 seconds)
Estimated memory usage: 2231 MB