Note
Go to the end to download the full example code. or to run this example in your browser via Binder
A introduction tutorial to fMRI decoding¶
Here is a simple tutorial on decoding with nilearn. It reproduces the Haxby et al.[1] study on a face vs cat discrimination task in a mask of the ventral stream.
This tutorial is meant as an introduction to the various steps of a decoding
analysis using Nilearn meta-estimator: Decoder
It is not a minimalistic example, as it strives to be didactic. It is not meant to be copied to analyze new data: many of the steps are unnecessary.
import warnings
warnings.filterwarnings(
"ignore", message="The provided image has no sform in its header."
)
Retrieve and load the fMRI data from the Haxby study¶
First download the data¶
The fetch_haxby
function will download the
Haxby dataset if not present on the disk, in the nilearn data directory.
It can take a while to download about 310 Mo of data from the Internet.
from nilearn import datasets
# By default 2nd subject will be fetched
haxby_dataset = datasets.fetch_haxby()
# 'func' is a list of filenames: one for each subject
fmri_filename = haxby_dataset.func[0]
# print basic information on the dataset
print(f"First subject functional nifti images (4D) are at: {fmri_filename}")
[get_dataset_dir] Dataset found in /home/runner/nilearn_data/haxby2001
First subject functional nifti images (4D) are at: /home/runner/nilearn_data/haxby2001/subj2/bold.nii.gz
Visualizing the fMRI volume¶
One way to visualize a fMRI volume is
using plot_epi
.
We will visualize the previously fetched fMRI
data from Haxby dataset.
Because fMRI data are 4D
(they consist of many 3D EPI images),
we cannot plot them directly using plot_epi
(which accepts just 3D input).
Here we are using mean_img
to
extract a single 3D EPI image from the fMRI data.
Feature extraction: from fMRI volumes to a data matrix¶
These are some really lovely images, but for machine learning
we need matrices to work with the actual data. Fortunately, the
Decoder
object we will use later on can
automatically transform Nifti images into matrices.
All we have to do for now is define a mask filename.
A mask of the Ventral Temporal (VT) cortex coming from the Haxby study is available:
mask_filename = haxby_dataset.mask_vt[0]
# Let's visualize it, using the subject's anatomical image as a
# background
plot_roi(mask_filename, bg_img=haxby_dataset.anat[0], cmap="Paired")
show()
Load the behavioral labels¶
Now that the brain images are converted to a data matrix, we can apply machine-learning to them, for instance to predict the task that the subject was doing. The behavioral labels are stored in a CSV file, separated by spaces.
We use pandas to load them in an array.
import pandas as pd
# Load behavioral information
behavioral = pd.read_csv(haxby_dataset.session_target[0], delimiter=" ")
print(behavioral)
labels chunks
0 rest 0
1 rest 0
2 rest 0
3 rest 0
4 rest 0
... ... ...
1447 rest 11
1448 rest 11
1449 rest 11
1450 rest 11
1451 rest 11
[1452 rows x 2 columns]
The task was a visual-recognition task, and the labels denote the experimental condition: the type of object that was presented to the subject. This is what we are going to try to predict.
conditions = behavioral["labels"]
print(conditions)
0 rest
1 rest
2 rest
3 rest
4 rest
...
1447 rest
1448 rest
1449 rest
1450 rest
1451 rest
Name: labels, Length: 1452, dtype: object
Restrict the analysis to cats and faces¶
As we can see from the targets above, the experiment contains many conditions. As a consequence, the data is quite big. Not all of this data has an interest to us for decoding, so we will keep only fMRI signals corresponding to faces or cats. We create a mask of the samples belonging to the condition; this mask is then applied to the fMRI data to restrict the classification to the face vs cat discrimination.
The input data will become much smaller (i.e. fMRI signal is shorter):
condition_mask = conditions.isin(["face", "cat"])
Because the data is in one single large 4D image, we need to use index_img to do the split easily.
from nilearn.image import index_img
fmri_niimgs = index_img(fmri_filename, condition_mask)
We apply the same mask to the targets
conditions = conditions[condition_mask]
conditions = conditions.to_numpy()
print(f"{conditions.shape=}")
conditions.shape=(216,)
Decoding with Support Vector Machine¶
As a decoder, we use a Support Vector Classifier with a linear kernel. We
first create it using by using Decoder
.
from nilearn.decoding import Decoder
decoder = Decoder(
estimator="svc", mask=mask_filename, standardize="zscore_sample"
)
The decoder object is an object that can be fit (or trained) on data with labels, and then predict labels on data without.
We first fit it on the data
We can then predict the labels from the data
prediction = decoder.predict(fmri_niimgs)
print(f"{prediction=}")
prediction=array(['face', 'face', 'face', 'face', 'face', 'face', 'face', 'face',
'face', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat',
'cat', 'face', 'face', 'face', 'face', 'face', 'face', 'face',
'face', 'face', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat',
'cat', 'cat', 'face', 'face', 'face', 'face', 'face', 'face',
'face', 'face', 'face', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat',
'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat',
'cat', 'cat', 'cat', 'face', 'face', 'face', 'face', 'face',
'face', 'face', 'face', 'face', 'face', 'face', 'face', 'face',
'face', 'face', 'face', 'face', 'face', 'cat', 'cat', 'cat', 'cat',
'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat',
'cat', 'cat', 'cat', 'cat', 'cat', 'face', 'face', 'face', 'face',
'face', 'face', 'face', 'face', 'face', 'face', 'face', 'face',
'face', 'face', 'face', 'face', 'face', 'face', 'cat', 'cat',
'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'face', 'face',
'face', 'face', 'face', 'face', 'face', 'face', 'face', 'cat',
'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'face',
'face', 'face', 'face', 'face', 'face', 'face', 'face', 'face',
'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat',
'face', 'face', 'face', 'face', 'face', 'face', 'face', 'face',
'face', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat',
'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat',
'cat', 'face', 'face', 'face', 'face', 'face', 'face', 'face',
'face', 'face', 'face', 'face', 'face', 'face', 'face', 'face',
'face', 'face', 'face', 'cat', 'cat', 'cat', 'cat', 'cat', 'cat',
'cat', 'cat', 'cat'], dtype='<U4')
Note that for this classification task both classes contain the same number of samples (the problem is balanced). Then, we can use accuracy to measure the performance of the decoder. This is done by defining accuracy as the scoring. Let’s measure the prediction accuracy:
print((prediction == conditions).sum() / float(len(conditions)))
1.0
This prediction accuracy score is meaningless. Why?
Measuring prediction scores using cross-validation¶
The proper way to measure error rates or prediction accuracy is via cross-validation: leaving out some data and testing on it.
Manually leaving out data¶
Let’s leave out the 30 last data points during training, and test the prediction on these 30 last points:
fmri_niimgs_train = index_img(fmri_niimgs, slice(0, -30))
fmri_niimgs_test = index_img(fmri_niimgs, slice(-30, None))
conditions_train = conditions[:-30]
conditions_test = conditions[-30:]
decoder = Decoder(
estimator="svc", mask=mask_filename, standardize="zscore_sample"
)
decoder.fit(fmri_niimgs_train, conditions_train)
prediction = decoder.predict(fmri_niimgs_test)
# The prediction accuracy is calculated on the test data: this is the accuracy
# of our model on examples it hasn't seen to examine how well the model perform
# in general.
predicton_accuracy = (prediction == conditions_test).sum() / float(
len(conditions_test)
)
print(f"Prediction Accuracy: {predicton_accuracy:.3f}")
Prediction Accuracy: 0.767
Implementing a KFold loop¶
We can manually split the data in train and test set repetitively in a KFold strategy by importing scikit-learn’s object:
from sklearn.model_selection import KFold
cv = KFold(n_splits=5)
for fold, (train, test) in enumerate(cv.split(conditions), start=1):
decoder = Decoder(
estimator="svc", mask=mask_filename, standardize="zscore_sample"
)
decoder.fit(index_img(fmri_niimgs, train), conditions[train])
prediction = decoder.predict(index_img(fmri_niimgs, test))
predicton_accuracy = (prediction == conditions[test]).sum() / float(
len(conditions[test])
)
print(
f"CV Fold {fold:01d} | Prediction Accuracy: {predicton_accuracy:.3f}"
)
CV Fold 1 | Prediction Accuracy: 0.886
CV Fold 2 | Prediction Accuracy: 0.767
CV Fold 3 | Prediction Accuracy: 0.767
CV Fold 4 | Prediction Accuracy: 0.698
CV Fold 5 | Prediction Accuracy: 0.744
Cross-validation with the decoder¶
The decoder also implements a cross-validation loop by default and returns an array of shape (cross-validation parameters, n_folds). We can use accuracy score to measure its performance by defining accuracy as the scoring parameter.
n_folds = 5
decoder = Decoder(
estimator="svc",
mask=mask_filename,
standardize="zscore_sample",
cv=n_folds,
scoring="accuracy",
)
decoder.fit(fmri_niimgs, conditions)
Cross-validation pipeline can also be implemented manually. More details can be found on scikit-learn website.
Then we can check the best performing parameters per fold.
print(decoder.cv_params_["face"])
{'C': [100.0, 100.0, 100.0, 100.0, 100.0]}
Note
We can speed things up to use all the CPUs of our computer with the n_jobs parameter.
The best way to do cross-validation is to respect the structure of the experiment, for instance by leaving out full runs of acquisition.
The number of the run is stored in the CSV file giving the behavioral data. We have to apply our run mask, to select only cats and faces.
run_label = behavioral["chunks"][condition_mask]
The fMRI data is acquired by runs, and the noise is autocorrelated in a given run. Hence, it is better to predict across runs when doing cross-validation. To leave a run out, pass the cross-validator object to the cv parameter of decoder.
from sklearn.model_selection import LeaveOneGroupOut
cv = LeaveOneGroupOut()
decoder = Decoder(
estimator="svc", mask=mask_filename, standardize="zscore_sample", cv=cv
)
decoder.fit(fmri_niimgs, conditions, groups=run_label)
print(f"{decoder.cv_scores_=}")
decoder.cv_scores_={'cat': [1.0, 1.0, 1.0, 1.0, 0.9629629629629629, 0.8518518518518519, 0.9753086419753086, 0.40740740740740744, 0.9876543209876543, 1.0, 0.9259259259259259, 0.8765432098765432], 'face': [1.0, 1.0, 1.0, 1.0, 0.9629629629629629, 0.8518518518518519, 0.9753086419753086, 0.40740740740740744, 0.9876543209876543, 1.0, 0.9259259259259259, 0.8765432098765432]}
Inspecting the model weights¶
Finally, it may be useful to inspect and display the model weights.
Turning the weights into a nifti image¶
We retrieve the SVC discriminating weights
coef_ = decoder.coef_
print(f"{coef_=}")
coef_=array([[-3.89377744e-02, -1.87168683e-02, -3.23027933e-02,
-2.88747103e-02, 4.18696943e-02, 1.10743428e-02,
1.69998818e-02, -5.50956301e-02, -1.94203918e-02,
-3.51225221e-02, 1.08511952e-02, -1.28797852e-02,
-1.54677689e-02, -3.78907832e-02, -3.69171204e-02,
2.28087447e-02, 6.56429306e-03, -7.65738902e-03,
1.67105992e-02, -8.02135441e-03, 5.29514546e-02,
-8.17596782e-02, -6.36991546e-02, 2.41325849e-02,
4.59874059e-02, -2.22602778e-02, -1.77311560e-02,
2.22197324e-02, -9.53203957e-03, 5.76045888e-02,
2.14298686e-02, -9.14229255e-02, 4.03655244e-03,
-2.89275621e-02, -3.89029602e-02, -3.35113714e-02,
2.21395574e-03, 8.73156891e-03, -3.37416134e-02,
-2.41273261e-02, -6.81650750e-02, 1.65404475e-02,
2.70785332e-02, -6.56856235e-03, -1.21662164e-02,
5.47673775e-02, 8.13286163e-03, 3.60957102e-02,
-1.52763972e-02, 7.02912618e-02, 1.28111395e-03,
2.08011951e-02, -4.09969872e-03, 3.72430647e-02,
-3.77393747e-02, -1.03859686e-02, -2.38237699e-02,
-5.48880114e-02, 4.43029212e-02, -1.47419326e-01,
-2.34043811e-02, 1.87115769e-02, 6.65860064e-02,
-9.07603401e-02, -1.22034043e-02, -2.95625631e-03,
3.22091962e-02, -3.04053007e-02, 6.15346189e-02,
1.12249799e-02, 1.93778408e-02, -1.30545268e-02,
4.42975414e-02, -2.23066060e-02, 6.88148636e-02,
1.69391159e-02, 1.78949250e-02, 1.00277823e-02,
2.99186491e-02, -2.52167813e-02, 1.06156228e-02,
-6.31950849e-03, 2.21512242e-03, -2.23348444e-02,
1.42561434e-02, -1.53124701e-02, -1.98227386e-02,
-4.32639609e-02, -4.55123756e-02, 3.41588961e-02,
-2.79200019e-02, -2.80913519e-02, -3.70158193e-02,
-5.71450812e-02, -6.98950055e-02, 3.20166460e-03,
-8.35434381e-03, -3.37626474e-02, 3.04261767e-02,
8.68464455e-03, 6.19380860e-03, 5.94178093e-02,
9.07299927e-03, -1.48931608e-02, 1.43559662e-02,
-1.09026291e-02, 2.67698222e-02, 4.73789204e-02,
-2.96433361e-02, 3.09421546e-02, 1.57925629e-02,
-3.16718592e-02, -4.00107422e-02, -5.40259912e-02,
2.82611626e-02, -1.12100705e-02, -5.45402650e-02,
6.32178956e-02, -1.49996080e-02, 2.47544839e-03,
-4.56643800e-02, -1.83881757e-02, 1.19958183e-02,
-3.72173116e-02, -2.25516704e-03, 4.58654494e-02,
4.79165444e-02, 2.51837653e-03, -4.31723370e-02,
-5.35323911e-03, 5.76993755e-02, 7.40813797e-03,
-3.20589584e-02, 4.35704233e-03, 1.68303219e-02,
-2.92569844e-02, -2.24492420e-03, -8.30220402e-03,
-1.00012763e-02, 2.17136848e-02, -1.92626296e-03,
-1.33222836e-02, -2.80300296e-02, -1.75291906e-02,
-9.17806240e-03, -7.09945113e-03, -1.43033039e-02,
5.06832132e-02, -1.84813339e-02, -4.71510021e-02,
1.72569509e-02, -4.76643337e-02, -9.08944947e-04,
4.00771410e-02, 7.53996143e-02, 7.25623789e-03,
4.82604254e-02, 4.50556646e-02, 3.61202546e-02,
-8.16480000e-03, 1.95406599e-02, 3.57885737e-02,
4.89305915e-02, 3.82972671e-02, 6.23919586e-02,
6.13672972e-02, -1.68752257e-02, 1.66514002e-02,
3.35522936e-02, -1.80215217e-02, 4.46410116e-02,
-3.53245501e-02, -3.67292890e-02, -4.62254818e-03,
4.86831638e-02, 3.39666279e-02, 6.21702399e-03,
1.73612338e-02, 2.01699005e-02, 2.17095948e-02,
2.91415905e-02, 2.37778080e-02, 4.84695862e-02,
-9.22616607e-03, -2.82638013e-02, -2.13779411e-02,
1.80778766e-03, 4.79689453e-02, -9.78886371e-03,
1.11430765e-02, -1.65021135e-02, -2.89087894e-02,
2.42851816e-02, -1.22348585e-02, -2.92870677e-02,
-2.89848408e-02, -3.39532852e-02, -3.65277078e-03,
2.65323530e-02, 4.58038926e-02, -5.93380229e-02,
-2.13630624e-02, -3.09405346e-02, 5.50178571e-02,
-3.38818048e-02, 6.12645280e-03, 1.41482598e-02,
1.10217060e-02, 5.33810398e-02, -2.12339021e-02,
6.37409855e-03, -1.13075108e-02, -2.64227570e-02,
-2.22399468e-02, -5.31918907e-02, -3.98653719e-02,
-1.29727535e-01, -3.28093643e-02, -2.89707579e-02,
-9.13464196e-03, -7.28713817e-03, -3.71049989e-02,
-6.34906413e-02, 2.04382488e-03, -8.26796285e-02,
-6.71217875e-02, -2.29133663e-03, -2.33451168e-02,
1.77913406e-02, -8.74667015e-02, -2.76479257e-03,
-4.38273519e-02, -1.28052427e-02, 2.78033434e-02,
-4.32696902e-02, -3.22691604e-02, -2.28028012e-02,
-2.57414654e-02, 2.03622951e-02, -9.90256702e-03,
-3.15035044e-02, -1.81418688e-02, -1.12288589e-03,
-4.17435241e-02, -6.23478649e-02, 2.54773437e-04,
-6.73680303e-02, 6.53963948e-02, 1.06521194e-02,
2.21985815e-02, -1.98727892e-02, -1.85520620e-02,
4.05703271e-02, -3.02838807e-02, -8.10049163e-02,
-7.42459546e-02, -4.93851352e-02, -1.01769572e-02,
1.09409504e-02, -4.49253231e-02, 2.92749893e-02,
7.05313262e-03, 5.07541157e-03, -4.84047555e-03,
2.48709627e-03, 3.00655506e-02, -2.63104762e-03,
4.64683601e-03, 7.90209035e-02, 1.04858795e-02,
1.68079266e-02, -4.36718576e-02, -1.08853505e-02,
2.10244260e-02, -4.41963632e-02, 3.16504713e-03,
6.98672799e-02, 8.61634194e-02, 4.96234330e-02,
6.03893573e-03, 5.56492348e-02, -2.98919713e-02,
4.13045940e-03, -3.21953663e-02, -3.14991417e-02,
-5.31277482e-02, 2.67254758e-02, 3.14427898e-02,
6.67119559e-03, -1.28703023e-02, 2.20150641e-02,
5.68524985e-02, 2.25602848e-02, -2.04616470e-02,
5.10349025e-03, 2.85357118e-02, -1.81665991e-02,
-8.48441134e-03, -3.18822276e-02, -1.18500315e-02,
-4.10846516e-02, 3.11779366e-02, 9.63468533e-03,
-8.25937808e-03, -3.12230060e-02, 8.57664142e-03,
-9.70176246e-03, 1.32377270e-02, 4.06446596e-02,
8.23386736e-03, -3.27357129e-02, -4.33869015e-03,
-1.75530367e-02, 6.88847420e-03, 3.45132209e-02,
7.03299165e-02, 2.16786294e-02, 5.32215335e-03,
8.17563544e-02, 6.40061503e-02, -2.31137902e-03,
-1.17555633e-02, 1.75889175e-01, 3.18129429e-02,
-3.15886206e-02, 3.34028171e-02, 2.22785298e-02,
1.00229999e-02, -4.74916649e-02, -2.12760286e-02,
-3.98717296e-02, -6.04068482e-02, -4.65057888e-02,
1.03006018e-02, -3.05697523e-04, 1.80741576e-02,
-1.75454419e-02, -8.72592256e-02, 1.00662572e-01,
4.46107349e-03, 7.46871534e-02, -6.13410424e-02,
2.81703436e-02, -1.40979778e-02, 3.14638106e-02,
-1.63834163e-02, 3.66531652e-02, -5.15648735e-03,
1.45093764e-02, 6.35868080e-02, 2.34599227e-02,
8.81057849e-02, 6.15343397e-02, -1.39361418e-02,
2.07247139e-02, -3.15473607e-03, 5.15426175e-02,
-2.88767817e-02, 1.60263475e-02, 2.09703161e-02,
-3.29172299e-02, -2.59462113e-02, -5.60400880e-02,
-3.64628127e-02, 1.12883212e-02, 2.17266681e-02,
-1.51637255e-02, -7.82886222e-03, 2.42549684e-02,
9.47014322e-02, -2.63031463e-02, 1.17384412e-04,
-5.24178474e-03, 4.17989303e-02, 8.85680814e-02,
6.23639986e-03, 1.86599575e-02, 1.54629231e-02,
3.50535064e-03, 6.20606401e-03, -1.19790483e-02,
1.59526299e-02, 7.12124289e-03, -8.93194037e-02,
-3.54334544e-03, 1.23477512e-02, 3.03927521e-02,
-2.37294351e-02, -3.82794700e-02, -4.98744939e-02,
4.66894814e-02, -1.23291083e-02, -1.10332125e-02,
2.18107422e-02, 2.18720270e-02, 2.63537549e-02,
1.05279336e-02, 1.84618590e-02, 8.36095792e-04,
-6.65216361e-03, 3.49397998e-02, 1.49353965e-02,
-1.11599448e-02, 6.69096152e-03, -2.00057857e-02,
-3.99014382e-02, 3.01872301e-02, -1.09867185e-02,
-4.11777843e-02, 2.72051858e-02, 1.16426462e-02,
-1.55501661e-02, 3.27701258e-02, 3.95493760e-02,
8.48728578e-03, 2.19935174e-02, -9.88648510e-03,
-3.61421769e-02, -4.77019514e-02, 1.90072175e-02,
-5.58286614e-02, -3.31742315e-02, -2.24912419e-02,
-3.36177357e-02, -4.07356308e-02, 1.08863176e-02,
1.12811649e-02, 7.63146319e-02, 4.04811664e-03,
3.07013271e-02, 2.89177297e-02, 4.71625148e-03,
5.13381859e-02, -4.10363756e-02, 1.23252500e-03,
-2.50403343e-02, 5.85904487e-02, -1.04965514e-01,
-4.41704209e-02, 1.18518242e-02, -5.83203587e-02,
-4.82243653e-02, 9.17658664e-03, 1.03259720e-02,
-5.09181971e-03, -3.23391068e-02, -3.19384404e-02,
-1.53769579e-02, -5.21213118e-02, 1.55619390e-02,
2.93484967e-02, -1.92529920e-02, 1.76694211e-02,
2.67991906e-02, 5.76554384e-02, -1.38161654e-02,
2.60399524e-02, 1.50401038e-02, 1.27426795e-02,
-2.29243257e-02, -1.06665797e-02, 9.81943185e-03,
-4.77511727e-02, 1.64243717e-02]])
It’s a numpy array with only one coefficient per voxel:
print(f"{coef_.shape=}")
coef_.shape=(1, 464)
To get the Nifti image of these coefficients, we only need retrieve the coef_img_ in the decoder and select the class
coef_img = decoder.coef_img_["face"]
coef_img is now a NiftiImage. We can save the coefficients as a nii.gz file:
from pathlib import Path
output_dir = Path.cwd() / "results" / "plot_decoding_tutorial"
output_dir.mkdir(exist_ok=True, parents=True)
print(f"Output will be saved to: {output_dir}")
decoder.coef_img_["face"].to_filename(output_dir / "haxby_svc_weights.nii.gz")
Output will be saved to: /home/runner/work/nilearn/nilearn/examples/00_tutorials/results/plot_decoding_tutorial
Plotting the SVM weights¶
We can plot the weights, using the subject’s anatomical as a background
view_img(
decoder.coef_img_["face"],
bg_img=haxby_dataset.anat[0],
title="SVM weights",
dim=-1,
)
/home/runner/work/nilearn/nilearn/.tox/doc/lib/python3.9/site-packages/numpy/core/fromnumeric.py:758: UserWarning: Warning: 'partition' will ignore the 'mask' of the MaskedArray.
a.partition(kth, axis=axis, kind=kind, order=order)