9.1.4. A introduction tutorial to fMRI decoding

Here is a simple tutorial on decoding with nilearn. It reproduces the Haxby 2001 study on a face vs cat discrimination task in a mask of the ventral stream.

  • J.V. Haxby et al. “Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex”, Science vol 293 (2001), p 2425.-2430.

This tutorial is meant as an introduction to the various steps of a decoding analysis using Nilearn meta-estimator: nilearn.decoding.Decoder

It is not a minimalistic example, as it strives to be didactic. It is not meant to be copied to analyze new data: many of the steps are unnecessary.

9.1.4.1. Retrieve and load the fMRI data from the Haxby study

9.1.4.1.1. First download the data

The nilearn.datasets.fetch_haxby function will download the Haxby dataset if not present on the disk, in the nilearn data directory. It can take a while to download about 310 Mo of data from the Internet.

from nilearn import datasets
# By default 2nd subject will be fetched
haxby_dataset = datasets.fetch_haxby()
# 'func' is a list of filenames: one for each subject
fmri_filename = haxby_dataset.func[0]

# print basic information on the dataset
print('First subject functional nifti images (4D) are at: %s' %
      fmri_filename)  # 4D data

Out:

First subject functional nifti images (4D) are at: /home/nicolas/nilearn_data/haxby2001/subj2/bold.nii.gz

9.1.4.1.2. Visualizing the fmri volume

One way to visualize a fmri volume is using nilearn.plotting.plot_epi. We will visualize the previously fetched fmri data from Haxby dataset.

Because fmri data are 4D (they consist of many 3D EPI images), we cannot plot them directly using nilearn.plotting.plot_epi (which accepts just 3D input). Here we are using nilearn.image.mean_img to extract a single 3D EPI image from the fmri data.

from nilearn import plotting
from nilearn.image import mean_img
plotting.view_img(mean_img(fmri_filename), threshold=None)


9.1.4.1.3. Feature extraction: from fMRI volumes to a data matrix

These are some really lovely images, but for machine learning we need matrices to work with the actual data. Fortunately, the nilearn.decoding.Decoder object we will use later on can automatically transform Nifti images into matrices. All we have to do for now is define a mask filename.

A mask of the Ventral Temporal (VT) cortex coming from the Haxby study is available:

mask_filename = haxby_dataset.mask_vt[0]

# Let's visualize it, using the subject's anatomical image as a
# background
plotting.plot_roi(mask_filename, bg_img=haxby_dataset.anat[0],
                  cmap='Paired')
plot decoding tutorial

Out:

<nilearn.plotting.displays.OrthoSlicer object at 0x7f881e6ee820>

9.1.4.1.4. Load the behavioral labels

Now that the brain images are converted to a data matrix, we can apply machine-learning to them, for instance to predict the task that the subject was doing. The behavioral labels are stored in a CSV file, separated by spaces.

We use pandas to load them in an array.

import pandas as pd
# Load behavioral information
behavioral = pd.read_csv(haxby_dataset.session_target[0], delimiter=' ')
print(behavioral)

Out:

     labels  chunks
0      rest       0
1      rest       0
2      rest       0
3      rest       0
4      rest       0
...     ...     ...
1447   rest      11
1448   rest      11
1449   rest      11
1450   rest      11
1451   rest      11

[1452 rows x 2 columns]

The task was a visual-recognition task, and the labels denote the experimental condition: the type of object that was presented to the subject. This is what we are going to try to predict.

conditions = behavioral['labels']
print(conditions)

Out:

0       rest
1       rest
2       rest
3       rest
4       rest
        ...
1447    rest
1448    rest
1449    rest
1450    rest
1451    rest
Name: labels, Length: 1452, dtype: object

9.1.4.1.5. Restrict the analysis to cats and faces

As we can see from the targets above, the experiment contains many conditions. As a consequence, the data is quite big. Not all of this data has an interest to us for decoding, so we will keep only fmri signals corresponding to faces or cats. We create a mask of the samples belonging to the condition; this mask is then applied to the fmri data to restrict the classification to the face vs cat discrimination.

The input data will become much smaller (i.e. fmri signal is shorter):

condition_mask = conditions.isin(['face', 'cat'])

Because the data is in one single large 4D image, we need to use index_img to do the split easily.

We apply the same mask to the targets

Out:

(216,)

9.1.4.2. Decoding with Support Vector Machine

As a decoder, we use a Support Vector Classifier with a linear kernel. We first create it using by using nilearn.decoding.Decoder.

from nilearn.decoding import Decoder
decoder = Decoder(estimator='svc', mask=mask_filename, standardize=True)

The decoder object is an object that can be fit (or trained) on data with labels, and then predict labels on data without.

We first fit it on the data

We can then predict the labels from the data

Out:

['face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'cat'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'face' 'face' 'cat' 'cat' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'cat' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'face' 'face' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'face' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat'
 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'face' 'face' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat'
 'cat' 'cat' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'face' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'face'
 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat']

Note that for this classification task both classes contain the same number of samples (the problem is balanced). Then, we can use accuracy to measure the performance of the decoder. This is done by defining accuracy as the scoring. Let’s measure the prediction accuracy:

print((prediction == conditions).sum() / float(len(conditions)))

Out:

1.0

This prediction accuracy score is meaningless. Why?

9.1.4.3. Measuring prediction scores using cross-validation

The proper way to measure error rates or prediction accuracy is via cross-validation: leaving out some data and testing on it.

9.1.4.3.1. Manually leaving out data

Let’s leave out the 30 last data points during training, and test the prediction on these 30 last points:

fmri_niimgs_train = index_img(fmri_niimgs, slice(0, -30))
fmri_niimgs_test = index_img(fmri_niimgs, slice(-30, None))
conditions_train = conditions[:-30]
conditions_test = conditions[-30:]

decoder = Decoder(estimator='svc', mask=mask_filename, standardize=True)
decoder.fit(fmri_niimgs_train, conditions_train)

prediction = decoder.predict(fmri_niimgs_test)

# The prediction accuracy is calculated on the test data: this is the accuracy
# of our model on examples it hasn't seen to examine how well the model perform
# in general.

print("Prediction Accuracy: {:.3f}".format(
    (prediction == conditions_test).sum() / float(len(conditions_test))))

Out:

Prediction Accuracy: 0.767

9.1.4.3.2. Implementing a KFold loop

We can manually split the data in train and test set repetitively in a KFold strategy by importing scikit-learn’s object:

from sklearn.model_selection import KFold
cv = KFold(n_splits=5)

# The "cv" object's split method can now accept data and create a
# generator which can yield the splits.
fold = 0
for train, test in cv.split(conditions):
    fold += 1
    decoder = Decoder(estimator='svc', mask=mask_filename, standardize=True)
    decoder.fit(index_img(fmri_niimgs, train), conditions[train])
    prediction = decoder.predict(index_img(fmri_niimgs, test))
    print(
        "CV Fold {:01d} | Prediction Accuracy: {:.3f}".format(
            fold,
            (prediction == conditions[test]).sum() / float(len(
                conditions[test]))))

Out:

CV Fold 1 | Prediction Accuracy: 0.886
CV Fold 2 | Prediction Accuracy: 0.767
CV Fold 3 | Prediction Accuracy: 0.767
CV Fold 4 | Prediction Accuracy: 0.698
CV Fold 5 | Prediction Accuracy: 0.744

9.1.4.3.3. Cross-validation with the decoder

The decoder also implements a cross-validation loop by default and returns an array of shape (cross-validation parameters, n_folds). We can use accuracy score to measure its performance by defining accuracy as the scoring parameter.

n_folds = 5
decoder = Decoder(
    estimator='svc', mask=mask_filename,
    standardize=True, cv=n_folds,
    scoring='accuracy'
)
decoder.fit(fmri_niimgs, conditions)

Cross-validation pipeline can also be implemented manually. More details can be found on scikit-learn website.

Then we can check the best performing parameters per fold.

print(decoder.cv_params_['face'])

Out:

{'C': [100.0, 100.0, 100.0, 100.0, 100.0]}

Note

We can speed things up to use all the CPUs of our computer with the n_jobs parameter.

The best way to do cross-validation is to respect the structure of the experiment, for instance by leaving out full sessions of acquisition.

The number of the session is stored in the CSV file giving the behavioral data. We have to apply our session mask, to select only cats and faces.

The fMRI data is acquired by sessions, and the noise is autocorrelated in a given session. Hence, it is better to predict across sessions when doing cross-validation. To leave a session out, pass the cross-validator object to the cv parameter of decoder.

from sklearn.model_selection import LeaveOneGroupOut
cv = LeaveOneGroupOut()

decoder = Decoder(estimator='svc', mask=mask_filename, standardize=True,
                  cv=cv)
decoder.fit(fmri_niimgs, conditions, groups=session_label)

print(decoder.cv_scores_)

Out:

{'cat': [1.0, 1.0, 1.0, 1.0, 0.9629629629629629, 0.8518518518518519, 0.9753086419753086, 0.40740740740740744, 0.9876543209876543, 1.0, 0.9259259259259259, 0.8765432098765432], 'face': [1.0, 1.0, 1.0, 1.0, 0.9629629629629629, 0.8518518518518519, 0.9753086419753086, 0.40740740740740744, 0.9876543209876543, 1.0, 0.9259259259259259, 0.8765432098765432]}

9.1.4.4. Inspecting the model weights

Finally, it may be useful to inspect and display the model weights.

9.1.4.4.1. Turning the weights into a nifti image

We retrieve the SVC discriminating weights

Out:

[[-3.88469300e-02 -1.86752077e-02 -3.22275135e-02 -2.88102737e-02
   4.17749963e-02  1.10474970e-02  1.69627682e-02 -5.49689355e-02
  -1.93773796e-02 -3.50416310e-02  1.08280122e-02 -1.28502096e-02
  -1.54317345e-02 -3.78043212e-02 -3.68278674e-02  2.27559214e-02
   6.55005987e-03 -7.64033825e-03  1.66730252e-02 -8.00447312e-03
   5.28262073e-02 -8.15726467e-02 -6.35573880e-02  2.40756015e-02
   4.58824104e-02 -2.22077001e-02 -1.76883046e-02  2.21688218e-02
  -9.51053739e-03  5.74705130e-02  2.13813275e-02 -9.12119455e-02
   4.02843447e-03 -2.88624875e-02 -3.88123311e-02 -3.34317404e-02
   2.20847936e-03  8.71165010e-03 -3.36651909e-02 -2.40699153e-02
  -6.80079334e-02  1.65042971e-02  2.70137225e-02 -6.55177529e-03
  -1.21379942e-02  5.46412781e-02  8.11523817e-03  3.60125844e-02
  -1.52385087e-02  7.01266662e-02  1.28224210e-03  2.07522134e-02
  -4.09189616e-03  3.71550796e-02 -3.76546024e-02 -1.03609055e-02
  -2.37697484e-02 -5.47604419e-02  4.41999718e-02 -1.47077106e-01
  -2.33519538e-02  1.86652592e-02  6.64352095e-02 -9.05547178e-02
  -1.21727246e-02 -2.94679934e-03  3.21325620e-02 -3.03308322e-02
   6.13950790e-02  1.12010834e-02  1.93353456e-02 -1.30260569e-02
   4.41940713e-02 -2.22555592e-02  6.86546889e-02  1.69013065e-02
   1.78552869e-02  1.00063043e-02  2.98480953e-02 -2.51582758e-02
   1.05948033e-02 -6.30480627e-03  2.21078444e-03 -2.22817895e-02
   1.42261239e-02 -1.52784238e-02 -1.97786693e-02 -4.31616119e-02
  -4.54073619e-02  3.40811504e-02 -2.78553337e-02 -2.80246686e-02
  -3.69301450e-02 -5.70138878e-02 -6.97323489e-02  3.19243532e-03
  -8.33528782e-03 -3.36854424e-02  3.03556033e-02  8.66310335e-03
   6.17990568e-03  5.92798141e-02  9.05113983e-03 -1.48581740e-02
   1.43214374e-02 -1.08762183e-02  2.67057365e-02  4.72687428e-02
  -2.95714955e-02  3.08745064e-02  1.57552839e-02 -3.16008268e-02
  -3.99189612e-02 -5.38977552e-02  2.81973856e-02 -1.11835158e-02
  -5.44117473e-02  6.30732363e-02 -1.49667441e-02  2.47157325e-03
  -4.55570852e-02 -1.83423738e-02  1.19705558e-02 -3.71288818e-02
  -2.24799668e-03  4.57568716e-02  4.78067772e-02  2.51047979e-03
  -4.30722655e-02 -5.33657784e-03  5.75655195e-02  7.39163508e-03
  -3.19810227e-02  4.34816928e-03  1.67904779e-02 -2.91879430e-02
  -2.23728409e-03 -8.28198403e-03 -9.97899906e-03  2.16654965e-02
  -1.92084314e-03 -1.32900438e-02 -2.79644257e-02 -1.74907249e-02
  -9.15538163e-03 -7.08033766e-03 -1.42698502e-02  5.05703745e-02
  -1.84395078e-02 -4.70414591e-02  1.72191461e-02 -4.75574316e-02
  -9.07025728e-04  3.99828115e-02  7.52266668e-02  7.24258009e-03
   4.81499374e-02  4.49523587e-02  3.60353069e-02 -8.14546873e-03
   1.94962578e-02  3.57075135e-02  4.88176569e-02  3.82081241e-02
   6.22478794e-02  6.12261630e-02 -1.68380194e-02  1.66160508e-02
   3.34741077e-02 -1.79793563e-02  4.45395610e-02 -3.52426726e-02
  -3.66473941e-02 -4.61423767e-03  4.85718365e-02  3.38889058e-02
   6.20125140e-03  1.73239496e-02  2.01273887e-02  2.16578615e-02
   2.90729803e-02  2.37270705e-02  4.83612185e-02 -9.20492580e-03
  -2.81969812e-02 -2.13306608e-02  1.80421555e-03  4.78566786e-02
  -9.76548486e-03  1.11162097e-02 -1.64704482e-02 -2.88447847e-02
   2.42267869e-02 -1.22079183e-02 -2.92194387e-02 -2.89203497e-02
  -3.38759194e-02 -3.64225833e-03  2.64729394e-02  4.57032263e-02
  -5.92019380e-02 -2.13147182e-02 -3.08698252e-02  5.48930084e-02
  -3.38039954e-02  6.11180712e-03  1.41178959e-02  1.09947614e-02
   5.32573758e-02 -2.11837384e-02  6.35996689e-03 -1.12817934e-02
  -2.63616359e-02 -2.21911236e-02 -5.30671210e-02 -3.97748219e-02
  -1.29431122e-01 -3.27320546e-02 -2.89008951e-02 -9.11553324e-03
  -7.27002641e-03 -3.70177707e-02 -6.33424364e-02  2.04167719e-03
  -8.24861225e-02 -6.69635391e-02 -2.28577044e-03 -2.32902396e-02
   1.77468924e-02 -8.72664447e-02 -2.75690311e-03 -4.37284527e-02
  -1.27747362e-02  2.77376041e-02 -4.31669037e-02 -3.21909025e-02
  -2.27498582e-02 -2.56846160e-02  2.03155309e-02 -9.88064789e-03
  -3.14294681e-02 -1.81002210e-02 -1.11800312e-03 -4.16455499e-02
  -6.22036913e-02  2.55494549e-04 -6.72171386e-02  6.52439595e-02
   1.06279801e-02  2.21493015e-02 -1.98227521e-02 -1.85107213e-02
   4.04761407e-02 -3.02129168e-02 -8.08213759e-02 -7.40774881e-02
  -4.92687692e-02 -1.01545007e-02  1.09171367e-02 -4.48198588e-02
   2.92093597e-02  7.03556877e-03  5.06299133e-03 -4.82908256e-03
   2.48284392e-03  2.99975207e-02 -2.62531634e-03  4.63549808e-03
   7.88406635e-02  1.04608013e-02  1.67695133e-02 -4.35721740e-02
  -1.08621153e-02  2.09744979e-02 -4.40929563e-02  3.15757019e-03
   6.97068096e-02  8.59640734e-02  4.95095018e-02  6.02621002e-03
   5.55187086e-02 -2.98207742e-02  4.11938311e-03 -3.21183999e-02
  -3.14239748e-02 -5.30016951e-02  2.66641103e-02  3.13671515e-02
   6.65664882e-03 -1.28392470e-02  2.19675908e-02  5.67231359e-02
   2.25086785e-02 -2.04144919e-02  5.09069852e-03  2.84697984e-02
  -1.81223071e-02 -8.46466661e-03 -3.18111433e-02 -1.18213294e-02
  -4.09899222e-02  3.11041883e-02  9.61316405e-03 -8.24096474e-03
  -3.11480410e-02  8.55869718e-03 -9.67736228e-03  1.32031802e-02
   4.05487405e-02  8.21003901e-03 -3.26565562e-02 -4.32623868e-03
  -1.75125184e-02  6.87120646e-03  3.44345543e-02  7.01687256e-02
   2.16271047e-02  5.30863481e-03  8.15657397e-02  6.38545313e-02
  -2.30756407e-03 -1.17255813e-02  1.75482624e-01  3.17384801e-02
  -3.15195403e-02  3.33275603e-02  2.22248107e-02  9.99730991e-03
  -4.73820528e-02 -2.12284034e-02 -3.97808169e-02 -6.02665008e-02
  -4.63980160e-02  1.02753940e-02 -3.05333511e-04  1.80353507e-02
  -1.75046312e-02 -8.70568886e-02  1.00429860e-01  4.45201894e-03
   7.45126077e-02 -6.11977514e-02  2.81053605e-02 -1.40643967e-02
   3.13915696e-02 -1.63458529e-02  3.65698457e-02 -5.14600953e-03
   1.44761787e-02  6.34381461e-02  2.34023154e-02  8.79029319e-02
   6.13932828e-02 -1.39020044e-02  2.06772313e-02 -3.14592531e-03
   5.14254989e-02 -2.88097121e-02  1.59904444e-02  2.09223396e-02
  -3.28410531e-02 -2.58826572e-02 -5.59116612e-02 -3.63769942e-02
   1.12594004e-02  2.16794564e-02 -1.51320600e-02 -7.81039575e-03
   2.42019000e-02  9.44817024e-02 -2.62429942e-02  1.16247980e-04
  -5.23302283e-03  4.17019448e-02  8.83630985e-02  6.22086484e-03
   1.86172056e-02  1.54275368e-02  3.49552486e-03  6.19241780e-03
  -1.19521284e-02  1.59131296e-02  7.10290362e-03 -8.91137467e-02
  -3.53352070e-03  1.23197415e-02  3.03208384e-02 -2.36741261e-02
  -3.81953874e-02 -4.97581859e-02  4.65828989e-02 -1.23004665e-02
  -1.10093169e-02  2.17590598e-02  2.18215816e-02  2.62921305e-02
   1.05030662e-02  1.84179839e-02  8.31930222e-04 -6.63451506e-03
   3.48614863e-02  1.48989712e-02 -1.11346998e-02  6.67421500e-03
  -1.99597920e-02 -3.98119594e-02  3.01189775e-02 -1.09587621e-02
  -4.10856220e-02  2.71431922e-02  1.16140820e-02 -1.55127378e-02
   3.26913874e-02  3.94582453e-02  8.47095051e-03  2.19427921e-02
  -9.86333333e-03 -3.60583746e-02 -4.75929031e-02  1.89646929e-02
  -5.57029188e-02 -3.31009892e-02 -2.24389883e-02 -3.35394977e-02
  -4.06400599e-02  1.08606412e-02  1.12545391e-02  7.61358519e-02
   4.03813651e-03  3.06313082e-02  2.88499824e-02  4.70356878e-03
   5.12218200e-02 -4.09448970e-02  1.22999650e-03 -2.49837667e-02
   5.84588518e-02 -1.04721977e-01 -4.40682418e-02  1.18252564e-02
  -5.81870044e-02 -4.81118466e-02  9.15435646e-03  1.03017713e-02
  -5.07927531e-03 -3.22632584e-02 -3.18666555e-02 -1.53436644e-02
  -5.19996219e-02  1.55244223e-02  2.92793129e-02 -1.92077555e-02
   1.76311098e-02  2.67382023e-02  5.75229619e-02 -1.37858272e-02
   2.59814700e-02  1.50061194e-02  1.27144933e-02 -2.28690941e-02
  -1.06427516e-02  9.79317819e-03 -4.76402327e-02  1.63869018e-02]]

It’s a numpy array with only one coefficient per voxel:

print(coef_.shape)

Out:

(1, 464)

To get the Nifti image of these coefficients, we only need retrieve the coef_img_ in the decoder and select the class

coef_img is now a NiftiImage. We can save the coefficients as a nii.gz file:

decoder.coef_img_['face'].to_filename('haxby_svc_weights.nii.gz')

9.1.4.4.2. Plotting the SVM weights

We can plot the weights, using the subject’s anatomical as a background

plotting.view_img(
    decoder.coef_img_['face'], bg_img=haxby_dataset.anat[0],
    title="SVM weights", dim=-1
)


9.1.4.5. What is the chance level accuracy?

Does the model above perform better than chance? To answer this question, we measure a score at random using simple strategies that are implemented in the nilearn.decoding.Decoder object. This is useful to inspect the decoding performance by comparing to a score at chance.

Let’s define a object with Dummy estimator replacing ‘svc’ for classification setting. This object initializes estimator with default dummy strategy.

dummy_decoder = Decoder(estimator='dummy_classifier', mask=mask_filename,
                        cv=cv)
dummy_decoder.fit(fmri_niimgs, conditions, groups=session_label)

# Now, we can compare these scores by simply taking a mean over folds
print(dummy_decoder.cv_scores_)

Out:

{'cat': [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5], 'face': [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]}