8.1.4. A introduction tutorial to fMRI decoding

Here is a simple tutorial on decoding with nilearn. It reproduces the Haxby 2001 study on a face vs cat discrimination task in a mask of the ventral stream.

This tutorial is meant as an introduction to the various steps of a decoding analysis.

It is not a minimalistic example, as it strives to be didactic. It is not meant to be copied to analyze new data: many of the steps are unecessary. Retrieve and load the fMRI data from the Haxby study First download the data

The nilearn.datasets.fetch_haxby function will download the Haxby dataset if not present on the disk, in the nilearn data directory. It can take a while to download about 310 Mo of data from the Internet.

from nilearn import datasets
# By default 2nd subject will be fetched
haxby_dataset = datasets.fetch_haxby()
# 'func' is a list of filenames: one for each subject
fmri_filename = haxby_dataset.func[0]

# print basic information on the dataset
print('First subject functional nifti images (4D) are at: %s' %
      fmri_filename)  # 4D data


First subject functional nifti images (4D) are at: /home/kshitij/nilearn_data/haxby2001/subj2/bold.nii.gz Convert the fMRI volume’s to a data matrix

We will use the nilearn.input_data.NiftiMasker to extract the fMRI data on a mask and convert it to data series.

The mask is a mask of the Ventral Temporal streaming coming from the Haxby study:

mask_filename = haxby_dataset.mask_vt[0]

# Let's visualize it, using the subject's anatomical image as a
# background
from nilearn import plotting
plotting.plot_roi(mask_filename, bg_img=haxby_dataset.anat[0],

Now we use the NiftiMasker.

We first create a masker, giving it the options that we care about. Here we use standardizing of the data, as it is often important for decoding

from nilearn.input_data import NiftiMasker
masker = NiftiMasker(mask_img=mask_filename, standardize=True)

# We give the masker a filename and retrieve a 2D array ready
# for machine learning with scikit-learn
fmri_masked = masker.fit_transform(fmri_filename)

The variable “fmri_masked” is a numpy array:



[[ 7.6757914e-01  2.3108709e+00 -2.0519446e-01 ... -1.0261141e+00
   8.7993503e-02  2.0720530e+00]
 [ 5.5640829e-01  1.6833442e+00 -2.4644940e-01 ... -7.0238096e-01
  -3.4570050e-01  2.0341012e+00]
 [ 7.6757914e-01  1.9186667e+00  1.0802227e-03 ... -9.9374086e-01
  -2.7630946e-01  2.1479564e+00]
 [-4.2905572e-01 -1.6896113e+00 -7.4150860e-01 ... -1.5440873e+00
   1.8054218e+00 -1.6709729e-01]
 [-1.4749458e-01 -1.8072726e+00 -2.4644940e-01 ... -1.7707006e+00
   1.5452055e+00  7.8169519e-01]
 [-2.1788488e-01 -1.4542887e+00  1.0802227e-03 ... -1.6412073e+00
   1.2676412e+00  8.9555031e-01]]

Its shape corresponds to the number of time-points times the number of voxels in the mask



(1452, 464) Load the behavioral labels

The behavioral labels are stored in a CSV file, separated by spaces.

We use pandas to load them in an array.

import pandas as pd
# Load behavioral information
behavioral = pd.read_csv(haxby_dataset.session_target[0], sep=" ")


labels  chunks
0         rest       0
1         rest       0
2         rest       0
3         rest       0
4         rest       0
5         rest       0
6     scissors       0
7     scissors       0
8     scissors       0
9     scissors       0
10    scissors       0
11    scissors       0
12    scissors       0
13    scissors       0
14    scissors       0
15        rest       0
16        rest       0
17        rest       0
18        rest       0
19        rest       0
20        rest       0
21        face       0
22        face       0
23        face       0
24        face       0
25        face       0
26        face       0
27        face       0
28        face       0
29        face       0
...        ...     ...
1422      rest      11
1423       cat      11
1424       cat      11
1425       cat      11
1426       cat      11
1427       cat      11
1428       cat      11
1429       cat      11
1430       cat      11
1431       cat      11
1432      rest      11
1433      rest      11
1434      rest      11
1435      rest      11
1436      rest      11
1437  scissors      11
1438  scissors      11
1439  scissors      11
1440  scissors      11
1441  scissors      11
1442  scissors      11
1443  scissors      11
1444  scissors      11
1445  scissors      11
1446      rest      11
1447      rest      11
1448      rest      11
1449      rest      11
1450      rest      11
1451      rest      11

[1452 rows x 2 columns]

Retrieve the experimental conditions, that we are going to use as prediction targets in the decoding

conditions = behavioral['labels']


0           rest
1           rest
2           rest
3           rest
4           rest
5           rest
6       scissors
7       scissors
8       scissors
9       scissors
10      scissors
11      scissors
12      scissors
13      scissors
14      scissors
15          rest
16          rest
17          rest
18          rest
19          rest
20          rest
21          face
22          face
23          face
24          face
25          face
26          face
27          face
28          face
29          face
1422        rest
1423         cat
1424         cat
1425         cat
1426         cat
1427         cat
1428         cat
1429         cat
1430         cat
1431         cat
1432        rest
1433        rest
1434        rest
1435        rest
1436        rest
1437    scissors
1438    scissors
1439    scissors
1440    scissors
1441    scissors
1442    scissors
1443    scissors
1444    scissors
1445    scissors
1446        rest
1447        rest
1448        rest
1449        rest
1450        rest
1451        rest
Name: labels, Length: 1452, dtype: object Restrict the analysis to cats and faces

As we can see from the targets above, the experiment contains many conditions, not all that interest us for decoding.

To keep only data corresponding to faces or cats, we create a mask of the samples belonging to the condition.

condition_mask = conditions.isin(['face', 'cat'])

# We apply this mask in the sampe direction to restrict the
# classification to the face vs cat discrimination
fmri_masked = fmri_masked[condition_mask]

We now have less samples



(216, 464)

We apply the same mask to the targets

conditions = conditions[condition_mask]


(216,) Decoding with an SVM

We will now use the scikit-learn machine-learning toolbox on the fmri_masked data.

As a decoder, we use a Support Vector Classification, with a linear kernel.

We first create it:

from sklearn.svm import SVC
svc = SVC(kernel='linear')


SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='linear', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)

The svc object is an object that can be fit (or trained) on data with labels, and then predict labels on data without.

We first fit it on the data

svc.fit(fmri_masked, conditions)

We can then predict the labels from the data

prediction = svc.predict(fmri_masked)


['face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'cat'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'face' 'face' 'cat' 'cat' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'cat' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'face' 'face' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'face' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat'
 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'face' 'face' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat'
 'cat' 'cat' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'face' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'face'
 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat'
 'cat' 'cat' 'cat' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face' 'face'
 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat' 'cat']

Let’s measure the error rate:

print((prediction == conditions).sum() / float(len(conditions)))



This error rate is meaningless. Why? Measuring prediction scores using cross-validation

The proper way to measure error rates or prediction accuracy is via cross-validation: leaving out some data and testing on it. Manually leaving out data

Let’s leave out the 30 last data points during training, and test the prediction on these 30 last points:

svc.fit(fmri_masked[:-30], conditions[:-30])

prediction = svc.predict(fmri_masked[-30:])
print((prediction == conditions[-30:]).sum() / float(len(conditions[-30:])))


0.6666666666666666 Implementing a KFold loop

We can split the data in train and test set repetitively in a KFold strategy:

from sklearn.model_selection import KFold

cv = KFold(n_splits=5)

# The "cv" object's split method can now accept data and create a
# generator which can yield the splits.
for train, test in cv.split(X=fmri_masked):
    conditions_masked = conditions.values[train]
    svc.fit(fmri_masked[train], conditions_masked)
    prediction = svc.predict(fmri_masked[test])
    print((prediction == conditions.values[test]).sum()
           / float(len(conditions.values[test])))


0.7441860465116279 Cross-validation with scikit-learn

Scikit-learn has tools to perform cross-validation easier:

from sklearn.model_selection import cross_val_score
cv_score = cross_val_score(svc, fmri_masked, conditions)


[0.59722222 0.80555556 0.55555556]

Note that we can speed things up to use all the CPUs of our computer with the n_jobs parameter.

The best way to do cross-validation is to respect the structure of the experiment, for instance by leaving out full sessions of acquisition.

The number of the session is stored in the CSV file giving the behavioral data. We have to apply our session mask, to select only cats and faces.

session_label = behavioral['chunks'][condition_mask]

# By default, cross_val_score uses a 3-fold KFold. We can control this by
# passing the "cv" object, here a 5-fold:
cv_score = cross_val_score(svc, fmri_masked, conditions, cv=cv)

# To leave a session out, pass it to the groups parameter of cross_val_score.
from sklearn.model_selection import LeaveOneGroupOut
cv = LeaveOneGroupOut()
cv_score = cross_val_score(svc,


[0.97727273 0.76744186 0.79069767 0.53488372 0.74418605]
[0.55555556 1.         0.66666667 0.66666667 0.77777778 0.72222222
 0.88888889 0.38888889 0.66666667 0.5        0.77777778 0.66666667] Inspecting the model weights

Finally, it may be useful to inspect and display the model weights. Turning the weights into a nifti image

We retrieve the SVC discriminating weights

coef_ = svc.coef_


[[-4.10591439e-02 -8.38913669e-04 -2.34992127e-02 -3.47431514e-02
   3.44100694e-02  2.59043472e-02  2.39052353e-02 -4.95012599e-02
  -3.19443415e-02 -1.65893786e-02  1.87552899e-02 -7.78939157e-03
   1.02022966e-02 -3.23066175e-02  4.77513388e-03  2.20187375e-02
   1.50985089e-02  1.77166050e-03  2.41532650e-02 -3.24624084e-02
   1.30955876e-02 -9.66231411e-02 -7.07582442e-02  1.92832959e-02
   3.57739471e-02 -1.44028906e-02 -9.17654420e-03 -3.23610315e-02
   2.34853293e-02  1.01923043e-01  1.61348815e-02 -7.79778695e-02
   2.49681620e-03 -3.13738572e-02 -2.80103975e-02 -5.16189181e-02
  -6.94717904e-03 -1.00566568e-02 -2.96087048e-02 -2.25467990e-02
  -3.00431368e-02 -3.28028599e-02  2.65750951e-02  1.66043690e-02
  -1.63196266e-02  3.33049535e-02  4.58224672e-04  7.30845484e-03
   2.85276988e-02  1.98812475e-02  2.65800328e-02  1.87685655e-02
  -2.05262911e-02 -9.08269461e-03 -4.53160560e-02  1.59140903e-02
  -1.75433965e-02 -6.90762579e-02  3.89501904e-02 -1.28822089e-01
  -3.50391659e-02  6.48267437e-03  5.01267951e-02 -7.63127896e-02
  -1.70264978e-02  5.40262082e-03  5.61464364e-02 -1.73917761e-02
   7.37187399e-02  1.29712224e-02  3.31590258e-02  6.15625548e-03
   4.10859945e-02 -6.77459982e-03  7.23131333e-03  2.18362878e-02
   3.84148397e-02  2.63664841e-02  1.74112933e-02  1.61460386e-02
   2.82263367e-02 -5.51453962e-03 -1.24329869e-02 -1.65007919e-02
   1.53495049e-03 -3.58689292e-02 -2.45351639e-02 -4.79747313e-03
  -5.32036709e-02  3.68274125e-02 -1.69981985e-02  1.08511387e-02
  -4.80508461e-02 -3.69829093e-02 -4.36149708e-02 -2.27505354e-02
  -1.82647161e-02 -1.29988416e-02  3.45534849e-02  7.32459494e-03
  -5.96338242e-03  5.11828907e-02  1.78048855e-02 -4.10786820e-02
   3.44218340e-02  1.02061855e-02  3.24182697e-02  3.40190648e-02
  -9.72497936e-03  2.21466123e-02  1.51944122e-02 -5.13909282e-02
  -2.25212663e-02 -5.36059085e-02  1.74461316e-02 -1.73912821e-02
  -5.30948812e-02  4.28375575e-02 -1.75974389e-02  2.65186313e-03
  -2.93060690e-02 -2.90011563e-02  1.88280315e-02 -3.77559429e-02
   2.82039100e-03  4.76274597e-02  3.82002135e-02  7.30823353e-03
  -5.78025756e-02  4.88796086e-03  2.08386543e-02  3.10799309e-06
   7.64731722e-05 -5.30726179e-03  8.32036061e-04  6.96334682e-03
  -2.52437504e-02 -1.01739725e-02 -1.62671356e-02  3.42707627e-02
   1.00432285e-02 -2.50394237e-02 -6.19833205e-03 -2.77334894e-02
  -1.03254429e-02 -1.42163226e-02  3.39557786e-03  5.27978096e-02
  -2.05438140e-02 -5.70770065e-02  1.02214240e-02 -5.51708996e-02
  -1.29624200e-02  3.77777072e-03  7.88352021e-03  1.51368300e-02
   5.94512230e-02  3.75684469e-02  4.61933687e-02 -6.28617029e-03
  -2.31685868e-03  4.39664384e-02  6.82138779e-02  1.58698406e-02
   5.48997068e-02  8.31410463e-02 -8.97988526e-03  7.83481005e-03
   2.26074506e-02 -2.60458600e-02  4.45277489e-02 -2.86097190e-02
  -4.99059962e-02  1.71997729e-03  3.65673224e-02  8.15741229e-04
   1.83704939e-02  5.17197092e-02  6.35578528e-03  7.05183282e-04
   1.64023420e-02  4.50576924e-03  3.11797770e-02 -3.37322280e-03
  -1.27150811e-02 -2.60359954e-02  1.47512619e-02  6.84497712e-02
   1.07656559e-02  1.99667580e-02  7.49694140e-03 -3.16838856e-02
   3.10212576e-02 -1.52473819e-04 -3.06834455e-02 -1.12419283e-02
  -2.01262203e-02  1.44459697e-02  1.89684619e-02  1.71060489e-02
  -6.42078682e-02  2.56285714e-03 -6.97908819e-02  4.37415790e-02
  -2.19265925e-02  2.75412496e-02  2.26371659e-02  6.08747457e-03
   7.00948041e-02 -5.47876565e-02  5.00929706e-03 -1.32239805e-02
  -1.67932274e-02 -2.90672109e-02 -4.34679793e-02 -2.32697607e-02
  -1.14648197e-01  1.08002999e-03 -1.34715639e-02 -2.42857355e-02
   4.32411447e-03 -7.01753153e-03 -5.78315906e-02 -1.63359362e-02
  -7.83189241e-02 -6.56549588e-02  1.59472704e-02 -3.93885430e-02
   3.70071579e-02 -8.53304624e-02 -1.46804768e-02 -3.66368426e-02
   6.33237317e-04  3.15900362e-02 -3.74878316e-02 -2.36906438e-02
  -3.48493077e-02 -1.29455374e-02  3.04677148e-02 -5.92433891e-03
  -2.96713551e-02 -1.06306524e-02 -4.51622456e-02 -1.39410348e-02
  -3.72100330e-02  7.48408861e-03 -5.96724188e-02  6.34547444e-02
   3.87715096e-02  3.48643739e-03 -3.29345516e-02 -2.05291736e-02
   3.74104677e-02 -5.28466371e-02 -5.24368180e-02 -1.08615834e-01
  -3.93677233e-02  2.73577934e-02 -1.15112257e-02 -3.11615139e-02
   2.96191093e-02 -1.56250888e-02  8.23424923e-03 -2.21058103e-02
   1.46964377e-02  2.33201228e-02  2.13609673e-02 -8.54915916e-04
   9.25025018e-02  8.45063327e-03  4.39500980e-02 -7.08333209e-02
  -2.79867591e-02  1.02058740e-02 -6.08684855e-02  6.20570031e-03
   3.75086657e-02  9.68286171e-02  4.41023193e-02  4.73388356e-02
   3.91426944e-02 -1.99352505e-02  7.03359502e-03 -2.44020595e-02
  -2.99892130e-02 -3.25777097e-02  2.72322372e-03  7.77638452e-03
   5.01101070e-02  9.72297020e-03  8.82609430e-03  1.97665762e-02
   1.24510717e-03 -2.57096538e-03  3.11640278e-03  2.92457840e-02
   6.42228310e-03 -9.54802550e-03 -4.14943422e-02 -2.03235930e-02
  -4.09358168e-02  6.34522456e-03  3.81027536e-02 -1.24257807e-03
  -1.35866182e-02  3.19962258e-03  4.15927262e-03  2.61804263e-02
   6.27312985e-02 -9.58309235e-03 -3.81101069e-02 -8.98117138e-03
  -4.78051891e-03 -5.94786274e-02  3.94663899e-02  6.07790784e-02
  -8.31262780e-03 -6.94258109e-03  4.99962896e-02  2.73733671e-02
  -5.71262633e-03 -1.84389576e-03  1.76737707e-01  1.14519296e-02
  -3.12318244e-02  3.35009779e-02  5.39812542e-03 -1.24046393e-02
  -3.01223574e-02 -2.34884768e-02 -4.04753625e-02 -3.31808379e-02
  -3.44700460e-02  4.34950149e-03 -1.12128351e-02  2.66120489e-02
  -2.08353307e-02 -5.55524566e-02  9.02084978e-02 -5.42311876e-03
   4.40591115e-02 -3.83539141e-02 -1.54222839e-03 -2.17430721e-02
   2.45192255e-02 -1.80792703e-02  4.16507360e-02  3.39222577e-03
   1.35898003e-02  5.62169808e-02  2.05511201e-02  5.22713085e-02
   5.72247210e-02 -8.37975269e-03  4.06737553e-02  5.04869536e-03
   8.32129801e-02 -3.55796178e-02  5.42543214e-03  3.00522892e-02
  -2.63670321e-02 -2.92756811e-02 -8.30947903e-02 -1.35585937e-02
  -2.90020763e-02  1.64358003e-02 -2.01398046e-02 -1.03980476e-02
  -2.06189132e-03  7.90399394e-02 -3.72163735e-02 -1.83636584e-02
  -2.12257025e-02  2.37873219e-02  7.58760375e-02  3.17983915e-02
   2.44485483e-02 -2.00030934e-03 -6.50080563e-03 -3.43038946e-03
  -9.17293632e-03  2.16091063e-02 -1.86680470e-03 -8.67444519e-02
  -3.76717846e-02  1.93650951e-02  3.42706125e-02 -1.84720711e-02
  -5.55555972e-02 -2.37116507e-02  5.81776873e-02 -8.53892872e-03
  -2.07530777e-02 -8.53067546e-04  2.72667521e-02 -3.51271213e-03
   2.06427064e-02 -8.93358005e-03 -6.59351511e-04 -1.36124161e-02
   3.95871060e-02  2.60889355e-02 -4.33492580e-03 -1.18201950e-02
  -1.91292290e-02 -2.46696785e-02  2.43858751e-02  5.70631381e-03
  -2.25983571e-02  3.35543974e-02  2.40025148e-03  9.97905075e-03
   2.38300137e-02  4.01035787e-02  4.73236294e-02 -3.05304529e-02
  -3.46325886e-02  8.39215710e-04 -4.89761870e-02  6.89462038e-03
  -6.43273366e-02 -6.98259848e-02  5.46324448e-02 -3.85178692e-02
  -1.88327596e-02  2.04875750e-02  2.30507846e-02  2.90462605e-02
  -5.97637507e-03  4.81619008e-02  2.38865302e-02  3.72272632e-03
   4.98869102e-02 -4.44756682e-02  7.97887018e-03 -2.10699182e-02
   5.06694017e-02 -7.28995658e-02 -2.29531446e-02  2.04757321e-02
  -4.30773655e-02 -4.20149354e-02  1.34561591e-02  2.11104965e-02
   1.25247353e-02 -3.91501850e-02 -2.81964084e-02  5.69533437e-03
  -5.09280045e-02  3.13800843e-02 -6.28572542e-03 -5.08252999e-02
   1.54894409e-02  2.86118800e-02  4.85921862e-02 -3.17075185e-03
   2.22448348e-02  1.68500752e-02  2.89099486e-02 -2.41337610e-02
  -1.78519517e-02  6.16670437e-03 -3.18354357e-02  1.94965342e-02]]

It’s a numpy array



(1, 464)

We need to turn it back into a Nifti image, in essence, “inverting” what the NiftiMasker has done.

For this, we can call inverse_transform on the NiftiMasker:

coef_img = masker.inverse_transform(coef_)


<class 'nibabel.nifti1.Nifti1Image'>
data shape (40, 64, 64, 1)
[[  -3.5      0.       0.      68.25 ]
 [   0.       3.75     0.    -118.125]
 [   0.       0.       3.75  -118.125]
 [   0.       0.       0.       1.   ]]
<class 'nibabel.nifti1.Nifti1Header'> object, endian='<'
sizeof_hdr      : 348
data_type       : b''
db_name         : b''
extents         : 0
session_error   : 0
regular         : b''
dim_info        : 0
dim             : [ 4 40 64 64  1  1  1  1]
intent_p1       : 0.0
intent_p2       : 0.0
intent_p3       : 0.0
intent_code     : none
datatype        : float64
bitpix          : 64
slice_start     : 0
pixdim          : [-1.    3.5   3.75  3.75  1.    1.    1.    1.  ]
vox_offset      : 0.0
scl_slope       : nan
scl_inter       : nan
slice_end       : 0
slice_code      : unknown
xyzt_units      : 0
cal_max         : 0.0
cal_min         : 0.0
slice_duration  : 0.0
toffset         : 0.0
glmax           : 0
glmin           : 0
descrip         : b''
aux_file        : b''
qform_code      : unknown
sform_code      : aligned
quatern_b       : 0.0
quatern_c       : 1.0
quatern_d       : 0.0
qoffset_x       : 68.25
qoffset_y       : -118.125
qoffset_z       : -118.125
srow_x          : [-3.5   0.    0.   68.25]
srow_y          : [   0.       3.75     0.    -118.125]
srow_z          : [   0.       0.       3.75  -118.125]
intent_name     : b''
magic           : b'n+1'

coef_img is now a NiftiImage.

We can save the coefficients as a nii.gz file:

coef_img.to_filename('haxby_svc_weights.nii.gz') Plotting the SVM weights

We can plot the weights, using the subject’s anatomical as a background

from nilearn.plotting import plot_stat_map, show

plot_stat_map(coef_img, bg_img=haxby_dataset.anat[0],
              title="SVM weights", display_mode="yx")