5.6. Running scikit-learn functions for more control on the analysis¶
This section gives pointers to design your own decoding pipelines with scikit-learn. This builds on the didactic introduction to decoding.
Note
This documentation gives links and additional definitions needed to work correctly with scikit-learn. For a full code example, please check out: Advanced decoding using scikit learn
5.6.1. Performing decoding with scikit-learn¶
5.6.1.1. Using scikit-learn estimators¶
You can easily import estimators from the scikit-learn machine-learning library,
those available in the Decoder
object and many others.
They all have the fit
and predict
functions.
For example you can directly import the versatile
Support Vector Classifier (or SVC).
To learn more about the variety of classifiers available in scikit-learn, see the scikit-learn documentation on supervised learning.
5.6.1.2. Cross-validation with scikit-learn¶
To perform cross-validation using a scikit-learn estimator, you should first
mask the data using a nilearn.maskers.NiftiMasker
: to extract
only the voxels inside the mask of interest, and transform 4D input fMRI
data to 2D arrays (shape (n_timepoints, n_voxels)) that estimators can work on.
Note
This example shows how to use masking: Simple example of NiftiMasker use
Then use a specific function sklearn.model_selection.cross_val_score
that computes for you the score of your model for the different folds
of cross-validation.
You can change many parameters of the cross_validation here, for example:
use a different cross-validation scheme, for example
sklearn.model_selection.LeaveOneGroupOut
.speed up the computation by using
n_jobs=-1
, which will spread the computation equally across all processors.use a different scoring function, as a keyword or imported from scikit-learn such as
scoring="roc_auc"
.
See also
If you need more than only than cross-validation scores (i.e the predictions or models for each fold) or if you want to learn more on various cross-validation schemes, see here.
5.6.1.3. Measuring the chance level¶
Dummy estimators: The simplest way to measure prediction performance at chance
is to use a “dummy” classifier: sklearn.dummy.DummyClassifier
.
Permutation testing: A more controlled way, but slower,
is to do permutation testing on the labels, with sklearn.model_selection.permutation_test_score
.
5.6.2. Going further with scikit-learn¶
We have seen a very simple analysis with scikit-learn, but your can easily add intermediate processing steps if your analysis requires it. Some common examples are :
adding a feature selection step using scikit-learn pipelines
use any model available in scikit-learn (or compatible with) at any step
add more intermediate steps such as clustering
5.6.2.1. Decoding without a mask: Anova-SVM using scikit-learn¶
We can also implement feature selection before decoding as a scikit-learn pipeline (sklearn.pipeline.Pipeline
).
For this, we need to import the sklearn.feature_selection
module and use sklearn.feature_selection.f_classif
, a simple F-score based feature selection (a.k.a. Anova),
5.6.2.2. Using any other model in the pipeline¶
Anova - SVM is a good baseline that will give reasonable results in common settings. However it may be interesting for you to explore the wide variety of supervised learning algorithms in the scikit-learn. These can readily replace the SVM in your pipeline and might be better fitted to some usecases as discussed in the previous section.
The feature selection step can also be tuned. For example we could use a more sophisticated scheme, such as Recursive Feature Elimination (RFE) or add some a clustering step before feature selection. This always amount to creating a pipeline that will link those steps together and apply a sensible cross-validation scheme to it. Scikit-learn usually takes care of the rest for us.
See also
The corresponding full code example to practice with pipelines Advanced decoding using scikit learn
The scikit-learn documentation with detailed explanations on a large variety of estimators and machine learning techniques. To become better at decoding, you need to study it.
5.6.3. Setting estimator parameters¶
Most estimators have parameters that can be set to optimize their performance. Importantly, this must be done via nested cross-validation.
Indeed, there is noise in the cross-validation score, and when we vary the parameter, the curve showing the score as a function of the parameter will have bumps and peaks due to this noise. These will not generalize to new data and chances are that the corresponding choice of parameter will not perform as well on new data.
With scikit-learn nested cross-validation is done via
sklearn.model_selection.GridSearchCV
. It is unfortunately time
consuming, but the n_jobs
argument can spread the load on multiple
CPUs.