Note
This page is a reference documentation. It only explains the class signature, and not how to use it. Please refer to the user guide for the big picture.
8.3.3. nilearn.decoding.FREMClassifier¶
- class
nilearn.decoding.
FREMClassifier
(estimator='svc', mask=None, cv=30, param_grid=None, clustering_percentile=10, screening_percentile=20, scoring='roc_auc', smoothing_fwhm=None, standardize=True, target_affine=None, target_shape=None, mask_strategy='background', low_pass=None, high_pass=None, t_r=None, memory=None, memory_level=0, n_jobs=1, verbose=0)¶ State of the art decoding scheme applied to usual classifiers.
FREM uses an implicit spatial regularization through fast clustering and aggregates a high number of estimators trained on various splits of the training set, thus returning a very robust decoder at a lower computational cost than other spatially regularized methods.[1]_.
Parameters: estimator : str, optional
The estimator to choose among: ‘svc’, ‘svc_l2’, ‘svc_l1’, ‘logistic’, ‘logistic_l1’, ‘logistic_l2’ and ‘ridge_classifier’. Note that ‘svc’ and ‘svc_l2’; ‘logistic’ and ‘logistic_l2’ correspond to the same estimator. Default ‘svc’.
mask : filename, Nifti1Image, NiftiMasker, or MultiNiftiMasker, optional
Mask to be used on data. If an instance of masker is passed, then its mask and parameters will be used. If no mask is given, mask will be computed automatically from provided images by an inbuilt masker with default parameters. Refer to NiftiMasker or MultiNiftiMasker to check for default parameters. Default None
cv : int or cross-validation generator, optional (default 30)
If int, number of stratified shuffled splits returned, which is usually the right way to train many different classifiers. A good trade-off between stability of the aggregated model and computation time is 50 splits. Shuffled splits are seeded by default for reproducibility. Can also be a cross-validation generator.
param_grid : dict of str to sequence, or sequence of such. Default None
The parameter grid to explore, as a dictionary mapping estimator parameters to sequences of allowed values.
None or an empty dict signifies default parameters.
A sequence of dicts signifies a sequence of grids to search, and is useful to avoid exploring parameter combinations that make no sense or have no effect. See scikit-learn documentation for more information, for example: https://scikit-learn.org/stable/modules/grid_search.html
clustering_percentile : int, float, optional, in closed interval [0, 100]
Used to perform a fast ReNA clustering on input data as a first step of fit. It agglomerates similar features together to reduce their number down to this percentile. ReNA is typically efficient for cluster_percentile equal to 10. Default: 10.
screening_percentile : int, float, optional, in closed interval [0, 100]
The percentage of brain volume that will be kept with respect to a full MNI template. In particular, if it is lower than 100, a univariate feature selection based on the Anova F-value for the input data will be performed. A float according to a percentile of the highest scores. Default: 20.
scoring : str, callable or None, optional. Default: ‘roc_auc’
The scoring strategy to use. See the scikit-learn documentation at https://scikit-learn.org/stable/modules/model_evaluation.html#the-scoring-parameter-defining-model-evaluation-rules If callable, takes as arguments the fitted estimator, the test data (X_test) and the test target (y_test) if y is not None. e.g. scorer(estimator, X_test, y_test)
For classification, valid entries are: ‘accuracy’, ‘f1’, ‘precision’, ‘recall’ or ‘roc_auc’. Default: ‘roc_auc’.
smoothing_fwhm : float, optional. Default: None
If smoothing_fwhm is not None, it gives the size in millimeters of the spatial smoothing to apply to the signal.
standardize : bool, optional. Default: True
If standardize is True, the time-series are centered and normed: their variance is put to 1 in the time dimension.
target_affine : 3x3 or 4x4 matrix, optional. Default: None
This parameter is passed to image.resample_img. Please see the related documentation for details.
target_shape : 3-tuple of int, optional. Default: None
This parameter is passed to image.resample_img. Please see the related documentation for details.
low_pass : None or float, optional
This parameter is passed to signal.clean. Please see the related documentation for details
high_pass : None or float, optional
This parameter is passed to signal.clean. Please see the related documentation for details
t_r : float, optional. Default: None
This parameter is passed to signal.clean. Please see the related documentation for details.
mask_strategy : {‘background’ or ‘epi’}, optional. Default: ‘background’
The strategy used to compute the mask: use ‘background’ if your images present a clear homogeneous background, and ‘epi’ if they are raw EPI images. Depending on this value, the mask will be computed from masking.compute_background_mask or masking.compute_epi_mask.
This parameter will be ignored if a mask image is provided.
memory : instance of joblib.Memory or str
Used to cache the masking process. By default, no caching is done. If a str is given, it is the path to the caching directory.
memory_level : int, optional. Default: 0
Rough estimator of the amount of memory used by caching. Higher value means more memory for caching.
n_jobs : int, optional. Default: 1.
The number of CPUs to use to do the computation. -1 means ‘all CPUs’.
verbose : int, optional. Default: 0.
Verbosity level.
See also
nilearn.decoding.Decoder
- Classification strategies for Neuroimaging,
nilearn.decoding.FREMRegressor
- State of the art regression pipeline for Neuroimaging
References
- A. Hoyos-Idrobo, G. Varoquaux, J. Kahn and B. Thirion, “FReM – scalable and stable decoding with fast regularized ensemble of models” in NeuroImage, Elsevier, 2017 pp.1-16, 11 October 2017. https://hal.archives-ouvertes.fr/hal-01615015/
__init__
(estimator='svc', mask=None, cv=30, param_grid=None, clustering_percentile=10, screening_percentile=20, scoring='roc_auc', smoothing_fwhm=None, standardize=True, target_affine=None, target_shape=None, mask_strategy='background', low_pass=None, high_pass=None, t_r=None, memory=None, memory_level=0, n_jobs=1, verbose=0)¶Initialize self. See help(type(self)) for accurate signature.
decision_function
(X)¶Predict class labels for samples in X.
Parameters: X: list of Niimg-like objects
See <http://nilearn.github.io/manipulating_images/input_output.html> Data on prediction is to be made. If this is a list, the affine is considered the same for all.
Returns: y_pred: ndarray, shape (n_samples,)
Predicted class label per sample.
fit
(X, y, groups=None)¶Fit the decoder (learner).
Parameters: X: list of Niimg-like objects
See http://nilearn.github.io/manipulating_images/input_output.html Data on which model is to be fitted. If this is a list, the affine is considered the same for all.
y: numpy.ndarray of shape=(n_samples) or list of length n_samples
The dependent variable (age, sex, IQ, yes/no, etc.). Target variable to predict. Must have exactly as many elements as 3D images in niimg.
groups: None
Group labels for the samples used while splitting the dataset into train/test set. Default None.
Note that this parameter must be specified in some scikit-learn cross-validation generators to calculate the number of splits, e.g. sklearn.model_selection.LeaveOneGroupOut or sklearn.model_selection.LeavePGroupsOut.
For more details see https://scikit-learn.org/stable/modules/cross_validation.html#cross-validation-iterators-for-grouped-data
Attributes
`masker_` (instance of NiftiMasker or MultiNiftiMasker) The NiftiMasker used to mask the data. `mask_img_` (Nifti1Image) Mask computed by the masker object. `classes_` (numpy.ndarray) Classes to predict. For classification only. `screening_percentile_` (float) Screening percentile corrected according to volume of mask, relative to the volume of standard brain. `coef_` (numpy.ndarray, shape=(n_classes, n_features)) Contains the mean of the models weight vector across fold for each class. `coef_img_` (dict of Nifti1Image) Dictionary containing coef_ with class names as keys, and coef_ transformed in Nifti1Images as values. In the case of a regression, it contains a single Nifti1Image at the key ‘beta’. `intercept_` (ndarray, shape (nclasses,)) Intercept (a.k.a. bias) added to the decision function. `cv_` (list of pairs of lists) List of the (n_folds,) folds. For the corresponding fold, each pair is composed of two lists of indices, one for the train samples and one for the test samples. `std_coef_` (numpy.ndarray, shape=(n_classes, n_features)) Contains the standard deviation of the models weight vector across fold for each class. Note that folds are not independent, see https://scikit-learn.org/stable/modules/cross_validation.html#cross-validation-iterators-for-grouped-data `std_coef_img_` (dict of Nifti1Image) Dictionary containing std_coef_ with class names as keys, and coef_ transformed in Nifti1Image as values. In the case of a regression, it contains a single Nifti1Image at the key ‘beta’. `cv_params_` (dict of lists) Best point in the parameter grid for each tested fold in the inner cross validation loop. `cv_scores_` (dict, (classes, n_folds)) Scores (misclassification) for each parameter, and on each fold
get_params
(deep=True)¶Get parameters for this estimator.
Parameters: deep : bool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params : mapping of string to any
Parameter names mapped to their values.
predict
(X)¶Predict a label for all X vectors indexed by the first axis.
Parameters: X: {array-like, sparse matrix}, shape = (n_samples, n_features)
Samples.
Returns: array, shape=(n_samples,) if n_classes == 2 else (n_samples, n_classes)
Confidence scores per (sample, class) combination. In the binary case, confidence score for self.classes_[1] where >0 means this class would be predicted.
score
(X, y, sample_weight=None)¶Return the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
Parameters: X : array-like of shape (n_samples, n_features)
Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead, shape = (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
y : array-like of shape (n_samples,) or (n_samples, n_outputs)
True values for X.
sample_weight : array-like of shape (n_samples,), default=None
Sample weights.
Returns: score : float
R^2 of self.predict(X) wrt. y.
Notes
The R2 score used when calling
score
on a regressor usesmultioutput='uniform_average'
from version 0.23 to keep consistent with default value ofr2_score
. This influences thescore
method of all the multioutput regressors (except forMultiOutputRegressor
).
set_params
(**params)¶Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Parameters: **params : dict
Estimator parameters.
Returns: self : object
Estimator instance.