Note

This page is a reference documentation. It only explains the function signature, and not how to use it. Please refer to the user guide for the big picture.

nilearn.datasets.fetch_neurovault

nilearn.datasets.fetch_neurovault(max_images=100, collection_terms=None, collection_filter=<function _empty_filter>, image_terms=None, image_filter=<function _empty_filter>, mode='download_new', data_dir=None, fetch_neurosynth_words=False, resample=False, vectorize_words=True, verbose=3, **kwarg_image_filters)[source]

Download data from neurovault.org that match certain criteria.

Any downloaded data is saved on the local disk and subsequent calls to this function will first look for the data locally before querying the server for more if necessary.

We explore the metadata for Neurovault collections and images, keeping those that match a certain set of criteria, until we have skimmed through the whole database or until an (optional) maximum number of images to fetch has been reached.

For more information, see Gorgolewski et al.[1], and Yarkoni et al.[2].

Parameters:
max_imagesint, default=100

Maximum number of images to fetch.

collection_termsdict, default=None

Key, value pairs used to filter collection metadata. Collections for which collection_metadata['key'] == value is not True for every key, value pair will be discarded. See documentation for basic_collection_terms for a description of the default selection criteria. If None is passed, will default to basic_collection_terms()

collection_filterCallable, default=empty_filter

Collections for which collection_filter(collection_metadata) is False will be discarded.

image_termsdict, default=None

Key, value pairs used to filter image metadata. Images for which image_metadata['key'] == value is not True for if image_filter != _empty_filter and image_terms = every key, value pair will be discarded. See documentation for basic_image_terms for a description of the default selection criteria. Will default to basic_image_terms() if None is passed.

image_filterCallable, default=empty_filter

Images for which image_filter(image_metadata) is False will be discarded.

mode{‘download_new’, ‘overwrite’, ‘offline’}

When to fetch an image from the server rather than the local disk.

  • ‘download_new’ (the default) means download only files that are not already on disk (regardless of modify date).

  • ‘overwrite’ means ignore files on disk and overwrite them.

  • ‘offline’ means load only data from disk; don’t query server.

data_dirstr, optional

The directory we want to use for nilearn data. A subdirectory named “neurovault” will contain Neurovault data.

fetch_neurosynth_wordsbool, default=False

whether to collect words from Neurosynth.

vectorize_wordsbool, default=True

If neurosynth words are downloaded, create a matrix of word counts and add it to the result. Also add to the result a vocabulary list. See sklearn.CountVectorizer for more info.

resamplebool, optional (default=False)

Resamples downloaded images to a 3x3x3 grid before saving them, to save disk space.

interpolationstr, default=’continuous’

Can be ‘continuous’, ‘linear’, or ‘nearest’. Indicates the resample method. Argument passed to nilearn.image.resample_img.

verboseint, default=3

An integer in [0, 1, 2, 3] to control the verbosity level.

kwarg_image_filters

Keyword arguments are understood to be filter terms for images, so for example map_type='Z map' means only download Z-maps; collection_id=35 means download images from collection 35 only.

Returns:
Bunch

A dict-like object which exposes its items as attributes. It contains:

  • ‘images’, the paths to downloaded files.

  • ‘images_meta’, the metadata for the images in a list of dictionaries.

  • ‘collections_meta’, the metadata for the collections.

  • ‘description’, a short description of the Neurovault dataset.

If fetch_neurosynth_words and vectorize_words were set, it also contains:

  • ‘vocabulary’, a list of words

  • ‘word_frequencies’, the weight of the words returned by neurosynth.org for each image, such that the weight of word vocabulary[j] for the image found in images[i] is word_frequencies[i, j]

See also

nilearn.datasets.fetch_neurovault_ids

Fetch collections and images from Neurovault by explicitly specifying their ids.

Notes

Images and collections from disk are fetched before remote data.

Some helpers are provided in the neurovault module to express filtering criteria more concisely:

ResultFilter, IsNull, NotNull, NotEqual, GreaterOrEqual, GreaterThan, LessOrEqual, LessThan, IsIn, NotIn, Contains, NotContains, Pattern.

If you pass a single value to match against the collection id (whether as the ‘id’ field of the collection metadata or as the ‘collection_id’ field of the image metadata), the server is directly queried for that collection, so fetch_neurovault(collection_id=40) is as efficient as fetch_neurovault(collection_ids=[40]) (but in the former version the other filters will still be applied). This is not true for the image ids. If you pass a single value to match against any of the fields listed in _COL_FILTERS_AVAILABLE_ON_SERVER, i.e., ‘DOI’, ‘name’, and ‘owner’, these filters can be applied by the server, limiting the amount of metadata we have to download: filtering on those fields makes the fetching faster because the filtering takes place on the server side.

In download_new mode, if a file exists on disk, it is not downloaded again, even if the version on the server is newer. Use overwrite mode to force a new download (you can filter on the field modify_date to re-download the files that are newer on the server - see Examples section).

Tries to yield max_images images; stops early if we have fetched all the images matching the filters or if too many images fail to be downloaded in a row.

References

Examples

To download all the collections and images from Neurovault:

fetch_neurovault(max_images=None, collection_terms={}, image_terms={})

To further limit the default selection to collections which specify a DOI (which reference a published paper, as they may be more likely to contain good images):

fetch_neurovault(
    max_images=None,
    collection_terms=dict(basic_collection_terms(), DOI=NotNull()),
)

To update all the images (matching the default filters):

fetch_neurovault(
    max_images=None, mode="overwrite", modify_date=GreaterThan(newest)
)

Examples using nilearn.datasets.fetch_neurovault

NeuroVault cross-study ICA maps

NeuroVault cross-study ICA maps