API reference

Detailed information about toolbox functions and classes

last modified

2023–12–7

This section documents the public API (application programming interface) of functions and classes provided by the CvCrossManova toolbox. The toolbox contains further helper functions which may also be useful, and they are documented with standard Matlab help comments, but they are not part of the public interface and are not described here.

General notes

Programming interfaces

The CvCrossManova toolbox has two interfaces to perform analyses:

A simplified interface to analyse fMRI data which have been modeled in SPM is provided by the functions ccmRegion for region-of-interest analyses and ccmSearchlight for searchlight analyses. Both functions take a parameter analyses which is a cell array of Analysis objects specifying analyses.

A more complex but also more versatile interface, which can be used with arbitrary suitable data, is provided by objects of class CvCrossManova. Upon creation, a ModeledData object containing data and design matrices as well as an analyses cell array have to be provided. After that, analyses can be run on subsets of dependent variables using the method runAnalyses.

Pattern distinctness vs pattern stability

The CvCrossManova toolbox implements both Cross-validated MANOVA resulting in pattern distinctness D and Cross-validated (Cross-) MANOVA resulting in pattern stability D×. Which of these a given Analysis specification corresponds to depends on the contrast matrices and the regressors involved in them.

In the simplest case where a regressor has the same meaning (corresponds to the same experimental condition) in all sessions (the same set of regressors is present in all sessions in the same order), Cross-validated MANOVA uses a single contrast matrix, which is used for both ‘training’ and ‘validation’, and Cross-validated Cross-MANOVA uses two different contrast matrices. However, if e.g. some experimental events occur only in some sessions and the corresponding regressors therefore do not occur in the others, it may happen that two different contrast matrices are necessary for Cross-validated MANOVA and that a single contrast matrix can be used for Cross-validated Cross-MANOVA.

Tip

It is recommended to always include all regressors in all sessions, even if they may be empty in some of them. This will lead to the corresponding parameter to be inestimable in those sessions, but as long as the contrasts used are estimable, this is not a problem for the analysis.

Permutations

analysis.addPermutations adds information to an Analysis object that permutations should be applied, so that subsequent analysis runs return not just the actual estimate but also permutation values, which can be used for a permutation test.

The ‘permutation’ method implemented is to switch the sign (±) of GLM parameter estimates in each session separately, as proposed by Allefeld & Haynes (2014). For m sessions there are formally 2m per-session sign permutations, but not all of them lead to different permutation values of D; the number of unique permutations depends on the ‘training’ and ‘validation’ sessions used in different folds. addPermutations determines which permutations can lead to different outcomes and makes sure only those unique permutations are included.

Caution

Sign permutations are used to test a per-unit (e.g. participant) null hypothesis of no effect in Cross-validated MANOVA analyses resulting in estimates of D. It does not make sense to apply them to Cross-MANOVA analyses resulting in estimates of D×.

If the number of unique permutations is too large, a maximum number of permutations to be applied can be specified via the parameter maxPerms. In that case, a random subset of permutations is chosen which can be used for a Monte-Carlo permutation test (Dwass, 1957).

Tip

Note that if the permutations are chosen randomly, the result of the permutation test is in principle random, too. To ensure reproducible results, it is recommended to run s = rng('shuffle') once, note the values of s.Seed and s.Type, and then to include rng(<Seed>, <Type>) in your analysis pipeline. This is also necessary for the checkpointing mechanism in ccmSearchlight to be effective.

If permutations are applied, an analysis which would normally return a single value (the actual estimate) instead returns an array of permutation values. Permutations always include the neutral permutation which does not modify the data, and it is always permutation 1, i.e. the actual estimate is always the first in an array of permutation values.

The toolbox does not implement the actual permutation test. For a single unit (participant), a permutation test can be implemented by comparing the actual value with all permutation values (including the actual value resulting from the neutral permutation) and rejecting the null hypothesis at significance level α if the actual value ranks within the upper α-quantile. A simple implementation is pValue = mean(Ds >= Ds(1)). See Ernst (2004) for details, including how to correct for multiple comparisons. For permutation-based population inference for MVPA, see Allefeld et al. (2016).

Searchlight radius and size

The numerical parameter radius of ccmSearchlight specifies the radius of the searchlight sphere.

Caution

Note that a voxel is included in a searchlight if its distance from the center voxel is smaller than or equal to the radius. This convention may be different from the one used by other software packages.

By default the radius is in voxels. That can be changed to millimeters by additionally passing mmUnits = true, which uses the information on dimensions and orientation of voxels stored in fMRI data files. If the voxel dimensions are different in different directions, the searchlight sphere in physical space will be an ellipsoid in voxel space.

Fractional values of radius are possible and meaningful:

Radius in voxels
 
radius p
0     1
1     7
1.5  19
1.8  27
2    33
2.3  57
2.5  81
2.9  93
3   123
Radius in mm
for voxel size 1 × 1 × 2 mm
radius p
0     1
1     5
1.5   9
2    15
2.3  31
2.5  39
2.9  51
3    71
3.2  79
3.5  87
3.7 103
3.8 119
4   125

p is the maximum number of voxels, which is reached by a searchlight completely within the brain mask; at its boundaries the actual number will be smaller.

You can use the function searchlightSize to explore meaningful radii and corresponding searchlight sizes before running an analysis.

Shrinkage regularization

The numerical parameter lambda (from 0 to 1) of ccmRegion, ccmSearchlight, and CvCrossManova controls the amount of shrinkage regularization applied to the estimate of the error covariance matrix, which improves the numerical and statistical stability of the estimates of D and D×. A drawback of regularization is that with it this estimation is no longer unbiased. The small default value of 10−8 was chosen to limit this effect while still providing somewhat improved numerical stability.

A larger value may be necessary if a larger number of dependent variables (voxels) enter an analysis. While there are approaches to determine the optimal amount of shrinkage regularization from the data (cf. Schäfer & Strimmer, 2005), in our experience they are not sufficiently reliable to be applied automatically.

Tip

It is recommended to usually keep the default value and rather restrict the number of dependent variables (voxels) analyzed together. If that is not possible, one pragmatic approach would be to manually optimize lambda on one unit (e.g. participant) which is then not used for further analysis.

The target of the shrinkage regularization is the diagonal matrix where every diagonal element is identical to the error variance averaged across dependent variables. A lambda value of 1 can therefore be used to disregard the error covariance structure, because it replaces the estimated error covariance matrix by this shrinkage target. This can be useful for Cross-MANOVA if it is intended to quantify orthogonality w.r.t. the original data space (Euclidean metric) instead of the whitened space.

Whitening and filtering

The logical parameter wf of ccmRegion, ccmSearchlight, and ModeledData.fromSPM specifies whether to apply the whitening and high-pass filtering set up in SPM to data and design matrices. The default value is true.

Caution

It is recommended to keep the default value. While the pattern distinctness and pattern stability estimators should still be unbiased because the residual degrees of freedom are adjusted, not whitening will lead to decreased precision and not filtering will retain low-frequency signal components which may act as a confound for experimental effects.

function ccmRegion

run Cross-validated (Cross-) MANOVA on regions

[Ds, ps] = ccmRegion(modelDir, regions, analyses, wf = true, lambda = 1e-8)

modelDir is the directory where the SPM.mat file referring to an estimated model is located.

regions is a cell array of logical region masks specified as three-dimensional arrays or filenames.

analyses is a cell array of Analysis objects specifying analyses.

The optional wf specifies whether to apply the whitening and high-pass filtering set up in SPM to data and design matrices. It should usually be kept at its default value.

The optional lambda (from 0 to 1) controls the amount of shrinkage regularization applied to the estimate of the error covariance matrix. It should usually be kept at its default value.

Ds is a two-dimensional cell array of analysis results. Each cell corresponds to the combination of an analysis (rows) and a region (columns) and contains either a scalar value or an array of permutation values. Whether a result is an estimate of pattern distinctness D or pattern stability D× depends on the contrasts of the corresponding analysis and the regressors involved in them.

ps is an array with the numbers of voxels contained in each region.

function ccmSearchlight

run Cross-validated (Cross-) MANOVA on searchlight

ccmSearchlight(modelDir, radius, analyses, ...
    mmUnits = false, wf = true, lambda = 1e-8)

modelDir is the directory where the SPM.mat file referring to an estimated model is located.

radius is the radius of the searchlight sphere.

analyses is a cell array of Analysis objects specifying analyses.

The optional mmUnits specifies whether the searchlight radius is in voxels or millimeters. By default the value of radius is interpreted as voxels, while interpretation as mm has to be requested by mmUnits = true.

The optional wf specifies whether to apply the whitening and high-pass filtering set up in SPM to data and design matrices. It should usually be kept at its default value.

The optional lambda (from 0 to 1) controls the amount of shrinkage regularization applied to the estimate of the error covariance matrix. It should usually be kept at its default value.

Analysis results are written to image files in the same directory modelDir, with names of the form spmD_A####_P####.nii. The number following A indicates the analysis and the number following P indicates the permutation. Whether a result is an estimate of pattern distinctness D or pattern stability D× depends on the contrasts of the corresponding analysis and the regressors involved in them.

In addition, an image file VPSL.nii is written containing the numbers of voxels for each searchlight, and a Matlab file ccmSearchlightParams.mat is written containing a record of the analysis parameters.

The searchlight procedure includes a checkpointing mechanism: Intermediate results are saved to a file ccmsCheckpoint….mat, where stands for a 32-digit hexadecimal checksum of the analysis parameters. If a run of ccmSearchlight is interrupted, running it again recovers partial results from the checkpoint file and continues from there. Note that checkpointing only works if all analysis parameters remain identical. In particular, if the analysis includes randomly selected permutations, make sure to initialize Matlab’s random number generator before the selection.

class Analysis

An Analysis object encapsulates ‘training’ and ‘validation’ contrasts and sessions as well as permutations, which together specify a Cross-validated (Cross-) MANOVA analysis.

properties

CA ‘training’ contrast matrix, regressors × subcontrasts
CB ‘validation’ contrast matrix, regressors × subcontrasts
sessionsA ‘training’ sessions logical matrix, folds × sessions
sessionsB ‘validation’ sessions logical matrix, folds × sessions
L number of folds
m number of sessions
perms sign permutations, permutations × sessions

constructor Analysis

create Analysis object

analysis = Analysis(CA, CB, sessionsA, sessionsB)

CA and CB are ‘training’ and ‘validation’ contrast matrices, respectively.

sessionsA and sessionsB are logical matrices which indicate which sessions (columns) are used for ‘training’ and ‘validation’, respectively, in each fold (rows).

static method leaveOneSessionOut

create Analysis object for leave-one-session-out cross-validation

analysis = Analysis.leaveOneSessionOut(m, CA, CB)
analysis = Analysis.leaveOneSessionOut(m, C)

m is the number of sessions. The object is created with ‘training’ sessions not(logical(eye(m))) and ‘validation’ sessions logical(eye(m)), the specification of standard leave-one-session-out cross-validation.

CA and CB are ‘training’ and ‘validation’ contrast matrices, respectively. If only one contrast matrix C is specified, it is used for both.

method addPermutations

add sign permutations of per-session parameter estimates

analysis.addPermutations(maxPerms = 1000)

This method adds information to analysis that sign permutations should be applied, so that different values of pattern distinctness D for the different permutations are computed.

The optional maxPerms specifies the maximum number of permutations.

method disp

textually display information about the object

analysis.disp()

This method overrides Matlab’s disp, so you can also use disp(analysis) or simply analysis without semicolon to get the same output.

method show

graphically display information about the object

fig = analysis.show()

This method creates a figure showing contrast matrices, session matrices, and if present permutations.

fig is the handle of the created figure.

class ModeledData

A ModeledData object encapsulates data and design matrices for multiple sessions along with supplementary information. Upon creation, it estimates and stores GLM parameters and errors. Combined with one or more Analysis objects, it is the basis for analyses within a CvCrossManova object.

properties

Xs cell array of per-session data matrices, observations × independent variables
Ys cell array of per-session design matrices, observations × dependent variables
fs array of per-session residual degrees of freedom
m number of sessions
ns array of per-session numbers of observations
p number of dependent variables
qs array of per-session numbers of independent variables
names cell array of per-session string arrays of names of independent variables

constructor ModeledData

create ModeledData object

modeledData = ModeledData(Ys, Xs, fs = ..., names = ...)

Ys and Xs are cell arrays of per-session data and design matrices, respectively.

The optional fs is an array of per-session residual degrees of freedom. Their default values are size(Xs{k}, 1) - rank(Xs{k}), corresponding to the assumption that observations are uncorrelated. If observations have been only approximately decorrelated or have been filtered, correct values should be explicitly specified.

The optional names is a cell array of per-session string arrays of names of independent variables. Default names are identical in all sessions and of the form "reg(#)", where # is the index of the independent variable. If independent variables with the same index in different sessions have different meanings, correct names should be explicitly specified.

static method fromSPM

create ModeledData object from SPM data

[modeledData, misc] = fromSPM(modelDir, regions = {}, wf = true)

modelDir is the directory where the SPM.mat file referring to an estimated model is located.

The optional regions is a cell array of logical region masks specified as three-dimensional arrays or filenames. Without it, only the SPM brain mask is applied.

The optional wf specifies whether to apply the whitening and high-pass filtering set up in SPM to data and design matrices. It should usually be kept at its default value.

Dependent variables (columns of data matrices Ys) are only read from voxels within the SPM analysis brain mask after it was intersected with the union of region masks (if provided).

misc is a structure of fields containing additional information:

mask logical 3D volume indicating the voxels which dependent variables correspond to
mat 3D voxel indices to mm transformation matrix
rmvi indices of dependent variables corresponding to each region, cell array of arrays

method disp

textually display information about the object

modeledData.disp()

This method overrides Matlab’s disp, so you can also use disp(modeledData) or simply modeledData without semicolon to get the same output.

method show

graphically display information about the object

fig = analysis.show(rescale = true)

This method creates a figure showing the design matrices of all sessions.

The optional rescale specifies whether independent variables are individually rescaled to an absolute maximum of 1, to aid visibility of details.

fig is the handle of the created figure.

class CvCrossManova

A CvCrossManova object implements the Cross-validated (Cross-) MANOVA algorithm. It combines the data and design matrices encapsulated in a ModeledData object with the analysis specifications of one or more Analysis objects. If its method runAnalyses is called for a set of dependent variables, all specified analyses are performed on these variables concurrently. This enables a more efficient implementation of the algorithm, because partial results shared between analyses only have to be computed once.

properties

modeledData data and design information, ModeledData object
analyses analysis specifications, cell array of Analysis objects
lambda amount of shrinkage regularization
nAnalyses number of analyses

constructor CvCrossManova

create CvCrossManova object

ccm = CvCrossManova(modeledData, analyses, lambda = 1e-8)

modeledData is a ModeledData object encapsulating data and design matrices for multiple sessions.

analyses is a cell array of Analysis objects specifying analyses.

The optional lambda (from 0 to 1) controls the amount of shrinkage regularization applied to the estimate of the error covariance matrix. It should usually be kept at its default value.

method runAnalyses

run analyses on (a subset of) the dependent variables

Ds = ccm.runAnalyses(vi)
Ds = ccm.runAnalyses()

vi specifies the variables to be included in the analysis, as column indices into the data matrices Ys of modeledData. If omitted, all variables are included.

Ds is a cell array of analysis results where each cell corresponds to a cell of analyses, either a scalar value or an array of permutation values. Whether a result is an estimate of pattern distinctness D or pattern stability D× depends on the contrasts of the corresponding analysis and the regressors involved in them.

To determine how many values will be included in each of the cells of Ds before actually running the analyses, e.g. for preallocation, use the method nResults.

method optimizeLambda

optimize regularization parameter for (a subset of) the dependent variables

[ol, MSE] = ccm.optimizeLambda(vi)
[ol, MSE] = ccm.optimizeLambda()

vi specifies the variables to be included in the analysis, as column indices into the data matrices Ys of modeledData. If omitted, all variables are included.

ol is the optimal shrinkage regularization parameter lambda, a number between 0 and 1.

MSE is the mean squared error of whitening at ol.

The optimal value is determined via leave-one-session-out cross-validation. The error covariance matrix is estimated from ‘training’ and ‘validation’ sessions, and the latter is whitened using a regularized version of the former. The deviation from the optimal result, the identity matrix, is quantified by the squared error, averaged across matrix elements and cross-validation folds. The regularization parameter is chosen such that the mean squared error is minimal. Because the ‘training’ error covariance matrix used for whitening is calculated from less than the full number of samples, ol will likely overestimate the optimum for the actual analysis.

method nResults

return the number of values returned by runAnalyses

nr = ccm.nResults()

nr is an array where each element corresponds to an analysis, and the value specifies the number of results returned for that analysis. For an analysis without permutations, that value is 1; otherwise it is the number of permutations.

method checkEstimability

check estimability of all contrasts of all analyses in all sessions

[estimability, problems] = ccm.checkEstimability()

estimability is a table with one row per session and one column per analysis & contrast. ‘true’ means that the contrast is estimable, ‘false’ that it is not estimable, ‘–’ that the contrast does not apply to the session.

problems indicates whether there are any inestimable contrasts (logical).

It is not usually necessary to use this method explicitly, because estimability is checked upon creation of a CvCrossManova object.

method disp

textually display information about the object

ccm.disp()

This method overrides Matlab’s disp, so you can also use disp(ccm) or simply ccm without semicolon to get the same output.

function searchlightSize

searchlight size as a function of searchlight radius

p = searchlightSize(radius, mat = eye(3), file = "")
tbl = searchlightSize(mat = eye(3), file = "")
tbl = searchlightSize([radiusMin, radiusMax], mat = eye(3), file = "")

With the first syntax, the number of voxels contained in a searchlight of radius radius is returned as p.

With the other syntaxes, a table of corresponding values of radius and p is returned as tbl. By default, the table includes radius values from 0 to 5, but another range can be specified as [radiusMin, radiusMax]. The tabulated values of radius are chosen such that all possible searchlight sizes in that range are listed.

Without further information, radius is in voxels. The optional mat can be used to specify a transformation matrix from voxel space to physical space, and then radius is in the same physical units as mat. Alternatively, the optional file can be used to specify the filename of an SPM.mat file or NIfTI file from which to read the transformation matrix, and then radius is in mm. If both are given, file overrides mat.

References

Allefeld, C., Görgen, K., & Haynes, J.-D. (2016). Valid population inference for information-based imaging: From the second-level t-test to prevalence inference. NeuroImage, 141, 378–392. https://doi.org/10.1016/j.neuroimage.2016.07.040
Allefeld, C., & Haynes, J.-D. (2014). Searchlight-based multi-voxel pattern analysis of fMRI by cross-validated MANOVA. NeuroImage, 89, 345–357. https://doi.org/10.1016/j.neuroimage.2013.11.043
Dwass, M. (1957). Modified randomization tests for nonparametric hypotheses. Annals of Mathematical Statistics, 28(1), 181–187. https://doi.org/10.1214/aoms/1177707045
Ernst, M. D. (2004). Permutation methods: A basis for exact inference. Statistical Science, 19(4), 676–685. https://doi.org/10.1214/088342304000000396
Schäfer, J., & Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4(1). https://doi.org/10.2202/1544-6115.1175