API reference
Detailed information about toolbox functions and classes
This section documents the public API (application programming interface) of functions and classes provided by the CvCrossManova toolbox. The toolbox contains further helper functions which may also be useful, and they are documented with standard Matlab help comments, but they are not part of the public interface and are not described here.
General notes
Programming interfaces
The CvCrossManova toolbox has two interfaces to perform analyses:
A simplified interface to analyse fMRI data which have been modeled in SPM is provided by the functions ccmRegion
for region-of-interest analyses and ccmSearchlight
for searchlight analyses. Both functions take a parameter analyses
which is a cell array of Analysis
objects specifying analyses.
A more complex but also more versatile interface, which can be used with arbitrary suitable data, is provided by objects of class CvCrossManova
. Upon creation, a ModeledData
object containing data and design matrices as well as an analyses
cell array have to be provided. After that, analyses can be run on subsets of dependent variables using the method runAnalyses
.
Pattern distinctness vs pattern stability
The CvCrossManova toolbox implements both Cross-validated MANOVA resulting in pattern distinctness D and Cross-validated (Cross-) MANOVA resulting in pattern stability D×. Which of these a given Analysis
specification corresponds to depends on the contrast matrices and the regressors involved in them.
In the simplest case where a regressor has the same meaning (corresponds to the same experimental condition) in all sessions (the same set of regressors is present in all sessions in the same order), Cross-validated MANOVA uses a single contrast matrix, which is used for both ‘training’ and ‘validation’, and Cross-validated Cross-MANOVA uses two different contrast matrices. However, if e.g. some experimental events occur only in some sessions and the corresponding regressors therefore do not occur in the others, it may happen that two different contrast matrices are necessary for Cross-validated MANOVA and that a single contrast matrix can be used for Cross-validated Cross-MANOVA.
It is recommended to always include all regressors in all sessions, even if they may be empty in some of them. This will lead to the corresponding parameter to be inestimable in those sessions, but as long as the contrasts used are estimable, this is not a problem for the analysis.
Permutations
analysis.addPermutations
adds information to an Analysis
object that permutations should be applied, so that subsequent analysis runs return not just the actual estimate but also permutation values, which can be used for a permutation test.
The ‘permutation’ method implemented is to switch the sign (±) of GLM parameter estimates in each session separately, as proposed by Allefeld & Haynes (2014). For m sessions there are formally 2m per-session sign permutations, but not all of them lead to different permutation values of D; the number of unique permutations depends on the ‘training’ and ‘validation’ sessions used in different folds. addPermutations
determines which permutations can lead to different outcomes and makes sure only those unique permutations are included.
Sign permutations are used to test a per-unit (e.g. participant) null hypothesis of no effect in Cross-validated MANOVA analyses resulting in estimates of D. It does not make sense to apply them to Cross-MANOVA analyses resulting in estimates of D×.
If the number of unique permutations is too large, a maximum number of permutations to be applied can be specified via the parameter maxPerms
. In that case, a random subset of permutations is chosen which can be used for a Monte-Carlo permutation test (Dwass, 1957).
Note that if the permutations are chosen randomly, the result of the permutation test is in principle random, too. To ensure reproducible results, it is recommended to run s = rng('shuffle')
once, note the values of s.Seed
and s.Type
, and then to include rng(<Seed>, <Type>)
in your analysis pipeline. This is also necessary for the checkpointing mechanism in ccmSearchlight
to be effective.
If permutations are applied, an analysis which would normally return a single value (the actual estimate) instead returns an array of permutation values. Permutations always include the neutral permutation which does not modify the data, and it is always permutation 1, i.e. the actual estimate is always the first in an array of permutation values.
The toolbox does not implement the actual permutation test. For a single unit (participant), a permutation test can be implemented by comparing the actual value with all permutation values (including the actual value resulting from the neutral permutation) and rejecting the null hypothesis at significance level α if the actual value ranks within the upper α-quantile. A simple implementation is pValue = mean(Ds >= Ds(1))
. See Ernst (2004) for details, including how to correct for multiple comparisons. For permutation-based population inference for MVPA, see Allefeld et al. (2016).
Searchlight radius and size
The numerical parameter radius
of ccmSearchlight
specifies the radius of the searchlight sphere.
Note that a voxel is included in a searchlight if its distance from the center voxel is smaller than or equal to the radius. This convention may be different from the one used by other software packages.
By default the radius is in voxels. That can be changed to millimeters by additionally passing mmUnits = true
, which uses the information on dimensions and orientation of voxels stored in fMRI data files. If the voxel dimensions are different in different directions, the searchlight sphere in physical space will be an ellipsoid in voxel space.
Fractional values of radius
are possible and meaningful:
radius |
p |
---|---|
0 | 1 |
1 | 7 |
1.5 | 19 |
1.8 | 27 |
2 | 33 |
2.3 | 57 |
2.5 | 81 |
2.9 | 93 |
3 | 123 |
radius |
p |
---|---|
0 | 1 |
1 | 5 |
1.5 | 9 |
2 | 15 |
2.3 | 31 |
2.5 | 39 |
2.9 | 51 |
3 | 71 |
3.2 | 79 |
3.5 | 87 |
3.7 | 103 |
3.8 | 119 |
4 | 125 |
p
is the maximum number of voxels, which is reached by a searchlight completely within the brain mask; at its boundaries the actual number will be smaller.
You can use the function searchlightSize
to explore meaningful radii and corresponding searchlight sizes before running an analysis.
Shrinkage regularization
The numerical parameter lambda
(from 0 to 1) of ccmRegion
, ccmSearchlight
, and CvCrossManova
controls the amount of shrinkage regularization applied to the estimate of the error covariance matrix, which improves the numerical and statistical stability of the estimates of D and D×. A drawback of regularization is that with it this estimation is no longer unbiased. The small default value of 10−8 was chosen to limit this effect while still providing somewhat improved numerical stability.
A larger value may be necessary if a larger number of dependent variables (voxels) enter an analysis. While there are approaches to determine the optimal amount of shrinkage regularization from the data (cf. Schäfer & Strimmer, 2005), in our experience they are not sufficiently reliable to be applied automatically.
It is recommended to usually keep the default value and rather restrict the number of dependent variables (voxels) analyzed together. If that is not possible, one pragmatic approach would be to manually optimize lambda
on one unit (e.g. participant) which is then not used for further analysis.
The target of the shrinkage regularization is the diagonal matrix where every diagonal element is identical to the error variance averaged across dependent variables. A lambda
value of 1 can therefore be used to disregard the error covariance structure, because it replaces the estimated error covariance matrix by this shrinkage target. This can be useful for Cross-MANOVA if it is intended to quantify orthogonality w.r.t. the original data space (Euclidean metric) instead of the whitened space.
Whitening and filtering
The logical parameter wf
of ccmRegion
, ccmSearchlight
, and ModeledData.fromSPM
specifies whether to apply the whitening and high-pass filtering set up in SPM to data and design matrices. The default value is true
.
It is recommended to keep the default value. While the pattern distinctness and pattern stability estimators should still be unbiased because the residual degrees of freedom are adjusted, not whitening will lead to decreased precision and not filtering will retain low-frequency signal components which may act as a confound for experimental effects.
function ccmRegion
run Cross-validated (Cross-) MANOVA on regions
Ds, ps] = ccmRegion(modelDir, regions, analyses, wf = true, lambda = 1e-8) [
modelDir
is the directory where the SPM.mat
file referring to an estimated model is located.
regions
is a cell array of logical region masks specified as three-dimensional arrays or filenames.
analyses
is a cell array of Analysis
objects specifying analyses.
The optional wf
specifies whether to apply the whitening and high-pass filtering set up in SPM to data and design matrices. It should usually be kept at its default value.
The optional lambda
(from 0 to 1) controls the amount of shrinkage regularization applied to the estimate of the error covariance matrix. It should usually be kept at its default value.
Ds
is a two-dimensional cell array of analysis results. Each cell corresponds to the combination of an analysis (rows) and a region (columns) and contains either a scalar value or an array of permutation values. Whether a result is an estimate of pattern distinctness D or pattern stability D× depends on the contrasts of the corresponding analysis and the regressors involved in them.
ps
is an array with the numbers of voxels contained in each region.
function ccmSearchlight
run Cross-validated (Cross-) MANOVA on searchlight
ccmSearchlight(modelDir, radius, analyses, ...
mmUnits = false, wf = true, lambda = 1e-8)
modelDir
is the directory where the SPM.mat
file referring to an estimated model is located.
radius
is the radius of the searchlight sphere.
analyses
is a cell array of Analysis
objects specifying analyses.
The optional mmUnits
specifies whether the searchlight radius is in voxels or millimeters. By default the value of radius
is interpreted as voxels, while interpretation as mm has to be requested by mmUnits = true
.
The optional wf
specifies whether to apply the whitening and high-pass filtering set up in SPM to data and design matrices. It should usually be kept at its default value.
The optional lambda
(from 0 to 1) controls the amount of shrinkage regularization applied to the estimate of the error covariance matrix. It should usually be kept at its default value.
Analysis results are written to image files in the same directory modelDir
, with names of the form spmD_A####_P####.nii
. The number following A
indicates the analysis and the number following P
indicates the permutation. Whether a result is an estimate of pattern distinctness D or pattern stability D× depends on the contrasts of the corresponding analysis and the regressors involved in them.
In addition, an image file VPSL.nii
is written containing the numbers of voxels for each searchlight, and a Matlab file ccmSearchlightParams.mat
is written containing a record of the analysis parameters.
The searchlight procedure includes a checkpointing mechanism: Intermediate results are saved to a file ccmsCheckpoint….mat
, where …
stands for a 32-digit hexadecimal checksum of the analysis parameters. If a run of ccmSearchlight
is interrupted, running it again recovers partial results from the checkpoint file and continues from there. Note that checkpointing only works if all analysis parameters remain identical. In particular, if the analysis includes randomly selected permutations, make sure to initialize Matlab’s random number generator before the selection.
class Analysis
An Analysis
object encapsulates ‘training’ and ‘validation’ contrasts and sessions as well as permutations, which together specify a Cross-validated (Cross-) MANOVA analysis.
properties
CA |
‘training’ contrast matrix, regressors × subcontrasts |
CB |
‘validation’ contrast matrix, regressors × subcontrasts |
sessionsA |
‘training’ sessions logical matrix, folds × sessions |
sessionsB |
‘validation’ sessions logical matrix, folds × sessions |
L |
number of folds |
m |
number of sessions |
perms |
sign permutations, permutations × sessions |
constructor Analysis
create Analysis
object
analysis = Analysis(CA, CB, sessionsA, sessionsB)
CA
and CB
are ‘training’ and ‘validation’ contrast matrices, respectively.
sessionsA
and sessionsB
are logical matrices which indicate which sessions (columns) are used for ‘training’ and ‘validation’, respectively, in each fold (rows).
static method leaveOneSessionOut
create Analysis
object for leave-one-session-out cross-validation
analysis = Analysis.leaveOneSessionOut(m, CA, CB)
analysis = Analysis.leaveOneSessionOut(m, C)
m
is the number of sessions. The object is created with ‘training’ sessions not(logical(eye(m)))
and ‘validation’ sessions logical(eye(m))
, the specification of standard leave-one-session-out cross-validation.
CA
and CB
are ‘training’ and ‘validation’ contrast matrices, respectively. If only one contrast matrix C
is specified, it is used for both.
method addPermutations
add sign permutations of per-session parameter estimates
analysis.addPermutations(maxPerms = 1000)
This method adds information to analysis
that sign permutations should be applied, so that different values of pattern distinctness D for the different permutations are computed.
The optional maxPerms
specifies the maximum number of permutations.
method disp
textually display information about the object
analysis.disp()
This method overrides Matlab’s disp
, so you can also use disp(analysis)
or simply analysis
without semicolon to get the same output.
method show
graphically display information about the object
fig = analysis.show()
This method creates a figure showing contrast matrices, session matrices, and if present permutations.
fig
is the handle of the created figure.
class ModeledData
A ModeledData
object encapsulates data and design matrices for multiple sessions along with supplementary information. Upon creation, it estimates and stores GLM parameters and errors. Combined with one or more Analysis
objects, it is the basis for analyses within a CvCrossManova
object.
properties
Xs |
cell array of per-session data matrices, observations × independent variables |
Ys |
cell array of per-session design matrices, observations × dependent variables |
fs |
array of per-session residual degrees of freedom |
m |
number of sessions |
ns |
array of per-session numbers of observations |
p |
number of dependent variables |
qs |
array of per-session numbers of independent variables |
names |
cell array of per-session string arrays of names of independent variables |
constructor ModeledData
create ModeledData
object
modeledData = ModeledData(Ys, Xs, fs = ..., names = ...)
Ys
and Xs
are cell arrays of per-session data and design matrices, respectively.
The optional fs
is an array of per-session residual degrees of freedom. Their default values are size(Xs{k}, 1) - rank(Xs{k})
, corresponding to the assumption that observations are uncorrelated. If observations have been only approximately decorrelated or have been filtered, correct values should be explicitly specified.
The optional names
is a cell array of per-session string arrays of names of independent variables. Default names are identical in all sessions and of the form "reg(#)"
, where #
is the index of the independent variable. If independent variables with the same index in different sessions have different meanings, correct names should be explicitly specified.
static method fromSPM
create ModeledData
object from SPM data
modeledData, misc] = fromSPM(modelDir, regions = {}, wf = true) [
modelDir
is the directory where the SPM.mat
file referring to an estimated model is located.
The optional regions
is a cell array of logical region masks specified as three-dimensional arrays or filenames. Without it, only the SPM brain mask is applied.
The optional wf
specifies whether to apply the whitening and high-pass filtering set up in SPM to data and design matrices. It should usually be kept at its default value.
Dependent variables (columns of data matrices Ys
) are only read from voxels within the SPM analysis brain mask after it was intersected with the union of region masks (if provided).
misc
is a structure of fields containing additional information:
mask |
logical 3D volume indicating the voxels which dependent variables correspond to |
mat |
3D voxel indices to mm transformation matrix |
rmvi |
indices of dependent variables corresponding to each region, cell array of arrays |
method disp
textually display information about the object
modeledData.disp()
This method overrides Matlab’s disp
, so you can also use disp(modeledData)
or simply modeledData
without semicolon to get the same output.
method show
graphically display information about the object
fig = analysis.show(rescale = true)
This method creates a figure showing the design matrices of all sessions.
The optional rescale
specifies whether independent variables are individually rescaled to an absolute maximum of 1, to aid visibility of details.
fig
is the handle of the created figure.
class CvCrossManova
A CvCrossManova
object implements the Cross-validated (Cross-) MANOVA algorithm. It combines the data and design matrices encapsulated in a ModeledData
object with the analysis specifications of one or more Analysis
objects. If its method runAnalyses
is called for a set of dependent variables, all specified analyses are performed on these variables concurrently. This enables a more efficient implementation of the algorithm, because partial results shared between analyses only have to be computed once.
properties
modeledData |
data and design information, ModeledData object |
analyses |
analysis specifications, cell array of Analysis objects |
lambda |
amount of shrinkage regularization |
nAnalyses |
number of analyses |
constructor CvCrossManova
create CvCrossManova
object
ccm = CvCrossManova(modeledData, analyses, lambda = 1e-8)
modeledData
is a ModeledData
object encapsulating data and design matrices for multiple sessions.
analyses
is a cell array of Analysis
objects specifying analyses.
The optional lambda
(from 0 to 1) controls the amount of shrinkage regularization applied to the estimate of the error covariance matrix. It should usually be kept at its default value.
method runAnalyses
run analyses on (a subset of) the dependent variables
Ds = ccm.runAnalyses(vi)
Ds = ccm.runAnalyses()
vi
specifies the variables to be included in the analysis, as column indices into the data matrices Ys
of modeledData
. If omitted, all variables are included.
Ds
is a cell array of analysis results where each cell corresponds to a cell of analyses
, either a scalar value or an array of permutation values. Whether a result is an estimate of pattern distinctness D or pattern stability D× depends on the contrasts of the corresponding analysis and the regressors involved in them.
To determine how many values will be included in each of the cells of Ds
before actually running the analyses, e.g. for preallocation, use the method nResults
.
method optimizeLambda
optimize regularization parameter for (a subset of) the dependent variables
ol, MSE] = ccm.optimizeLambda(vi)
[ol, MSE] = ccm.optimizeLambda() [
vi
specifies the variables to be included in the analysis, as column indices into the data matrices Ys
of modeledData
. If omitted, all variables are included.
ol
is the optimal shrinkage regularization parameter lambda
, a number between 0 and 1.
MSE
is the mean squared error of whitening at ol
.
The optimal value is determined via leave-one-session-out cross-validation. The error covariance matrix is estimated from ‘training’ and ‘validation’ sessions, and the latter is whitened using a regularized version of the former. The deviation from the optimal result, the identity matrix, is quantified by the squared error, averaged across matrix elements and cross-validation folds. The regularization parameter is chosen such that the mean squared error is minimal. Because the ‘training’ error covariance matrix used for whitening is calculated from less than the full number of samples, ol
will likely overestimate the optimum for the actual analysis.
method nResults
return the number of values returned by runAnalyses
nr = ccm.nResults()
nr
is an array where each element corresponds to an analysis, and the value specifies the number of results returned for that analysis. For an analysis without permutations, that value is 1; otherwise it is the number of permutations.
method checkEstimability
check estimability of all contrasts of all analyses in all sessions
estimability, problems] = ccm.checkEstimability() [
estimability
is a table with one row per session and one column per analysis & contrast. ‘true’ means that the contrast is estimable, ‘false’ that it is not estimable, ‘–’ that the contrast does not apply to the session.
problems
indicates whether there are any inestimable contrasts (logical).
It is not usually necessary to use this method explicitly, because estimability is checked upon creation of a CvCrossManova
object.
method disp
textually display information about the object
ccm.disp()
This method overrides Matlab’s disp
, so you can also use disp(ccm)
or simply ccm
without semicolon to get the same output.
function searchlightSize
searchlight size as a function of searchlight radius
p = searchlightSize(radius, mat = eye(3), file = "")
tbl = searchlightSize(mat = eye(3), file = "")
tbl = searchlightSize([radiusMin, radiusMax], mat = eye(3), file = "")
With the first syntax, the number of voxels contained in a searchlight of radius radius
is returned as p
.
With the other syntaxes, a table of corresponding values of radius
and p
is returned as tbl
. By default, the table includes radius
values from 0 to 5, but another range can be specified as [radiusMin, radiusMax]
. The tabulated values of radius
are chosen such that all possible searchlight sizes in that range are listed.
Without further information, radius
is in voxels. The optional mat
can be used to specify a transformation matrix from voxel space to physical space, and then radius
is in the same physical units as mat
. Alternatively, the optional file
can be used to specify the filename of an SPM.mat
file or NIfTI file from which to read the transformation matrix, and then radius
is in mm. If both are given, file
overrides mat
.