src.uf_toolbox.
uf_designmat
(EEG, varargin)Input an EEG event structure and you will get an EEG.unfold.X field with the designmatrix.If you add multiple eventtypess+formulas as cell-arrays, this function will iteratively call itself and combine it to one big designmatrix. The designmatrix is not yet ready to do deconvolution, use uf_timeexpandDesignmat for this.
cfg.formula (string) –
Formula in the wilkinson format. In addition to the matlab default, one can specify ‘cat(X)’ so that X is interpreted as a categorical variable and therefore dummy/effect-coded. Also using spl(Y,5) defines a “non-linear” predictor using 5 b-cubic spline basis functions. The more splines one uses, the higher the risk of overfitting but also of course more flexible relations can be fitted. Custom spline functions are possible by using uf_designmat_spline() after the initial call of uf_designmat.
Example with multiple formulas: {‘y~A+spl(B,5)’, ‘y~x+cat(y)’,’y~1’}
Be sure to define multiple eventtypes if you use multiple formulas.
Example with more complex formula: {‘y ~ stimulus_type + color * size + stimulus_type:color}’
This formula would add the following main effects: “stimulus_type, color, size” and the following interactions: “stimulus_type:color, color:size”
To define the reference category, have a look at cfg.categorical {cell} down below. By default we sort the levels and choose the first level as the reference
cfg.eventtypes (cell of strings or cell of cells of strings) – the formula is fit on these events. make sure that all fields are filled for all events special-case: Multiple eventtypess You can fit multiple different formulas on different events concurrently. The specification could be as follows: {{‘A1’,’A2’,’A3’},{‘B’},{‘C’}}. If more than one formula are specified, we expect you to specify the eventtypess each formula should be applied to.
cfg.categorical (cell-array) – default {}, list of which of the EEG.event fields should be treated as an categorical effect (thus dummy/effect coded). You can also directly specify what variables are categorical in the formula. You can specify the order of the predictors. For example: {‘predictorA’,{‘level3’,’level1’,’level2’}; ‘predictorB’,{‘level2’,’level1’}} For predictorA, the level3 is now used as a reference group. For predictorB the level2 is now used. The second column of the cell array is optional. E.g. {‘predictorA’,’predictorB’} will make both predictors as categorical
cfg.splinespacing (string) – defines how the knots of the splines should be placed. Possible values: ‘linear’ : linear spacing with boundary splines at the respective min/max ‘log’: logarithmic increasing spacing ‘logreverse’: log decreasing spacing ‘quantiles’ (default): heuristic spacing at the quantiles
cfg.codingschema (string) – default: ‘reference’, could be ‘effects’, this is relevant if you define categorical input variables. Reference coding is also known as treatment coding could be ‘full’, but in that case an overcomplete designmatrix is returned. The resulting betas are not estimable functions and would need to be combined by an estimable contrast
Returns the EEG structure with the additional fields in EEG.unfold
X: The design matrix
colnames: For each column of ‘X’, which predictor it represents
formula: The original cfg.formula
event: the cfg.eventtypes
cols2eventtypes: For each column of ‘X’ which event it represents
EEG-struct
A classical 2x2 factorial design with interaction
src.uf_toolbox.
uf_timeexpandDesignmat
(EEG, varargin)This function takes the designmatrix% (saved in EEG.unfold.X, a EEG.points times nPredictor matrix) and expands it over time (in the range of the windowlength).
cfg.method (string) –
default ‘stick’; Three methods are available:
’stick’ We shift the signal over each point in time, uses the stickfunction basis
’splines’ We use cubic splines (number = Timeexpandparam) to approximate the signal. This makes use of neighbouring timepoints that are very likely correlated.
’fourier’ We use a fourier set (up to the first Timeexpandparam frequencies) to model the signal.
cfg.timelimits (2 integer) – defines over what time the timeexpand should go, this is analog to the epoch-size. This should be as long, as you think overlap can happen in your data (in seconds)
cfg.timeexpandparam (integer) – depending on whether cfg.method is splines or fourier defines how many splines or fourier frequencies (in case of fourier, the effective parametersize is twice as large due to the sin/cos ‘duplication’) should be used to convolve. In case of ‘full’, the parameter is not used.
EEG.unfold.Xdc - the designmatrix for all time points
EEG.unfold.timebasis - the basis set for splines / fourier. This is used later to recover the values in the time-domain, not the basis-function domain
EEG.unfold.basisTime - the time of the unfold-window in seconds
EEG.Xdc_terms2cols - A unique specifier defining which of the deconvolution-additional-columns belongs to which predictor
EEG = uf_timeexpandDesignmat(EEG,’method’,’splines’,’windowlength’,128,’timeexpandparam’,30)
src.uf_toolbox.
uf_glmfit
(EEG, varargin)This function solves the Equation X*beta = EEG.data, with X = Designmat. There are multiple algorithms implemented, a slow iterative algorithm that runs on sparse matrices (default) that solves each channel in turn and the matlab algorithm which solves all channels at the same time, but take quite a lot of memory.
cfg.method (string) –
“lsmr” default; an iterative solver is used, this is
very memory efficient, but is a lot slower than the ‘time’ option because each electrode has to be solved independently. The LSMR algorithm is used for sparse iterative solving.
”par-lsmr” same as lsmr, but uses parfor with ncpu-1. This does not
seem to be any faster at the moment (unsure why). Not recommended
”matlab” , uses matlabs native A/b solver. For moderate to big
design-matrices it will need a lot of memory (40-60GB is easily reached)
”pinv” A naive pseudo-inverse, generally not recommended due to
floating point instability
”glmnet” uses glmnet to fit the linear system. This by default uses
L1-Norm aka lasso (specified as cfg.glmnetalpha = 1). For ridge-regression (L2-Norm) use (cfg.glmnetalpha = 0). Something inbetween results in elastic-net. We use the cvglmnet functionality that automatically does crossvalidation to estimate the lambda parameter (i.e. how strongly parameter values should be regularised compared to the fit of the model). We use the glmnet recommended ‘lambda_1se’, i.e. minimum lambda + 1SE buffer towards more strict regularisation.
cfg.lsmriterations – (default 400), defines how many steps the iterative solver should search for a solution. While the solver is mostly monotonic (see paper), it is recommended to increase the iterations. A limit is only defined because in our experience, high number of iterations are a result of strong collinearities, and hint to a faulty model
cfg.glmnetalpha – (default 1, as in glmnet), can be 0 for L2 norm, 1 for L1-norm or something inbetween for elastic net
cfg.fold_event – (defaultempty), (development / no unit-test) defines EEG.events on which the crossvalidation folds for glmnet should be placed
cfg.channel (array) – Restrict the beta-calculation to a subset of channels. Default is all channels
cfg.debug (boolean) – 0, only with method:matlab, outputs additional details from the solver used
cfg.precondition (boolean) – 1, scales each row of Xdc to SD=1. This increase the solving speed by factor ~2. For very large matrices you might run into memory problems. Deactivate then.
cfg.ica (boolean) – 0, use data or ICA components (have to be in EEG.icaact). cfg.channel chooses the components.
EEG – the EEG set, need to have EEG.unfold.Xdc compatible with the size of EEG.data
array (nchan x ntime x npred) (ntime could be n-timesplines, n-fourierbasis or samples)
EEG.unfold.beta
EEG = dc_glmfit(EEG); EEG = dc_glmfit(EEG,’method’,’matlab’,’channel’,[3 5]);
src.uf_toolbox.
uf_condense
(EEG, varargin)Condense results in new structure. Apply timebasis (if necessary). Returns an “ufresult”-structure that contains the predictor betas over time and accompanying information. This structure is further used in all plotting functions. This function also applies the time basis (if you have specified something else than the default ‘stick’ in uf_timeexpandDesignmat() )
EEG (struct) – A struct containing EEG.unfold.beta_dc
cfg.deconv (integer) –
1, use EEG.unfold.beta_dc, the deconvolved betas 0, use EEG.unfold.beta_nodc, betas without
deconvolution
and returns both
cfg.channel (array) – Restrict the beta-output to a subset of channels. Default is all channels
ufresult.beta = (nchans x time x parameters) ufresult.beta_nodc = (nchans x time x parameters) (only if unfold=0 or -1) ufresult.param = (struct size: parameters) each field contains the values of the respective parameter. ufresult.unfold = EEG.unfold ufresult.times = EEG.times ufresult.chanlocs = EEG.chanlocs
Example:
ufresult = uf_condense(EEG)
ufresult.param(X): * name: name of the variable, e.g.: ‘continuousA’ * value: value of the predictor, e.g. ‘50’ * event: event of the variable, e.g.: ‘eventA’
src.uf_toolbox.
uf_continuousArtifactDetect
(EEG, varargin)Reject commonly recorded artifactual potentials (c.r.a.p.)
Note: This is an ERPLAB function that was heavily altered by Benedikt Ehinger to be included in the unfold toolbox. In particular, I removed all the filter-features and changed the input parser. Please cite the ERPLAB toolbox if you use this function (reference below). Benedikt Ehinger & Olaf Dimigen
There are a number of common artifacts that you will see in nearly every EEG data file. These include eyeblinks, slow voltage changes (caused mostly by skin potentials), muscle activity (from moving the head or tensing up the muscles in the face or neck), horizontal eye movements, and various types of C.R.A.P. (Commonly Recorded Artifactual Potentials).
Although we usually perform artifact rejection on the segmented data, it’s a good idea to examine the raw unsegmented EEG data first. You can usually identify patterns of artifacts, make sure there were no errors in the file, etc., more easily with the raw data [1].
crap.m allows you to automatically identify large peak-to-peak differences or extreme amplitude values, within a moving window, across your continuous EEG dataset. After performing crap.m, artifactual segments will be rejected and replaced by a ‘boundary’ event code.
EEG –
continuous EEG dataset (EEGLAB’s EEG structure)
'amplitudeThreshold' –
Thresolds (values). [-lim +lim] is marked
'windowsize' –
moving window width (in msec, default: 2000 ms)
'stepsize' –
moving window step (default: 1000 ms)
'combineSegments' –
marked segment(s) closer than this value will be joined together
'channels' –
channels to check for artifacts (e.g. [1:64]), default: all
Example –
= uf_continuousArtifactDetect (winrej) –
ERP Boot Camp: Data Analysis Tutorials. Emily S. Kappenman, Marissa L. Gamble, and Steven J. Luck. UC Davis
This function is part of ERPLAB Toolbox Author: Javier Lopez-Calderon Center for Mind and Brain University of California, Davis, Davis, CA 2009
src.uf_toolbox.
uf_continuousArtifactExclude
(EEG, varargin)This function expects a rejection vector and excludes those intervals from being modeled in the design matrix. That means all predictor values in the given intervals are set to 0 in the time-expanded design matrix and therefore ignored in the modeling process
cfg.winrej (integer) – A (nx2) array with n from-to pairs of samples to
excluded from further processing This is the same output as from (be) –
eegplot rej (EEGlabs') –
EEG-Structure * unfold.X: All elements between the from-to pairs got set to 0
src.uf_toolbox.
uf_epoch
(EEG, varargin)Deconvolution works on continuous data, thus to compare it to the “normal” use-case, we have to epoch it. Because the data has not been cleaned yet, we do this in this function. We additionally remove trials from unfold.X that were removed during epoching. Afterwards you can use uf_glmfit_nodc to fit the model
cfg.winrej (integer) – A (2xn) array with n from-to pairs of samples to be excluded from further processing
cfg.timelimits (float) – min+max of the epoch in seconds
EEG (eeglab) – the EEG set, need to have EEG.unfold.Xdc compatible with the size of EEG.data
Epoched EEG file to cfg.timelimits
EEG_epoch = uf_epoch(EEG,’winrej’,winrej,’timelimits’,cfgTimeexpand.timelimits)
src.uf_toolbox.
uf_glmfit_nodc
(EEG, varargin)Simple function to do massive univariate linear model. The function expects EEG.data to be (CHAN,TIME,EPOCH) with EPOCH the same number as EEG.unfold.X.
It is recommended to use uf_epoch for epoching, because you need to remove rows from EEG.unfold.X if the epoching function removed trials. Also cleaning of data is taken care of in uf_epoch
cfg.method – (default pinv) ‘glmnet’,’pinv’,’matlab’,’lsmr’ are available. See the uf_glmfit function for further information. By making use of pinv, the linear model needs to be solved only once and can be applied to all electrodes. The other solves iteratively solve for each electtrode.
Returns a matrix (channel x pnts x predictors) of betas saved into EEG.devon.beta
EEG = uf_glmfit_nodeconv(EEG)
src.uf_toolbox.
uf_imputeMissing
(EEG, varargin)Deal with predictors for which some values are missing in design matrix You can either impute missing values or remove the predictors events for which some values are missing
cfg.method –
it with 0). This will lead to the event not being used for overlap correction!
in the future it might be interesting to implement not the marginal, but multivariate methods to conservate correlations between predictors (c.f. Horton & Kleinmann 2007)
’mean’ : fill in the mean value
’median’ : (Default) fill in the median value
EEG.unfold.X in which missing NAN-values were imputed (‘marginal’, ‘mean’, ‘median’) or in which the events with missing predictor information were removed (‘drop’), which means put to 0
Example
EEG = uf_imputeMissing(EEG)
src.uf_toolbox.
uf_designmat_addcol
(EEG, newcol, label, variablettype)sometimes useful to add e.g. continuous predictors manually. Note that this is somewhat experimental, and not all functions support this.
newrow (array) – The column(s) to add to the Xdc designmat
label (string) – The label/identifier of the column
eventtype (string) – (optional, default nan) the eventtype. E.g. for trf
should be {'trf'}. If you do not manually timeexpand, this should (this) –
nan (be) –
while reshaping. (column) –
EEG-Struct * unfold.Xdc added column * unfold.colnames added label
src.uf_toolbox.
uf_designmat_spline
(EEG, varargin)Helper function to generate spline-part of designmatrix
spline predictor
splines should be calculated and evaluated. E.g. [-3 4,1,2,3, … 4]
overfitting, to few to underfitting. This number will be transformed into the number of knots later. As different spline functions have different requiements to the number of knots, we specify this number and fix the number of knots later.
sequence of knots explicitly (else they are put on the quantiles or linearly (see splinespacing). An example would be [0,1,2,3,5,10,11,12,13]. This example could make sense if there is lots of data at predictor 0-5 and again at 10-13. To define the knotsequence explicitly is also useful when you directly want to estimate the same betas for all subjects. But beware of subject-specific ranges, not all subjects have the same range in their covariates.
of the knots along the
function. This in principle also allows to make use of polynomial regression
unfold.X: new entries for the spline
unfold.splines: new entrie for the spline
in addition update to unfold: colnames, variablenames,cols2variablenames,cols2eventtypes,variabletypes
spl: Same as EEG.unfold.spline{end} nanlist: the paramValues that were nan (same as ‘isnan(spl.paramValues)’ )
Example
EEG = dc_designmat_spline(EEG,’name’,’splineA’,’paramValues’,[EEG.event.splineB],’nsplines’,10,’splinespacing’,’linear’); EEG = dc_designmat_spline(EEG,’name’,’splineB’,’paramValues’,[EEG.event.splineB],’knotsequence’,linspace(0,2*pi,15),’splinefunction’,’cyclical’);
src.uf_toolbox.
uf_predictContinuous
(ufresult, varargin)This is similar to a predict function, but does not add the marginal of the other parameters. For this please make use of uf_addmarginals().
Because model-estimates / parameters are defined for each time-point and electrode and can also encompass multiple betas (in the case of spline predictors), this becomes non trivial and thus this function. Note that this will overwrite the ufresult.beta field
cfg.predictAt (cell) – One entry per parameter: {{‘par1’,[10 20 30]},{‘par2’,[0,1,2]}}. This evaluates parameter 1 at the values 10,20 and 30. Parameter 2 at 0, 1 and 2. Default behaviour: evaluates 7 linearly spaced values between the min + max. of the parameterdomain
cfg.auto_method (string) – ‘quantile’ (default) or ‘linear’. ‘quantile’ - the auto_n values are placed on the quantile of the predictor ‘linear’ - the auto_n values are placed linearly over the range of the predictor ‘average’ - only evaluates at the average of the predictor. This is useful if you are interested in the marginal response
cfg.auto_n (integer) – default 10; the number of automatically evaluated values
Betas with evaluated betas at specified continuous values.
You calculated for a continuous variable “parameterA” a beta of 3. You want to know what the predicted signal of parameterA = [10,20,30] is. You call the function:
ufresult = uf_predictContinuous(ufresult,’predictAt’,{{‘parameterA’,[10 20 30]}}
The output then would be the respective values 30,60 and 90.
src.uf_toolbox.
uf_addmarginal
(ufresult, varargin)add the marginal of the other predictors (i.e. continuous & spline predictors) to the beta estimates Important: If dummy-coded (i.e. non-effect coded) predictors and interactions exist, they are NOT added to the marginal effect. I.e. the output of the method returns the average ERP evaluated at the average of all spline/continuous predictors. The categorical predictors are kept at their reference level (which is X = 0, so in the case of dummy coded this is the reference, in case of sum/contrast coding this is the mean of group means). Interactions are also ignored. This can potentially be problematic and you have to calculate the marginal for yourmodel by hand!
Note: This calculates the marginal effect at mean (MEM), that is f(E(x)). In other words, it calculates the effect at the average of the continuous covariate. An alternative would be to calculate the average marginal mean (AME) E(f(x)). In other word, the average effect of the predictor over all continuous covariate values. The latter is not implemented.
unfold result structure generated by uf_condense() (ufresult) –
cfg.channel (all) Calculate only for a subset of channels (numeric) cfg.betaSetname (“beta” = deconvolution model) string that indicates which unfold.(field) to use
(i.e. ufresult.beta for deconvolution vs. ufresult.beta_nodc for a massive univariate model)
marginal effect (AME) of each spline/continuous predictor. “MEM” default option.
Example For instance the model 1 + cat(facA) + continuousB
has the betas: intercept, facA==1, continuousB-Slope
intercept: response with facA = 0 and continuousB = 0 facA==1 : differential effect of facA == 1 (against facA==0) continuousB-slope: the slope of continous B
Using uf_predictContinuous(), we evaluate the continuous predictor at [0 50 100] The beta output of uf_predictContinuous then mean the following:
intercept: same as before facA==1 : same as before continuousB@0 : the differential effect if continuous B is 0 continuousB@50 : the differential effect if continuous B is 50 continuousB@100: the differential effect if continuous B is 100
Using uf_addmarginal(), the average response is added to all predictors:
intercept: the response of facA==0 AND continuousB@mean(continuousB) intercept: the response of facA==1 AND continuousB@mean(continuousB) continuousB@0 : the response of facA==0 if continuous B is 0 continuousB@50 : the response of facA==0 if continuous B is 50 continuousB@100: the response of facA==0 if continuous B is 100
Note that mean(continuousB) does not need to be a number we evaluated in the uf_predictContinuous step
src.uf_toolbox.
uf_unfold2csv
(ufresult, varargin)returns a data-table
cfg.deconv (boolean) – Use the unfold betas (unfold.beta_dc) or the no-unfold betas(unfold.beta_nodc)
cfg.channel (integer) – (Default: All channels) Limit to a list of specific channels
cfg.filename – filename for the csv file. if empty, only returns table
Each observation (voltage/beta) has one row, channels, predictors etc. gets one column
Data-Table in the “tidy”-format
Example
uftable = uf_unfold2csv(ufresult,’filename’,’output.csv’)
src.uf_toolbox.
uf_plotEventCorrmat
(EEG, varargin)Its possible to subselect the eventtype Planned feature: allow to plot only the EEG.unfold.X field
eventtypes (cell) – Subselect the eventtypes, by default chooses all
plot (0/1) – whether the corrmat should be plotted (default) or only returned (deprecated “figure”)
correlationMatrix
Example
uf_plotEventCorrmat(EEG)
src.uf_toolbox.
uf_plotEventHistogram
(EEG, varargin)This function also adds a density estimate
cfg.eventtypes – Restrict the histogram to a specific eventtypes
Return:
uf_plotEventHistogram(EEG,’eventA’)
src.uf_toolbox.
uf_plotDesignmat
(EEG, varargin)Plots the designmatrix If the matrix is very large (the timeexpanded/Xdc matrix) we do not plot everything, but only the middle 60s. In addition (for timeexpand) we plot the events as horizontal lines.
cfg.timeexpand' (boolean) – 0: Plots EEG.unfold.X (default) 1: Plots EEG.unfold.Xdc
cfg.logColor (boolean) – plot the color on logscale (default 0)
cfg.sort (boolean) – Sort the designmatrix (only possible for X, not Xdc)
cfg.figure (1/0) – Open a new figure (default 1)
uf_plot_designmat(EEG) uf_plot_designmat(EEG,’sort’,1) uf_plot_designmat(EEG,’timeexpand’,1) %plot the timeexpanded X
src.uf_toolbox.
uf_plotParam
(ufresult, varargin)predictor, where there are multiple lines for each predictor
‘ufresult’ needs to contain the ‘ufresult’ structure, the output from uf_condense()
Uses the ‘gramm’-toolbox for plotting
'channel' (integer) – Which channel to plot
'predictAt' (cell) – a cell of cell arrays, e.g. {{‘parName’,linspace(0,10,5)},{‘parname2’,1:5}} This is a shortcut to uf_continuousPredict. We generally recommend to explicitly use the c function.
'deconv' ([-1 0 1]) – default: -1; whether to plot ufresult.beta (1) or ufresult.beta_nodc(0) or everything/autodetect (-1). Autodetect would also detect same-shaped other predictors. If e.g. you want to compare multiple runs from different algorithms or similar
'add_intercept' (boolean) – Add the intercept/constant to each subplot. This will give ERP-plots that are commonly used. Without add_intercepts the factors (if they are categorical) could be interpretet as difference or sometimes main effect plots (if effects-coding is used)
'baseline' (2 integers) – default none; Performs a baseline corrections on the interval (in seconds = ufresult.times units) given.
'include_intercept' (boolean) – default 0; useful with “add_intercept”, will add the constant/intercept to each subplot
'plotSeparate' ('all','event','none') – Each predictor will be plotted in a separate figure (‘all’), plotted in an event-specific figure (‘event’) or all subplots are in the same figure (‘none’, default)
'plotParam' (string/cell of strings) – Defines which parameters are to be plotted
'sameyaxis' ('all','row','independent') – Force the same y-axis (default ‘all’)
'gramm' – (gramm-object) plots the current data ontop of the last gramm-object. This is useful to plot multiple subjects in a single figure.
'figure' (boolean) – Generate a new figure? (default 1)
All ‘subplot’ axes that were generated
allAxesInFigure
uf_plotParam(ufresult,’channel’,1)
src.uf_toolbox.
uf_plotParam2d
(ufresult, varargin)third dimension This function plots an imagesc plot of time vs. parameter of choice
'plotParam' – Name of parameter to be plotted. can be empty to plot all splines/continuous parameters
'add_intercept' – add the intercept to the plot, default 0
'channel' – Specify which channel-idx to plot
'betaSetName' – Default ‘beta’. Can be any field of the ufresult-struct
'caxis' – Default [], specify coloraxis
Example: uf_plotParam2d(‘plotParam’,’continuosPredictorA’)
src.uf_toolbox.
uf_plotParamTopo
(ufresult, varargin)If you are not interested in differences, but the predicted cells, it might be helpful to run dc_addmarginal() before. Then you do not only plot the simple/main effect, but the intercept is added to the difference resulting in both condition.
'plotParam' – cell array of parameters to be plotted, if empty plots all
'n_topos' –
number of topographies to plot
'channel' – plot only a subset of channels
'baseline' (2 integers) – default none; Performs a baseline corrections on the interval (in seconds = ufresult.times units) given.
('same',default ('caxis') – []) if ‘same’, generates the same coloraxis based on the 95% percentile of the selected beta-values. can be customized to whichever caxis e.g. [-3 5]
'betaSetName' – Default ‘beta’. Can be any field of the ufresult-struct
'figure' – plot in new figure (1) or old (0), default: (1)
structure of all plotting axes.
uf_plotParamTopo(EEG,’plotParam’,{‘FactorX’,’FactorC’})
src.uf_toolbox.
uf_plot2nd
(d2nd, varargin)This function allows to plot multiple subjects at the same time the function requires the data to be in the following format: ufresult.beta(CHAN,TIME,PARAM,SUBJECT)
Each line is one subject, its possible to calculate confidence intervals
cfg.channel – Which channel to use
cfg.plotParam – (default 1, as in glmnet), can be 0 for L2 norm, 1 for L1-norm or something inbetween for elastic net
cfg.bootci – (default 1) calculate and plot boostraped confidence intervals
cfg.singlesubjects – (default 1) plot the singlesubject lines
... – Other parameters are linked to uf_plotParam
nothing
Example
uf_plot2nd(ufresult2nd,’channel’,2)