Title: | Group Iterative Multiple Model Estimation |
Version: | 0.9.1 |
Date: | 2025-06-26 |
Maintainer: | Kathleen M Gates <gateskm@email.unc.edu> |
Depends: | R (≥ 3.5.0) |
Imports: | lavaan (≥ 0.6-17), igraph (≥ 1.0-0), qgraph(≥ 1.9.8), data.tree, MIIVsem(≥ 0.5.4), imputeTS(≥ 3.0), nloptr, graphics, stats, MASS, tseries, utils |
Description: | Data-driven approach for arriving at person-specific time series models. The method first identifies which relations replicate across the majority of individuals to detect signal from noise. These group-level relations are then used as a foundation for starting the search for person-specific (or individual-level) relations. See Gates & Molenaar (2012) <doi:10.1016/j.neuroimage.2012.06.026>. |
License: | GPL-2 |
LazyData: | true |
URL: | https://github.com/GatesLab/gimme/, https://tarheels.live/gimme/tutorials/ |
BugReports: | https://github.com/GatesLab/gimme/issues |
ByteCompile: | true |
RoxygenNote: | 7.2.3 |
NeedsCompilation: | no |
Suggests: | knitr, rmarkdown |
VignetteBuilder: | knitr |
Encoding: | UTF-8 |
Packaged: | 2025-06-26 16:55:57 UTC; gateskm |
Author: | Stephanie Lane [aut, trl], Kathleen M Gates [aut, cre, ccp], Zachary Fisher [aut], Cara Arizmendi [aut], Peter Molenaar [aut, ccp], Edgar Merkle [ctb], Michael Hallquist [ctb], Hallie Pike [ctb], Teague Henry [ctb], Kelly Duffy [ctb], Lan Luo [ctb], Adriene Beltz [csp], Aidan Wright [csp], Jonathan Park [ctb], Sebastian Castro Alvarez [ctb] |
Repository: | CRAN |
Date/Publication: | 2025-06-26 21:30:01 UTC |
Group iterative multiple model estimation
Description
This package contains functions to automatically identify the structure of group- and individual-level networks from a range of vector autoregressive models, estimated with structural equation modeling.
Details
Researchers across varied domains gather multivariate data for each individual unit of study across multiple occasions of measurement. Generally referred to as time series (or in the social sciences, intensive longitudinal) data, examples include psychophysiological processes such as neuroimaging and heart rate variability, daily diary studies, ecological momentary assessments, data passively collected from devices such as smartphones, and observational coding of social interactions among dyads.
A primary goal for acquiring these data is to understand dynamic processes.
The gimme package contains several functions for use with these data.
These functions include gimmeSEM
, which provides both group-
and individual-level results by looking across individuals for patterns of
relations among variables. A function that provides group-level results,
aggSEM
, is included, as well as a function that provides
individual-level results, indSEM
. The major functions within the gimme package all require the
user to specify the data, although many additional options exist.
Author(s)
Stephanie Lane [aut, trl],
Kathleen Gates [aut, cre],
Zachary Fisher [aut],
Cara Arizmendi [aut],
Peter Molenaar [aut],
Michael Hallquist [ctb],
Hallie Pike [ctb],
Cara Arizmendi [ctb],
Teague Henry [ctb],
Kelly Duffy [ctb],
Lan Luo [ctb],
Adriene Beltz [csp]
Maintainer: KM Gates gateskm@email.unc.edu
Hemodynamic Response Function (HRF) GIMME example.
Description
This object contains a list of simulated time series data for twenty-five individuals. Each data set has 500 time points and five variables. The fifth variable represents an onset vector for stimulation.
Usage
HRFsim
Format
A list of data frames with 25 individuals, who each have 500 observations on 5 variables.
Group-level structural equation model search.
Description
Concatenates all individual-level data files and fits a group model to the data.
Usage
aggSEM(data = "",
out = "",
sep = "",
header = "",
ar = TRUE,
plot = TRUE,
paths = NULL,
exogenous = NULL,
outcome = NULL,
conv_vars = NULL,
conv_length = 16,
conv_interval = 1,
mult_vars = NULL,
mean_center_mult = FALSE,
standardize = FALSE,
hybrid = FALSE,
VAR = FALSE)
Arguments
data |
The path to the directory where the data files are located, or the name of the list containing each individual's time series. Each file or matrix must contain one matrix for each individual containing a T (time) by p (number of variables) matrix where the columns represent variables and the rows represent time. If in list form, each item in the list (i.e., matrix) must be named. |
out |
The path to the directory where the results will be stored (optional). If specified, a copy of output files will be replaced in directory. If directory at specified path does not exist, it will be created. |
sep |
The spacing of the data files when data are in a directory. "" indicates space-delimited, "/t" indicates tab-delimited, "," indicates comma delimited. Only necessary to specify if reading data in from physical directory. |
header |
Logical. Indicate TRUE for data files with a header, FALSE otherwise. Only necessary to specify if reading data in from physical directory. |
ar |
Logical. If TRUE, begins search for group model with autoregressive (AR) paths open. Defaults to TRUE. |
plot |
Logical. If TRUE, figures depicting relations among variables of interest will automatically be created. For aggregate-level plot, red paths represent positive weights and blue paths represent negative weights. Dashed lines denote lagged relations (lag 1) and solid lines are contemporaneous (lag 0). Defaults to TRUE. |
paths |
|
exogenous |
Vector of variable names to be treated as exogenous (optional). That is, exogenous variable X can predict Y but cannot be predicted by Y. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. Defaults to NULL. |
outcome |
Vector of variable names to be treated as outcome (optional). This is a variable that can be predicted by others but cannot predict. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. |
conv_vars |
Vector of variable names to be convolved via smoothed Finite Impulse Response (sFIR). Defaults to NULL. |
conv_length |
Expected response length in seconds. For functional MRI BOLD, 16 seconds (default) is typical for the hemodynamic response function. |
conv_interval |
Interval between data acquisition. Currently conv_length/conv_interval must be a constant. For fMRI studies, this is the repetition time. Defaults to 1. |
mult_vars |
Vector of variable names to be multiplied to explore bilinear/modulatory effects (optional). All multiplied variables will be treated as exogenous (X can predict Y but cannot be predicted by Y). Within the vector, multiplication of two variables should be indicated with an asterik (e.g. V1*V2). If no header is used, variables should be referred to with V followed by the column number (with no separation). If a header is used, each variable should be referred to using variable names. If multiplication with the lag 1 of a variable is desired, the variable name should be followed by "lag" with no separation (e.g. V1*V2lag). Note that if multiplied variables are desired, at least one variable in the dataset must be specified as exogenous. Defaults to NULL. |
mean_center_mult |
Logical. If TRUE, the variables indicated in mult_vars will be mean-centered before being multiplied together. Defaults to FALSE. |
standardize |
Logical. If TRUE, all variables will be standardized to have a mean of zero and a standard deviation of one. Defaults to FALSE. |
hybrid |
Logical. If TRUE, enables hybrid-VAR models where both directed contemporaneous paths and contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE. |
VAR |
Logical. If true, VAR models where contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE. |
Details
Output is a list of results if saved as an object and/or files printed to a directory if the "out" argument is used.
Author(s)
Stephanie Lane
Examples
## Not run:
exFit <- aggSEM(data = ts)
## End(Not run)
plot(exFit)
Create tree structures for group search solutions.
Description
Create tree structures for group search solutions.
Usage
batch.create.tree(hist, ind_hist, ind_fit, subgroup, names.ts_list, sub)
Counts number of excellent fit indices
Description
Counts number of excellent fit indices
Usage
count.excellent(indices)
Arguments
indices |
A vector of fit indices from lavaan. |
Value
The number of fit indices that are excellent.
Create structure of group search solutions.
Description
Create structure of group search solutions.
Usage
create.tree(history, subgroup, individual = FALSE, all.ind = FALSE)
Determines subgroups.
Description
Determines subgroups.
Usage
determine.subgroups(
data_list,
base_syntax,
n_subj,
chisq_cutoff,
file_order,
elig_paths,
confirm_subgroup,
out_path = NULL,
sub_feature,
sub_method,
sub_sim_thresh,
hybrid,
dir_prop_cutoff
)
Arguments
data_list |
A list of all datasets. |
base_syntax |
A character vector containing syntax that never changes. |
n_subj |
The number of subjects in the sample. |
chisq_cutoff |
Cutoff used in order for MI to be considered significant. |
file_order |
A data frame containing the order of the files and the names of the files. Used to merge in subgroup assignment and preserve order. |
elig_paths |
A character vector containing eligible paths that gimme is allowed to add to the model. Ensures only EPCs from allowable paths are considered in the creation of the similarity matrix. |
confirm_subgroup |
A dataframe with the first column a string vector of data file names without extensions and the second vector a integer vector of subgroup labels. |
Value
Returns sub object containing similarity matrix, the number of subgroups, the modularity associated with the subgroup memberships, and a data frame containing the file names and subgroup memberships.
Provides unique combinations of two vectors.
Description
Provides unique combinations of two vectors.
Usage
expand.grid.unique(x, y, incl.eq = TRUE)
Arguments
x |
A character vector containing variable names. |
y |
A character vector containing variable names. |
incl.eq |
Logical. TRUE means that combinations are kept where a variable appears twice. |
Value
The unique combinations of the variable names. Used in syntax creation.
Wrapup, create output files.
Description
Wrapup, create output files.
Usage
final.org(
dat,
grp,
sub,
sub_spec,
diagnos = FALSE,
store,
confirm_subgroup,
elig_paths = NULL
)
Arguments
dat |
A list containing information created in setup(). |
grp |
A list containing group-level information. NULL in aggSEM and indSEM. |
sub |
A list containing subgroup information. |
sub_spec |
A list containing information specific to each subgroup. |
store |
A list containing output from indiv.search(). |
elig_paths |
if subgroup = TRUE, eligable paths for potential individual-level search |
Value
Aggregated information, such as counts, levels, and plots.
Attempt to fit lavaan model.
Description
Attempt to fit lavaan model.
Usage
fit.model(syntax, data_file)
Arguments
syntax |
A character vector containing syntax. |
data_file |
A data frame containing individual data set. |
Value
If successful, returns fitted lavaan object. If not successful, catches and returns error message.
Grabs final coefficients for each individual.
Description
Grabs final coefficients for each individual.
Usage
get.params(dat, grp, ind, k, ms.print = TRUE)
Arguments
dat |
A list containing information created in setup(). |
grp |
A list containing group-level information. NULL in aggSEM and indSEM. |
ind |
A list containing individual- and (potentially) subgroup-level information. |
k |
The counter indicating the individual. |
Value
Individual-level information on fit, coefficients, and plots.
Group iterative multiple model estimation.
Description
This function identifies structural equation models for each individual that consist of both group-level and individual-level paths.
Usage
gimmeSEM(data = NULL,
out = NULL,
sep = NULL,
header = NULL,
ar = TRUE,
plot = TRUE,
subgroup = FALSE,
sub_feature = "lag & contemp",
sub_method = "Walktrap",
sub_sim_thresh = "lowest",
confirm_subgroup = NULL,
paths = NULL,
exogenous = NULL,
outcome = NULL,
conv_vars = NULL,
conv_length = 16,
conv_interval = 1,
mult_vars = NULL,
mean_center_mult = FALSE,
standardize = FALSE,
groupcutoff = .75,
subcutoff = .75,
diagnos = FALSE,
ms_allow = FALSE,
ms_tol = 1e-5,
lv_model = NULL,
lv_estimator = "miiv",
lv_scores = "regression",
lv_miiv_scaling = "first.indicator",
lv_final_estimator = "miiv",
lasso_model_crit = NULL,
hybrid = FALSE,
VAR = FALSE,
dir_prop_cutoff =0,
ordered = NULL,
group_correct = "Bonferoni Group")
Arguments
data |
The path to the directory where the data files are located, or the name of the list containing each individual's time series. Each file or matrix must contain one matrix for each individual containing a T (time) by p (number of variables) matrix where the columns represent variables and the rows represent time. Individuals must have the same variables (p) but can have different lengths of observations (T). |
out |
The path to the directory where the results will be stored (optional). If specified, a copy of output files will be replaced in directory. If directory at specified path does not exist, it will be created. |
sep |
The spacing of the data files. Follows R convention. "" indicates space-delimited, backslash "t" indicates tab-delimited, "," indicates comma delimited. Only necessary to specify if reading data in from physical directory. |
header |
Logical. Indicate TRUE for data files with a header. Only necessary to specify if reading data in from physical directory. |
ar |
Logical. If TRUE, begins search for group model with autoregressive (AR) paths freed for estimation. If ms_allow=TRUE, it is recommended to set ar=FALSE. Multiple solutions are unlikely to be found when ar=TRUE. Defaults to TRUE. |
plot |
Logical. If TRUE, graphs depicting relations among variables of interest will automatically be created. Solid lines represent contemporaneous relations (lag 0) and dashed lines reflect lagged relations (lag 1). For individual-level plots, red paths represent positive weights and blue paths represent negative weights. Width of paths corresponds to estimated path weight. For the group-level plot, black represents group-level paths, grey represents individual-level paths, and (if subgroup = TRUE) green represents subgroup-level paths. For the group-level plot, the width of the edge corresponds to the count. Defaults to TRUE. |
subgroup |
Logical. If TRUE, subgroups are generated based on
similarities in model features using the |
sub_feature |
Option to indicate feature(s) used to subgroup individuals. Defaults to "lag & contemp" for lagged and contemporaneous, which is the original method. Can use "lagged" or "contemp" to subgroup solely on features related to lagged and contemporaneous relations, respectively. |
sub_method |
Community detection method used to cluster individuals into subgroups. Options align with those available in the igraph package: "Walktrap" (default), "Infomap", "Louvain", "Edge Betweenness", "Label Prop", "Fast Greedy", "Leading Eigen", and "Spinglass". |
sub_sim_thresh |
Threshold for inducing sparsity in similarity matrix. Options are: the percent of edges in the similarity matrix to set to zero (e.g., .25 would set the lower quartile), "lowest" (default) subtracts the minimum value from all values, and "search" searches across thresholds to arrive at one providing highest modularity. |
confirm_subgroup |
Dataframe. Option only available when subgroup = TRUE. Dataframe should contain two columns. The first
column should specify file labels (the name of the data files without file extension),
and the second should contain integer values (beginning at 1)
specifying the subgroup membership for each individual.
function from the |
paths |
|
exogenous |
Vector of variable names to be treated as exogenous (optional). That is, exogenous variable X can predict Y but cannot be predicted by Y. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. The default for exogenous variables is that lagged effects of the exogenous variables are not included in the model search. If lagged paths are wanted, "&lag" should be added to the end of the variable name with no separation. Defaults to NULL. |
outcome |
Vector of variable names to be treated as outcome (optional). This is a variable that can be predicted by others but cannot predict. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. |
conv_vars |
Vector of variable names to be convolved via smoothed Finite Impulse Response (sFIR). Note, conv_vars are not not automatically considered exogenous variables. To treat conv_vars as exogenous use the exogenous argument. Variables listed in conv_vars must be binary variables. You cannot do lagged variables. If there is missing data in the endogenous variables their values will be imputed for the convolution operation only. Defaults to NULL. |
conv_length |
Expected response length in seconds. For functional MRI BOLD, 16 seconds (default) is typical for the hemodynamic response function. |
conv_interval |
Interval between data acquisition. Currently conv_length/conv_interval must be an integer. For fMRI studies, this is the repetition time. Defaults to 1. |
mult_vars |
Vector of variable names to be multiplied to explore bilinear/modulatory effects (optional). All multiplied variables will be treated as exogenous (X can predict Y but cannot be predicted by Y). Within the vector, multiplication of two variables should be indicated with an asterik (e.g. V1*V2). If no header is used, variables should be referred to with V followed by the column number (with no separation). If a header is used, each variable should be referred to using variable names. If multiplication with the lag 1 of a variable is desired, the variable name should be followed by "lag" with no separation (e.g. V1*V2lag). |
mean_center_mult |
Logical. If TRUE, the variables indicated in mult_vars will be mean-centered before being multiplied together. Defaults to FALSE. |
standardize |
Logical. If TRUE, all variables will be standardized to have a mean of zero and a standard deviation of one. Defaults to FALSE |
groupcutoff |
Cutoff value for group-level paths. Defaults to .75, indicating that a path must be significant across 75% of individuals to be included as a group-level path. |
subcutoff |
Cutoff value for subgroup- level paths. Defaults to .75, indicating that a path must be significant across at least 75% of the individuals in a subgroup to be considered a subgroup-level path. |
diagnos |
Logical. If TRUE provides internal output for diagnostic purposes. Defaults to FALSE. |
ms_allow |
Logical. If TRUE provides multiple solutions when more than one path has identical modification index values. When ms_allow=TRUE, it is recommended to set ar=FALSE. Multiple solutions are unlikely to be found when ar=TRUE. Additionally, subgroup should be set to FALSE. Output files for individuals with multiple solutions will represent the last solution found for the individual, not necessarily the best solution for the individual. |
ms_tol |
Precision used when evaluating similarity of modification indices when ms_allow = TRUE. We recommend that ms_tol not be greater than the default, especially when standardize=TRUE. Defaults to 1e-5. |
lv_model |
Invoke latent variable modeling by providing the measurement model syntax here. lavaan conventions are used for relating observed variables to factors. Defaults to NULL. |
lv_estimator |
Estimator used for factor analysis. Options are "miiv" (default), "pml" (pseudo-ML) or "svd". |
lv_scores |
Method used for estimating latent variable scores from parameters obtained from the factor analysis when lv_model is not NULL. Options are: "regression" (Default), "bartlett". |
lv_miiv_scaling |
Type of scaling indicator to use when "miiv" selected for lv_estimator. Options are "first.indicator" (Default; the first observed variable in the measurement equation is used), "group" (best one for the group), or "individual" (each individual has the best one for them according to R2). |
lv_final_estimator |
Estimator for final estimations. "miiv" (Default) or "pml" (pseudo-ML). |
lasso_model_crit |
When not null, invokes multiLASSO approach for the GIMME model search procedure. Arguments indicate the model selection criterion to use for model selection: 'bic' (select on BIC), 'aic', 'aicc', 'hqc', 'cv' (cross-validation). |
hybrid |
Logical. If TRUE, enables hybrid-VAR models where both directed contemporaneous paths and contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE. |
VAR |
Logical. If true, VAR models where contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE. |
dir_prop_cutoff |
Option to require that the directionality of a relation has to be higher than the reverse direction for a prespecified proportion of indivdiuals. |
ordered |
A character vector containing the names of all ordered categorical variables in the model. |
group_correct |
Indicate how to correct for multiple testing. "Bonferoni Group" (Default) corrects the alpha value for the number of people (N) in th sample; "Bonferoni Paths" corrects according to the number of eligible paths for that individual; a numeric <1 and >0 can be entered to indicate the alpha level desired. |
Details
Output is a list of results if saved as an object and/or files printed to a directory if the "out" argument is used.
Value
A list with the following components:
data: list of data used in analyses. Contains lagged variables and any data manipulations done within gimme.
path_est_mats: N matrices of individual-level coefficient estimates for directed paths.
varnames: Variable names in order of data.
n_vars_total: total number of variables.
n_lagged: total nubmer of lagged varaibles.
n_endog: total number of endogenous variables.
fit: Final fit indices, R-squared for each variable, convergence status, subgroup membership (if applicable), and modularity (if applicable).
path_se_est: Matrix of all individuals' unstandardized & standardized coefficient estimates, standard errors, level of each relation (e.g., "group"), and subgroup membership.
plots: If number of variables >3, N individual-level plots of directed lagged and contemporaneous relations among variables. Red = high / hot / positive values; Blue = low / cold / negative values. Line width corresponds with absolute value of beta estimate. Use plot() function.
plots_cov: If number of variables >3 and hybrid = TRUE or VAR = TRUE, N plots of contemporaneous covariances among residuals.
group_plot_paths: If number of variables >3, aggregated plot of directed relations. Black = group-level, Green = subgroup level, Grey = individual level. Line width corresponds with percent of individuals who have that path estimated.
group_plot_cov: If number of variables >3 and hybrid = TRUE or VAR = TRUE, aggregated plot of covariances among residuals.
sub_plots_paths: Aggregated directed subgroup plots for K subgroups if applicable.
sub_plots_cov: Aggregated covariance subgroup plots for K subgroups if applicable.
path_counts: Matrix containing counts of the number of people for whom a given directed relation is estimated.
path_counts_sub: K matrices containing counts of the number of people within that subgroup for whom a given directed relation is estimated.
cov_counts: Matrix containing counts of the number of people for whom a given covariance relation is estimated.
cov_counts_sub: K matrices containing counts of the number of people within that subgroup for whom a given covariance relation is estimated.
vcov: N matrices containing the estimated covariance matrix of paramaters of interest in gimme.
vcovfull: N matrices of the full estimated covariance matrix of parameters.
psi: N standardized residual covariance matrices.
ps_unstd: N unstandardied residual covariance matrices.
sim_matrix: If subgroup = TRUE, similarity count matrix of how many edges are in commong among each pair of individuals after the group-level search (also considers individual-level paths that may be added later via the EPC).
syntax: N individual slices containing lavaan-style syntax.
lvgimme: If provided, the latent variable model syntax (also included in the above).
rf_est: If variables to convolve are provided in conv_vars, the N response function estimates for individuals.
arguments: List of arguments provided by the user.
Author(s)
Zachary Fisher, Kathleen Gates, & Stephanie Lane
References
Gates, K.M. & Molenaar, P.C.M. (2012). Group search algorithm recovers effective connectivity maps for individuals in homogeneous and heterogeneous samples. NeuroImage, 63, 310-319.
Lane, S.T. & Gates, K.M. (2017). Automated selection of robust individual-level structural equation models for time series data. Structural Equation Modeling.
Adriene M. Beltz & Peter C. M. Molenaar (2016) Dealing with Multiple Solutions in Structural Vector Autoregressive Models, Multivariate Behavioral Research, 51:2-3, 357-373.
Examples
## Not run:
paths <- 'V2 ~ V1
V3 ~ V4lag'
fit <- gimmeSEM(data = simData,
out = "C:/simData_out",
subgroup = TRUE,
paths = paths)
print(fit, mean = TRUE)
print(fit, subgroup = 1, mean = TRUE)
print(fit, file = "group_1_1", estimates = TRUE)
print(fit, subgroup = 2, fitMeasures = TRUE)
plot(fit, file = "group_1_1")
## End(Not run)
Write MS-GIMME results to data.frame.
Description
Write MS-GIMME results to data.frame.
Usage
gimmems.write(x)
Identifies highest MI from list of MIs.
Description
Identifies highest MI from list of MIs.
Usage
highest.mi(
mi_list,
indices,
elig_paths,
prop_cutoff,
n_subj,
chisq_cutoff,
allow.mult,
ms_tol,
hybrid,
dir_prop_cutoff
)
Arguments
mi_list |
A list of MIs across individuals |
indices |
A list of fit indices. Only relevant at the individual-level. |
elig_paths |
A character vector containing eligible paths that gimme is allowed to add to a model (e.g., no nonsense paths). |
prop_cutoff |
The proportion of individuals for whom a path must be significant in order for it to be added to the models. NULL if used at the individual-level. |
n_subj |
The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1. |
chisq_cutoff |
Cutoff used in order for MI to be considered significant. Value varies depending on stage of search (e.g., group, subgroup, individual). |
Value
Returns name of parameter associated with highest MI. If no MI meets the criteria, returns NA.
Individual-level structural equation model search.
Description
This function identifies structural equation models for each individual. It does not utilize any shared information from the sample.
Usage
indSEM(data = NULL,
out = NULL,
sep = NULL,
header = NULL,
ar = TRUE,
plot = TRUE,
paths = NULL,
exogenous = NULL,
outcome = NULL,
conv_vars = NULL,
conv_length = 16,
conv_interval = 1,
mult_vars = NULL,
mean_center_mult = FALSE,
standardize = FALSE,
hybrid = FALSE,
VAR = FALSE)
Arguments
data |
The path to the directory where the data files are located, or the name of the list containing each individual's time series. Each file or matrix must contain one matrix for each individual containing a T (time) by p (number of variables) matrix where the columns represent variables and the rows represent time. |
out |
The path to the directory where the results will be stored (optional). If specified, a copy of output files will be replaced in directory. If directory at specified path does not exist, it will be created. |
sep |
The spacing of the data files. "" indicates space-delimited, "/t" indicates tab-delimited, "," indicates comma delimited. Only necessary to specify if reading data in from physical directory. |
header |
Logical. Indicate TRUE for data files with a header. Only necessary to specify if reading data in from physical directory. |
ar |
Logical. If TRUE, begins search for individual models with autoregressive (AR) paths open. Defaults to TRUE. |
plot |
Logical. If TRUE, graphs depicting relations among variables of interest will automatically be created. Defaults to TRUE. For individual- level plots, red paths represent positive weights and blue paths represent negative weights. |
paths |
lavaan-style syntax containing paths with which to begin model
estimation. That is, Y~X indicates that Y is regressed on X, or X
predicts Y. Paths can also be set to a specific value for estimation using |
exogenous |
Vector of variable names to be treated as exogenous. That is, exogenous variable X can predict Y but cannot be predicted by Y. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. Defaults to NULL. |
outcome |
Vector of variable names to be treated as outcome (optional). This is a variable that can be predicted by others but cannot predict. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. |
conv_vars |
Vector of variable names to be convolved via smoothed Finite Impulse Response (sFIR). Defaults to NULL. |
conv_length |
Expected response length in seconds. For functional MRI BOLD, 16 seconds (default) is typical for the hemodynamic response function. |
conv_interval |
Interval between data acquisition. Currently conv_length/conv_interval must be a constant. For fMRI studies, this is the repetition time. Defaults to 1. |
mult_vars |
Vector of variable names to be multiplied to explore bilinear/modulatory effects (optional). All multiplied variables will be treated as exogenous (X can predict Y but cannot be predicted by Y). Within the vector, multiplication of two variables should be indicated with an asterik (e.g. V1*V2). If no header is used, variables should be referred to with V followed by the column number (with no separation). If a header is used, each variable should be referred to using variable names. If multiplication with the lag 1 of a variable is desired, the variable name should be followed by "lag" with no separation (e.g. V1*V2lag). Note that if multiplied variables are desired, at least one variable in the dataset must be specified as exogenous. Defaults to NULL. |
mean_center_mult |
Logical. If TRUE, the variables indicated in mult_vars will be mean-centered before being multiplied together. Defaults to FALSE. |
standardize |
Logical. If TRUE, all variables will be standardized to have a mean of zero and a standard deviation of one. Defaults to FALSE. |
hybrid |
Logical. If TRUE, enables hybrid-VAR models where both directed contemporaneous paths and contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE. |
VAR |
Logical. If true, VAR models where contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE. |
Details
Output is a list of results if saved as an object and/or files printed to a directory if the "out" argument is used.
Author(s)
Stephanie Lane
Examples
## Not run:
fit <- indSEM(data = "C:/data100",
out = "C:/data100_indSEM_out",
sep = ",",
header = FALSE)
print(fit, file = "group1.1", estimates = TRUE)
plot(fit, file = "group1.1")
## End(Not run)
Individual-level search. Used in gimmeSEM, aggSEM, indSEM.
Description
Individual-level search. Used in gimmeSEM, aggSEM, indSEM.
Usage
indiv.search(dat, grp, ind, ind_cutoff = NULL, ind_z_cutoff = 1.96)
Arguments
dat |
A list containing information created in setup(). |
grp |
A list containing group-level information. NULL in aggSEM and indSEM. |
ind |
A list containing individual- and (potentially) subgroup-level information. |
ind_cutoff |
Chi square cutoff, .05 level adjusted for multiple tests. |
ind_z_cutoff |
Z score cutoff, .05 level adjusted for multiple tests. |
Value
Lists associated with coefficients, fit indices, etc.
Individual-level search. Used in gimmeSEM, aggSEM, indSEM.
Description
Individual-level search. Used in gimmeSEM, aggSEM, indSEM.
Usage
indiv.search.ms(dat, grp, ind, ms_tol, ms_allow, grp_num)
Arguments
dat |
A list containing information created in setup(). |
grp |
A list containing group-level information. NULL in aggSEM and indSEM. |
ind |
A list containing individual- and (potentially) subgroup-level information. |
Value
Lists associated with coefficients, fit indices, etc.
Identifies lowest z value from list of z values.
Description
Identifies lowest z value from list of z values.
Usage
lowest.z(z_list, elig_paths, prop_cutoff, n_subj, test_cutoff)
Arguments
z_list |
A list of z values across individuals. |
elig_paths |
A character vector containing eligible paths that gimme is allowed to drop from the model at a given stage. |
prop_cutoff |
The proportion of individuals for whom a path must be nonsignificant in order for it to be dropped from the models. NULL if used at the individual-level. |
n_subj |
The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1. |
test_cutoff |
Z score cutoff for significance testing. |
Value
Returns name of parameter associated with lowest z. If no z meets the criteria, returns NA.
Fitted gimme object with multiple solutions
Description
This object contains a fitted gimme object where multiple solutions gimme was used. The simulated data had 25 individuals, each with 100 time points.
Usage
ms.fit
Format
A fitted gimme object, where multiple solutions gimme was used.
GIMME Predicted Values.
Description
This function calculates the predicted values of a fitted gimme model.
Usage
predict.gimme(x)
Arguments
x |
A fitted gimme object. |
Value
List of data frames. Each data frame contains the predicted values of a subject in the data.
Author(s)
Sebastian Castro-Alvarez
Examples
## Not run:
paths <- 'V2 ~ V1
V3 ~ V4lag'
fit <- gimmeSEM(data = simData,
out = "C:/simData_out",
subgroup = TRUE,
paths = paths)
predictions <- predict.gimme(fit)
## End(Not run)
Prunes paths. Ties together lowest.z and return.zs functions.
Description
Prunes paths. Ties together lowest.z and return.zs functions.
Usage
prune.paths(
base_syntax,
fixed_syntax,
add_syntax,
data_list,
n_paths,
n_subj,
prop_cutoff,
elig_paths,
subgroup_stage = FALSE,
test_cutoff
)
Arguments
base_syntax |
A character vector containing syntax that never changes. |
fixed_syntax |
A character vector containing syntax that does not change in a given stage of pruning. |
add_syntax |
A character vector containing the syntax that is allowed to change in a given stage of pruning. |
data_list |
A list of datasets to be used in a given stage of the search. Varies based on group, subgroup, or individual-level stage. |
n_paths |
The number of paths that are eligible for pruning. Equal to the number of paths in add_syntax. |
n_subj |
The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1. |
prop_cutoff |
The proportion of individuals for whom a path must be nonsignificant in order for it to be dropped from the models. NULL if used at the individual-level. |
elig_paths |
A character vector containing eligible paths that gimme is allowed to drop from the model at a given stage. |
subgroup_stage |
Logical. Only present in order to instruct gimme what message to print to console using writeLines. |
test_cutoff |
Z score cutoff for significance testing. |
Value
Returns updated values of n_paths and add_syntax.
Recode variable names.
Description
Recode variable names.
Usage
recode.vars(data, oldvalue, newvalue)
Arguments
data |
The vector of variable names to be recoded |
oldvalue |
A vector containing the latent variable names used internally. |
newvalue |
A vector containing the observed variable names, either provided by the user (as a header) or provided by R (e.g., V1, V2). |
Value
Recoded vector of variable names.
GIMME Residuals.
Description
This function calculates the unstandardized and standardized residuals of a fitted gimme model.
Usage
residuals.gimme(x, lag)
Arguments
x |
A fitted gimme object. |
lag |
The number of lags tested in the Box-Pierce and Ljung-Box tests of the residuals. If user does not specify a value, default is the smaller of 10 or the length of the time series divided by 5. |
Value
List of four lists of data frames:
- residuals
List of the unstandardized residuals per subject.
- standardized.residuals
List of the standardized residuals per subject.
- Box.Pierce.test
List of the results of the Box-Pierce test for each subject's residuals.
- Ljung.Box.test
List of the results of the Ljung-Box test for each subject's residuals.
Author(s)
Sebastian Castro-Alvarez
Examples
## Not run:
paths <- 'V2 ~ V1
V3 ~ V4lag'
fit <- gimmeSEM(data = simData,
out = "C:/simData_out",
subgroup = TRUE,
paths = paths)
residuals <- residuals.gimme(fit)
residuals <- residuals.gimme(fit, lag = 5)
## End(Not run)
Returns MIs from lavaan fit object.
Description
Returns MIs from lavaan fit object.
Usage
return.mis(fit, elig_paths)
Arguments
fit |
An object from lavaan. |
Value
If successful, returns MIs for an individual. If unsuccessful, returns NA.
Returns z values from lavaan fit object.
Description
Returns z values from lavaan fit object.
Usage
return.zs(fit, elig_paths)
Arguments
fit |
An object from lavaan. |
elig_paths |
eligable paths at this stage. For subgrouping, group and fixed paths. For pruning, only group paths. |
Value
If successful, returns z values for an individual. If unsuccessful, returns NA.
Estimate response function for each person using smoothed Finite Impulse Response.
Description
Estimate response function for each person using smoothed Finite Impulse Response.
Usage
sFIR(data, stimuli, response_length = 16, interval = 1)
Arguments
data |
The data to be used to estimate response function |
stimuli |
A vector containing '0' when the stimuli of interest is not present and '1' otherwise. Number of observations across time must equal the length of data. |
interval |
Time between observations; for fMRI this is the repetition time. Defaults to 1. |
Value
Shape of response function and convolved time series vector.
Searches for paths. Ties together highest.mi and return.mis functions.
Description
Searches for paths. Ties together highest.mi and return.mis functions.
Usage
search.paths(
base_syntax,
fixed_syntax,
add_syntax,
n_paths,
data_list,
elig_paths,
prop_cutoff,
n_subj,
chisq_cutoff,
subgroup_stage = FALSE,
ms_allow = FALSE,
ms_tol = 1e-06,
hybrid = F,
dir_prop_cutoff = 0
)
Arguments
base_syntax |
A character vector containing syntax that never changes. |
fixed_syntax |
A character vector containing syntax that does not change in a given stage of searching. |
add_syntax |
A character vector containing the syntax that is allowed to change in a given stage of searching. |
n_paths |
The number of paths present in a given stage of searching. Equal to the number of paths in add_syntax. |
data_list |
A list of datasets to be used in a given stage of the search. Varies based on group, subgroup, or individual-level stage. |
elig_paths |
A character vector containing eligible paths that gimme is allowed to add to the model at a given stage. |
prop_cutoff |
The proportion of individuals for whom a path must be nonsignificant in order for it to be dropped from the models. NULL if used at the individual-level. |
n_subj |
The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1. |
chisq_cutoff |
Cutoff used in order for MI to be considered significant. Value varies depending on stage of search (e.g., group, subgroup, individual). |
subgroup_stage |
Logical. Only present in order to instruct gimme what message to print to console using writeLines. |
Value
Returns updated values of n_paths and add_syntax.
Searches for individual-level paths. Ties together highest.mi, return.mis, prune, and get.params functions.
Description
Searches for individual-level paths. Ties together highest.mi, return.mis, prune, and get.params functions.
Usage
search.paths.ind(
dat,
k,
data_list,
base_syntax,
fixed_syntax,
elig_paths,
prop_cutoff,
n_subj,
chisq_cutoff,
subgroup_stage,
hybrid,
dir_prop_cutoff,
ind_z_cutoff
)
Arguments
dat |
Object created at beginning of gimme containing static info. |
k |
Which individual this is. |
data_list |
A list of datasets to be used in a given stage of the search. Varies based on group, subgroup, or individual-level stage. |
base_syntax |
A character vector containing syntax that never changes. |
fixed_syntax |
A character vector containing syntax that does not change in a given stage of searching. |
elig_paths |
A character vector containing eligible paths that gimme is allowed to add to the model at a given stage. |
prop_cutoff |
The proportion of individuals for whom a path must be nonsignificant in order for it to be dropped from the models. NULL if used at the individual-level. |
n_subj |
The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1. |
chisq_cutoff |
Cutoff used in order for MI to be considered significant. Value varies depending on stage of search (e.g., group, subgroup, individual). |
subgroup_stage |
Logical. Only present in order to instruct gimme what message to print to console using writeLines. |
Value
Returns updated values of n_paths and add_syntax.
Searches for paths. Ties together highest.mi and return.mis functions.
Description
Searches for paths. Ties together highest.mi and return.mis functions.
Usage
search.paths.ms(
obj,
data_list,
base_syntax,
fixed_syntax,
elig_paths,
prop_cutoff,
n_subj,
chisq_cutoff,
subgroup_stage,
ms_allow,
ms_tol,
hybrid,
dir_prop_cutoff
)
Arguments
data_list |
A list of datasets to be used in a given stage of the search. Varies based on group, subgroup, or individual-level stage. |
base_syntax |
A character vector containing syntax that never changes. |
fixed_syntax |
A character vector containing syntax that does not change in a given stage of searching. |
elig_paths |
A character vector containing eligible paths that gimme is allowed to add to the model at a given stage. |
prop_cutoff |
The proportion of individuals for whom a path must be nonsignificant in order for it to be dropped from the models. NULL if used at the individual-level. |
n_subj |
The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1. |
chisq_cutoff |
Cutoff used in order for MI to be considered significant. Value varies depending on stage of search (e.g., group, subgroup, individual). |
subgroup_stage |
Logical. Only present in order to instruct gimme what message to print to console using writeLines. |
Value
Returns updated values of n_paths and add_syntax.
Set up base syntax file.
Description
Set up base syntax file.
Usage
setupBaseSyntax(paths, remove, varLabels, ctrlOpts)
Group iterative multiple model estimation.
Description
This function estimates the basis vectors related to responses following a binary impulse and convolves that binary impulse vector.
Usage
convolveFIR(ts_list = NULL,
varLabels = NULL,
conv_length = 16,
conv_interval = 1)
Arguments
ts_list |
a list of dataframes. |
varLabels |
a list of variable sets. Contains varLabels$coln, all column names, varLabels$conv, the names of variables to convolve, and varLabels$exog, a list of exogenous variables (if any). |
conv_length |
Expected response length in seconds. For functional MRI BOLD, 16 seconds (default) is typical for the hemodynamic response function. |
conv_interval |
Interval between data acquisition. Currently must be a scalar For fMRI studies, this is the repetition time. Defaults to 1. |
Create a list of dataframes
Description
Create a list of dataframes
Usage
setupDataLists(data, ctrlOpts = NULL, lv_model = NULL)
Arguments
data |
a list or directory. |
ctrlOpts |
a lit of control options. |
Get names for bilinear effects.
Description
Get names for bilinear effects.
Usage
setupMultVarNames(mult_vars)
Do some preliminary checks on the data.
Description
Do some preliminary checks on the data.
Usage
setupPrelimDataChecks(df)
Allows user to open and close certain paths.
Description
Allows user to open and close certain paths.
Usage
setupPrepPaths(paths, varLabels, ctrlOpts)
Arguments
paths |
|
Transform raw data as required.
Description
Transform raw data as required.
Usage
setupTransformData(
ts_list = NULL,
varLabels = NULL,
ctrlOpts = NULL,
ms_allow = FALSE
)
Arguments
ts_list |
a list or directory |
varLabels |
Variable labels. |
ctrlOpts |
List used in setup function. |
Large example, heterogeneous data, group, subgroup, and individual level effects.
Description
This object contains a list of simulated time series data for twenty-five individuals with 200 time points and 10 variables, or regions of interest.
Usage
simData
Format
A list of data frames with 25 individuals, who each have 200 observations on 10 variables.
Latent variable example, heterogeneous data, group, subgroup level effects.
Description
This object contains a list of simulated time series data for twenty individuals with 500 time points and 9 variables, or regions of interest.
Usage
simDataLV
Format
A list of data frames with 20 individuals, who each have 500 observations on 9 variables.
Simulate data from Vector AutoRegression (VAR) models.
Description
This function simulates data. It allows for structural VAR and VAR data generating models.
Usage
simulateVAR(A = NULL,
Phi = NULL,
Psi = NULL,
subAssign = NULL,
N = NULL,
ASign = "random",
PhiSign = "random",
Obs = NULL,
indA = 0.01,
indPhi = 0.01,
indPsi = 0.00)
Arguments
A |
A matrix (for no subgroups) or list of A matrices, with slice # = # of subgroups. |
Phi |
Phi matrix (for no subgroups) or list of Phi matrices, with slice # = # of subgroups. |
Psi |
matrix (for no subgroups) or list of Psi matrices, with slice # = # of subgroups. |
subAssign |
Optional vector of length N that indicates which subgroup each individual is in. |
N |
Number of indvidiuals. |
ASign |
Defaults to "random" for ind level paths, with 50 percent chance of positive and 50 percent negative, other option is either "neg" or "pos" which provides all negative or all positive relations, respectively. |
PhiSign |
Defaults to "random" for ind level paths, with 50 percent chance of positive and 50 percent negative, other option is either "neg" or "pos" which provides all negative or all positive relations, respectively. |
Obs |
Number of observations (T) per individual. Burn in of 400 is used to generate then discarded. |
indA |
Sparsity of individual-level A paths. 0 indicates no individual-level. Use decimals. Default is 0.01, meaning that each path that is not in the group-level A matrix has a 0.01 chance of being added. |
indPhi |
Sparsity of individual-level Phi paths. 0 indicates no individual-level. Use decimals. Default is 0.01, meaning that each path that is not in the group-level Phi matrix has a 0.01 chance of being added. |
indPsi |
Sparsity of individual-level Psi paths. 0 indicates no individual-level. Use decimals. Default is 0, meaning that each path that is not in the group-level Psi matrix has a 0 chance of being added at the ind. level. Individual- level paths added at this rate per individual. |
Author(s)
KM Gates, Ai Ye, Ethan McCormick, & Zachary Fisher
Solution trees for multiple solutions gimme.
Description
This function allows for the exploration of divergences in multiple solutions gimme for both the group and individuals.
Usage
solution.tree(x,
level = c("group", "individual"),
cols = NULL,
ids = "all",
plot.tree = FALSE)
Arguments
x |
A fitted gimme object. |
level |
A character vector indicating what levels of the solution tree you would like returned. Options are "group", "individual", or c("group", "individual"). Defaults to c("group", "individual"). |
cols |
A character vector indicating additional information to include in tree plot. Options include "stage", "pruned", "rmsea", "nnfi", "cfi","srmr", "grp_sol", "bic", "aic", "modularity." Defaults to NULL. |
ids |
A character vector indicating the names of subjects to print. Defaults to "all." |
plot.tree |
Logical. If TRUE, plot of tree is produced. Defaults to FALSE. |
Details
solution.tree
Create structure of group search solutions.
Description
Create structure of group search solutions.
Usage
subgroupStage(
dat,
grp,
confirm_subgroup,
elig_paths,
sub_feature,
sub_method,
ms_tol,
ms_allow,
sub_sim_thresh,
hybrid,
dir_prop_cutoff,
group_correct
)
Create summary matrix of path counts and subgroup plots
Description
Create summary matrix of path counts and subgroup plots
Usage
summaryPathsCounts(dat, grp, store, sub, sub_spec)
Arguments
dat |
A list containing information created in setup(). |
grp |
A list containing group-level information. NULL in aggSEM and indSEM. |
store |
A list containing output from indiv.search(). |
sub |
A list containing subgroup information. |
sub_spec |
A list containing information specific to each subgroup. |
Value
Aggregated information, such as counts, levels, and plots.
Small example, heterogeneous data, group and individual level effects
Description
This object contains a list of simulated time series data for five individuals with 50 time points and 3 variables, or regions of interest.
Usage
ts
Format
A list of data frames with 5 individuals, who each have 50 observations on 3 variables.
Create edge list from weight matrix.
Description
Create edge list from weight matrix.
Usage
w2e(x)
Arguments
x |
The coefficient matrix from an individual |
Value
A list of all non-zero edges to feed to qgraph