Title: Group Iterative Multiple Model Estimation
Version: 0.9.1
Date: 2025-06-26
Maintainer: Kathleen M Gates <gateskm@email.unc.edu>
Depends: R (≥ 3.5.0)
Imports: lavaan (≥ 0.6-17), igraph (≥ 1.0-0), qgraph(≥ 1.9.8), data.tree, MIIVsem(≥ 0.5.4), imputeTS(≥ 3.0), nloptr, graphics, stats, MASS, tseries, utils
Description: Data-driven approach for arriving at person-specific time series models. The method first identifies which relations replicate across the majority of individuals to detect signal from noise. These group-level relations are then used as a foundation for starting the search for person-specific (or individual-level) relations. See Gates & Molenaar (2012) <doi:10.1016/j.neuroimage.2012.06.026>.
License: GPL-2
LazyData: true
URL: https://github.com/GatesLab/gimme/, https://tarheels.live/gimme/tutorials/
BugReports: https://github.com/GatesLab/gimme/issues
ByteCompile: true
RoxygenNote: 7.2.3
NeedsCompilation: no
Suggests: knitr, rmarkdown
VignetteBuilder: knitr
Encoding: UTF-8
Packaged: 2025-06-26 16:55:57 UTC; gateskm
Author: Stephanie Lane [aut, trl], Kathleen M Gates [aut, cre, ccp], Zachary Fisher [aut], Cara Arizmendi [aut], Peter Molenaar [aut, ccp], Edgar Merkle [ctb], Michael Hallquist [ctb], Hallie Pike [ctb], Teague Henry [ctb], Kelly Duffy [ctb], Lan Luo [ctb], Adriene Beltz [csp], Aidan Wright [csp], Jonathan Park [ctb], Sebastian Castro Alvarez [ctb]
Repository: CRAN
Date/Publication: 2025-06-26 21:30:01 UTC

Group iterative multiple model estimation

Description

This package contains functions to automatically identify the structure of group- and individual-level networks from a range of vector autoregressive models, estimated with structural equation modeling.

Details

Researchers across varied domains gather multivariate data for each individual unit of study across multiple occasions of measurement. Generally referred to as time series (or in the social sciences, intensive longitudinal) data, examples include psychophysiological processes such as neuroimaging and heart rate variability, daily diary studies, ecological momentary assessments, data passively collected from devices such as smartphones, and observational coding of social interactions among dyads.

A primary goal for acquiring these data is to understand dynamic processes. The gimme package contains several functions for use with these data. These functions include gimmeSEM, which provides both group- and individual-level results by looking across individuals for patterns of relations among variables. A function that provides group-level results, aggSEM, is included, as well as a function that provides individual-level results, indSEM. The major functions within the gimme package all require the user to specify the data, although many additional options exist.

Author(s)

Stephanie Lane [aut, trl],
Kathleen Gates [aut, cre],
Zachary Fisher [aut],
Cara Arizmendi [aut],
Peter Molenaar [aut],
Michael Hallquist [ctb],
Hallie Pike [ctb],
Cara Arizmendi [ctb],
Teague Henry [ctb],
Kelly Duffy [ctb],
Lan Luo [ctb],
Adriene Beltz [csp] Maintainer: KM Gates gateskm@email.unc.edu


Hemodynamic Response Function (HRF) GIMME example.

Description

This object contains a list of simulated time series data for twenty-five individuals. Each data set has 500 time points and five variables. The fifth variable represents an onset vector for stimulation.

Usage

HRFsim

Format

A list of data frames with 25 individuals, who each have 500 observations on 5 variables.


Group-level structural equation model search.

Description

Concatenates all individual-level data files and fits a group model to the data.

Usage

aggSEM(data   = "",
       out    = "",
       sep    = "",
       header = "",
       ar     = TRUE,
       plot   = TRUE,
       paths  = NULL,
       exogenous = NULL, 
       outcome   = NULL, 
       conv_vars        = NULL,
       conv_length      = 16, 
       conv_interval    = 1, 
       mult_vars        = NULL,
       mean_center_mult = FALSE,
       standardize      = FALSE,
       hybrid = FALSE,
       VAR    = FALSE)

Arguments

data

The path to the directory where the data files are located, or the name of the list containing each individual's time series. Each file or matrix must contain one matrix for each individual containing a T (time) by p (number of variables) matrix where the columns represent variables and the rows represent time. If in list form, each item in the list (i.e., matrix) must be named.

out

The path to the directory where the results will be stored (optional). If specified, a copy of output files will be replaced in directory. If directory at specified path does not exist, it will be created.

sep

The spacing of the data files when data are in a directory. "" indicates space-delimited, "/t" indicates tab-delimited, "," indicates comma delimited. Only necessary to specify if reading data in from physical directory.

header

Logical. Indicate TRUE for data files with a header, FALSE otherwise. Only necessary to specify if reading data in from physical directory.

ar

Logical. If TRUE, begins search for group model with autoregressive (AR) paths open. Defaults to TRUE.

plot

Logical. If TRUE, figures depicting relations among variables of interest will automatically be created. For aggregate-level plot, red paths represent positive weights and blue paths represent negative weights. Dashed lines denote lagged relations (lag 1) and solid lines are contemporaneous (lag 0). Defaults to TRUE.

paths

lavaan-style syntax containing paths with which to begin model estimation (optional). That is, Y~X indicates that Y is regressed on X, or X predicts Y. Paths can also be set to a specific value for estimation using lavaan-style syntax (e.g., 'V4 ~ 0.5*V3'), or set to 0 so that they will not be estimated (e.g., 'V4 ~ 0*V3'). If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. To reference lag variables, "lag" should be added to the end of the variable name with no separation. Defaults to NULL.

exogenous

Vector of variable names to be treated as exogenous (optional). That is, exogenous variable X can predict Y but cannot be predicted by Y. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. Defaults to NULL.

outcome

Vector of variable names to be treated as outcome (optional). This is a variable that can be predicted by others but cannot predict. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names.

conv_vars

Vector of variable names to be convolved via smoothed Finite Impulse Response (sFIR). Defaults to NULL.

conv_length

Expected response length in seconds. For functional MRI BOLD, 16 seconds (default) is typical for the hemodynamic response function.

conv_interval

Interval between data acquisition. Currently conv_length/conv_interval must be a constant. For fMRI studies, this is the repetition time. Defaults to 1.

mult_vars

Vector of variable names to be multiplied to explore bilinear/modulatory effects (optional). All multiplied variables will be treated as exogenous (X can predict Y but cannot be predicted by Y). Within the vector, multiplication of two variables should be indicated with an asterik (e.g. V1*V2). If no header is used, variables should be referred to with V followed by the column number (with no separation). If a header is used, each variable should be referred to using variable names. If multiplication with the lag 1 of a variable is desired, the variable name should be followed by "lag" with no separation (e.g. V1*V2lag). Note that if multiplied variables are desired, at least one variable in the dataset must be specified as exogenous. Defaults to NULL.

mean_center_mult

Logical. If TRUE, the variables indicated in mult_vars will be mean-centered before being multiplied together. Defaults to FALSE.

standardize

Logical. If TRUE, all variables will be standardized to have a mean of zero and a standard deviation of one. Defaults to FALSE.

hybrid

Logical. If TRUE, enables hybrid-VAR models where both directed contemporaneous paths and contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE.

VAR

Logical. If true, VAR models where contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE.

Details

Output is a list of results if saved as an object and/or files printed to a directory if the "out" argument is used.

Author(s)

Stephanie Lane

Examples

## Not run: 
exFit <- aggSEM(data = ts)

## End(Not run)

plot(exFit)

Create tree structures for group search solutions.

Description

Create tree structures for group search solutions.

Usage

batch.create.tree(hist, ind_hist, ind_fit, subgroup, names.ts_list, sub)

Counts number of excellent fit indices

Description

Counts number of excellent fit indices

Usage

count.excellent(indices)

Arguments

indices

A vector of fit indices from lavaan.

Value

The number of fit indices that are excellent.


Create structure of group search solutions.

Description

Create structure of group search solutions.

Usage

create.tree(history, subgroup, individual = FALSE, all.ind = FALSE)

Determines subgroups.

Description

Determines subgroups.

Usage

determine.subgroups(
  data_list,
  base_syntax,
  n_subj,
  chisq_cutoff,
  file_order,
  elig_paths,
  confirm_subgroup,
  out_path = NULL,
  sub_feature,
  sub_method,
  sub_sim_thresh,
  hybrid,
  dir_prop_cutoff
)

Arguments

data_list

A list of all datasets.

base_syntax

A character vector containing syntax that never changes.

n_subj

The number of subjects in the sample.

chisq_cutoff

Cutoff used in order for MI to be considered significant.

file_order

A data frame containing the order of the files and the names of the files. Used to merge in subgroup assignment and preserve order.

elig_paths

A character vector containing eligible paths that gimme is allowed to add to the model. Ensures only EPCs from allowable paths are considered in the creation of the similarity matrix.

confirm_subgroup

A dataframe with the first column a string vector of data file names without extensions and the second vector a integer vector of subgroup labels.

Value

Returns sub object containing similarity matrix, the number of subgroups, the modularity associated with the subgroup memberships, and a data frame containing the file names and subgroup memberships.


Provides unique combinations of two vectors.

Description

Provides unique combinations of two vectors.

Usage

expand.grid.unique(x, y, incl.eq = TRUE)

Arguments

x

A character vector containing variable names.

y

A character vector containing variable names.

incl.eq

Logical. TRUE means that combinations are kept where a variable appears twice.

Value

The unique combinations of the variable names. Used in syntax creation.


Wrapup, create output files.

Description

Wrapup, create output files.

Usage

final.org(
  dat,
  grp,
  sub,
  sub_spec,
  diagnos = FALSE,
  store,
  confirm_subgroup,
  elig_paths = NULL
)

Arguments

dat

A list containing information created in setup().

grp

A list containing group-level information. NULL in aggSEM and indSEM.

sub

A list containing subgroup information.

sub_spec

A list containing information specific to each subgroup.

store

A list containing output from indiv.search().

elig_paths

if subgroup = TRUE, eligable paths for potential individual-level search

Value

Aggregated information, such as counts, levels, and plots.


Attempt to fit lavaan model.

Description

Attempt to fit lavaan model.

Usage

fit.model(syntax, data_file)

Arguments

syntax

A character vector containing syntax.

data_file

A data frame containing individual data set.

Value

If successful, returns fitted lavaan object. If not successful, catches and returns error message.


Grabs final coefficients for each individual.

Description

Grabs final coefficients for each individual.

Usage

get.params(dat, grp, ind, k, ms.print = TRUE)

Arguments

dat

A list containing information created in setup().

grp

A list containing group-level information. NULL in aggSEM and indSEM.

ind

A list containing individual- and (potentially) subgroup-level information.

k

The counter indicating the individual.

Value

Individual-level information on fit, coefficients, and plots.


Group iterative multiple model estimation.

Description

This function identifies structural equation models for each individual that consist of both group-level and individual-level paths.

Usage

gimmeSEM(data        = NULL,
         out         = NULL,
         sep         = NULL,
         header      = NULL,
         ar          = TRUE,
         plot        = TRUE,
         subgroup    = FALSE,
         sub_feature = "lag & contemp",
         sub_method = "Walktrap",
         sub_sim_thresh    = "lowest", 
         confirm_subgroup = NULL,
         paths       = NULL,
         exogenous = NULL,
         outcome   = NULL,
         conv_vars   = NULL,
         conv_length = 16, 
         conv_interval = 1,
         mult_vars   = NULL,
         mean_center_mult = FALSE,
         standardize = FALSE,
         groupcutoff = .75,
         subcutoff   = .75,
         diagnos     = FALSE, 
         ms_allow         = FALSE,
         ms_tol           = 1e-5,
         lv_model         = NULL, 
         lv_estimator     = "miiv",     
         lv_scores        = "regression",       
         lv_miiv_scaling  = "first.indicator", 
         lv_final_estimator = "miiv",
         lasso_model_crit    = NULL, 
         hybrid = FALSE,
         VAR = FALSE,
         dir_prop_cutoff =0,
         ordered = NULL,
         group_correct = "Bonferoni Group")

Arguments

data

The path to the directory where the data files are located, or the name of the list containing each individual's time series. Each file or matrix must contain one matrix for each individual containing a T (time) by p (number of variables) matrix where the columns represent variables and the rows represent time. Individuals must have the same variables (p) but can have different lengths of observations (T).

out

The path to the directory where the results will be stored (optional). If specified, a copy of output files will be replaced in directory. If directory at specified path does not exist, it will be created.

sep

The spacing of the data files. Follows R convention. "" indicates space-delimited, backslash "t" indicates tab-delimited, "," indicates comma delimited. Only necessary to specify if reading data in from physical directory.

header

Logical. Indicate TRUE for data files with a header. Only necessary to specify if reading data in from physical directory.

ar

Logical. If TRUE, begins search for group model with autoregressive (AR) paths freed for estimation. If ms_allow=TRUE, it is recommended to set ar=FALSE. Multiple solutions are unlikely to be found when ar=TRUE. Defaults to TRUE.

plot

Logical. If TRUE, graphs depicting relations among variables of interest will automatically be created. Solid lines represent contemporaneous relations (lag 0) and dashed lines reflect lagged relations (lag 1). For individual-level plots, red paths represent positive weights and blue paths represent negative weights. Width of paths corresponds to estimated path weight. For the group-level plot, black represents group-level paths, grey represents individual-level paths, and (if subgroup = TRUE) green represents subgroup-level paths. For the group-level plot, the width of the edge corresponds to the count. Defaults to TRUE.

subgroup

Logical. If TRUE, subgroups are generated based on similarities in model features using the walktrap.community function from the igraph package. When ms_allow=TRUE, subgroup should be set to FALSE. Defaults to FALSE.

sub_feature

Option to indicate feature(s) used to subgroup individuals. Defaults to "lag & contemp" for lagged and contemporaneous, which is the original method. Can use "lagged" or "contemp" to subgroup solely on features related to lagged and contemporaneous relations, respectively.

sub_method

Community detection method used to cluster individuals into subgroups. Options align with those available in the igraph package: "Walktrap" (default), "Infomap", "Louvain", "Edge Betweenness", "Label Prop", "Fast Greedy", "Leading Eigen", and "Spinglass".

sub_sim_thresh

Threshold for inducing sparsity in similarity matrix. Options are: the percent of edges in the similarity matrix to set to zero (e.g., .25 would set the lower quartile), "lowest" (default) subtracts the minimum value from all values, and "search" searches across thresholds to arrive at one providing highest modularity.

confirm_subgroup

Dataframe. Option only available when subgroup = TRUE. Dataframe should contain two columns. The first column should specify file labels (the name of the data files without file extension), and the second should contain integer values (beginning at 1) specifying the subgroup membership for each individual. function from the igraph package. Defaults to TRUE.

paths

lavaan-style syntax containing paths with which to begin model estimation (optional). That is, Y~X indicates that Y is regressed on X, or X predicts Y. Paths can also be set to a specific value for estimation using lavaan-style syntax (e.g., 'V4 ~ 0.5*V3'), or set to 0 so that they will not be estimated (e.g., 'V4 ~ 0*V3'). If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. To reference lag variables, "lag" should be added to the end of the variable name with no separation. Defaults to NULL.

exogenous

Vector of variable names to be treated as exogenous (optional). That is, exogenous variable X can predict Y but cannot be predicted by Y. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. The default for exogenous variables is that lagged effects of the exogenous variables are not included in the model search. If lagged paths are wanted, "&lag" should be added to the end of the variable name with no separation. Defaults to NULL.

outcome

Vector of variable names to be treated as outcome (optional). This is a variable that can be predicted by others but cannot predict. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names.

conv_vars

Vector of variable names to be convolved via smoothed Finite Impulse Response (sFIR). Note, conv_vars are not not automatically considered exogenous variables. To treat conv_vars as exogenous use the exogenous argument. Variables listed in conv_vars must be binary variables. You cannot do lagged variables. If there is missing data in the endogenous variables their values will be imputed for the convolution operation only. Defaults to NULL.

conv_length

Expected response length in seconds. For functional MRI BOLD, 16 seconds (default) is typical for the hemodynamic response function.

conv_interval

Interval between data acquisition. Currently conv_length/conv_interval must be an integer. For fMRI studies, this is the repetition time. Defaults to 1.

mult_vars

Vector of variable names to be multiplied to explore bilinear/modulatory effects (optional). All multiplied variables will be treated as exogenous (X can predict Y but cannot be predicted by Y). Within the vector, multiplication of two variables should be indicated with an asterik (e.g. V1*V2). If no header is used, variables should be referred to with V followed by the column number (with no separation). If a header is used, each variable should be referred to using variable names. If multiplication with the lag 1 of a variable is desired, the variable name should be followed by "lag" with no separation (e.g. V1*V2lag).

mean_center_mult

Logical. If TRUE, the variables indicated in mult_vars will be mean-centered before being multiplied together. Defaults to FALSE.

standardize

Logical. If TRUE, all variables will be standardized to have a mean of zero and a standard deviation of one. Defaults to FALSE

groupcutoff

Cutoff value for group-level paths. Defaults to .75, indicating that a path must be significant across 75% of individuals to be included as a group-level path.

subcutoff

Cutoff value for subgroup- level paths. Defaults to .75, indicating that a path must be significant across at least 75% of the individuals in a subgroup to be considered a subgroup-level path.

diagnos

Logical. If TRUE provides internal output for diagnostic purposes. Defaults to FALSE.

ms_allow

Logical. If TRUE provides multiple solutions when more than one path has identical modification index values. When ms_allow=TRUE, it is recommended to set ar=FALSE. Multiple solutions are unlikely to be found when ar=TRUE. Additionally, subgroup should be set to FALSE. Output files for individuals with multiple solutions will represent the last solution found for the individual, not necessarily the best solution for the individual.

ms_tol

Precision used when evaluating similarity of modification indices when ms_allow = TRUE. We recommend that ms_tol not be greater than the default, especially when standardize=TRUE. Defaults to 1e-5.

lv_model

Invoke latent variable modeling by providing the measurement model syntax here. lavaan conventions are used for relating observed variables to factors. Defaults to NULL.

lv_estimator

Estimator used for factor analysis. Options are "miiv" (default), "pml" (pseudo-ML) or "svd".

lv_scores

Method used for estimating latent variable scores from parameters obtained from the factor analysis when lv_model is not NULL. Options are: "regression" (Default), "bartlett".

lv_miiv_scaling

Type of scaling indicator to use when "miiv" selected for lv_estimator. Options are "first.indicator" (Default; the first observed variable in the measurement equation is used), "group" (best one for the group), or "individual" (each individual has the best one for them according to R2).

lv_final_estimator

Estimator for final estimations. "miiv" (Default) or "pml" (pseudo-ML).

lasso_model_crit

When not null, invokes multiLASSO approach for the GIMME model search procedure. Arguments indicate the model selection criterion to use for model selection: 'bic' (select on BIC), 'aic', 'aicc', 'hqc', 'cv' (cross-validation).

hybrid

Logical. If TRUE, enables hybrid-VAR models where both directed contemporaneous paths and contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE.

VAR

Logical. If true, VAR models where contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE.

dir_prop_cutoff

Option to require that the directionality of a relation has to be higher than the reverse direction for a prespecified proportion of indivdiuals.

ordered

A character vector containing the names of all ordered categorical variables in the model.

group_correct

Indicate how to correct for multiple testing. "Bonferoni Group" (Default) corrects the alpha value for the number of people (N) in th sample; "Bonferoni Paths" corrects according to the number of eligible paths for that individual; a numeric <1 and >0 can be entered to indicate the alpha level desired.

Details

Output is a list of results if saved as an object and/or files printed to a directory if the "out" argument is used.

Value

A list with the following components:

Author(s)

Zachary Fisher, Kathleen Gates, & Stephanie Lane

References

Gates, K.M. & Molenaar, P.C.M. (2012). Group search algorithm recovers effective connectivity maps for individuals in homogeneous and heterogeneous samples. NeuroImage, 63, 310-319.

Lane, S.T. & Gates, K.M. (2017). Automated selection of robust individual-level structural equation models for time series data. Structural Equation Modeling.

Adriene M. Beltz & Peter C. M. Molenaar (2016) Dealing with Multiple Solutions in Structural Vector Autoregressive Models, Multivariate Behavioral Research, 51:2-3, 357-373.

Examples

 ## Not run: 
paths <- 'V2 ~ V1
          V3 ~ V4lag'

fit <- gimmeSEM(data     = simData,
                out      = "C:/simData_out",
                subgroup = TRUE, 
                paths    = paths)

print(fit, mean = TRUE)
print(fit, subgroup = 1, mean = TRUE)
print(fit, file = "group_1_1", estimates = TRUE)
print(fit, subgroup = 2, fitMeasures = TRUE)
plot(fit, file = "group_1_1")
 
## End(Not run)

Write MS-GIMME results to data.frame.

Description

Write MS-GIMME results to data.frame.

Usage

gimmems.write(x)

Identifies highest MI from list of MIs.

Description

Identifies highest MI from list of MIs.

Usage

highest.mi(
  mi_list,
  indices,
  elig_paths,
  prop_cutoff,
  n_subj,
  chisq_cutoff,
  allow.mult,
  ms_tol,
  hybrid,
  dir_prop_cutoff
)

Arguments

mi_list

A list of MIs across individuals

indices

A list of fit indices. Only relevant at the individual-level.

elig_paths

A character vector containing eligible paths that gimme is allowed to add to a model (e.g., no nonsense paths).

prop_cutoff

The proportion of individuals for whom a path must be significant in order for it to be added to the models. NULL if used at the individual-level.

n_subj

The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1.

chisq_cutoff

Cutoff used in order for MI to be considered significant. Value varies depending on stage of search (e.g., group, subgroup, individual).

Value

Returns name of parameter associated with highest MI. If no MI meets the criteria, returns NA.


Individual-level structural equation model search.

Description

This function identifies structural equation models for each individual. It does not utilize any shared information from the sample.

Usage

indSEM(data   = NULL,
       out    = NULL,
       sep    = NULL,
       header = NULL,
       ar     = TRUE,
       plot   = TRUE,
       paths  = NULL,
       exogenous        = NULL, 
       outcome          = NULL, 
       conv_vars        = NULL,
       conv_length      = 16, 
       conv_interval    = 1,
       mult_vars        = NULL,
       mean_center_mult = FALSE,
       standardize      = FALSE,
       hybrid = FALSE,
       VAR    = FALSE)

Arguments

data

The path to the directory where the data files are located, or the name of the list containing each individual's time series. Each file or matrix must contain one matrix for each individual containing a T (time) by p (number of variables) matrix where the columns represent variables and the rows represent time.

out

The path to the directory where the results will be stored (optional). If specified, a copy of output files will be replaced in directory. If directory at specified path does not exist, it will be created.

sep

The spacing of the data files. "" indicates space-delimited, "/t" indicates tab-delimited, "," indicates comma delimited. Only necessary to specify if reading data in from physical directory.

header

Logical. Indicate TRUE for data files with a header. Only necessary to specify if reading data in from physical directory.

ar

Logical. If TRUE, begins search for individual models with autoregressive (AR) paths open. Defaults to TRUE.

plot

Logical. If TRUE, graphs depicting relations among variables of interest will automatically be created. Defaults to TRUE. For individual- level plots, red paths represent positive weights and blue paths represent negative weights.

paths

lavaan-style syntax containing paths with which to begin model estimation. That is, Y~X indicates that Y is regressed on X, or X predicts Y. Paths can also be set to a specific value for estimation using lavaan-style syntax (e.g., 'V4 ~ 0.5*V3'), or set to 0 so that they will not be estimated (e.g., 'V4 ~ 0*V3'). If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. To reference lag variables, "lag" should be added to the end of the variable name with no separation. Defaults to NULL.

exogenous

Vector of variable names to be treated as exogenous. That is, exogenous variable X can predict Y but cannot be predicted by Y. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. Defaults to NULL.

outcome

Vector of variable names to be treated as outcome (optional). This is a variable that can be predicted by others but cannot predict. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names.

conv_vars

Vector of variable names to be convolved via smoothed Finite Impulse Response (sFIR). Defaults to NULL.

conv_length

Expected response length in seconds. For functional MRI BOLD, 16 seconds (default) is typical for the hemodynamic response function.

conv_interval

Interval between data acquisition. Currently conv_length/conv_interval must be a constant. For fMRI studies, this is the repetition time. Defaults to 1.

mult_vars

Vector of variable names to be multiplied to explore bilinear/modulatory effects (optional). All multiplied variables will be treated as exogenous (X can predict Y but cannot be predicted by Y). Within the vector, multiplication of two variables should be indicated with an asterik (e.g. V1*V2). If no header is used, variables should be referred to with V followed by the column number (with no separation). If a header is used, each variable should be referred to using variable names. If multiplication with the lag 1 of a variable is desired, the variable name should be followed by "lag" with no separation (e.g. V1*V2lag). Note that if multiplied variables are desired, at least one variable in the dataset must be specified as exogenous. Defaults to NULL.

mean_center_mult

Logical. If TRUE, the variables indicated in mult_vars will be mean-centered before being multiplied together. Defaults to FALSE.

standardize

Logical. If TRUE, all variables will be standardized to have a mean of zero and a standard deviation of one. Defaults to FALSE.

hybrid

Logical. If TRUE, enables hybrid-VAR models where both directed contemporaneous paths and contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE.

VAR

Logical. If true, VAR models where contemporaneous covariances among residuals are candidate relations in the search space. Defaults to FALSE.

Details

Output is a list of results if saved as an object and/or files printed to a directory if the "out" argument is used.

Author(s)

Stephanie Lane

Examples

 ## Not run: 
fit <- indSEM(data   = "C:/data100",
              out    = "C:/data100_indSEM_out",
              sep    = ",",
              header = FALSE)
print(fit, file = "group1.1", estimates = TRUE)
plot(fit, file = "group1.1")
 
## End(Not run)

Individual-level search. Used in gimmeSEM, aggSEM, indSEM.

Description

Individual-level search. Used in gimmeSEM, aggSEM, indSEM.

Usage

indiv.search(dat, grp, ind, ind_cutoff = NULL, ind_z_cutoff = 1.96)

Arguments

dat

A list containing information created in setup().

grp

A list containing group-level information. NULL in aggSEM and indSEM.

ind

A list containing individual- and (potentially) subgroup-level information.

ind_cutoff

Chi square cutoff, .05 level adjusted for multiple tests.

ind_z_cutoff

Z score cutoff, .05 level adjusted for multiple tests.

Value

Lists associated with coefficients, fit indices, etc.


Individual-level search. Used in gimmeSEM, aggSEM, indSEM.

Description

Individual-level search. Used in gimmeSEM, aggSEM, indSEM.

Usage

indiv.search.ms(dat, grp, ind, ms_tol, ms_allow, grp_num)

Arguments

dat

A list containing information created in setup().

grp

A list containing group-level information. NULL in aggSEM and indSEM.

ind

A list containing individual- and (potentially) subgroup-level information.

Value

Lists associated with coefficients, fit indices, etc.


Identifies lowest z value from list of z values.

Description

Identifies lowest z value from list of z values.

Usage

lowest.z(z_list, elig_paths, prop_cutoff, n_subj, test_cutoff)

Arguments

z_list

A list of z values across individuals.

elig_paths

A character vector containing eligible paths that gimme is allowed to drop from the model at a given stage.

prop_cutoff

The proportion of individuals for whom a path must be nonsignificant in order for it to be dropped from the models. NULL if used at the individual-level.

n_subj

The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1.

test_cutoff

Z score cutoff for significance testing.

Value

Returns name of parameter associated with lowest z. If no z meets the criteria, returns NA.


Fitted gimme object with multiple solutions

Description

This object contains a fitted gimme object where multiple solutions gimme was used. The simulated data had 25 individuals, each with 100 time points.

Usage

ms.fit

Format

A fitted gimme object, where multiple solutions gimme was used.


GIMME Predicted Values.

Description

This function calculates the predicted values of a fitted gimme model.

Usage

predict.gimme(x)

Arguments

x

A fitted gimme object.

Value

List of data frames. Each data frame contains the predicted values of a subject in the data.

Author(s)

Sebastian Castro-Alvarez

Examples

 ## Not run: 
paths <- 'V2 ~ V1
          V3 ~ V4lag'

fit <- gimmeSEM(data     = simData,
                out      = "C:/simData_out",
                subgroup = TRUE, 
                paths    = paths)

predictions <- predict.gimme(fit)
 
## End(Not run)

Prunes paths. Ties together lowest.z and return.zs functions.

Description

Prunes paths. Ties together lowest.z and return.zs functions.

Usage

prune.paths(
  base_syntax,
  fixed_syntax,
  add_syntax,
  data_list,
  n_paths,
  n_subj,
  prop_cutoff,
  elig_paths,
  subgroup_stage = FALSE,
  test_cutoff
)

Arguments

base_syntax

A character vector containing syntax that never changes.

fixed_syntax

A character vector containing syntax that does not change in a given stage of pruning.

add_syntax

A character vector containing the syntax that is allowed to change in a given stage of pruning.

data_list

A list of datasets to be used in a given stage of the search. Varies based on group, subgroup, or individual-level stage.

n_paths

The number of paths that are eligible for pruning. Equal to the number of paths in add_syntax.

n_subj

The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1.

prop_cutoff

The proportion of individuals for whom a path must be nonsignificant in order for it to be dropped from the models. NULL if used at the individual-level.

elig_paths

A character vector containing eligible paths that gimme is allowed to drop from the model at a given stage.

subgroup_stage

Logical. Only present in order to instruct gimme what message to print to console using writeLines.

test_cutoff

Z score cutoff for significance testing.

Value

Returns updated values of n_paths and add_syntax.


Recode variable names.

Description

Recode variable names.

Usage

recode.vars(data, oldvalue, newvalue)

Arguments

data

The vector of variable names to be recoded

oldvalue

A vector containing the latent variable names used internally.

newvalue

A vector containing the observed variable names, either provided by the user (as a header) or provided by R (e.g., V1, V2).

Value

Recoded vector of variable names.


GIMME Residuals.

Description

This function calculates the unstandardized and standardized residuals of a fitted gimme model.

Usage

residuals.gimme(x, lag)

Arguments

x

A fitted gimme object.

lag

The number of lags tested in the Box-Pierce and Ljung-Box tests of the residuals. If user does not specify a value, default is the smaller of 10 or the length of the time series divided by 5.

Value

List of four lists of data frames:

residuals

List of the unstandardized residuals per subject.

standardized.residuals

List of the standardized residuals per subject.

Box.Pierce.test

List of the results of the Box-Pierce test for each subject's residuals.

Ljung.Box.test

List of the results of the Ljung-Box test for each subject's residuals.

Author(s)

Sebastian Castro-Alvarez

Examples

 ## Not run: 
paths <- 'V2 ~ V1
          V3 ~ V4lag'

fit <- gimmeSEM(data     = simData,
                out      = "C:/simData_out",
                subgroup = TRUE,
                paths    = paths)

residuals <- residuals.gimme(fit)
residuals <- residuals.gimme(fit, lag = 5)
 
## End(Not run)

Returns MIs from lavaan fit object.

Description

Returns MIs from lavaan fit object.

Usage

return.mis(fit, elig_paths)

Arguments

fit

An object from lavaan.

Value

If successful, returns MIs for an individual. If unsuccessful, returns NA.


Returns z values from lavaan fit object.

Description

Returns z values from lavaan fit object.

Usage

return.zs(fit, elig_paths)

Arguments

fit

An object from lavaan.

elig_paths

eligable paths at this stage. For subgrouping, group and fixed paths. For pruning, only group paths.

Value

If successful, returns z values for an individual. If unsuccessful, returns NA.


Estimate response function for each person using smoothed Finite Impulse Response.

Description

Estimate response function for each person using smoothed Finite Impulse Response.

Usage

sFIR(data, stimuli, response_length = 16, interval = 1)

Arguments

data

The data to be used to estimate response function

stimuli

A vector containing '0' when the stimuli of interest is not present and '1' otherwise. Number of observations across time must equal the length of data.

interval

Time between observations; for fMRI this is the repetition time. Defaults to 1.

Value

Shape of response function and convolved time series vector.


Searches for paths. Ties together highest.mi and return.mis functions.

Description

Searches for paths. Ties together highest.mi and return.mis functions.

Usage

search.paths(
  base_syntax,
  fixed_syntax,
  add_syntax,
  n_paths,
  data_list,
  elig_paths,
  prop_cutoff,
  n_subj,
  chisq_cutoff,
  subgroup_stage = FALSE,
  ms_allow = FALSE,
  ms_tol = 1e-06,
  hybrid = F,
  dir_prop_cutoff = 0
)

Arguments

base_syntax

A character vector containing syntax that never changes.

fixed_syntax

A character vector containing syntax that does not change in a given stage of searching.

add_syntax

A character vector containing the syntax that is allowed to change in a given stage of searching.

n_paths

The number of paths present in a given stage of searching. Equal to the number of paths in add_syntax.

data_list

A list of datasets to be used in a given stage of the search. Varies based on group, subgroup, or individual-level stage.

elig_paths

A character vector containing eligible paths that gimme is allowed to add to the model at a given stage.

prop_cutoff

The proportion of individuals for whom a path must be nonsignificant in order for it to be dropped from the models. NULL if used at the individual-level.

n_subj

The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1.

chisq_cutoff

Cutoff used in order for MI to be considered significant. Value varies depending on stage of search (e.g., group, subgroup, individual).

subgroup_stage

Logical. Only present in order to instruct gimme what message to print to console using writeLines.

Value

Returns updated values of n_paths and add_syntax.


Searches for individual-level paths. Ties together highest.mi, return.mis, prune, and get.params functions.

Description

Searches for individual-level paths. Ties together highest.mi, return.mis, prune, and get.params functions.

Usage

search.paths.ind(
  dat,
  k,
  data_list,
  base_syntax,
  fixed_syntax,
  elig_paths,
  prop_cutoff,
  n_subj,
  chisq_cutoff,
  subgroup_stage,
  hybrid,
  dir_prop_cutoff,
  ind_z_cutoff
)

Arguments

dat

Object created at beginning of gimme containing static info.

k

Which individual this is.

data_list

A list of datasets to be used in a given stage of the search. Varies based on group, subgroup, or individual-level stage.

base_syntax

A character vector containing syntax that never changes.

fixed_syntax

A character vector containing syntax that does not change in a given stage of searching.

elig_paths

A character vector containing eligible paths that gimme is allowed to add to the model at a given stage.

prop_cutoff

The proportion of individuals for whom a path must be nonsignificant in order for it to be dropped from the models. NULL if used at the individual-level.

n_subj

The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1.

chisq_cutoff

Cutoff used in order for MI to be considered significant. Value varies depending on stage of search (e.g., group, subgroup, individual).

subgroup_stage

Logical. Only present in order to instruct gimme what message to print to console using writeLines.

Value

Returns updated values of n_paths and add_syntax.


Searches for paths. Ties together highest.mi and return.mis functions.

Description

Searches for paths. Ties together highest.mi and return.mis functions.

Usage

search.paths.ms(
  obj,
  data_list,
  base_syntax,
  fixed_syntax,
  elig_paths,
  prop_cutoff,
  n_subj,
  chisq_cutoff,
  subgroup_stage,
  ms_allow,
  ms_tol,
  hybrid,
  dir_prop_cutoff
)

Arguments

data_list

A list of datasets to be used in a given stage of the search. Varies based on group, subgroup, or individual-level stage.

base_syntax

A character vector containing syntax that never changes.

fixed_syntax

A character vector containing syntax that does not change in a given stage of searching.

elig_paths

A character vector containing eligible paths that gimme is allowed to add to the model at a given stage.

prop_cutoff

The proportion of individuals for whom a path must be nonsignificant in order for it to be dropped from the models. NULL if used at the individual-level.

n_subj

The number of subjects in a given stage of the search. If in the group stage, n_subj equals the number of subjects. If in the subgroup stage, n_subj equals the number of individuals in a given subgroup. At the individual stage, n_subj = 1.

chisq_cutoff

Cutoff used in order for MI to be considered significant. Value varies depending on stage of search (e.g., group, subgroup, individual).

subgroup_stage

Logical. Only present in order to instruct gimme what message to print to console using writeLines.

Value

Returns updated values of n_paths and add_syntax.


Set up base syntax file.

Description

Set up base syntax file.

Usage

setupBaseSyntax(paths, remove, varLabels, ctrlOpts)

Group iterative multiple model estimation.

Description

This function estimates the basis vectors related to responses following a binary impulse and convolves that binary impulse vector.

Usage

convolveFIR(ts_list = NULL, 
     varLabels = NULL, 
     conv_length = 16, 
     conv_interval = 1)

Arguments

ts_list

a list of dataframes.

varLabels

a list of variable sets. Contains varLabels$coln, all column names, varLabels$conv, the names of variables to convolve, and varLabels$exog, a list of exogenous variables (if any).

conv_length

Expected response length in seconds. For functional MRI BOLD, 16 seconds (default) is typical for the hemodynamic response function.

conv_interval

Interval between data acquisition. Currently must be a scalar For fMRI studies, this is the repetition time. Defaults to 1.


Create a list of dataframes

Description

Create a list of dataframes

Usage

setupDataLists(data, ctrlOpts = NULL, lv_model = NULL)

Arguments

data

a list or directory.

ctrlOpts

a lit of control options.


Get names for bilinear effects.

Description

Get names for bilinear effects.

Usage

setupMultVarNames(mult_vars)

Do some preliminary checks on the data.

Description

Do some preliminary checks on the data.

Usage

setupPrelimDataChecks(df)

Allows user to open and close certain paths.

Description

Allows user to open and close certain paths.

Usage

setupPrepPaths(paths, varLabels, ctrlOpts)

Arguments

paths

lavaan-style syntax containing paths with which to begin model estimation (optional). That is, Y~X indicates that Y is regressed on X, or X predicts Y. If no header is used, then variables should be referred to with V followed (with no separation) by the column number. If a header is used, variables should be referred to using variable names. To reference lag variables, "lag" should be added to the end of the variable name with no separation. Defaults to NULL.


Transform raw data as required.

Description

Transform raw data as required.

Usage

setupTransformData(
  ts_list = NULL,
  varLabels = NULL,
  ctrlOpts = NULL,
  ms_allow = FALSE
)

Arguments

ts_list

a list or directory

varLabels

Variable labels.

ctrlOpts

List used in setup function.


Large example, heterogeneous data, group, subgroup, and individual level effects.

Description

This object contains a list of simulated time series data for twenty-five individuals with 200 time points and 10 variables, or regions of interest.

Usage

simData

Format

A list of data frames with 25 individuals, who each have 200 observations on 10 variables.


Latent variable example, heterogeneous data, group, subgroup level effects.

Description

This object contains a list of simulated time series data for twenty individuals with 500 time points and 9 variables, or regions of interest.

Usage

simDataLV

Format

A list of data frames with 20 individuals, who each have 500 observations on 9 variables.


Simulate data from Vector AutoRegression (VAR) models.

Description

This function simulates data. It allows for structural VAR and VAR data generating models.

Usage

simulateVAR(A   = NULL, 
            Phi       = NULL, 
            Psi       = NULL, 
            subAssign = NULL, 
            N         = NULL, 
            ASign     = "random",  
            PhiSign   = "random",  
            Obs       = NULL,
            indA      = 0.01, 
            indPhi    = 0.01,
            indPsi    = 0.00)

Arguments

A

A matrix (for no subgroups) or list of A matrices, with slice # = # of subgroups.

Phi

Phi matrix (for no subgroups) or list of Phi matrices, with slice # = # of subgroups.

Psi

matrix (for no subgroups) or list of Psi matrices, with slice # = # of subgroups.

subAssign

Optional vector of length N that indicates which subgroup each individual is in.

N

Number of indvidiuals.

ASign

Defaults to "random" for ind level paths, with 50 percent chance of positive and 50 percent negative, other option is either "neg" or "pos" which provides all negative or all positive relations, respectively.

PhiSign

Defaults to "random" for ind level paths, with 50 percent chance of positive and 50 percent negative, other option is either "neg" or "pos" which provides all negative or all positive relations, respectively.

Obs

Number of observations (T) per individual. Burn in of 400 is used to generate then discarded.

indA

Sparsity of individual-level A paths. 0 indicates no individual-level. Use decimals. Default is 0.01, meaning that each path that is not in the group-level A matrix has a 0.01 chance of being added.

indPhi

Sparsity of individual-level Phi paths. 0 indicates no individual-level. Use decimals. Default is 0.01, meaning that each path that is not in the group-level Phi matrix has a 0.01 chance of being added.

indPsi

Sparsity of individual-level Psi paths. 0 indicates no individual-level. Use decimals. Default is 0, meaning that each path that is not in the group-level Psi matrix has a 0 chance of being added at the ind. level. Individual- level paths added at this rate per individual.

Author(s)

KM Gates, Ai Ye, Ethan McCormick, & Zachary Fisher


Solution trees for multiple solutions gimme.

Description

This function allows for the exploration of divergences in multiple solutions gimme for both the group and individuals.

Usage

solution.tree(x,
              level     =  c("group", "individual"),
              cols      =  NULL,
              ids       =  "all",
              plot.tree =  FALSE)

Arguments

x

A fitted gimme object.

level

A character vector indicating what levels of the solution tree you would like returned. Options are "group", "individual", or c("group", "individual"). Defaults to c("group", "individual").

cols

A character vector indicating additional information to include in tree plot. Options include "stage", "pruned", "rmsea", "nnfi", "cfi","srmr", "grp_sol", "bic", "aic", "modularity." Defaults to NULL.

ids

A character vector indicating the names of subjects to print. Defaults to "all."

plot.tree

Logical. If TRUE, plot of tree is produced. Defaults to FALSE.

Details

solution.tree


Create structure of group search solutions.

Description

Create structure of group search solutions.

Usage

subgroupStage(
  dat,
  grp,
  confirm_subgroup,
  elig_paths,
  sub_feature,
  sub_method,
  ms_tol,
  ms_allow,
  sub_sim_thresh,
  hybrid,
  dir_prop_cutoff,
  group_correct
)

Create summary matrix of path counts and subgroup plots

Description

Create summary matrix of path counts and subgroup plots

Usage

summaryPathsCounts(dat, grp, store, sub, sub_spec)

Arguments

dat

A list containing information created in setup().

grp

A list containing group-level information. NULL in aggSEM and indSEM.

store

A list containing output from indiv.search().

sub

A list containing subgroup information.

sub_spec

A list containing information specific to each subgroup.

Value

Aggregated information, such as counts, levels, and plots.


Small example, heterogeneous data, group and individual level effects

Description

This object contains a list of simulated time series data for five individuals with 50 time points and 3 variables, or regions of interest.

Usage

ts

Format

A list of data frames with 5 individuals, who each have 50 observations on 3 variables.


Create edge list from weight matrix.

Description

Create edge list from weight matrix.

Usage

w2e(x)

Arguments

x

The coefficient matrix from an individual

Value

A list of all non-zero edges to feed to qgraph