Help for package mmiCATs

Title:

Cluster Adjusted t Statistic Applications

Version:

0.2.0

Description:

Simulation results detailed in Esarey and Menger (2019) <doi:10.1017/psrm.2017.42> demonstrate that cluster adjusted t statistics (CATs) are an effective method for correcting standard errors in scenarios with a small number of clusters. The 'mmiCATs' package offers a suite of tools for working with CATs. The mmiCATs() function initiates a 'shiny' web application, facilitating the analysis of data utilizing CATs, as implemented in the cluster.im.glm() function from the 'clusterSEs' package. Additionally, the pwr_func_lmer() function is designed to simplify the process of conducting simulations to compare mixed effects models with CATs models. For educational purposes, the CloseCATs() function launches a 'shiny' application card game, aimed at enhancing users' understanding of the conditions under which CATs should be preferred over random intercept models.

License:

MIT + file LICENSE

Encoding:

UTF-8

RoxygenNote:

7.3.2

URL:

https://github.com/mightymetrika/mmiCATs

BugReports:

https://github.com/mightymetrika/mmiCATs/issues

Imports:

broom, broom.mixed, clusterSEs, DT, lmerTest, MASS, mmcards, pool, robust, robustbase, RPostgres, shiny, shinythemes

Suggests:

testthat (≥ 3.0.0)

Config/testthat/edition:

NeedsCompilation:

Packaged:

2024-08-26 00:38:48 UTC; Administrator

Author:

Mackson Ncube [aut, cre], mightymetrika, LLC [cph, fnd]

Maintainer:

Mackson Ncube <macksonncube.stats@gmail.com>

Repository:

CRAN

Date/Publication:

2024-08-26 04:20:01 UTC

CloseCATs Shiny Application

Description

This function creates and runs a Shiny application for the CloseCATs game. The application provides a user interface for setting up the game, dealing cards, swapping cards, and scoring the game based on statistical computations. The game involves dealing cards to players and the computer, allowing the player to swap cards in their column, and scoring the game based on the mispecification distance calculated from the processed hands.

Usage

CloseCATs()

Details

The UI allows players to input various statistical parameters and preferences for the game setup. It also provides interactive elements for dealing cards, swapping cards within a column, and scoring the game based on the calculated mispecification distance.

The main components of the Shiny application include:

A sidebar for inputting game parameters and controls for dealing and scoring.
A main panel for displaying game cards, swap options, and results.
Reactive elements that update based on user interaction and game state.

Value

A Shiny app object which can be run to start the application.

Examples

# To run the CloseCATs Shiny application:
if(interactive()){
  CloseCATs()
}

Launch KenRCATs Shiny Application

Description

This function launches a 'shiny' application for conducting power analysis simulations using CATs (Clustered Adjusted t-statistics) and Kenward-Roger methods. The app allows users to input simulation parameters, run simulations, view results, and manage data in a PostgreSQL database.

Usage

KenRCATs(dbname, datatable, host, port, user, password)

Arguments

dbname

Character string specifying the name of the PostgreSQL database.

datatable

Character string specifying the name of the table in the database.

host

Character string specifying the host name or IP address of the database server.

port

Integer specifying the port number on which the database is running.

user

Character string specifying the username for database connection.

password

Character string specifying the password for database connection.

Details

The KenRCATs function sets up a Shiny application with the following features:

Input fields for various simulation parameters
Ability to run power analysis simulations
Display of simulation results
Option to submit results to a PostgreSQL database
Functionality to download data from the database
Display of relevant citations

Value

A 'shiny' app object.

Examples

if(interactive()){
  KenRCATs(
    dbname = "your_database_name",
    datatable = "your_table_name",
    host = "localhost",
    port = 5432,
    user = "your_username",
    password = "your_password"
  )
}

Append KenRCATs Results with Input Parameters

Description

Append KenRCATs Results with Input Parameters

Usage

append_KenRCATs(df, input)

Arguments

df

A data frame containing simulation results.

input

A list of input parameters from the Shiny app.

Value

A list containing two data frames: the original results and the results with appended input parameters.

Column Swap in CloseCATs Game Grid

Description

This internal function performs a column swap in the CloseCATs game grid. It is designed to allow the player to swap the cards in their column (column 2). This swap changes the contribution of the cards to the random slope variance and the covariance between the random slope and the random intercept. The function modifies the order of cards in the specified column by reversing their positions.

Usage

cc_swapper(cards_matrix, swap_in_col = NULL)

Arguments

cards_matrix

A matrix representing the current state of the game grid. The matrix should have 2 rows and a number of columns equal to the number of players (including the computer). Each cell of the matrix contains a card.

swap_in_col

The column number where the swap should be performed. If this parameter is NULL or not equal to 2, no action is taken. The default value is NULL. Typically, this parameter should be set to 2 to perform a swap for the player's column.

Value

The modified game grid matrix after performing the swap in the specified column.

Cluster-Adjusted Confidence Intervals And p-Values Robust GLMs

Description

Performs cluster-adjusted inference on a robust generalized linear model object, using robust generalized linear regression within each cluster. This function is tailored for models where observations are clustered, and standard errors need adjustment for clustering. The function applies a robust generalized linear regression model to each cluster using the specified family and method, and then aggregates the results.

Usage

cluster_im_glmRob(
  robmod,
  dat,
  cluster,
  ci.level = 0.95,
  drop = TRUE,
  return.vcv = FALSE,
  engine = "robust",
  ...
)

Arguments

robmod

A robust generalized linear model object created using robust::glmRob() or robustbase::glmrob(). It must contain elements 'formula', 'family', and 'method'.

dat

A data frame containing the data used in the model.

cluster

A formula indicating the clustering variable in dat.

ci.level

Confidence level for the confidence intervals, default is 0.95.

drop

Logical; if TRUE, drops clusters where the model does not converge.

return.vcv

Logical; if TRUE, the variance-covariance matrix of the cluster-averaged coefficients will be returned.

engine

Set the engine to "robust" to use robust::glmRob() or "robustbase" to use robustbase::glmrob(). Default is "robust".

...

Additional arguments to be passed to robust::glmRob() or robustbase::glmrob().

Value

An invisible list containing the following elements:

p.values: A matrix of p-values for each independent variable.
ci: A matrix with the lower and upper bounds of the confidence intervals for each independent variable.
vcv.hat: The variance-covariance matrix of the cluster-averaged coefficients, returned if return.vcv is TRUE.
beta.bar: The cluster-averaged coefficients, returned if return.vcv is TRUE.

Examples

iris_bin <- iris
# Create a binary variable for MPG (e.g., MPG > 20)
iris_bin$high_Sepal.Length = as.factor(ifelse(iris_bin$Sepal.Length > 5.8, 1, 0))

robout <- robustbase::glmrob(formula = high_Sepal.Length ~ Petal.Length + Petal.Width,
                             family = binomial,
                             data = iris_bin)
cluster_im_glmRob(robout, dat = iris_bin, ~Species, return.vcv = TRUE,
                  engine = "robustbase")

Cluster-Adjusted Confidence Intervals And p-Values Robust Linear Models

Description

Performs cluster-adjusted inference on a robust linear model object, using robust linear regression within each cluster. This function is designed to handle models where observations are clustered, and standard errors need to be adjusted to account for this clustering. The function applies a robust linear regression model to each cluster and then aggregates the results.

Usage

cluster_im_lmRob(
  robmod,
  formula,
  dat,
  cluster,
  ci.level = 0.95,
  drop = TRUE,
  return.vcv = FALSE,
  engine = "robust",
  ...
)

Arguments

robmod

A robust linear model object created using robust::lmRob() or robustbase::lmrob().

formula

A formula or a string that can be coerced to a formula.

dat

A data frame containing the data used in the model.

cluster

A formula indicating the clustering variable in dat.

ci.level

Confidence level for the confidence intervals, default is 0.95.

drop

Logical; if TRUE, drops clusters where the model does not converge.

return.vcv

Logical; if TRUE, the variance-covariance matrix of the cluster-averaged coefficients will be returned.

engine

Set the engine to "robust" to use robust::lmRob() or "robustbase" to use robustbase::lmrob(). Default is "robust".

...

Additional arguments to be passed to the robust::lmRob() or the robustbase::lmrob() function.

Value

A list containing the following elements:

p.values: A matrix of p-values for each independent variable.
ci: A matrix with the lower and upper bounds of the confidence intervals for each independent variable.
vcv.hat: The variance-covariance matrix of the cluster-averaged coefficients, returned if return.vcv is TRUE.
beta.bar: The cluster-averaged coefficients, returned if return.vcv is TRUE.

Examples

form <- Sepal.Length ~ Petal.Length + Petal.Width
mod <- robust::lmRob(formula = form, dat = iris)
cluster_im_lmRob(robmod = mod, formula = form, dat = iris,cluster = ~Species)

Deal Cards to CloseCATs Game Grid

Description

This internal function deals cards to a 2 x 2 grid for the CloseCATs game. The game involves a deck of cards, where cards are dealt to both the computer and the player. The matrix format of the game grid is such that column 1 is for the computer, and column 2 is for the player. The first row of cards contributes to the random slope variance, and the second row contributes to the covariance between the random slope and the random intercept.

Usage

deal_cards_to_cc_grid(
  deck = mmcards::i_deck(deck = mmcards::shuffle_deck(), i_path = "www", i_names =
    c("2_of_clubs", "2_of_diamonds", "2_of_hearts", "2_of_spades", "3_of_clubs",
    "3_of_diamonds", "3_of_hearts", "3_of_spades", "4_of_clubs", "4_of_diamonds",
    "4_of_hearts", "4_of_spades", "5_of_clubs", "5_of_diamonds", "5_of_hearts",
    "5_of_spades", "6_of_clubs", "6_of_diamonds", "6_of_hearts", "6_of_spades",
    "7_of_clubs", "7_of_diamonds", "7_of_hearts", "7_of_spades", "8_of_clubs",
    "8_of_diamonds", "8_of_hearts", "8_of_spades", 
     "9_of_clubs", "9_of_diamonds",
    "9_of_hearts", "9_of_spades", "10_of_clubs", "10_of_diamonds", "10_of_hearts",
    "10_of_spades", "jack_of_clubs", "jack_of_diamonds", "jack_of_hearts",
    "jack_of_spades", "queen_of_clubs", "queen_of_diamonds", "queen_of_hearts",
    "queen_of_spades", "king_of_clubs", "king_of_diamonds", "king_of_hearts",
    "king_of_spades", "ace_of_clubs", "ace_of_diamonds", "ace_of_hearts",
    "ace_of_spades")),
  n
)

Arguments

deck

A dataframe representing a deck of cards. If not provided, a shuffled deck is generated using mmcards::i_deck() and mmcards::shuffle_deck(). The deck should contain at least 2*n cards.

n

The number of players. It defines the number of columns in the grid. Each player, including the computer, will be dealt two cards.

Value

A matrix representing the game grid with dealt cards. Each cell of the matrix contains a card dealt to either the computer (column 1) or the player (column 2).

Handle Model Fitting Failures and Variable Dropping

Description

This internal function is designed to handle failures in cluster-specific model fitting and variable dropping issues in the context of cluster-adjusted robust inference functions. It checks for model fitting failures and whether independent variables have been dropped in the model.

Usage

fail_drop(drop, fail, clust.mod, ind_variables)

Arguments

drop

Logical; if TRUE, allows the function to return NA for failed model fits, otherwise stops execution with an error message.

fail

Logical; indicates whether the model fitting process has failed.

clust.mod

A model object resulting from cluster-specific fitting.

ind_variables

A character vector of independent variable names expected in the model.

Value

If fail is TRUE and drop is FALSE, the function stops with an error message. If fail is TRUE and drop is TRUE, it returns NA. If fail is FALSE, it returns the coefficients for the independent variables in clust.mod.

Fit a Linear Model with Robust Estimation

Description

This function fits a linear model using robust estimation methods. It allows the use of either the 'robust' or 'robustbase' packages for fitting the model.

Usage

fit_model(engine, formula, data, ...)

Arguments

engine

A character string specifying the engine to be used for model fitting. Must be either "robust" or "robustbase".

formula

An object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted.

data

A dataframe

...

Additional arguments to be passed to the underlying fitting function.

Value

A fitted model.

Fit a Generalized Linear Model with Robust Estimation

Description

This function fits a generalized linear model using robust estimation methods. It allows the use of either the 'robust' or 'robustbase' packages for fitting the model.

Usage

fit_model_g(engine, formula, data, family, method, ...)

Arguments

engine

A character string specifying the engine to be used for model fitting. Must be either "robust" or "robustbase".

formula

An object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted.

data

A dataframe containing the variables in the model.

family

A description of the error distribution and link function to be used in the model (See stats::family for details of family functions).

method

Fitting method to be used. Can vary depending on the selected engine.

...

Additional arguments to be passed to the underlying fitting function.

Value

A fitted model

Format Citation

Description

This internal function formats a citation object into a readable string. The function extracts relevant information such as the title, author, year, address, note, and URL from the citation object and formats it into a standardized citation format.

Usage

format_citation(cit)

Arguments

cit

A citation object typically obtained from citation().

Value

A character string with the formatted citation.

Generate UI Parameters for KenRCATs Shiny App

Description

This internal function creates a list of UI elements for the KenRCATs() 'shiny' application. It generates input fields for various parameters used in the power analysis simulation.

Usage

getUIParams()

Value

A shiny::tagList containing UI elements.

Note

This function is intended for internal use within the KenRCATs() 'shiny' application.

Extract Model Information for Cluster-Adjusted Robust Inference

Description

This internal function extracts essential information from a model object and associated data, specifically for use in cluster-adjusted robust inference functions like cluster_im_glmRob and cluster_im_lmRob. It handles variable extraction, clustering information, and filtering the dataset based on the model's usage.

Usage

info(formula = NULL, cluster, dat, robmod)

Arguments

formula

A formula used in the model fitting.

cluster

A formula or a character string indicating the clustering variable in dat.

dat

A data frame containing the data used in the model.

robmod

A robust model object from which to extract information.

Value

A list containing the following elements:

variables: Vector of all variable names used in the model.
clust.name: Name of the clustering variable.
dat: Filtered data set containing only the observations used in the model.
clust: Vector representing the cluster index for each observation in dat.
ind.variables.full: Names of all independent variables including dropped ones, if any, in the model.
ind.variables: Names of non-dropped independent variables used in the model.

Kenward-Roger Analysis Shiny Application

Description

A Shiny application that allows users to upload a dataset, modify variable types, and fit a mixed-effects model using the Kenward-Roger approximation for small sample inference.

Usage

kenward_roger()

Details

The application provides an interactive interface for setting up and running a mixed-effects model analysis with the Kenward-Roger method for estimating the degrees of freedom in linear mixed-effects models.

Value

A Shiny application object.

References

Kenward, M. G., & Roger, J. H. (1997). Small Sample Inference for Fixed Effects from Restricted Maximum Likelihood. Biometrics, 53, 983-997.

Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software, 82(13), 1-26. <doi: 10.18637/jss.v082.i13>.

Examples

if (interactive()) {
  kenward_roger()
}

Handle Null Values for List Input

Description

Handle Null Values for List Input

Usage

list_null(par_input = "")

Arguments

par_input

A string input, default is "".

Value

NULL if input is NA or empty, otherwise a parsed list.

Set Up CATs Analysis in Shiny Application

Description

This function creates a Shiny application for performing CATs (Cluster-Adjusted t-statistics) analysis. It provides a user interface for uploading a CSV file, specifying the model and additional arguments, and running the analysis. The output includes variable selection, GLM (Generalized Linear Model) summary, and results of the CATs analysis.

Usage

mmiCATs()

Details

The application allows the user to upload a dataset, specify a GLM model and additional arguments, and run CATs analysis. The UI consists of various input elements like file upload, text input, numeric input, and action buttons. The server part handles the data processing, model fitting, and execution of the CATs analysis. The application outputs include the list of variables, GLM model summary, and the results from the CATs analysis.

Value

A Shiny app object which can be run to start the application.

References

Esarey J, Menger A. Practical and Effective Approaches to Dealing With Clustered Data. Political Science Research and Methods. 2019;7(3):541-559. doi:10.1017/psrm.2017.42

Examples

# To run the Shiny app
if(interactive()){
  mmiCATs()
}

Parse String Input to List

Description

Parse String Input to List

Usage

parse_list_input(input_string)

Arguments

input_string

A string representation of a list.

Value

A parsed list object.

Process Hand and Calculate Mispecification Distance in CloseCATs Game

Description

This function processes hands in the CloseCATs game and calculates the mispecification distance. It performs statistical computations based on the dealt cards and specified parameters, evaluating the performance of mixed effects models and cluster adjusted t-statistics models in the context of the game. The function considers various statistical parameters and model specifications to compute the results.

Usage

process_hand(
  x,
  process_col,
  beta_int = 0,
  beta_x1 = 0.25,
  beta_x2 = 1.5,
  mean_x1 = 0,
  sd_x1 = 1,
  mean_x2 = 0,
  sd_x2 = 4,
  N = 20,
  reps = 1,
  alpha = 0.05,
  n_time = 20,
  mean_i = 0,
  var_i = 1,
  mean_s = 0,
  mean_r = 0,
  var_r = 1,
  cor_pred = NULL,
  truncate = FALSE,
  var_s_factor = 1,
  cov_is_factor = 14.75
)

Arguments

x

A matrix representing the current hand in the game grid.

process_col

The column number (1 for computer, 2 for player) to process.

beta_int

Intercept for the mixed effects model.

beta_x1

Coefficient for the first predictor in the mixed effects model.

beta_x2

Coefficient for the second predictor in the mixed effects model.

mean_x1

Mean of the first predictor.

sd_x1

Standard deviation of the first predictor.

mean_x2

Mean of the second predictor.

sd_x2

Standard deviation of the second predictor.

N

Number of observations.

reps

Number of replications for the power analysis.

alpha

Significance level for the power analysis.

n_time

Number of time points.

mean_i

Mean of the random intercept.

var_i

Variance of the random intercept.

mean_s

Mean of the random slope.

mean_r

Mean of the residual.

var_r

Variance of the residual.

cor_pred

Correlation predictor, NULL if not specified.

truncate

Boolean to determine if truncation is applied in the model.

var_s_factor

Factor to adjust the variance of the random slope.

cov_is_factor

Factor to adjust the covariance between the random intercept and slope.

Value

A list containing two elements: 'mispec_dist', the mispecification distance, and 'results', a summary of model results and their statistical parameters.

Process Cluster-Adjusted Robust Inference Results

Description

This internal function processes results from cluster-specific model fittings for cluster-robust inference functions. It combines results, computes variance-covariance matrix, standard errors, t-statistics, p-values, and confidence intervals for each independent variable.

Usage

process_results(results, ind_variables, ci.level, drop, return.vcv)

Arguments

results

A list of results from cluster-specific model fittings.

ind_variables

A vector of independent variable names for which the results are computed.

ci.level

Confidence level for the confidence intervals.

drop

Logical; if TRUE, clusters with failed model fits are omitted from the results.

return.vcv

Logical; if TRUE, returns the variance-covariance matrix of the cluster-averaged coefficients.

Details

Workflow defined in clusterSEs::cluster.im.glm() function.

Value

A list containing the following elements:

p.values: A matrix of p-values for each independent variable.
ci: A matrix with the lower and upper bounds of the confidence intervals for each independent variable.
vcv.hat: The variance-covariance matrix of the cluster-averaged coefficients, returned if return.vcv is TRUE.
beta.bar: The cluster-averaged coefficients, returned if return.vcv is TRUE.

Power Analysis for Clustered Data

Description

Conducts a power analysis for clustered data using simulation. This function allows for comparing the performance of different estimation methods in terms of power, rejection rate, root mean square error (RMSE), relative RMSE, coverage probability, and average confidence interval width.

Usage

pwr_func_lmer(
  betas = list(int = 0, x1 = -5, x2 = 2, x3 = 10),
  dists = list(x1 = stats::rnorm, x2 = stats::rbinom, x3 = stats::rnorm),
  distpar = list(x1 = list(mean = 0, sd = 1), x2 = list(size = 1, prob = 0.4), x3 =
    list(mean = 1, sd = 2)),
  N = 25,
  reps = 1000,
  alpha = 0.05,
  var_intr = "x1",
  grp = "ID",
  mod = paste0("out ~ x1 + x2 + x3 + (x3|", grp, ")"),
  catsmod = "out ~ x1 + x2 + x3",
  r_slope = "x1",
  r_int = "int",
  n_time = 20,
  mean_i = 0,
  var_i = 1,
  mean_s = 0,
  var_s = 1,
  cov_is = 0,
  mean_r = 0,
  var_r = 1,
  cor_mat = NULL,
  corvars = NULL,
  time_index = NULL
)

Arguments

betas

Named list of true coefficient values for the fixed effects.

dists

Named list of functions to generate random distributions for each predictor.

distpar

Named list of parameter lists for each distribution function in dists.

N

Integer specifying the number of groups.

reps

Integer specifying the number of replications for the simulation.

alpha

Numeric value specifying the significance level for hypothesis testing.

var_intr

Character string specifying the name of the variable of interest (for power calculations).

grp

Character string specifying the name of the grouping variable.

mod

Formula for fitting mixed-effects model to simulated data.

catsmod

Formula for the CATs model.

r_slope

Character string specifying the name of the random slope variable. This is the random slope when simulating data.

r_int

Character string specifying the name of the random intercept.

n_time

Integer or vector specifying the number of time points per group. If a vector, its length must equal N.

mean_i

Numeric value specifying the mean for the random intercept.

var_i

Numeric value specifying the variance for the random intercept.

mean_s

Numeric value specifying the mean for the random slope.

var_s

Numeric value specifying the variance for the random slope.

cov_is

Numeric value specifying the covariance between the random intercept and slope.

mean_r

Numeric value specifying the mean for the residual error.

var_r

Numeric value specifying the variance for the residual error.

cor_mat

Correlation matrix for correlated predictors, if any.

corvars

List of vectors, each vector containing names of correlated variables.

time_index

Either "linear" for a linear time trend, a custom function, or a character string that evaluates to a function for generating time values. If specified, 'time' should be included in betas but not in dists.

Value

A dataframe summarizing the results of the power analysis, including average coefficient estimate, rejection rate, root mean square error, relative root mean square error, coverage probability, and average confidence interval width for each method.

Examples

# Basic usage with default parameters
pwr_func_lmer(reps = 2)

Render Card Grid in Shiny App

Description

This function takes a grid of card information, generates image tags for each card, and organizes them into a responsive grid layout for display in a Shiny application.

Usage

render_card_grid(new_card_grid)

Arguments

new_card_grid

A matrix or data frame where each row represents a card and each card has a property icard pointing to the image file relative to the mmiCATs package's www directory. The function expects this parameter to be structured with named columns where icard is one of the column names.

Value

A Shiny UI element (tagList) representing a grid of card images.

Handle Null Values for Sigma Matrix

Description

Handle Null Values for Sigma Matrix

Usage

sig_null(par_input = "")

Arguments

par_input

A string input, default is "".

Value

NULL if input is NA or empty, otherwise an evaluated expression.

Convert a Character String to a List

Description

This internal function converts a character string representing R arguments into a list of arguments. It is primarily used to facilitate the passing of additional arguments from the Shiny UI to internal functions within the app.

Usage

str2list(arg_str)

Arguments

arg_str

A character string representing R arguments.

Value

A list containing the arguments represented by arg_str. If arg_str is not a valid representation of R arguments, the function will throw an error.

Convert Text Input to Vector

Description

Convert Text Input to Vector

Usage

text_to_vector(text_input)

Arguments

text_input

A string input to be converted to a vector.

Value

A vector parsed from the input string.

Handle Null Values for Text to Vector Conversion

Description

Handle Null Values for Text to Vector Conversion

Usage

vec_null(par_input = "")

Arguments

par_input

A string input, default is "".

Value

NULL if input is NA or empty, otherwise a vector.