Title: | Cluster Adjusted t Statistic Applications |
Version: | 0.2.0 |
Description: | Simulation results detailed in Esarey and Menger (2019) <doi:10.1017/psrm.2017.42> demonstrate that cluster adjusted t statistics (CATs) are an effective method for correcting standard errors in scenarios with a small number of clusters. The 'mmiCATs' package offers a suite of tools for working with CATs. The mmiCATs() function initiates a 'shiny' web application, facilitating the analysis of data utilizing CATs, as implemented in the cluster.im.glm() function from the 'clusterSEs' package. Additionally, the pwr_func_lmer() function is designed to simplify the process of conducting simulations to compare mixed effects models with CATs models. For educational purposes, the CloseCATs() function launches a 'shiny' application card game, aimed at enhancing users' understanding of the conditions under which CATs should be preferred over random intercept models. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/mightymetrika/mmiCATs |
BugReports: | https://github.com/mightymetrika/mmiCATs/issues |
Imports: | broom, broom.mixed, clusterSEs, DT, lmerTest, MASS, mmcards, pool, robust, robustbase, RPostgres, shiny, shinythemes |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2024-08-26 00:38:48 UTC; Administrator |
Author: | Mackson Ncube [aut, cre], mightymetrika, LLC [cph, fnd] |
Maintainer: | Mackson Ncube <macksonncube.stats@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2024-08-26 04:20:01 UTC |
CloseCATs Shiny Application
Description
This function creates and runs a Shiny application for the CloseCATs game. The application provides a user interface for setting up the game, dealing cards, swapping cards, and scoring the game based on statistical computations. The game involves dealing cards to players and the computer, allowing the player to swap cards in their column, and scoring the game based on the mispecification distance calculated from the processed hands.
Usage
CloseCATs()
Details
The UI allows players to input various statistical parameters and preferences for the game setup. It also provides interactive elements for dealing cards, swapping cards within a column, and scoring the game based on the calculated mispecification distance.
The main components of the Shiny application include:
A sidebar for inputting game parameters and controls for dealing and scoring.
A main panel for displaying game cards, swap options, and results.
Reactive elements that update based on user interaction and game state.
Value
A Shiny app object which can be run to start the application.
Examples
# To run the CloseCATs Shiny application:
if(interactive()){
CloseCATs()
}
Launch KenRCATs Shiny Application
Description
This function launches a 'shiny' application for conducting power analysis simulations using CATs (Clustered Adjusted t-statistics) and Kenward-Roger methods. The app allows users to input simulation parameters, run simulations, view results, and manage data in a PostgreSQL database.
Usage
KenRCATs(dbname, datatable, host, port, user, password)
Arguments
dbname |
Character string specifying the name of the PostgreSQL database. |
datatable |
Character string specifying the name of the table in the database. |
host |
Character string specifying the host name or IP address of the database server. |
port |
Integer specifying the port number on which the database is running. |
user |
Character string specifying the username for database connection. |
password |
Character string specifying the password for database connection. |
Details
The KenRCATs function sets up a Shiny application with the following features:
Input fields for various simulation parameters
Ability to run power analysis simulations
Display of simulation results
Option to submit results to a PostgreSQL database
Functionality to download data from the database
Display of relevant citations
Value
A 'shiny' app object.
Examples
if(interactive()){
KenRCATs(
dbname = "your_database_name",
datatable = "your_table_name",
host = "localhost",
port = 5432,
user = "your_username",
password = "your_password"
)
}
Append KenRCATs Results with Input Parameters
Description
Append KenRCATs Results with Input Parameters
Usage
append_KenRCATs(df, input)
Arguments
df |
A data frame containing simulation results. |
input |
A list of input parameters from the Shiny app. |
Value
A list containing two data frames: the original results and the results with appended input parameters.
Column Swap in CloseCATs Game Grid
Description
This internal function performs a column swap in the CloseCATs game grid. It is designed to allow the player to swap the cards in their column (column 2). This swap changes the contribution of the cards to the random slope variance and the covariance between the random slope and the random intercept. The function modifies the order of cards in the specified column by reversing their positions.
Usage
cc_swapper(cards_matrix, swap_in_col = NULL)
Arguments
cards_matrix |
A matrix representing the current state of the game grid. The matrix should have 2 rows and a number of columns equal to the number of players (including the computer). Each cell of the matrix contains a card. |
swap_in_col |
The column number where the swap should be performed. If this parameter is NULL or not equal to 2, no action is taken. The default value is NULL. Typically, this parameter should be set to 2 to perform a swap for the player's column. |
Value
The modified game grid matrix after performing the swap in the specified column.
Cluster-Adjusted Confidence Intervals And p-Values Robust GLMs
Description
Performs cluster-adjusted inference on a robust generalized linear model object, using robust generalized linear regression within each cluster. This function is tailored for models where observations are clustered, and standard errors need adjustment for clustering. The function applies a robust generalized linear regression model to each cluster using the specified family and method, and then aggregates the results.
Usage
cluster_im_glmRob(
robmod,
dat,
cluster,
ci.level = 0.95,
drop = TRUE,
return.vcv = FALSE,
engine = "robust",
...
)
Arguments
robmod |
A robust generalized linear model object created using robust::glmRob() or robustbase::glmrob(). It must contain elements 'formula', 'family', and 'method'. |
dat |
A data frame containing the data used in the model. |
cluster |
A formula indicating the clustering variable in |
ci.level |
Confidence level for the confidence intervals, default is 0.95. |
drop |
Logical; if TRUE, drops clusters where the model does not converge. |
return.vcv |
Logical; if TRUE, the variance-covariance matrix of the cluster-averaged coefficients will be returned. |
engine |
Set the engine to "robust" to use robust::glmRob() or "robustbase" to use robustbase::glmrob(). Default is "robust". |
... |
Additional arguments to be passed to |
Value
An invisible list containing the following elements:
- p.values
A matrix of p-values for each independent variable.
- ci
A matrix with the lower and upper bounds of the confidence intervals for each independent variable.
- vcv.hat
The variance-covariance matrix of the cluster-averaged coefficients, returned if
return.vcv
is TRUE.- beta.bar
The cluster-averaged coefficients, returned if
return.vcv
is TRUE.
Examples
iris_bin <- iris
# Create a binary variable for MPG (e.g., MPG > 20)
iris_bin$high_Sepal.Length = as.factor(ifelse(iris_bin$Sepal.Length > 5.8, 1, 0))
robout <- robustbase::glmrob(formula = high_Sepal.Length ~ Petal.Length + Petal.Width,
family = binomial,
data = iris_bin)
cluster_im_glmRob(robout, dat = iris_bin, ~Species, return.vcv = TRUE,
engine = "robustbase")
Cluster-Adjusted Confidence Intervals And p-Values Robust Linear Models
Description
Performs cluster-adjusted inference on a robust linear model object, using robust linear regression within each cluster. This function is designed to handle models where observations are clustered, and standard errors need to be adjusted to account for this clustering. The function applies a robust linear regression model to each cluster and then aggregates the results.
Usage
cluster_im_lmRob(
robmod,
formula,
dat,
cluster,
ci.level = 0.95,
drop = TRUE,
return.vcv = FALSE,
engine = "robust",
...
)
Arguments
robmod |
A robust linear model object created using robust::lmRob() or robustbase::lmrob(). |
formula |
A formula or a string that can be coerced to a formula. |
dat |
A data frame containing the data used in the model. |
cluster |
A formula indicating the clustering
variable in |
ci.level |
Confidence level for the confidence intervals, default is 0.95. |
drop |
Logical; if TRUE, drops clusters where the model does not converge. |
return.vcv |
Logical; if TRUE, the variance-covariance matrix of the cluster-averaged coefficients will be returned. |
engine |
Set the engine to "robust" to use robust::lmRob() or "robustbase" to use robustbase::lmrob(). Default is "robust". |
... |
Additional arguments to be passed to the robust::lmRob() or the robustbase::lmrob() function. |
Value
A list containing the following elements:
- p.values
A matrix of p-values for each independent variable.
- ci
A matrix with the lower and upper bounds of the confidence intervals for each independent variable.
- vcv.hat
The variance-covariance matrix of the cluster-averaged coefficients, returned if
return.vcv
is TRUE.- beta.bar
The cluster-averaged coefficients, returned if
return.vcv
is TRUE.
Examples
form <- Sepal.Length ~ Petal.Length + Petal.Width
mod <- robust::lmRob(formula = form, dat = iris)
cluster_im_lmRob(robmod = mod, formula = form, dat = iris,cluster = ~Species)
Deal Cards to CloseCATs Game Grid
Description
This internal function deals cards to a 2 x 2 grid for the CloseCATs game. The game involves a deck of cards, where cards are dealt to both the computer and the player. The matrix format of the game grid is such that column 1 is for the computer, and column 2 is for the player. The first row of cards contributes to the random slope variance, and the second row contributes to the covariance between the random slope and the random intercept.
Usage
deal_cards_to_cc_grid(
deck = mmcards::i_deck(deck = mmcards::shuffle_deck(), i_path = "www", i_names =
c("2_of_clubs", "2_of_diamonds", "2_of_hearts", "2_of_spades", "3_of_clubs",
"3_of_diamonds", "3_of_hearts", "3_of_spades", "4_of_clubs", "4_of_diamonds",
"4_of_hearts", "4_of_spades", "5_of_clubs", "5_of_diamonds", "5_of_hearts",
"5_of_spades", "6_of_clubs", "6_of_diamonds", "6_of_hearts", "6_of_spades",
"7_of_clubs", "7_of_diamonds", "7_of_hearts", "7_of_spades", "8_of_clubs",
"8_of_diamonds", "8_of_hearts", "8_of_spades",
"9_of_clubs", "9_of_diamonds",
"9_of_hearts", "9_of_spades", "10_of_clubs", "10_of_diamonds", "10_of_hearts",
"10_of_spades", "jack_of_clubs", "jack_of_diamonds", "jack_of_hearts",
"jack_of_spades", "queen_of_clubs", "queen_of_diamonds", "queen_of_hearts",
"queen_of_spades", "king_of_clubs", "king_of_diamonds", "king_of_hearts",
"king_of_spades", "ace_of_clubs", "ace_of_diamonds", "ace_of_hearts",
"ace_of_spades")),
n
)
Arguments
deck |
A dataframe representing a deck of cards. If not provided, a
shuffled deck is generated using |
n |
The number of players. It defines the number of columns in the grid. Each player, including the computer, will be dealt two cards. |
Value
A matrix representing the game grid with dealt cards. Each cell of the matrix contains a card dealt to either the computer (column 1) or the player (column 2).
Handle Model Fitting Failures and Variable Dropping
Description
This internal function is designed to handle failures in cluster-specific model fitting and variable dropping issues in the context of cluster-adjusted robust inference functions. It checks for model fitting failures and whether independent variables have been dropped in the model.
Usage
fail_drop(drop, fail, clust.mod, ind_variables)
Arguments
drop |
Logical; if TRUE, allows the function to return |
fail |
Logical; indicates whether the model fitting process has failed. |
clust.mod |
A model object resulting from cluster-specific fitting. |
ind_variables |
A character vector of independent variable names expected in the model. |
Value
If fail
is TRUE and drop
is FALSE, the function stops with an
error message.
If fail
is TRUE and drop
is TRUE, it returns NA
.
If fail
is FALSE, it returns the coefficients for the independent
variables in clust.mod
.
Fit a Linear Model with Robust Estimation
Description
This function fits a linear model using robust estimation methods. It allows the use of either the 'robust' or 'robustbase' packages for fitting the model.
Usage
fit_model(engine, formula, data, ...)
Arguments
engine |
A character string specifying the engine to be used for model fitting. Must be either "robust" or "robustbase". |
formula |
An object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted. |
data |
A dataframe |
... |
Additional arguments to be passed to the underlying fitting function. |
Value
A fitted model.
Fit a Generalized Linear Model with Robust Estimation
Description
This function fits a generalized linear model using robust estimation methods. It allows the use of either the 'robust' or 'robustbase' packages for fitting the model.
Usage
fit_model_g(engine, formula, data, family, method, ...)
Arguments
engine |
A character string specifying the engine to be used for model fitting. Must be either "robust" or "robustbase". |
formula |
An object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted. |
data |
A dataframe containing the variables in the model. |
family |
A description of the error distribution and link function to be used in the model (See stats::family for details of family functions). |
method |
Fitting method to be used. Can vary depending on the selected engine. |
... |
Additional arguments to be passed to the underlying fitting function. |
Value
A fitted model
Format Citation
Description
This internal function formats a citation object into a readable string. The function extracts relevant information such as the title, author, year, address, note, and URL from the citation object and formats it into a standardized citation format.
Usage
format_citation(cit)
Arguments
cit |
A citation object typically obtained from |
Value
A character string with the formatted citation.
Generate UI Parameters for KenRCATs Shiny App
Description
This internal function creates a list of UI elements for the KenRCATs() 'shiny' application. It generates input fields for various parameters used in the power analysis simulation.
Usage
getUIParams()
Value
A shiny::tagList containing UI elements.
Note
This function is intended for internal use within the KenRCATs() 'shiny' application.
Extract Model Information for Cluster-Adjusted Robust Inference
Description
This internal function extracts essential information from a model object and
associated data, specifically for use in cluster-adjusted robust inference
functions like cluster_im_glmRob
and cluster_im_lmRob
. It handles variable
extraction, clustering information, and filtering the dataset based on the
model's usage.
Usage
info(formula = NULL, cluster, dat, robmod)
Arguments
formula |
A formula used in the model fitting. |
cluster |
A formula or a character string indicating the clustering
variable in |
dat |
A data frame containing the data used in the model. |
robmod |
A robust model object from which to extract information. |
Value
A list containing the following elements:
- variables
Vector of all variable names used in the model.
- clust.name
Name of the clustering variable.
- dat
Filtered data set containing only the observations used in the model.
- clust
Vector representing the cluster index for each observation in
dat
.- ind.variables.full
Names of all independent variables including dropped ones, if any, in the model.
- ind.variables
Names of non-dropped independent variables used in the model.
Kenward-Roger Analysis Shiny Application
Description
A Shiny application that allows users to upload a dataset, modify variable types, and fit a mixed-effects model using the Kenward-Roger approximation for small sample inference.
Usage
kenward_roger()
Details
The application provides an interactive interface for setting up and running a mixed-effects model analysis with the Kenward-Roger method for estimating the degrees of freedom in linear mixed-effects models.
Value
A Shiny application object.
References
Kenward, M. G., & Roger, J. H. (1997). Small Sample Inference for Fixed Effects from Restricted Maximum Likelihood. Biometrics, 53, 983-997.
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software, 82(13), 1-26. <doi: 10.18637/jss.v082.i13>.
Examples
if (interactive()) {
kenward_roger()
}
Handle Null Values for List Input
Description
Handle Null Values for List Input
Usage
list_null(par_input = "")
Arguments
par_input |
A string input, default is "". |
Value
NULL if input is NA or empty, otherwise a parsed list.
Set Up CATs Analysis in Shiny Application
Description
This function creates a Shiny application for performing CATs (Cluster-Adjusted t-statistics) analysis. It provides a user interface for uploading a CSV file, specifying the model and additional arguments, and running the analysis. The output includes variable selection, GLM (Generalized Linear Model) summary, and results of the CATs analysis.
Usage
mmiCATs()
Details
The application allows the user to upload a dataset, specify a GLM model and additional arguments, and run CATs analysis. The UI consists of various input elements like file upload, text input, numeric input, and action buttons. The server part handles the data processing, model fitting, and execution of the CATs analysis. The application outputs include the list of variables, GLM model summary, and the results from the CATs analysis.
Value
A Shiny app object which can be run to start the application.
References
Esarey J, Menger A. Practical and Effective Approaches to Dealing With Clustered Data. Political Science Research and Methods. 2019;7(3):541-559. doi:10.1017/psrm.2017.42
Examples
# To run the Shiny app
if(interactive()){
mmiCATs()
}
Parse String Input to List
Description
Parse String Input to List
Usage
parse_list_input(input_string)
Arguments
input_string |
A string representation of a list. |
Value
A parsed list object.
Process Hand and Calculate Mispecification Distance in CloseCATs Game
Description
This function processes hands in the CloseCATs game and calculates the mispecification distance. It performs statistical computations based on the dealt cards and specified parameters, evaluating the performance of mixed effects models and cluster adjusted t-statistics models in the context of the game. The function considers various statistical parameters and model specifications to compute the results.
Usage
process_hand(
x,
process_col,
beta_int = 0,
beta_x1 = 0.25,
beta_x2 = 1.5,
mean_x1 = 0,
sd_x1 = 1,
mean_x2 = 0,
sd_x2 = 4,
N = 20,
reps = 1,
alpha = 0.05,
n_time = 20,
mean_i = 0,
var_i = 1,
mean_s = 0,
mean_r = 0,
var_r = 1,
cor_pred = NULL,
truncate = FALSE,
var_s_factor = 1,
cov_is_factor = 14.75
)
Arguments
x |
A matrix representing the current hand in the game grid. |
process_col |
The column number (1 for computer, 2 for player) to process. |
beta_int |
Intercept for the mixed effects model. |
beta_x1 |
Coefficient for the first predictor in the mixed effects model. |
beta_x2 |
Coefficient for the second predictor in the mixed effects model. |
mean_x1 |
Mean of the first predictor. |
sd_x1 |
Standard deviation of the first predictor. |
mean_x2 |
Mean of the second predictor. |
sd_x2 |
Standard deviation of the second predictor. |
N |
Number of observations. |
reps |
Number of replications for the power analysis. |
alpha |
Significance level for the power analysis. |
n_time |
Number of time points. |
mean_i |
Mean of the random intercept. |
var_i |
Variance of the random intercept. |
mean_s |
Mean of the random slope. |
mean_r |
Mean of the residual. |
var_r |
Variance of the residual. |
cor_pred |
Correlation predictor, NULL if not specified. |
truncate |
Boolean to determine if truncation is applied in the model. |
var_s_factor |
Factor to adjust the variance of the random slope. |
cov_is_factor |
Factor to adjust the covariance between the random intercept and slope. |
Value
A list containing two elements: 'mispec_dist', the mispecification distance, and 'results', a summary of model results and their statistical parameters.
Process Cluster-Adjusted Robust Inference Results
Description
This internal function processes results from cluster-specific model fittings for cluster-robust inference functions. It combines results, computes variance-covariance matrix, standard errors, t-statistics, p-values, and confidence intervals for each independent variable.
Usage
process_results(results, ind_variables, ci.level, drop, return.vcv)
Arguments
results |
A list of results from cluster-specific model fittings. |
ind_variables |
A vector of independent variable names for which the results are computed. |
ci.level |
Confidence level for the confidence intervals. |
drop |
Logical; if TRUE, clusters with failed model fits are omitted from the results. |
return.vcv |
Logical; if TRUE, returns the variance-covariance matrix of the cluster-averaged coefficients. |
Details
Workflow defined in clusterSEs::cluster.im.glm() function.
Value
A list containing the following elements:
- p.values
A matrix of p-values for each independent variable.
- ci
A matrix with the lower and upper bounds of the confidence intervals for each independent variable.
- vcv.hat
The variance-covariance matrix of the cluster-averaged coefficients, returned if
return.vcv
is TRUE.- beta.bar
The cluster-averaged coefficients, returned if
return.vcv
is TRUE.
Power Analysis for Clustered Data
Description
Conducts a power analysis for clustered data using simulation. This function allows for comparing the performance of different estimation methods in terms of power, rejection rate, root mean square error (RMSE), relative RMSE, coverage probability, and average confidence interval width.
Usage
pwr_func_lmer(
betas = list(int = 0, x1 = -5, x2 = 2, x3 = 10),
dists = list(x1 = stats::rnorm, x2 = stats::rbinom, x3 = stats::rnorm),
distpar = list(x1 = list(mean = 0, sd = 1), x2 = list(size = 1, prob = 0.4), x3 =
list(mean = 1, sd = 2)),
N = 25,
reps = 1000,
alpha = 0.05,
var_intr = "x1",
grp = "ID",
mod = paste0("out ~ x1 + x2 + x3 + (x3|", grp, ")"),
catsmod = "out ~ x1 + x2 + x3",
r_slope = "x1",
r_int = "int",
n_time = 20,
mean_i = 0,
var_i = 1,
mean_s = 0,
var_s = 1,
cov_is = 0,
mean_r = 0,
var_r = 1,
cor_mat = NULL,
corvars = NULL,
time_index = NULL
)
Arguments
betas |
Named list of true coefficient values for the fixed effects. |
dists |
Named list of functions to generate random distributions for each predictor. |
distpar |
Named list of parameter lists for each distribution function in
|
N |
Integer specifying the number of groups. |
reps |
Integer specifying the number of replications for the simulation. |
alpha |
Numeric value specifying the significance level for hypothesis testing. |
var_intr |
Character string specifying the name of the variable of interest (for power calculations). |
grp |
Character string specifying the name of the grouping variable. |
mod |
Formula for fitting mixed-effects model to simulated data. |
catsmod |
Formula for the CATs model. |
r_slope |
Character string specifying the name of the random slope variable. This is the random slope when simulating data. |
r_int |
Character string specifying the name of the random intercept. |
n_time |
Integer or vector specifying the number of time points per group. If a vector, its length must equal N. |
mean_i |
Numeric value specifying the mean for the random intercept. |
var_i |
Numeric value specifying the variance for the random intercept. |
mean_s |
Numeric value specifying the mean for the random slope. |
var_s |
Numeric value specifying the variance for the random slope. |
cov_is |
Numeric value specifying the covariance between the random intercept and slope. |
mean_r |
Numeric value specifying the mean for the residual error. |
var_r |
Numeric value specifying the variance for the residual error. |
cor_mat |
Correlation matrix for correlated predictors, if any. |
corvars |
List of vectors, each vector containing names of correlated variables. |
time_index |
Either "linear" for a linear time trend, a custom function,
or a character string that evaluates to a function for
generating time values. If specified, 'time' should be
included in |
Value
A dataframe summarizing the results of the power analysis, including average coefficient estimate, rejection rate, root mean square error, relative root mean square error, coverage probability, and average confidence interval width for each method.
Examples
# Basic usage with default parameters
pwr_func_lmer(reps = 2)
Render Card Grid in Shiny App
Description
This function takes a grid of card information, generates image tags for each card, and organizes them into a responsive grid layout for display in a Shiny application.
Usage
render_card_grid(new_card_grid)
Arguments
new_card_grid |
A matrix or data frame where each row represents a card and
each card has a property |
Value
A Shiny UI element (tagList
) representing a grid of card images.
Handle Null Values for Sigma Matrix
Description
Handle Null Values for Sigma Matrix
Usage
sig_null(par_input = "")
Arguments
par_input |
A string input, default is "". |
Value
NULL if input is NA or empty, otherwise an evaluated expression.
Convert a Character String to a List
Description
This internal function converts a character string representing R arguments into a list of arguments. It is primarily used to facilitate the passing of additional arguments from the Shiny UI to internal functions within the app.
Usage
str2list(arg_str)
Arguments
arg_str |
A character string representing R arguments. |
Value
A list containing the arguments represented by arg_str
. If
arg_str
is not a valid representation of R arguments, the function
will throw an error.
Convert Text Input to Vector
Description
Convert Text Input to Vector
Usage
text_to_vector(text_input)
Arguments
text_input |
A string input to be converted to a vector. |
Value
A vector parsed from the input string.
Handle Null Values for Text to Vector Conversion
Description
Handle Null Values for Text to Vector Conversion
Usage
vec_null(par_input = "")
Arguments
par_input |
A string input, default is "". |
Value
NULL if input is NA or empty, otherwise a vector.