Type: | Package |
Title: | Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready |
Description: | Create beautiful and customizable tables to summarize several statistical models side-by-side. Draw coefficient plots, multi-level cross-tabs, dataset summaries, balance tables (a.k.a. "Table 1s"), and correlation matrices. This package supports dozens of statistical models, and it can produce tables in HTML, LaTeX, Word, Markdown, PDF, PowerPoint, Excel, RTF, JPG, or PNG. Tables can easily be embedded in 'Rmarkdown' or 'knitr' dynamic documents. Details can be found in Arel-Bundock (2022) <doi:10.18637/jss.v103.i01>. |
Version: | 2.4.0 |
URL: | https://modelsummary.com |
BugReports: | https://github.com/vincentarelbundock/modelsummary/issues/ |
Depends: | R (≥ 4.0.0) |
Imports: | checkmate (≥ 2.3.1), data.table (≥ 1.16.4), generics, glue, insight (≥ 1.3.0), methods, parameters (≥ 0.26.0), performance (≥ 0.14.0), tables (≥ 0.9.31), tinytable (≥ 0.9.0) |
Suggests: | AER, altdoc, Amelia, betareg, bookdown, brms, broom, broom.mixed, car, clubSandwich, correlation, covr, did, digest, DT, estimatr, fixest, flextable, future, future.apply, gamlss, ggdist, ggplot2, gh, gt (≥ 0.8.0), gtExtras, haven, huxtable, labelled, IRdisplay, ivreg, kableExtra, knitr, lavaan, lfe, lme4, lmtest, magick, magrittr, marginaleffects, MASS, mgcv, mice, nlme, nnet, officer, openxlsx, pandoc, parallel, pscl, psych, randomizr, remotes, rmarkdown, rstanarm, rsvg, sandwich, spelling, survey, survival, tibble, tictoc, tidyselect, tidyverse, tinysnapshot (≥ 0.1.0), tinytest, tinytex, webshot2, wesanderson |
License: | GPL-3 |
Encoding: | UTF-8 |
Config/testthat/edition: | 3 |
Language: | en-US |
RoxygenNote: | 7.3.2 |
Collate: | 'bind_est_gof.R' 'coef_rename.R' 'config_modelsummary.R' 'convenience.R' 'datasummary.R' 'datasummary_balance.R' 'datasummary_correlation.R' 'datasummary_crosstab.R' 'datasummary_df.R' 'datasummary_extract.R' 'datasummary_functions.R' 'datasummary_skim.R' 'dvnames.R' 'escape.R' 'factory.R' 'factory_DT.R' 'factory_dataframe.R' 'factory_flextable.R' 'factory_gt.R' 'factory_huxtable.R' 'factory_kableExtra.R' 'factory_markdown.R' 'factory_tinytable.R' 'factory_typst.R' 'fmt_factory.R' 'format_estimates.R' 'format_gof.R' 'format_msg.R' 'get_estimates.R' 'get_gof.R' 'get_vcov.R' 'glance_custom.R' 'gof_map.R' 'hush.R' 'map_estimates.R' 'map_gof.R' 'methods_did.R' 'methods_estimatr.R' 'methods_fixest.R' 'methods_lfe.R' 'methods_stats.R' 'modelplot.R' 'modelsummary.R' 'modelsummary_cbind.R' 'modelsummary_list.R' 'modelsummary_rbind.R' 'poorman.R' 'reexport.R' 'rename_statistics.R' 'sanitize_conf_level.R' 'sanitize_fmt.R' 'sanitize_gof_map.R' 'sanitize_models.R' 'sanitize_output.R' 'sanitize_shape.R' 'sanitize_statistic.R' 'sanitize_vcov.R' 'sanity_checks.R' 'settings.R' 'shape_estimates.R' 'span.R' 'stars.R' 'supported_models.R' 'themes.R' 'tidy_custom.R' 'update_modelsummary.R' 'utils_labels.R' 'utils_pad.R' 'utils_print.R' 'utils_replace.R' 'utils_stats.R' 'utils_warn.R' 'zzz.R' |
NeedsCompilation: | no |
Packaged: | 2025-06-08 18:45:41 UTC; vincent |
Author: | Vincent Arel-Bundock
|
Maintainer: | Vincent Arel-Bundock <vincent.arel-bundock@umontreal.ca> |
Repository: | CRAN |
Date/Publication: | 2025-06-08 19:10:02 UTC |
Include all columns of a dataframe.
Description
Include all columns of a dataframe.
Usage
All(
df,
numeric = TRUE,
character = FALSE,
logical = FALSE,
factor = FALSE,
complex = FALSE,
raw = FALSE,
other = FALSE,
texify = getOption("tables.texify", FALSE)
)
Display all observations in a table.
Description
Display all observations in a table.
Usage
AllObs(data = NULL, show = FALSE, label = "Obsn.", within = NULL)
'Arguments' pseudo-function
Description
'Arguments' pseudo-function
Usage
Arguments(...)
'DropEmpty' pseudo-function
Description
'DropEmpty' pseudo-function
Usage
DropEmpty(empty = "", which = c("row", "col", "cell"))
Use a variable as a factor to give rows in a table.
Description
Use a variable as a factor to give rows in a table.
Usage
Factor(
x,
name = deparse(expr),
levelnames = levels(x),
texify = getOption("tables.texify", FALSE),
expr = substitute(x),
override = TRUE
)
Use a variable as a factor to give rows in a table.
Description
Use a variable as a factor to give rows in a table.
Usage
Format(...)
'Heading' pseudo-function
Description
'Heading' pseudo-function
Usage
Heading(name = NULL, override = TRUE, character.only = FALSE, nearData = TRUE)
datasummary statistic shortcut
Description
This function uses Unicode characters to create a histogram. This can
sometimes be useful, but is generally discouraged. Unicode characters can
only display a limited number of heights for bars, and the accuracy of
output is highly dependent on the platform (typeface, output type, windows
vs. mac, etc.). We recommend you use the tinytable::plot_tt()
function
instead.
Usage
Histogram(x, bins = 10)
Arguments
x |
variable to summarize |
bins |
number of histogram bars |
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
Max(x, fmt = NULL, na.rm = TRUE, ...)
Arguments
x |
variable to summarize |
fmt |
passed to the |
na.rm |
a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds. |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(mpg + hp ~ Mean + Median + P0 + P25 + P50 + P75 + P100 +
Min + Max + SD + Var,
data = mtcars)
}
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
Mean(x, fmt = NULL, na.rm = TRUE, ...)
Arguments
x |
variable to summarize |
fmt |
passed to the |
na.rm |
a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds. |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(mpg + hp ~ Mean + P0 + P25 + P50 + P75 + P100 +
Min + Max + SD + Var,
data = mtcars)
}
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
Median(x, fmt = NULL, na.rm = TRUE, ...)
Arguments
x |
variable to summarize |
fmt |
passed to the |
na.rm |
a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds. |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(mpg + hp ~ Mean + Median + P0 + P25 + P50 + P75 + P100 +
Min + Max + SD + Var,
data = mtcars)
}
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
Min(x, fmt = NULL, na.rm = TRUE, ...)
Arguments
x |
variable to summarize |
fmt |
passed to the |
na.rm |
a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds. |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(mpg + hp ~ Mean + Median + P0 + P25 + P50 + P75 + P100 +
Min + Max + SD + Var,
data = mtcars)
}
Use a variable as a factor to give rows in a table.
Description
Use a variable as a factor to give rows in a table.
Usage
Multicolumn(
x,
name = deparse(expr),
levelnames = levels(x),
width = 2,
first = 1,
justify = "l",
texify = getOption("tables.texify", FALSE),
expr = substitute(x),
override = TRUE
)
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
N(x)
Arguments
x |
variable to summarize |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(Factor(cyl) ~ N, data = mtcars)
}
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
NPercent(x, y)
Arguments
x |
variable to summarize |
y |
denominator variable |
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
NUnique(x, ...)
Arguments
x |
variable to summarize |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(cyl + hp ~ NUnique, data = mtcars)
}
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
Ncol(x, ...)
Arguments
x |
variable to summarize |
... |
unused |
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
P0(x, fmt = NULL, na.rm = TRUE, ...)
Arguments
x |
variable to summarize |
fmt |
passed to the |
na.rm |
a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds. |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(mpg + hp ~ Mean + Median + P0 + P25 + P50 + P75 + P100 +
Min + Max + SD + Var,
data = mtcars)
}
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
P100(x, fmt = NULL, na.rm = TRUE, ...)
Arguments
x |
variable to summarize |
fmt |
passed to the |
na.rm |
a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds. |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(mpg + hp ~ Mean + Median + P0 + P25 + P50 + P75 + P100 +
Min + Max + SD + Var,
data = mtcars)
}
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
P25(x, fmt = NULL, na.rm = TRUE, ...)
Arguments
x |
variable to summarize |
fmt |
passed to the |
na.rm |
a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds. |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(mpg + hp ~ Mean + Median + P0 + P25 + P50 + P75 + P100 +
Min + Max + SD + Var,
data = mtcars)
}
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
P50(x, fmt = NULL, na.rm = TRUE, ...)
Arguments
x |
variable to summarize |
fmt |
passed to the |
na.rm |
a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds. |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(mpg + hp ~ Mean + Median + P0 + P25 + P50 + P75 + P100 +
Min + Max + SD + Var,
data = mtcars)
}
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
P75(x, fmt = NULL, na.rm = TRUE, ...)
Arguments
x |
variable to summarize |
fmt |
passed to the |
na.rm |
a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds. |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(mpg + hp ~ Mean + Median + P0 + P25 + P50 + P75 + P100 +
Min + Max + SD + Var,
data = mtcars)
}
Generate terms to paste values together in table.
Description
Generate terms to paste values together in table.
Usage
Paste(
...,
head,
digits = 2,
justify = "c",
prefix = "",
sep = "",
postfix = "",
character.only = FALSE
)
Pseudo-function to compute a statistic relative to a reference set.
Description
Pseudo-function to compute a statistic relative to a reference set.
Usage
Percent(denom = "all", fn = percent)
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
PercentMissing(x)
Arguments
x |
variable to summarize |
Generate 'x +/- y' terms in table.
Description
Generate 'x +/- y' terms in table.
Usage
PlusMinus(x, y, head, xhead, yhead, digits = 2, character.only = FALSE, ...)
Use a variable as a factor to give rows in a table.
Description
Use a variable as a factor to give rows in a table.
Usage
RowFactor(
x,
name = deparse(expr),
levelnames = levels(x),
spacing = 3,
space = 1,
suppressfirst = TRUE,
nopagebreak = "\\nopagebreak ",
texify = getOption("tables.texify", FALSE),
expr = substitute(x),
override = TRUE
)
Display all observations in a table.
Description
Display all observations in a table.
Usage
RowNum(within = NULL, perrow = 5, show = FALSE, label = "Row", data = NULL)
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
SD(x, fmt = NULL, na.rm = TRUE, ...)
Arguments
x |
variable to summarize |
fmt |
passed to the |
na.rm |
a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds. |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(mpg + hp ~ Mean + Median + P0 + P25 + P50 + P75 + P100 +
Min + Max + SD + Var,
data = mtcars)
}
datasummary statistic shortcut
Description
datasummary statistic shortcut
Usage
Var(x, fmt = NULL, na.rm = TRUE, ...)
Arguments
x |
variable to summarize |
fmt |
passed to the |
na.rm |
a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds. |
... |
unused |
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
datasummary(mpg + hp ~ Mean + Median + P0 + P25 + P50 + P75 + P100 +
Min + Max + SD + Var,
data = mtcars)
}
Rename model terms
Description
A convenience function which can be passed to the coef_rename
argument of
the modelsummary
function.
Usage
coef_rename(
x,
factor = TRUE,
factor_name = TRUE,
poly = TRUE,
backticks = TRUE,
titlecase = TRUE,
underscore = TRUE,
asis = TRUE
)
Arguments
x |
character vector of term names to transform |
factor |
boolean remove the "factor()" label |
factor_name |
boolean remove the "factor()" label and the name of the variable |
poly |
boolean remove the "poly()" label and function arguments |
backticks |
boolean remove backticks |
titlecase |
boolean convert to title case |
underscore |
boolean replace underscores by spaces |
asis |
boolean remove the |
Examples
library(modelsummary) dat <- mtcars dat$horse_power <- dat$hp mod <- lm(mpg ~ horse_power + factor(cyl), dat) modelsummary(mod, coef_rename = coef_rename)
Retrieve or modify the row or column labels.
Description
Retrieve or modify the row or column labels.
Usage
colLabels(x)
Persistent user settings for the modelsummary
package
Description
Persistent user settings for the modelsummary
package
Usage
config_modelsummary(
factory_default,
factory_latex,
factory_html,
factory_markdown,
startup_message,
reset = FALSE
)
Arguments
factory_default |
Default output format: "tinytable", "kableExtra", "gt", "flextable", "huxtable", "DT", or "markdown" |
factory_latex |
Name of package used to generate LaTeX output when |
factory_html |
Name of package used to generate LaTeX output when |
factory_markdown |
Name of package used to generate LaTeX output when |
startup_message |
TRUE or FALSE to show warnings at startup |
reset |
TRUE to return to default settings. |
Summary tables using 2-sided formulae: crosstabs, frequencies, table 1s and more.
Description
datasummary
can use any summary function which produces one numeric or
character value per variable. The examples section of this documentation
shows how to define custom summary functions.
modelsummary
also supplies several shortcut summary functions which can be used in datasummary()
formulas: Min, Max, Mean, Median, Var, SD, NPercent, NUnique, Ncol, P0, P25, P50, P75, P100.
See the Details and Examples sections below, and the vignettes on the modelsummary
website:
https://modelsummary.com/
https://modelsummary.com/vignettes/datasummary.html
Usage
datasummary(
formula,
data,
output = getOption("modelsummary_output", default = "default"),
fmt = 2,
title = getOption("modelsummary_title", default = NULL),
notes = getOption("modelsummary_notes", default = NULL),
align = getOption("modelsummary_align", default = NULL),
add_columns = getOption("modelsummary_add_columns", default = NULL),
add_rows = getOption("modelsummary_add_rows", default = NULL),
sparse_header = getOption("modelsummary_sparse_header", default = TRUE),
escape = getOption("modelsummary_escape", default = TRUE),
...
)
Arguments
formula |
A two-sided formula to describe the table: rows ~ columns. See the Examples section for a mini-tutorial and the Details section for more resources. Grouping/nesting variables can appear on both sides of the formula, but all summary functions must be on one side. |
data |
A data.frame (or tibble) |
output |
filename or object type (character string)
|
fmt |
how to format numeric values: integer, user-supplied function, or
|
title |
string. Cross-reference labels should be added with Quarto or Rmarkdown chunk options when applicable. When saving standalone LaTeX files, users can add a label such as |
notes |
list or vector of notes to append to the bottom of the table. |
align |
A string with a number of characters equal to the number of columns in
the table (e.g.,
|
add_columns |
a data.frame (or tibble) with the same number of rows as your main table. |
add_rows |
a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. Positions can be defined using integers. In the
|
sparse_header |
TRUE or FALSE. TRUE eliminates column headers which
have a unique label across all columns, except for the row immediately above
the data. FALSE keeps all headers. The order in which terms are entered in
the formula determines the order in which headers appear. For example,
|
escape |
boolean TRUE escapes or substitutes LaTeX/HTML characters which could
prevent the file from compiling/displaying. |
... |
all other arguments are passed through to the table-making
functions tinytable::tt, kableExtra::kbl, gt::gt, DT::datatable, etc. depending on the |
Details
Visit the 'modelsummary' website for more usage examples: https://modelsummary.com
The 'datasummary' function is a thin wrapper around the 'tabular' function from the 'tables' package. More details about table-making formulas can be found in the 'tables' package documentation: ?tables::tabular
Hierarchical or "nested" column labels are only available for these output formats: tinytable, kableExtra, gt, html, rtf, and LaTeX. When saving tables to other formats, nested labels will be combined to a "flat" header.
Version 2.0.0, kableExtra, and tinytable
Since version 2.0.0, modelsummary
uses tinytable
as its default table-drawing backend.
Learn more at: https://vincentarelbundock.github.io/tinytable/",
Revert to kableExtra
for one session:
options(modelsummary_factory_default = 'kableExtra')
options(modelsummary_factory_latex = 'kableExtra')
options(modelsummary_factory_html = 'kableExtra')
Global Options
The behavior of modelsummary
can be modified by setting global options. In particular, most of the arguments for most of the package's functions cna be set using global options. For example:
-
options(modelsummary_output = "modelsummary_list")
-
options(modelsummary_statistic = '({conf.low}, {conf.high})')
-
options(modelsummary_stars = TRUE)
Options not specific to given arguments are listed below.
Model labels: default column names
These global option changes the style of the default column headers:
-
options(modelsummary_model_labels = "roman")
The supported styles are: "model", "arabic", "letters", "roman", "(arabic)", "(letters)", "(roman)"
Table-making packages
modelsummary
supports 6 table-making packages: tinytable
, kableExtra
, gt
,
flextable
, huxtable
, and DT
. Some of these packages have overlapping
functionalities. To change the default backend used for a specific file
format, you can use ' the options
function:
options(modelsummary_factory_html = 'kableExtra')
options(modelsummary_factory_word = 'huxtable')
options(modelsummary_factory_png = 'gt')
options(modelsummary_factory_latex = 'gt')
options(modelsummary_factory_latex_tabular = 'kableExtra')
Table themes
Change the look of tables in an automated and replicable way, using the modelsummary
theming functionality. See the vignette: https://modelsummary.com/vignettes/appearance.html
-
modelsummary_theme_gt
-
modelsummary_theme_kableExtra
-
modelsummary_theme_huxtable
-
modelsummary_theme_flextable
-
modelsummary_theme_dataframe
Model extraction functions
modelsummary
can use two sets of packages to extract information from
statistical models: the easystats
family (performance
and parameters
)
and broom
. By default, it uses easystats
first and then falls back on
broom
in case of failure. You can change the order of priorities or include
goodness-of-fit extracted by both packages by setting:
options(modelsummary_get = "easystats")
options(modelsummary_get = "broom")
options(modelsummary_get = "all")
Formatting numeric entries
By default, LaTeX tables enclose all numeric entries in the \num{}
command
from the siunitx package. To prevent this behavior, or to enclose numbers
in dollar signs (for LaTeX math mode), users can call:
options(modelsummary_format_numeric_latex = "plain")
options(modelsummary_format_numeric_latex = "mathmode")
A similar option can be used to display numerical entries using MathJax in HTML tables:
options(modelsummary_format_numeric_html = "mathjax")
LaTeX preamble
When creating LaTeX via the tinytable
backend (default in version 2.0.0 and later), it is useful to include the following commands in the LaTeX preamble of your documents. These commands are automatically added to the preamble when compiling Rmarkdown or Quarto documents, except when the modelsummary()
calls are cached.
\usepackage{tabularray} \usepackage{float} \usepackage{graphicx} \usepackage[normalem]{ulem} \UseTblrLibrary{booktabs} \UseTblrLibrary{siunitx} \newcommand{\tinytableTabularrayUnderline}[1]{\underline{#1}} \newcommand{\tinytableTabularrayStrikeout}[1]{\sout{#1}} \NewTableCommand{\tinytableDefineColor}[3]{\definecolor{#1}{#2}{#3}}
Examples
library(modelsummary) # The left-hand side of the formula describes rows, and the right-hand side # describes columns. This table uses the "mpg" variable as a row and the "mean" # function as a column: datasummary(mpg ~ mean, data = mtcars) # This table uses the "mean" function as a row and the "mpg" variable as a column: datasummary(mean ~ mpg, data = mtcars) # Display several variables or functions of the data using the "+" # concatenation operator. This table has 2 rows and 2 columns: datasummary(hp + mpg ~ mean + sd, data = mtcars) # Nest variables or statistics inside a "factor" variable using the "*" nesting # operator. This table shows the mean of "hp" and "mpg" for each value of # "cyl": mtcars$cyl <- as.factor(mtcars$cyl) datasummary(hp + mpg ~ cyl * mean, data = mtcars) # If you don't want to convert your original data # to factors, you can use the 'Factor()' # function inside 'datasummary' to obtain an identical result: datasummary(hp + mpg ~ Factor(cyl) * mean, data = mtcars) # You can nest several variables or statistics inside a factor by using # parentheses. This table shows the mean and the standard deviation for each # subset of "cyl": datasummary(hp + mpg ~ cyl * (mean + sd), data = mtcars) # Summarize all numeric variables with 'All()' datasummary(All(mtcars) ~ mean + sd, data = mtcars) # Define custom summary statistics. Your custom function should accept a vector # of numeric values and return a single numeric or string value: minmax <- function(x) sprintf("[%.2f, %.2f]", min(x), max(x)) mean_na <- function(x) mean(x, na.rm = TRUE) datasummary(hp + mpg ~ minmax + mean_na, data = mtcars) # To handle missing values, you can pass arguments to your functions using # '*Arguments()' datasummary(hp + mpg ~ mean * Arguments(na.rm = TRUE), data = mtcars) # For convenience, 'modelsummary' supplies several convenience functions # with the argument `na.rm=TRUE` by default: Mean, Median, Min, Max, SD, Var, # P0, P25, P50, P75, P100, NUnique, Histogram #datasummary(hp + mpg ~ Mean + SD + Histogram, data = mtcars) # These functions also accept a 'fmt' argument which allows you to # round/format the results datasummary(hp + mpg ~ Mean * Arguments(fmt = "%.3f") + SD * Arguments(fmt = "%.1f"), data = mtcars) # Save your tables to a variety of output formats: f <- hp + mpg ~ Mean + SD #datasummary(f, data = mtcars, output = 'table.html') #datasummary(f, data = mtcars, output = 'table.tex') #datasummary(f, data = mtcars, output = 'table.md') #datasummary(f, data = mtcars, output = 'table.docx') #datasummary(f, data = mtcars, output = 'table.pptx') #datasummary(f, data = mtcars, output = 'table.jpg') #datasummary(f, data = mtcars, output = 'table.png') # Display human-readable code #datasummary(f, data = mtcars, output = 'html') #datasummary(f, data = mtcars, output = 'markdown') #datasummary(f, data = mtcars, output = 'latex') # Return a table object to customize using a table-making package #datasummary(f, data = mtcars, output = 'tinytable') #datasummary(f, data = mtcars, output = 'gt') #datasummary(f, data = mtcars, output = 'kableExtra') #datasummary(f, data = mtcars, output = 'flextable') #datasummary(f, data = mtcars, output = 'huxtable') # add_rows new_rows <- data.frame(a = 1:2, b = 2:3, c = 4:5) attr(new_rows, 'position') <- c(1, 3) datasummary(mpg + hp ~ mean + sd, data = mtcars, add_rows = new_rows)
References
Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. doi:10.18637/jss.v103.i01.'
Balance table: Summary statistics for different subsets of the data (e.g., control and treatment groups)
Description
Creates balance tables with summary statistics for different subsets of the
data (e.g., control and treatment groups). It can also be used to create
summary tables for full data sets. See the Details and Examples sections
below, and the vignettes on the modelsummary
website:
https://modelsummary.com/
https://modelsummary.com/vignettes/datasummary.html
Usage
datasummary_balance(
formula,
data,
output = getOption("modelsummary_output", default = "default"),
fmt = fmt_decimal(digits = 1, pdigits = 3),
title = getOption("modelsummary_title", default = NULL),
notes = getOption("modelsummary_notes", default = NULL),
align = getOption("modelsummary_align", default = NULL),
stars = getOption("modelsummary_stars", default = FALSE),
add_columns = getOption("modelsummary_add_columns", default = NULL),
add_rows = getOption("modelsummary_add_rows", default = NULL),
dinm = getOption("modelsummary_dinm", default = TRUE),
dinm_statistic = getOption("modelsummary_dinm_statistic", default = "std.error"),
escape = getOption("modelsummary_escape", default = TRUE),
...
)
Arguments
formula |
|
data |
A data.frame (or tibble). If this data includes columns called
"blocks", "clusters", and/or "weights", the "estimatr" package will consider
them when calculating the difference in means. If there is a |
output |
filename or object type (character string)
|
fmt |
how to format numeric values: integer, user-supplied function, or
|
title |
string. Cross-reference labels should be added with Quarto or Rmarkdown chunk options when applicable. When saving standalone LaTeX files, users can add a label such as |
notes |
list or vector of notes to append to the bottom of the table. |
align |
A string with a number of characters equal to the number of columns in
the table (e.g.,
|
stars |
to indicate statistical significance
|
add_columns |
a data.frame (or tibble) with the same number of rows as your main table. |
add_rows |
a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. Positions can be defined using integers. In the
|
dinm |
TRUE calculates a difference in means with uncertainty
estimates. This option is only available if the |
dinm_statistic |
string: "std.error" or "p.value" |
escape |
boolean TRUE escapes or substitutes LaTeX/HTML characters which could
prevent the file from compiling/displaying. |
... |
all other arguments are passed through to the table-making
functions tinytable::tt, kableExtra::kbl, gt::gt, DT::datatable, etc. depending on the |
Version 2.0.0, kableExtra, and tinytable
Since version 2.0.0, modelsummary
uses tinytable
as its default table-drawing backend.
Learn more at: https://vincentarelbundock.github.io/tinytable/",
Revert to kableExtra
for one session:
options(modelsummary_factory_default = 'kableExtra')
options(modelsummary_factory_latex = 'kableExtra')
options(modelsummary_factory_html = 'kableExtra')
Global Options
The behavior of modelsummary
can be modified by setting global options. In particular, most of the arguments for most of the package's functions cna be set using global options. For example:
-
options(modelsummary_output = "modelsummary_list")
-
options(modelsummary_statistic = '({conf.low}, {conf.high})')
-
options(modelsummary_stars = TRUE)
Options not specific to given arguments are listed below.
Model labels: default column names
These global option changes the style of the default column headers:
-
options(modelsummary_model_labels = "roman")
The supported styles are: "model", "arabic", "letters", "roman", "(arabic)", "(letters)", "(roman)"
Table-making packages
modelsummary
supports 6 table-making packages: tinytable
, kableExtra
, gt
,
flextable
, huxtable
, and DT
. Some of these packages have overlapping
functionalities. To change the default backend used for a specific file
format, you can use ' the options
function:
options(modelsummary_factory_html = 'kableExtra')
options(modelsummary_factory_word = 'huxtable')
options(modelsummary_factory_png = 'gt')
options(modelsummary_factory_latex = 'gt')
options(modelsummary_factory_latex_tabular = 'kableExtra')
Table themes
Change the look of tables in an automated and replicable way, using the modelsummary
theming functionality. See the vignette: https://modelsummary.com/vignettes/appearance.html
-
modelsummary_theme_gt
-
modelsummary_theme_kableExtra
-
modelsummary_theme_huxtable
-
modelsummary_theme_flextable
-
modelsummary_theme_dataframe
Model extraction functions
modelsummary
can use two sets of packages to extract information from
statistical models: the easystats
family (performance
and parameters
)
and broom
. By default, it uses easystats
first and then falls back on
broom
in case of failure. You can change the order of priorities or include
goodness-of-fit extracted by both packages by setting:
options(modelsummary_get = "easystats")
options(modelsummary_get = "broom")
options(modelsummary_get = "all")
Formatting numeric entries
By default, LaTeX tables enclose all numeric entries in the \num{}
command
from the siunitx package. To prevent this behavior, or to enclose numbers
in dollar signs (for LaTeX math mode), users can call:
options(modelsummary_format_numeric_latex = "plain")
options(modelsummary_format_numeric_latex = "mathmode")
A similar option can be used to display numerical entries using MathJax in HTML tables:
options(modelsummary_format_numeric_html = "mathjax")
LaTeX preamble
When creating LaTeX via the tinytable
backend (default in version 2.0.0 and later), it is useful to include the following commands in the LaTeX preamble of your documents. These commands are automatically added to the preamble when compiling Rmarkdown or Quarto documents, except when the modelsummary()
calls are cached.
\usepackage{tabularray} \usepackage{float} \usepackage{graphicx} \usepackage[normalem]{ulem} \UseTblrLibrary{booktabs} \UseTblrLibrary{siunitx} \newcommand{\tinytableTabularrayUnderline}[1]{\underline{#1}} \newcommand{\tinytableTabularrayStrikeout}[1]{\sout{#1}} \NewTableCommand{\tinytableDefineColor}[3]{\definecolor{#1}{#2}{#3}}
Examples
library(modelsummary) datasummary_balance(~am, mtcars)
References
Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. doi:10.18637/jss.v103.i01.'
Generate a correlation table for all numeric variables in your dataset.
Description
The names of the variables displayed in the correlation table are the names
of the columns in the data
. You can rename those columns (with or without
spaces) to produce a table of human-readable variables. See the Details and
Examples sections below, and the vignettes on the modelsummary
website:
https://modelsummary.com/
https://modelsummary.com/vignettes/datasummary.html
Usage
datasummary_correlation(
data,
output = getOption("modelsummary_output", default = "default"),
method = getOption("modelsummary_method", default = "pearson"),
fmt = 2,
align = getOption("modelsummary_align", default = NULL),
add_rows = getOption("modelsummary_add_rows", default = NULL),
add_columns = getOption("modelsummary_add_columns", default = NULL),
title = getOption("modelsummary_title", default = NULL),
notes = getOption("modelsummary_notes", default = NULL),
escape = getOption("modelsummary_escape", default = TRUE),
stars = getOption("modelsummary_stars", default = FALSE),
...
)
Arguments
data |
A data.frame (or tibble) |
output |
filename or object type (character string)
|
method |
character or function
|
fmt |
how to format numeric values: integer, user-supplied function, or
|
align |
A string with a number of characters equal to the number of columns in
the table (e.g.,
|
add_rows |
a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. Positions can be defined using integers. In the
|
add_columns |
a data.frame (or tibble) with the same number of rows as your main table. |
title |
string. Cross-reference labels should be added with Quarto or Rmarkdown chunk options when applicable. When saving standalone LaTeX files, users can add a label such as |
notes |
list or vector of notes to append to the bottom of the table. |
escape |
boolean TRUE escapes or substitutes LaTeX/HTML characters which could
prevent the file from compiling/displaying. |
stars |
to indicate statistical significance
|
... |
other parameters are passed through to the table-making packages. |
Version 2.0.0, kableExtra, and tinytable
Since version 2.0.0, modelsummary
uses tinytable
as its default table-drawing backend.
Learn more at: https://vincentarelbundock.github.io/tinytable/",
Revert to kableExtra
for one session:
options(modelsummary_factory_default = 'kableExtra')
options(modelsummary_factory_latex = 'kableExtra')
options(modelsummary_factory_html = 'kableExtra')
Global Options
The behavior of modelsummary
can be modified by setting global options. In particular, most of the arguments for most of the package's functions cna be set using global options. For example:
-
options(modelsummary_output = "modelsummary_list")
-
options(modelsummary_statistic = '({conf.low}, {conf.high})')
-
options(modelsummary_stars = TRUE)
Options not specific to given arguments are listed below.
Model labels: default column names
These global option changes the style of the default column headers:
-
options(modelsummary_model_labels = "roman")
The supported styles are: "model", "arabic", "letters", "roman", "(arabic)", "(letters)", "(roman)"
Table-making packages
modelsummary
supports 6 table-making packages: tinytable
, kableExtra
, gt
,
flextable
, huxtable
, and DT
. Some of these packages have overlapping
functionalities. To change the default backend used for a specific file
format, you can use ' the options
function:
options(modelsummary_factory_html = 'kableExtra')
options(modelsummary_factory_word = 'huxtable')
options(modelsummary_factory_png = 'gt')
options(modelsummary_factory_latex = 'gt')
options(modelsummary_factory_latex_tabular = 'kableExtra')
Table themes
Change the look of tables in an automated and replicable way, using the modelsummary
theming functionality. See the vignette: https://modelsummary.com/vignettes/appearance.html
-
modelsummary_theme_gt
-
modelsummary_theme_kableExtra
-
modelsummary_theme_huxtable
-
modelsummary_theme_flextable
-
modelsummary_theme_dataframe
Model extraction functions
modelsummary
can use two sets of packages to extract information from
statistical models: the easystats
family (performance
and parameters
)
and broom
. By default, it uses easystats
first and then falls back on
broom
in case of failure. You can change the order of priorities or include
goodness-of-fit extracted by both packages by setting:
options(modelsummary_get = "easystats")
options(modelsummary_get = "broom")
options(modelsummary_get = "all")
Formatting numeric entries
By default, LaTeX tables enclose all numeric entries in the \num{}
command
from the siunitx package. To prevent this behavior, or to enclose numbers
in dollar signs (for LaTeX math mode), users can call:
options(modelsummary_format_numeric_latex = "plain")
options(modelsummary_format_numeric_latex = "mathmode")
A similar option can be used to display numerical entries using MathJax in HTML tables:
options(modelsummary_format_numeric_html = "mathjax")
LaTeX preamble
When creating LaTeX via the tinytable
backend (default in version 2.0.0 and later), it is useful to include the following commands in the LaTeX preamble of your documents. These commands are automatically added to the preamble when compiling Rmarkdown or Quarto documents, except when the modelsummary()
calls are cached.
\usepackage{tabularray} \usepackage{float} \usepackage{graphicx} \usepackage[normalem]{ulem} \UseTblrLibrary{booktabs} \UseTblrLibrary{siunitx} \newcommand{\tinytableTabularrayUnderline}[1]{\underline{#1}} \newcommand{\tinytableTabularrayStrikeout}[1]{\sout{#1}} \NewTableCommand{\tinytableDefineColor}[3]{\definecolor{#1}{#2}{#3}}
Examples
library(modelsummary) # clean variable names (base R) dat <- mtcars[, c("mpg", "hp")] colnames(dat) <- c("Miles / Gallon", "Horse Power") datasummary_correlation(dat) # clean variable names (tidyverse) library(tidyverse) dat <- mtcars %>% select(`Miles / Gallon` = mpg, `Horse Power` = hp) datasummary_correlation(dat) # `correlation` package objects if (requireNamespace("correlation", quietly = TRUE)) { co <- correlation::correlation(mtcars[, 1:4]) datasummary_correlation(co) # add stars to easycorrelation objects datasummary_correlation(co, stars = TRUE) } # alternative methods datasummary_correlation(dat, method = "pearspear") # custom function cor_fun <- function(x) cor(x, method = "kendall") datasummary_correlation(dat, method = cor_fun) # rename columns alphabetically and include a footnote for reference note <- sprintf("(%s) %s", letters[1:ncol(dat)], colnames(dat)) note <- paste(note, collapse = "; ") colnames(dat) <- sprintf("(%s)", letters[1:ncol(dat)]) datasummary_correlation(dat, notes = note) # `datasummary_correlation_format`: custom function with formatting dat <- mtcars[, c("mpg", "hp", "disp")] cor_fun <- function(x) { out <- cor(x, method = "kendall") datasummary_correlation_format( out, fmt = 2, upper_triangle = "x", diagonal = ".") } datasummary_correlation(dat, method = cor_fun) # use kableExtra and psych to color significant cells library(psych) library(kableExtra) dat <- mtcars[, c("vs", "hp", "gear")] cor_fun <- function(dat) { # compute correlations and format them correlations <- data.frame(cor(dat)) correlations <- datasummary_correlation_format(correlations, fmt = 2) # calculate pvalues using the `psych` package pvalues <- psych::corr.test(dat)$p # use `kableExtra::cell_spec` to color significant cells for (i in 1:nrow(correlations)) { for (j in 1:ncol(correlations)) { if (pvalues[i, j] < 0.05 && i != j) { correlations[i, j] <- cell_spec(correlations[i, j], background = "pink") } } } return(correlations) } # The `escape=FALSE` is important here! datasummary_correlation(dat, method = cor_fun, escape = FALSE)
References
Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. doi:10.18637/jss.v103.i01.'
Format the content of a correlation table
Description
Mostly for internal use, but can be useful when users supply a function to
the method
argument of datasummary_correlation
.
Usage
datasummary_correlation_format(
x,
fmt,
leading_zero = FALSE,
diagonal = NULL,
upper_triangle = NULL,
stars = FALSE
)
Arguments
x |
square numeric matrix |
fmt |
how to format numeric values: integer, user-supplied function, or
|
leading_zero |
boolean. If |
diagonal |
character or NULL. If character, all elements of the diagonal are replaced by the same character (e.g., "1"). |
upper_triangle |
character or NULL. If character, all elements of the upper triangle are replaced by the same character (e.g., "" or "."). |
stars |
to indicate statistical significance
|
Examples
library(modelsummary)
dat <- mtcars[, c("mpg", "hp", "disp")]
cor_fun <- function(x) {
out <- cor(x, method = "kendall")
datasummary_correlation_format(
out,
fmt = 2,
upper_triangle = "x",
diagonal = ".")
}
datasummary_correlation(dat, method = cor_fun)
Cross tabulations for categorical variables
Description
Convenience function to tabulate counts, cell percentages, and row/column
percentages for categorical variables. See the Details section for a
description of the internal design. For more complex cross tabulations, use
datasummary directly. See the Details and Examples sections below,
and the vignettes on the modelsummary
website:
https://modelsummary.com/
https://modelsummary.com/vignettes/datasummary.html
Usage
datasummary_crosstab(
formula,
statistic = 1 ~ 1 + N + Percent("row"),
data,
output = getOption("modelsummary_output", default = "default"),
fmt = 1,
title = getOption("modelsummary_title", default = NULL),
notes = getOption("modelsummary_notes", default = NULL),
align = getOption("modelsummary_align", default = NULL),
add_columns = getOption("modelsummary_add_columns", default = NULL),
add_rows = getOption("modelsummary_add_rows", default = NULL),
sparse_header = getOption("modelsummary_sparse_header", default = TRUE),
escape = getOption("modelsummary_escape", default = TRUE),
...
)
Arguments
formula |
A two-sided formula to describe the table: rows ~ columns,
where rows and columns are variables in the data. Rows and columns may
contain interactions, e.g., |
statistic |
A formula of the form |
data |
A data.frame (or tibble) |
output |
filename or object type (character string)
|
fmt |
how to format numeric values: integer, user-supplied function, or
|
title |
string. Cross-reference labels should be added with Quarto or Rmarkdown chunk options when applicable. When saving standalone LaTeX files, users can add a label such as |
notes |
list or vector of notes to append to the bottom of the table. |
align |
A string with a number of characters equal to the number of columns in
the table (e.g.,
|
add_columns |
a data.frame (or tibble) with the same number of rows as your main table. |
add_rows |
a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. Positions can be defined using integers. In the
|
sparse_header |
TRUE or FALSE. TRUE eliminates column headers which
have a unique label across all columns, except for the row immediately above
the data. FALSE keeps all headers. The order in which terms are entered in
the formula determines the order in which headers appear. For example,
|
escape |
boolean TRUE escapes or substitutes LaTeX/HTML characters which could
prevent the file from compiling/displaying. |
... |
all other arguments are passed through to the table-making
functions tinytable::tt, kableExtra::kbl, gt::gt, DT::datatable, etc. depending on the |
Details
datasummary_crosstab
is a wrapper around the datasummary
function. This wrapper works by creating a customized formula and by
feeding it to datasummary
. The customized formula comes in two parts.
First, we take a two-sided formula supplied by the formula
argument.
All variables of that formula are wrapped in a Factor()
call to ensure
that the variables are treated as categorical.
Second, the statistic
argument gives a two-sided formula which specifies
the statistics to include in the table. datasummary_crosstab
modifies
this formula automatically to include "clean" labels.
Finally, the formula
and statistic
formulas are combined into a single
formula which is fed directly to the datasummary
function to produce the
table.
Variables in formula
are automatically wrapped in Factor()
.
Version 2.0.0, kableExtra, and tinytable
Since version 2.0.0, modelsummary
uses tinytable
as its default table-drawing backend.
Learn more at: https://vincentarelbundock.github.io/tinytable/",
Revert to kableExtra
for one session:
options(modelsummary_factory_default = 'kableExtra')
options(modelsummary_factory_latex = 'kableExtra')
options(modelsummary_factory_html = 'kableExtra')
Global Options
The behavior of modelsummary
can be modified by setting global options. In particular, most of the arguments for most of the package's functions cna be set using global options. For example:
-
options(modelsummary_output = "modelsummary_list")
-
options(modelsummary_statistic = '({conf.low}, {conf.high})')
-
options(modelsummary_stars = TRUE)
Options not specific to given arguments are listed below.
Model labels: default column names
These global option changes the style of the default column headers:
-
options(modelsummary_model_labels = "roman")
The supported styles are: "model", "arabic", "letters", "roman", "(arabic)", "(letters)", "(roman)"
Table-making packages
modelsummary
supports 6 table-making packages: tinytable
, kableExtra
, gt
,
flextable
, huxtable
, and DT
. Some of these packages have overlapping
functionalities. To change the default backend used for a specific file
format, you can use ' the options
function:
options(modelsummary_factory_html = 'kableExtra')
options(modelsummary_factory_word = 'huxtable')
options(modelsummary_factory_png = 'gt')
options(modelsummary_factory_latex = 'gt')
options(modelsummary_factory_latex_tabular = 'kableExtra')
Table themes
Change the look of tables in an automated and replicable way, using the modelsummary
theming functionality. See the vignette: https://modelsummary.com/vignettes/appearance.html
-
modelsummary_theme_gt
-
modelsummary_theme_kableExtra
-
modelsummary_theme_huxtable
-
modelsummary_theme_flextable
-
modelsummary_theme_dataframe
Model extraction functions
modelsummary
can use two sets of packages to extract information from
statistical models: the easystats
family (performance
and parameters
)
and broom
. By default, it uses easystats
first and then falls back on
broom
in case of failure. You can change the order of priorities or include
goodness-of-fit extracted by both packages by setting:
options(modelsummary_get = "easystats")
options(modelsummary_get = "broom")
options(modelsummary_get = "all")
Formatting numeric entries
By default, LaTeX tables enclose all numeric entries in the \num{}
command
from the siunitx package. To prevent this behavior, or to enclose numbers
in dollar signs (for LaTeX math mode), users can call:
options(modelsummary_format_numeric_latex = "plain")
options(modelsummary_format_numeric_latex = "mathmode")
A similar option can be used to display numerical entries using MathJax in HTML tables:
options(modelsummary_format_numeric_html = "mathjax")
LaTeX preamble
When creating LaTeX via the tinytable
backend (default in version 2.0.0 and later), it is useful to include the following commands in the LaTeX preamble of your documents. These commands are automatically added to the preamble when compiling Rmarkdown or Quarto documents, except when the modelsummary()
calls are cached.
\usepackage{tabularray} \usepackage{float} \usepackage{graphicx} \usepackage[normalem]{ulem} \UseTblrLibrary{booktabs} \UseTblrLibrary{siunitx} \newcommand{\tinytableTabularrayUnderline}[1]{\underline{#1}} \newcommand{\tinytableTabularrayStrikeout}[1]{\sout{#1}} \NewTableCommand{\tinytableDefineColor}[3]{\definecolor{#1}{#2}{#3}}
Examples
library(modelsummary) # crosstab of two variables, showing counts, row percentages, and row/column totals datasummary_crosstab(cyl ~ gear, data = mtcars) # crosstab of two variables, showing counts only and no totals datasummary_crosstab(cyl ~ gear, statistic = ~ N, data = mtcars) # crosstab of three variables datasummary_crosstab(am * cyl ~ gear, data = mtcars) # crosstab with two variables and column percentages datasummary_crosstab(am ~ gear, statistic = ~ Percent("col"), data = mtcars)
References
Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. doi:10.18637/jss.v103.i01.'
Draw a table from a data.frame
Description
Draw a table from a data.frame
Usage
datasummary_df(
data,
output = getOption("modelsummary_output", default = "default"),
fmt = 2,
align = getOption("modelsummary_align", default = NULL),
hrule = getOption("modelsummary_hrule", default = NULL),
title = getOption("modelsummary_title", default = NULL),
notes = getOption("modelsummary_notes", default = NULL),
add_rows = getOption("modelsummary_add_rows", default = NULL),
add_columns = getOption("modelsummary_add_columns", default = NULL),
escape = getOption("modelsummary_escape", default = TRUE),
...
)
Arguments
data |
A data.frame (or tibble) |
output |
filename or object type (character string)
|
fmt |
how to format numeric values: integer, user-supplied function, or
|
align |
A string with a number of characters equal to the number of columns in
the table (e.g.,
|
hrule |
position of horizontal rules (integer vector) |
title |
string. Cross-reference labels should be added with Quarto or Rmarkdown chunk options when applicable. When saving standalone LaTeX files, users can add a label such as |
notes |
list or vector of notes to append to the bottom of the table. |
add_rows |
a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. Positions can be defined using integers. In the
|
add_columns |
a data.frame (or tibble) with the same number of rows as your main table. |
escape |
boolean TRUE escapes or substitutes LaTeX/HTML characters which could
prevent the file from compiling/displaying. |
... |
all other arguments are passed through to the table-making
functions tinytable::tt, kableExtra::kbl, gt::gt, DT::datatable, etc. depending on the |
Version 2.0.0, kableExtra, and tinytable
Since version 2.0.0, modelsummary
uses tinytable
as its default table-drawing backend.
Learn more at: https://vincentarelbundock.github.io/tinytable/",
Revert to kableExtra
for one session:
options(modelsummary_factory_default = 'kableExtra')
options(modelsummary_factory_latex = 'kableExtra')
options(modelsummary_factory_html = 'kableExtra')
References
Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. doi:10.18637/jss.v103.i01.'
Quick overview of numeric or categorical variables
Description
This function was inspired by the excellent skimr
package for R.
See the Details and Examples sections below, and the vignettes on the
modelsummary
website:
https://modelsummary.com/
https://modelsummary.com/vignettes/datasummary.html
Usage
datasummary_skim(
data,
output = getOption("modelsummary_output", default = "default"),
type = getOption("modelsummary_type", default = "all"),
fmt = 1,
title = getOption("modelsummary_title", default = NULL),
notes = getOption("modelsummary_notes", default = NULL),
align = getOption("modelsummary_align", default = NULL),
escape = getOption("modelsummary_escape", default = TRUE),
by = getOption("modelsummary_by", default = NULL),
fun_numeric = getOption("modelsummary_fun_numeric", default = list(Unique = NUnique,
`Missing Pct.` = PercentMissing, Mean = Mean, SD = SD, Min = Min, Median = Median,
Max = Max, Histogram = function(x) "")),
...
)
Arguments
data |
A data.frame (or tibble) |
output |
filename or object type (character string)
|
type |
String. Variables to summarize: "all", "numeric", "categorical", "dataset" |
fmt |
how to format numeric values: integer, user-supplied function, or
|
title |
string. Cross-reference labels should be added with Quarto or Rmarkdown chunk options when applicable. When saving standalone LaTeX files, users can add a label such as |
notes |
list or vector of notes to append to the bottom of the table. |
align |
A string with a number of characters equal to the number of columns in
the table (e.g.,
|
escape |
boolean TRUE escapes or substitutes LaTeX/HTML characters which could
prevent the file from compiling/displaying. |
by |
Character vector of grouping variables to compute statistics over. |
fun_numeric |
Named list of funtions to apply to each numeric column of |
... |
all other arguments are passed through to the table-making
functions tinytable::tt, kableExtra::kbl, gt::gt, DT::datatable, etc. depending on the |
Version 2.0.0, kableExtra, and tinytable
Since version 2.0.0, modelsummary
uses tinytable
as its default table-drawing backend.
Learn more at: https://vincentarelbundock.github.io/tinytable/",
Revert to kableExtra
for one session:
options(modelsummary_factory_default = 'kableExtra')
options(modelsummary_factory_latex = 'kableExtra')
options(modelsummary_factory_html = 'kableExtra')
Global Options
The behavior of modelsummary
can be modified by setting global options. In particular, most of the arguments for most of the package's functions cna be set using global options. For example:
-
options(modelsummary_output = "modelsummary_list")
-
options(modelsummary_statistic = '({conf.low}, {conf.high})')
-
options(modelsummary_stars = TRUE)
Options not specific to given arguments are listed below.
Model labels: default column names
These global option changes the style of the default column headers:
-
options(modelsummary_model_labels = "roman")
The supported styles are: "model", "arabic", "letters", "roman", "(arabic)", "(letters)", "(roman)"
Table-making packages
modelsummary
supports 6 table-making packages: tinytable
, kableExtra
, gt
,
flextable
, huxtable
, and DT
. Some of these packages have overlapping
functionalities. To change the default backend used for a specific file
format, you can use ' the options
function:
options(modelsummary_factory_html = 'kableExtra')
options(modelsummary_factory_word = 'huxtable')
options(modelsummary_factory_png = 'gt')
options(modelsummary_factory_latex = 'gt')
options(modelsummary_factory_latex_tabular = 'kableExtra')
Table themes
Change the look of tables in an automated and replicable way, using the modelsummary
theming functionality. See the vignette: https://modelsummary.com/vignettes/appearance.html
-
modelsummary_theme_gt
-
modelsummary_theme_kableExtra
-
modelsummary_theme_huxtable
-
modelsummary_theme_flextable
-
modelsummary_theme_dataframe
Model extraction functions
modelsummary
can use two sets of packages to extract information from
statistical models: the easystats
family (performance
and parameters
)
and broom
. By default, it uses easystats
first and then falls back on
broom
in case of failure. You can change the order of priorities or include
goodness-of-fit extracted by both packages by setting:
options(modelsummary_get = "easystats")
options(modelsummary_get = "broom")
options(modelsummary_get = "all")
Formatting numeric entries
By default, LaTeX tables enclose all numeric entries in the \num{}
command
from the siunitx package. To prevent this behavior, or to enclose numbers
in dollar signs (for LaTeX math mode), users can call:
options(modelsummary_format_numeric_latex = "plain")
options(modelsummary_format_numeric_latex = "mathmode")
A similar option can be used to display numerical entries using MathJax in HTML tables:
options(modelsummary_format_numeric_html = "mathjax")
LaTeX preamble
When creating LaTeX via the tinytable
backend (default in version 2.0.0 and later), it is useful to include the following commands in the LaTeX preamble of your documents. These commands are automatically added to the preamble when compiling Rmarkdown or Quarto documents, except when the modelsummary()
calls are cached.
\usepackage{tabularray} \usepackage{float} \usepackage{graphicx} \usepackage[normalem]{ulem} \UseTblrLibrary{booktabs} \UseTblrLibrary{siunitx} \newcommand{\tinytableTabularrayUnderline}[1]{\underline{#1}} \newcommand{\tinytableTabularrayStrikeout}[1]{\sout{#1}} \NewTableCommand{\tinytableDefineColor}[3]{\definecolor{#1}{#2}{#3}}
References
Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. doi:10.18637/jss.v103.i01.'
Examples
dat <- mtcars
dat$vs <- as.logical(dat$vs)
dat$cyl <- as.factor(dat$cyl)
datasummary_skim(dat)
datasummary_skim(dat, type = "categorical")
safe do.call
Description
safe do.call
Usage
do_call(fun, args)
dsummary()
is a shortcut to datasummary()
Description
datasummary
can use any summary function which produces one numeric or
character value per variable. The examples section of this documentation
shows how to define custom summary functions.
modelsummary
also supplies several shortcut summary functions which can be used in datasummary()
formulas: Min, Max, Mean, Median, Var, SD, NPercent, NUnique, Ncol, P0, P25, P50, P75, P100.
See the Details and Examples sections below, and the vignettes on the modelsummary
website:
https://modelsummary.com/
https://modelsummary.com/vignettes/datasummary.html
Usage
dsummary(
formula,
data,
output = getOption("modelsummary_output", default = "default"),
fmt = 2,
title = getOption("modelsummary_title", default = NULL),
notes = getOption("modelsummary_notes", default = NULL),
align = getOption("modelsummary_align", default = NULL),
add_columns = getOption("modelsummary_add_columns", default = NULL),
add_rows = getOption("modelsummary_add_rows", default = NULL),
sparse_header = getOption("modelsummary_sparse_header", default = TRUE),
escape = getOption("modelsummary_escape", default = TRUE),
...
)
Arguments
formula |
A two-sided formula to describe the table: rows ~ columns. See the Examples section for a mini-tutorial and the Details section for more resources. Grouping/nesting variables can appear on both sides of the formula, but all summary functions must be on one side. |
data |
A data.frame (or tibble) |
output |
filename or object type (character string)
|
fmt |
how to format numeric values: integer, user-supplied function, or
|
title |
string. Cross-reference labels should be added with Quarto or Rmarkdown chunk options when applicable. When saving standalone LaTeX files, users can add a label such as |
notes |
list or vector of notes to append to the bottom of the table. |
align |
A string with a number of characters equal to the number of columns in
the table (e.g.,
|
add_columns |
a data.frame (or tibble) with the same number of rows as your main table. |
add_rows |
a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. Positions can be defined using integers. In the
|
sparse_header |
TRUE or FALSE. TRUE eliminates column headers which
have a unique label across all columns, except for the row immediately above
the data. FALSE keeps all headers. The order in which terms are entered in
the formula determines the order in which headers appear. For example,
|
escape |
boolean TRUE escapes or substitutes LaTeX/HTML characters which could
prevent the file from compiling/displaying. |
... |
all other arguments are passed through to the table-making
functions tinytable::tt, kableExtra::kbl, gt::gt, DT::datatable, etc. depending on the |
Details
Visit the 'modelsummary' website for more usage examples: https://modelsummary.com
The 'datasummary' function is a thin wrapper around the 'tabular' function from the 'tables' package. More details about table-making formulas can be found in the 'tables' package documentation: ?tables::tabular
Hierarchical or "nested" column labels are only available for these output formats: tinytable, kableExtra, gt, html, rtf, and LaTeX. When saving tables to other formats, nested labels will be combined to a "flat" header.
Version 2.0.0, kableExtra, and tinytable
Since version 2.0.0, modelsummary
uses tinytable
as its default table-drawing backend.
Learn more at: https://vincentarelbundock.github.io/tinytable/",
Revert to kableExtra
for one session:
options(modelsummary_factory_default = 'kableExtra')
options(modelsummary_factory_latex = 'kableExtra')
options(modelsummary_factory_html = 'kableExtra')
Global Options
The behavior of modelsummary
can be modified by setting global options. In particular, most of the arguments for most of the package's functions cna be set using global options. For example:
-
options(modelsummary_output = "modelsummary_list")
-
options(modelsummary_statistic = '({conf.low}, {conf.high})')
-
options(modelsummary_stars = TRUE)
Options not specific to given arguments are listed below.
Model labels: default column names
These global option changes the style of the default column headers:
-
options(modelsummary_model_labels = "roman")
The supported styles are: "model", "arabic", "letters", "roman", "(arabic)", "(letters)", "(roman)"
Table-making packages
modelsummary
supports 6 table-making packages: tinytable
, kableExtra
, gt
,
flextable
, huxtable
, and DT
. Some of these packages have overlapping
functionalities. To change the default backend used for a specific file
format, you can use ' the options
function:
options(modelsummary_factory_html = 'kableExtra')
options(modelsummary_factory_word = 'huxtable')
options(modelsummary_factory_png = 'gt')
options(modelsummary_factory_latex = 'gt')
options(modelsummary_factory_latex_tabular = 'kableExtra')
Table themes
Change the look of tables in an automated and replicable way, using the modelsummary
theming functionality. See the vignette: https://modelsummary.com/vignettes/appearance.html
-
modelsummary_theme_gt
-
modelsummary_theme_kableExtra
-
modelsummary_theme_huxtable
-
modelsummary_theme_flextable
-
modelsummary_theme_dataframe
Model extraction functions
modelsummary
can use two sets of packages to extract information from
statistical models: the easystats
family (performance
and parameters
)
and broom
. By default, it uses easystats
first and then falls back on
broom
in case of failure. You can change the order of priorities or include
goodness-of-fit extracted by both packages by setting:
options(modelsummary_get = "easystats")
options(modelsummary_get = "broom")
options(modelsummary_get = "all")
Formatting numeric entries
By default, LaTeX tables enclose all numeric entries in the \num{}
command
from the siunitx package. To prevent this behavior, or to enclose numbers
in dollar signs (for LaTeX math mode), users can call:
options(modelsummary_format_numeric_latex = "plain")
options(modelsummary_format_numeric_latex = "mathmode")
A similar option can be used to display numerical entries using MathJax in HTML tables:
options(modelsummary_format_numeric_html = "mathjax")
LaTeX preamble
When creating LaTeX via the tinytable
backend (default in version 2.0.0 and later), it is useful to include the following commands in the LaTeX preamble of your documents. These commands are automatically added to the preamble when compiling Rmarkdown or Quarto documents, except when the modelsummary()
calls are cached.
\usepackage{tabularray} \usepackage{float} \usepackage{graphicx} \usepackage[normalem]{ulem} \UseTblrLibrary{booktabs} \UseTblrLibrary{siunitx} \newcommand{\tinytableTabularrayUnderline}[1]{\underline{#1}} \newcommand{\tinytableTabularrayStrikeout}[1]{\sout{#1}} \NewTableCommand{\tinytableDefineColor}[3]{\definecolor{#1}{#2}{#3}}
Examples
library(modelsummary) # The left-hand side of the formula describes rows, and the right-hand side # describes columns. This table uses the "mpg" variable as a row and the "mean" # function as a column: datasummary(mpg ~ mean, data = mtcars) # This table uses the "mean" function as a row and the "mpg" variable as a column: datasummary(mean ~ mpg, data = mtcars) # Display several variables or functions of the data using the "+" # concatenation operator. This table has 2 rows and 2 columns: datasummary(hp + mpg ~ mean + sd, data = mtcars) # Nest variables or statistics inside a "factor" variable using the "*" nesting # operator. This table shows the mean of "hp" and "mpg" for each value of # "cyl": mtcars$cyl <- as.factor(mtcars$cyl) datasummary(hp + mpg ~ cyl * mean, data = mtcars) # If you don't want to convert your original data # to factors, you can use the 'Factor()' # function inside 'datasummary' to obtain an identical result: datasummary(hp + mpg ~ Factor(cyl) * mean, data = mtcars) # You can nest several variables or statistics inside a factor by using # parentheses. This table shows the mean and the standard deviation for each # subset of "cyl": datasummary(hp + mpg ~ cyl * (mean + sd), data = mtcars) # Summarize all numeric variables with 'All()' datasummary(All(mtcars) ~ mean + sd, data = mtcars) # Define custom summary statistics. Your custom function should accept a vector # of numeric values and return a single numeric or string value: minmax <- function(x) sprintf("[%.2f, %.2f]", min(x), max(x)) mean_na <- function(x) mean(x, na.rm = TRUE) datasummary(hp + mpg ~ minmax + mean_na, data = mtcars) # To handle missing values, you can pass arguments to your functions using # '*Arguments()' datasummary(hp + mpg ~ mean * Arguments(na.rm = TRUE), data = mtcars) # For convenience, 'modelsummary' supplies several convenience functions # with the argument `na.rm=TRUE` by default: Mean, Median, Min, Max, SD, Var, # P0, P25, P50, P75, P100, NUnique, Histogram #datasummary(hp + mpg ~ Mean + SD + Histogram, data = mtcars) # These functions also accept a 'fmt' argument which allows you to # round/format the results datasummary(hp + mpg ~ Mean * Arguments(fmt = "%.3f") + SD * Arguments(fmt = "%.1f"), data = mtcars) # Save your tables to a variety of output formats: f <- hp + mpg ~ Mean + SD #datasummary(f, data = mtcars, output = 'table.html') #datasummary(f, data = mtcars, output = 'table.tex') #datasummary(f, data = mtcars, output = 'table.md') #datasummary(f, data = mtcars, output = 'table.docx') #datasummary(f, data = mtcars, output = 'table.pptx') #datasummary(f, data = mtcars, output = 'table.jpg') #datasummary(f, data = mtcars, output = 'table.png') # Display human-readable code #datasummary(f, data = mtcars, output = 'html') #datasummary(f, data = mtcars, output = 'markdown') #datasummary(f, data = mtcars, output = 'latex') # Return a table object to customize using a table-making package #datasummary(f, data = mtcars, output = 'tinytable') #datasummary(f, data = mtcars, output = 'gt') #datasummary(f, data = mtcars, output = 'kableExtra') #datasummary(f, data = mtcars, output = 'flextable') #datasummary(f, data = mtcars, output = 'huxtable') # add_rows new_rows <- data.frame(a = 1:2, b = 2:3, c = 4:5) attr(new_rows, 'position') <- c(1, 3) datasummary(mpg + hp ~ mean + sd, data = mtcars, add_rows = new_rows)
References
Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. doi:10.18637/jss.v103.i01.'
Title models with their dependent variables
Description
A convenience function for use with a regression model or list of regression models. Returns a named list of models, where the names are the models' respective dependent variables. Pass your list of models to dvnames
before sending to modelsummary
to automatically get dependent variable-titled columns.
Usage
dvnames(models, number = FALSE, strip = FALSE, fill = "Model")
Arguments
models |
A regression model or list of regression models |
number |
Should the models be numbered (1), (2), etc., in addition to their dependent variable names? |
strip |
boolean FALSE returns the dependent variable names as they appear in the model. TRUE returns the dependent variable names as they appear in the data, without transformations. |
fill |
If |
Examples
m1 <- lm(mpg ~ hp, data = mtcars)
m2 <- lm(mpg ~ hp + wt, data = mtcars)
# Without dvnames, column names are (1) and (2)
modelsummary(list(m1, m2))
# With dvnames, they are "mpg" and "mpg"
modelsummary(dvnames(list(m1,m2)))
Escape problematic characters to allow display in HTML
Description
Copied from knitr
for internal use because it is unexported and CRAN
rejects :::
Usage
escape_html(x)
Arguments
x |
a character string to escape |
Escape problematic characters to allow compilation in LaTeX
Description
Copied from knitr
for internal use because it is unexported and CRAN
rejects :::
Usage
escape_latex(x, newlines = FALSE, spaces = FALSE)
Arguments
x |
a character string to escape |
newlines |
boolean |
spaces |
boolean |
Make sure LaTeX and HTML are safe to compile
Description
Make sure LaTeX and HTML are safe to compile
Usage
escape_string(x, output_format = NULL)
Rounding with decimal digits in the fmt
argument
Description
Rounding with decimal digits in the fmt
argument
Usage
fmt_decimal(digits = 3, pdigits = NULL, ...)
Arguments
digits |
Number of decimal digits to keep, including trailing zeros. |
pdigits |
Number of decimal digits to keep for p values. If |
... |
Additional arguments are passed to the |
Rounding with number of digits determined by an equivalence test
Description
This function implements the suggestions of Astier & Wolak for the number of decimal digits to keep for coefficient estimates. The other statistics are rounded by fmt_significant()
.
Usage
fmt_equivalence(conf_level = 0.95, digits = 3, pdigits = NULL, ...)
Arguments
conf_level |
Confidence level to use for the equivalence test (1 - alpha). |
digits |
Number of significant digits to keep. |
pdigits |
Number of decimal digits to keep for p values. If |
... |
Additional arguments are passed to the |
References
Astier, Nicolas, and Frank A. Wolak. Credible Numbers: A Procedure for Reporting Statistical Precision in Parameter Estimates. No. w32124. National Bureau of Economic Research, 2024.
Examples
library(modelsummary)
mod <- lm(mpg ~ hp, mtcars)
# Default equivalence-based formatting
modelsummary(mod, fmt = fmt_equivalence())
# alpha = 0.2
modelsummary(mod, fmt = fmt_equivalence(conf_level = .8))
# default equivalence, but with alternative significant digits for other statistics
modelsummary(mod, fmt = fmt_equivalence(digits = 5))
Rounding using scientific notation
Description
Rounding using scientific notation
Usage
fmt_sci(digits = 3, ...)
Arguments
digits |
a positive integer indicating how many significant digits are to be used for numeric and complex |
... |
additional arguments passed to |
Rounding with significant digits in the fmt
argument
Description
The number of decimal digits to keep after the decimal is assessed
Usage
fmt_significant(digits = 3, ...)
Arguments
digits |
Number of significant digits to keep. |
... |
Additional arguments are passed to the |
Rounding with the sprintf()
function in the fmt
argument
Description
Rounding with the sprintf()
function in the fmt
argument
Usage
fmt_sprintf(fmt)
Arguments
fmt |
A string to control |
Rounding with decimal digits on a per-statistic basis in the fmt
argument for modelsummary()
Description
Rounding with decimal digits on a per-statistic basis in the fmt
argument for modelsummary()
Usage
fmt_statistic(..., default = 3)
Arguments
... |
Statistic names and |
default |
Number of decimal digits to keep for unspecified terms |
Rounding with decimal digits on a per-term basis in the fmt
argument for modelsummary()
Description
Rounding with decimal digits on a per-term basis in the fmt
argument for modelsummary()
Usage
fmt_term(..., default = 3)
Arguments
... |
Term names and |
default |
Number of decimal digits to keep for unspecified terms |
Extract model estimates in a tidy format.
Description
A unified approach to extract results from a wide variety of models. For
some models get_estimates
attaches useful attributes to the output. You
can access this information by calling the attributes
function:
attributes(get_estimates(model))
Usage
get_estimates(
model,
conf_level = 0.95,
vcov = NULL,
shape = NULL,
coef_rename = FALSE,
...
)
Arguments
model |
a single model object |
conf_level |
numeric value between 0 and 1. confidence level to use for
confidence intervals. Setting this argument to |
vcov |
robust standard errors and other manual statistics. The
|
shape |
|
coef_rename |
logical, named or unnamed character vector, or function
|
... |
all other arguments are passed through to three functions. See the documentation of these functions for lists of available arguments.
|
Extract goodness-of-fit statistics a tidy format.
Description
A unified approach to extract results from a wide variety of models. For
some models get_gof
attaches useful attributes to the output. You
can access this information by calling the attributes
function:
attributes(get_estimates(model))
Usage
get_gof(model, gof_function = NULL, vcov_type = NULL, ...)
Arguments
model |
a single model object |
gof_function |
function which accepts a model object in the |
vcov_type |
string vcov type to add at the bottom of the table |
... |
all other arguments are passed through to three functions. See the documentation of these functions for lists of available arguments.
|
Extract goodness-of-fit statistics from a single model using the
broom
package or another package with package which supplies a
method for the generics::glance
generic.
Description
Extract goodness-of-fit statistics from a single model using the
broom
package or another package with package which supplies a
method for the generics::glance
generic.
Usage
get_gof_broom(model, ...)
Extract goodness-of-fit statistics from a single model using
the performance
package
Description
Extract goodness-of-fit statistics from a single model using
the performance
package
Usage
get_gof_parameters(model, ...)
Allow users to override uncertainty estimates
Description
Allow users to override uncertainty estimates
Usage
get_vcov(model, vcov = NULL, ...)
Arguments
model |
object type with an available |
vcov |
robust standard errors and other manual statistics. The
|
... |
all other arguments are passed through to three functions. See the documentation of these functions for lists of available arguments.
|
Value
a numeric vector of test statistics
Allow users to override uncertainty estimates
Description
Allow users to override uncertainty estimates
Usage
## S3 method for class 'mlm'
get_vcov(model, vcov = NULL, conf_level = NULL, ...)
Arguments
model |
object type with an available |
vcov |
robust standard errors and other manual statistics. The
|
... |
all other arguments are passed through to three functions. See the documentation of these functions for lists of available arguments.
|
Value
a numeric vector of test statistics
Extract custom information from a model object and turn it into a tidy data.frame or tibble with a single row.
Description
To customize the output of a model of class lm
, you can define a new
method called glance_custom.lm
which returns a one-row data.frame.
Usage
glance_custom(x, ...)
Arguments
x |
model or other R object to convert to single-row data frame |
... |
ellipsis |
Avoid namespace conflict when we want to customize glance internally and still allow users to do the same with their own functions
Description
Avoid namespace conflict when we want to customize glance internally and still allow users to do the same with their own functions
Usage
glance_custom_internal(x, ...)
Avoid namespace conflict when we want to customize glance internally and still allow users to do the same with their own functions
Description
Avoid namespace conflict when we want to customize glance internally and still allow users to do the same with their own functions
Usage
## Default S3 method:
glance_custom_internal(x, ...)
Avoid namespace conflict when we want to customize glance internally and still allow users to do the same with their own functions
Description
Avoid namespace conflict when we want to customize glance internally and still allow users to do the same with their own functions
Usage
## S3 method for class 'lm'
glance_custom_internal(x, vcov_type = NULL, gof = NULL, ...)
Data.frame used to clean up and format goodness-of-fit statistics
Description
By default, this data frame is passed to the 'gof_map' argument of the 'modelsummary' function. Users can modify this data frame to customize the list of statistics to display and their format. See example below.
Usage
gof_map
Format
data.frame with 4 columns of character data: raw, clean, fmt, omit
Examples
if (identical(Sys.getenv("pkgdown"), "true")) {
library(modelsummary)
mod <- lm(wt ~ drat, data = mtcars)
gm <- modelsummary::gof_map
gm$omit[gm$raw == 'deviance'] <- FALSE
gm$fmt[gm$raw == 'r.squared'] <- "%.5f"
modelsummary(mod, gof_map = gm)
}
Execute code silently
Description
Execute code silently
Usage
hush(code)
Add a label to a logical vector.
Description
Add a label to a logical vector.
Usage
labelSubset(subset, label)
rename and reorder estimates from a single model (before merging to collapse)
Description
rename and reorder estimates from a single model (before merging to collapse)
Usage
map_estimates(estimates, coef_rename, coef_map, coef_omit, group_map)
Internal function to subset, rename and re-order gof statistics
Description
Internal function to subset, rename and re-order gof statistics
Usage
map_gof(gof, gof_omit, gof_map)
Model Summary Plots with Estimates and Confidence Intervals
Description
Dot-Whisker plot of coefficient estimates with confidence intervals. For
more information, see the Details and Examples sections below, and the
vignettes on the modelsummary
website:
https://modelsummary.com/
Usage
modelplot(
models,
conf_level = getOption("modelsummary_conf_level", default = 0.95),
coef_map = getOption("modelsummary_coef_map", default = NULL),
coef_omit = getOption("modelsummary_coef_omit", default = NULL),
coef_rename = getOption("modelsummary_coef_rename", default = NULL),
vcov = getOption("modelsummary_vcov", default = NULL),
exponentiate = getOption("modelsummary_exponentiate", default = FALSE),
add_rows = getOption("modelsummary_add_rows", default = NULL),
facet = getOption("modelsummary_facet", default = FALSE),
draw = getOption("modelsummary_draw", default = TRUE),
background = getOption("modelsummary_background", default = NULL),
...
)
Arguments
models |
a model, (named) list of models, or nested list of models.
|
conf_level |
numeric value between 0 and 1. confidence level to use for
confidence intervals. Setting this argument to |
coef_map |
character vector. Subset, rename, and reorder coefficients.
Coefficients omitted from this vector are omitted from the table. The order
of the vector determines the order of the table. |
coef_omit |
integer vector or regular expression to identify which coefficients to omit (or keep) from the table. Positive integers determine which coefficients to omit. Negative integers determine which coefficients to keep. A regular expression can be used to omit coefficients, and perl-compatible "negative lookaheads" can be used to specify which coefficients to keep in the table. Examples:
|
coef_rename |
logical, named or unnamed character vector, or function
|
vcov |
robust standard errors and other manual statistics. The
|
exponentiate |
TRUE, FALSE, or logical vector of length equal to the
number of models. If TRUE, the |
add_rows |
a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. Positions can be defined using integers. In the
|
facet |
TRUE or FALSE. When the 'models' argument includes several model objects, TRUE draws terms in separate facets, and FALSE draws terms side-by-side (dodged). |
draw |
TRUE returns a 'ggplot2' object, FALSE returns the data.frame used to draw the plot. |
background |
A list of 'ggplot2' geoms to add to the background of the plot. This is especially useful to display annotations "behind" the 'geom_pointrange' that 'modelplot' draws. |
... |
all other arguments are passed through to three functions. See the documentation of these functions for lists of available arguments.
|
Examples
library(modelsummary) # single model mod <- lm(hp ~ vs + drat, mtcars) modelplot(mod) # omit terms with string matches or regexes modelplot(mod, coef_omit = 'Interc') # rename, reorder and subset with 'coef_map' cm <- c('vs' = 'V-shape engine', 'drat' = 'Rear axle ratio') modelplot(mod, coef_map = cm) # several models models <- list() models[['Small model']] <- lm(hp ~ vs, mtcars) models[['Medium model']] <- lm(hp ~ vs + factor(cyl), mtcars) models[['Large model']] <- lm(hp ~ vs + drat + factor(cyl), mtcars) modelplot(models) # add_rows: add an empty reference category mod <- lm(hp ~ factor(cyl), mtcars) add_rows = data.frame( term = "factory(cyl)4", model = "(1)", estimate = NA) attr(add_rows, "position") = 3 modelplot(mod, add_rows = add_rows) # customize your plots with 'ggplot2' functions library(ggplot2) modelplot(models) + scale_color_brewer(type = 'qual') + theme_classic() # pass arguments to 'geom_pointrange' through the ... ellipsis modelplot(mod, color = 'red', size = 1, fatten = .5) # add geoms to the background, behind geom_pointrange b <- list(geom_vline(xintercept = 0, color = 'orange'), annotate("rect", alpha = .1, xmin = -.5, xmax = .5, ymin = -Inf, ymax = Inf), geom_point(aes(y = term, x = estimate), alpha = .3, size = 10, color = 'red', shape = 'square')) modelplot(mod, background = b) # logistic regression example df <- as.data.frame(Titanic) mod_titanic <- glm( Survived ~ Class + Sex, family = binomial, weight = Freq, data = df ) # displaying odds ratio using a log scale modelplot(mod_titanic, exponentiate = TRUE) + scale_x_log10() + xlab("Odds Ratios and 95% confidence intervals")
References
Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. doi:10.18637/jss.v103.i01.'
Model Summary Tables
Description
Create beautiful and customizable tables to summarize several statistical
models side-by-side. This function supports dozens of statistical models,
and it can produce tables in HTML, LaTeX, Word, Markdown, Typst, PDF, PowerPoint,
Excel, RTF, JPG, or PNG. The appearance of the tables can be customized
extensively by specifying the output
argument, and by using functions from
one of the supported table customization packages: tinytable
, kableExtra
, gt
,
flextable
, huxtable
, DT
. For more information, see the Details and Examples
sections below, and the vignettes on the modelsummary
website:
https://modelsummary.com/
-
The
modelsummary
Vignette includes dozens of examples of tables with extensive customizations. -
The Appearance Vignette shows how to modify the look of tables.
Usage
modelsummary(
models,
output = getOption("modelsummary_output", default = "default"),
fmt = getOption("modelsummary_fmt", default = 3),
estimate = getOption("modelsummary_estimate", default = "estimate"),
statistic = getOption("modelsummary_statistic", default = "std.error"),
vcov = getOption("modelsummary_vcov", default = NULL),
conf_level = getOption("modelsummary_conf_level", default = 0.95),
exponentiate = getOption("modelsummary_exponentiate", default = FALSE),
stars = getOption("modelsummary_stars", default = FALSE),
shape = getOption("modelsummary_shape", default = term + statistic ~ model),
coef_map = getOption("modelsummary_coef_map", default = NULL),
coef_omit = getOption("modelsummary_coef_omit", default = NULL),
coef_rename = getOption("modelsummary_coef_rename", default = FALSE),
gof_map = getOption("modelsummary_gof_map", default = NULL),
gof_omit = getOption("modelsummary_gof_omit", default = NULL),
gof_function = getOption("modelsummary_gof_function", default = NULL),
group_map = getOption("modelsummary_group_map", default = NULL),
add_columns = getOption("modelsummary_add_columns", default = NULL),
add_rows = getOption("modelsummary_add_rows", default = NULL),
align = getOption("modelsummary_align", default = NULL),
notes = getOption("modelsummary_notes", default = NULL),
title = getOption("modelsummary_title", default = NULL),
escape = getOption("modelsummary_escape", default = TRUE),
...
)
Arguments
models |
a model, (named) list of models, or nested list of models.
|
output |
filename or object type (character string)
|
fmt |
how to format numeric values: integer, user-supplied function, or
|
estimate |
a single string or a character vector of length equal to the
number of models. Valid entries include any column name of
the data.frame produced by
|
statistic |
vector of strings or
|
vcov |
robust standard errors and other manual statistics. The
|
conf_level |
numeric value between 0 and 1. confidence level to use for
confidence intervals. Setting this argument to |
exponentiate |
TRUE, FALSE, or logical vector of length equal to the
number of models. If TRUE, the |
stars |
to indicate statistical significance
|
shape |
|
coef_map |
character vector. Subset, rename, and reorder coefficients.
Coefficients omitted from this vector are omitted from the table. The order
of the vector determines the order of the table. |
coef_omit |
integer vector or regular expression to identify which coefficients to omit (or keep) from the table. Positive integers determine which coefficients to omit. Negative integers determine which coefficients to keep. A regular expression can be used to omit coefficients, and perl-compatible "negative lookaheads" can be used to specify which coefficients to keep in the table. Examples:
|
coef_rename |
logical, named or unnamed character vector, or function
|
gof_map |
rename, reorder, and omit goodness-of-fit statistics and other model information. This argument accepts 4 types of values:
|
gof_omit |
string regular expression (perl-compatible) used to determine which statistics to omit from the bottom section of the table. A "negative lookahead" can be used to specify which statistics to keep in the table. Examples:
|
gof_function |
function which accepts a model object in the |
group_map |
named or unnamed character vector. Subset, rename, and
reorder coefficient groups specified a grouping variable specified in the
|
add_columns |
a data.frame (or tibble) with the same number of rows as #' your main table. By default, rows are appended to the bottom of the table. You can define a "position" attribute of integers to set the columns positions. See Examples section below. |
add_rows |
a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. Positions can be defined using integers. In the
|
align |
A string with a number of characters equal to the number of columns in
the table (e.g.,
|
notes |
list or vector of notes to append to the bottom of the table. |
title |
string. Cross-reference labels should be added with Quarto or Rmarkdown chunk options when applicable. When saving standalone LaTeX files, users can add a label such as |
escape |
boolean TRUE escapes or substitutes LaTeX/HTML characters which could
prevent the file from compiling/displaying. |
... |
all other arguments are passed through to three functions. See the documentation of these functions for lists of available arguments.
|
Details
output
The modelsummary_list
output is a lightweight format which can be used to save model results, so they can be fed back to modelsummary
later to avoid extracting results again.
When a file name with a valid extension is supplied to the output
argument,
the table is written immediately to file. If you want to customize your table
by post-processing it with an external package, you need to choose a
different output format and saving mechanism. Unfortunately, the approach
differs from package to package:
-
tinytable
: setoutput="tinytable"
, post-process your table, and use thetinytable::save_tt
function. -
gt
: setoutput="gt"
, post-process your table, and use thegt::gtsave
function. -
kableExtra
: setoutput
to your destination format (e.g., "latex", "html", "markdown"), post-process your table, and usekableExtra::save_kable
function.
vcov
To use a string such as "robust" or "HC0", your model must be supported
by the sandwich
package. This includes objects such as: lm, glm,
survreg, coxph, mlogit, polr, hurdle, zeroinfl, and more.
NULL, "classical", "iid", and "constant" are aliases which do not modify uncertainty estimates and simply report the default standard errors stored in the model object.
One-sided formulas such as ~clusterid
are passed to the sandwich::vcovCL
function.
Matrices and functions producing variance-covariance matrices are first
passed to lmtest
. If this does not work, modelsummary
attempts to take
the square root of the diagonal to adjust "std.error", but the other
uncertainty estimates are not be adjusted.
Numeric vectors are formatted according to fmt
and placed in brackets.
Character vectors printed as given, without parentheses.
If your model type is supported by the lmtest
package, the
vcov
argument will try to use that package to adjust all the
uncertainty estimates, including "std.error", "statistic", "p.value", and
"conf.int". If your model is not supported by lmtest
, only the "std.error"
will be adjusted by, for example, taking the square root of the matrix's
diagonal.
Value
a regression table in a format determined by the output
argument.
Version 2.0.0, kableExtra, and tinytable
Since version 2.0.0, modelsummary
uses tinytable
as its default table-drawing backend.
Learn more at: https://vincentarelbundock.github.io/tinytable/",
Revert to kableExtra
for one session:
options(modelsummary_factory_default = 'kableExtra')
options(modelsummary_factory_latex = 'kableExtra')
options(modelsummary_factory_html = 'kableExtra')
Global Options
The behavior of modelsummary
can be modified by setting global options. In particular, most of the arguments for most of the package's functions cna be set using global options. For example:
-
options(modelsummary_output = "modelsummary_list")
-
options(modelsummary_statistic = '({conf.low}, {conf.high})')
-
options(modelsummary_stars = TRUE)
Options not specific to given arguments are listed below.
Model labels: default column names
These global option changes the style of the default column headers:
-
options(modelsummary_model_labels = "roman")
The supported styles are: "model", "arabic", "letters", "roman", "(arabic)", "(letters)", "(roman)"
Table-making packages
modelsummary
supports 6 table-making packages: tinytable
, kableExtra
, gt
,
flextable
, huxtable
, and DT
. Some of these packages have overlapping
functionalities. To change the default backend used for a specific file
format, you can use ' the options
function:
options(modelsummary_factory_html = 'kableExtra')
options(modelsummary_factory_word = 'huxtable')
options(modelsummary_factory_png = 'gt')
options(modelsummary_factory_latex = 'gt')
options(modelsummary_factory_latex_tabular = 'kableExtra')
Table themes
Change the look of tables in an automated and replicable way, using the modelsummary
theming functionality. See the vignette: https://modelsummary.com/vignettes/appearance.html
-
modelsummary_theme_gt
-
modelsummary_theme_kableExtra
-
modelsummary_theme_huxtable
-
modelsummary_theme_flextable
-
modelsummary_theme_dataframe
Model extraction functions
modelsummary
can use two sets of packages to extract information from
statistical models: the easystats
family (performance
and parameters
)
and broom
. By default, it uses easystats
first and then falls back on
broom
in case of failure. You can change the order of priorities or include
goodness-of-fit extracted by both packages by setting:
options(modelsummary_get = "easystats")
options(modelsummary_get = "broom")
options(modelsummary_get = "all")
Formatting numeric entries
By default, LaTeX tables enclose all numeric entries in the \num{}
command
from the siunitx package. To prevent this behavior, or to enclose numbers
in dollar signs (for LaTeX math mode), users can call:
options(modelsummary_format_numeric_latex = "plain")
options(modelsummary_format_numeric_latex = "mathmode")
A similar option can be used to display numerical entries using MathJax in HTML tables:
options(modelsummary_format_numeric_html = "mathjax")
LaTeX preamble
When creating LaTeX via the tinytable
backend (default in version 2.0.0 and later), it is useful to include the following commands in the LaTeX preamble of your documents. These commands are automatically added to the preamble when compiling Rmarkdown or Quarto documents, except when the modelsummary()
calls are cached.
\usepackage{tabularray} \usepackage{float} \usepackage{graphicx} \usepackage[normalem]{ulem} \UseTblrLibrary{booktabs} \UseTblrLibrary{siunitx} \newcommand{\tinytableTabularrayUnderline}[1]{\underline{#1}} \newcommand{\tinytableTabularrayStrikeout}[1]{\sout{#1}} \NewTableCommand{\tinytableDefineColor}[3]{\definecolor{#1}{#2}{#3}}
Parallel computation
It can take a long time to compute and extract summary statistics from
certain models (e.g., Bayesian). In those cases, users can parallelize the
process. Since parallelization occurs at the model level, no speedup is
available for tables with a single model. Users on mac or linux can launch
parallel computation using the built-in parallel
package. All they need to
do is supply a mc.cores
argument which will be pushed forward to the
parallel::mclapply
function:
modelsummary(model_list, mc.cores = 5)
All users can also use the future.apply
package to parallelize model summaries.
For example, to use 4 cores to extract results:
library(future.apply) plan(multicore, workers = 4) options("modelsummary_future" = TRUE) modelsummary(model_list)
Note that the "multicore" plan only parallelizes under mac or linux. Windows
users can use plan(multisession)
instead. However, note that the first
time modelsummary()
is called under multisession can be a fair bit longer,
because of extra costs in passing data to and loading required packages on
to workers. Subsequent calls to modelsummary()
will often be much faster.
Some users have reported difficult to reproduce errors when using the
future
package with some packages. The future
parallelization in
modelsummary
can be disabled by calling:
options("modelsummary_future" = FALSE)
References
Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. doi:10.18637/jss.v103.i01.'
Examples
# The `modelsummary` website includes \emph{many} examples and tutorials:
# https://modelsummary.com
library(modelsummary)
# load data and estimate models
utils::data(trees)
models <- list()
models[["Bivariate"]] <- lm(Girth ~ Height, data = trees)
models[["Multivariate"]] <- lm(Girth ~ Height + Volume, data = trees)
# simple table
modelsummary(models)
# statistic
modelsummary(models, statistic = NULL)
modelsummary(models, statistic = "p.value")
modelsummary(models, statistic = "statistic")
modelsummary(models, statistic = "conf.int", conf_level = 0.99)
modelsummary(models, statistic = c(
"t = {statistic}",
"se = {std.error}",
"conf.int"))
# estimate
modelsummary(models,
statistic = NULL,
estimate = "{estimate} [{conf.low}, {conf.high}]")
modelsummary(models,
estimate = c(
"{estimate}{stars}",
"{estimate} ({std.error})"))
# vcov
modelsummary(models, vcov = "robust")
modelsummary(models, vcov = list("classical", "stata"))
modelsummary(models, vcov = sandwich::vcovHC)
modelsummary(models,
vcov = list(stats::vcov, sandwich::vcovHC))
modelsummary(models,
vcov = list(
c("(Intercept)" = "", "Height" = "!"),
c("(Intercept)" = "", "Height" = "!", "Volume" = "!!")))
# vcov with custom names
modelsummary(
models,
vcov = list(
"Stata Corp" = "stata",
"Newey Lewis & the News" = "NeweyWest"))
# fmt
mod <- lm(mpg ~ hp + drat + qsec, data = mtcars)
modelsummary(mod, fmt = 3)
modelsummary(mod, fmt = fmt_significant(3))
modelsummary(mod, fmt = NULL)
modelsummary(mod, fmt = fmt_decimal(4))
modelsummary(mod, fmt = fmt_sprintf("%.5f"))
modelsummary(mod, fmt = fmt_statistic(estimate = 4, conf.int = 1), statistic = "conf.int")
modelsummary(mod, fmt = fmt_term(hp = 4, drat = 1, default = 2))
m <- lm(mpg ~ I(hp * 1000) + drat, data = mtcars)
f <- function(x) format(x, digits = 3, nsmall = 2, scientific = FALSE, trim = TRUE)
modelsummary(m, fmt = f, gof_map = NA)
# coef_rename
modelsummary(models, coef_rename = c("Volume" = "Large", "Height" = "Tall"))
modelsummary(models, coef_rename = toupper)
modelsummary(models, coef_rename = coef_rename)
# coef_rename = TRUE for variable labels
datlab <- mtcars
datlab$cyl <- factor(datlab$cyl)
attr(datlab$hp, "label") <- "Horsepower"
attr(datlab$cyl, "label") <- "Cylinders"
modlab <- lm(mpg ~ hp * drat + cyl, data = datlab)
modelsummary(modlab, coef_rename = TRUE)
# coef_rename: unnamed vector of length equal to the number of terms in the final table
m <- lm(hp ~ mpg + factor(cyl), data = mtcars)
modelsummary(m, coef_omit = -(3:4), coef_rename = c("Cyl 6", "Cyl 8"))
# coef_map
modelsummary(models, coef_map = c("Volume" = "Large", "Height" = "Tall"))
modelsummary(models, coef_map = c("Volume", "Height"))
# coef_omit: omit the first and second coefficients
modelsummary(models, coef_omit = 1:2)
# coef_omit: omit coefficients matching one substring
modelsummary(models, coef_omit = "ei", gof_omit = ".*")
# coef_omit: omit a specific coefficient
modelsummary(models, coef_omit = "^Volume$", gof_omit = ".*")
# coef_omit: omit coefficients matching either one of two substring
# modelsummary(models, coef_omit = "ei|rc", gof_omit = ".*")
# coef_omit: keep coefficients starting with a substring (using a negative lookahead)
# modelsummary(models, coef_omit = "^(?!Vol)", gof_omit = ".*")
# coef_omit: keep coefficients matching a substring
modelsummary(models, coef_omit = "^(?!.*ei|.*pt)", gof_omit = ".*")
# shape: multinomial model
library(nnet)
multi <- multinom(factor(cyl) ~ mpg + hp, data = mtcars, trace = FALSE)
# shape: term names and group ids in rows, models in columns
modelsummary(multi, shape = response ~ model)
# shape: term names and group ids in rows in a single column
modelsummary(multi, shape = term:response ~ model)
# shape: term names in rows and group ids in columns
modelsummary(multi, shape = term ~ response:model)
# shape = "rcollapse"
panels <- list(
"Panel A: MPG" = list(
"A" = lm(mpg ~ hp, data = mtcars),
"B" = lm(mpg ~ hp + factor(gear), data = mtcars)),
"Panel B: Displacement" = list(
"A" = lm(disp ~ hp, data = mtcars),
"C" = lm(disp ~ hp + factor(gear), data = mtcars))
)
# shape = "cbind"
modelsummary(panels, shape = "cbind")
modelsummary(
panels,
shape = "rbind",
gof_map = c("nobs", "r.squared"))
# title
modelsummary(models, title = "This is the title")
# title with LaTeX label (for numbering and referencing)
modelsummary(models, title = "This is the title \\label{tab:description}", escape = FALSE)
# add_rows
rows <- tibble::tribble(
~term, ~Bivariate, ~Multivariate,
"Empty row", "-", "-",
"Another empty row", "?", "?")
attr(rows, "position") <- c(1, 3)
modelsummary(models, add_rows = rows)
attr(rows, "position") <- "gof_start"
modelsummary(models, add_rows = rows)
# notes
modelsummary(models, notes = list("A first note", "A second note"))
# gof_map: tribble
library(tibble)
gm <- tribble(
~raw, ~clean, ~fmt,
"r.squared", "R Squared", 5)
modelsummary(models, gof_map = gm)
# gof_map: list of lists
f <- function(x) format(round(x, 3), big.mark = ",")
gm <- list(
list("raw" = "nobs", "clean" = "N", "fmt" = f),
list("raw" = "AIC", "clean" = "aic", "fmt" = f))
modelsummary(models, gof_map = gm)
msummary()
is a shortcut to modelsummary()
Description
Create beautiful and customizable tables to summarize several statistical
models side-by-side. This function supports dozens of statistical models,
and it can produce tables in HTML, LaTeX, Word, Markdown, Typst, PDF, PowerPoint,
Excel, RTF, JPG, or PNG. The appearance of the tables can be customized
extensively by specifying the output
argument, and by using functions from
one of the supported table customization packages: tinytable
, kableExtra
, gt
,
flextable
, huxtable
, DT
. For more information, see the Details and Examples
sections below, and the vignettes on the modelsummary
website:
https://modelsummary.com/
-
The
modelsummary
Vignette includes dozens of examples of tables with extensive customizations. -
The Appearance Vignette shows how to modify the look of tables.
Usage
msummary(
models,
output = getOption("modelsummary_output", default = "default"),
fmt = getOption("modelsummary_fmt", default = 3),
estimate = getOption("modelsummary_estimate", default = "estimate"),
statistic = getOption("modelsummary_statistic", default = "std.error"),
vcov = getOption("modelsummary_vcov", default = NULL),
conf_level = getOption("modelsummary_conf_level", default = 0.95),
exponentiate = getOption("modelsummary_exponentiate", default = FALSE),
stars = getOption("modelsummary_stars", default = FALSE),
shape = getOption("modelsummary_shape", default = term + statistic ~ model),
coef_map = getOption("modelsummary_coef_map", default = NULL),
coef_omit = getOption("modelsummary_coef_omit", default = NULL),
coef_rename = getOption("modelsummary_coef_rename", default = FALSE),
gof_map = getOption("modelsummary_gof_map", default = NULL),
gof_omit = getOption("modelsummary_gof_omit", default = NULL),
gof_function = getOption("modelsummary_gof_function", default = NULL),
group_map = getOption("modelsummary_group_map", default = NULL),
add_columns = getOption("modelsummary_add_columns", default = NULL),
add_rows = getOption("modelsummary_add_rows", default = NULL),
align = getOption("modelsummary_align", default = NULL),
notes = getOption("modelsummary_notes", default = NULL),
title = getOption("modelsummary_title", default = NULL),
escape = getOption("modelsummary_escape", default = TRUE),
...
)
Arguments
models |
a model, (named) list of models, or nested list of models.
|
output |
filename or object type (character string)
|
fmt |
how to format numeric values: integer, user-supplied function, or
|
estimate |
a single string or a character vector of length equal to the
number of models. Valid entries include any column name of
the data.frame produced by
|
statistic |
vector of strings or
|
vcov |
robust standard errors and other manual statistics. The
|
conf_level |
numeric value between 0 and 1. confidence level to use for
confidence intervals. Setting this argument to |
exponentiate |
TRUE, FALSE, or logical vector of length equal to the
number of models. If TRUE, the |
stars |
to indicate statistical significance
|
shape |
|
coef_map |
character vector. Subset, rename, and reorder coefficients.
Coefficients omitted from this vector are omitted from the table. The order
of the vector determines the order of the table. |
coef_omit |
integer vector or regular expression to identify which coefficients to omit (or keep) from the table. Positive integers determine which coefficients to omit. Negative integers determine which coefficients to keep. A regular expression can be used to omit coefficients, and perl-compatible "negative lookaheads" can be used to specify which coefficients to keep in the table. Examples:
|
coef_rename |
logical, named or unnamed character vector, or function
|
gof_map |
rename, reorder, and omit goodness-of-fit statistics and other model information. This argument accepts 4 types of values:
|
gof_omit |
string regular expression (perl-compatible) used to determine which statistics to omit from the bottom section of the table. A "negative lookahead" can be used to specify which statistics to keep in the table. Examples:
|
gof_function |
function which accepts a model object in the |
group_map |
named or unnamed character vector. Subset, rename, and
reorder coefficient groups specified a grouping variable specified in the
|
add_columns |
a data.frame (or tibble) with the same number of rows as #' your main table. By default, rows are appended to the bottom of the table. You can define a "position" attribute of integers to set the columns positions. See Examples section below. |
add_rows |
a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. Positions can be defined using integers. In the
|
align |
A string with a number of characters equal to the number of columns in
the table (e.g.,
|
notes |
list or vector of notes to append to the bottom of the table. |
title |
string. Cross-reference labels should be added with Quarto or Rmarkdown chunk options when applicable. When saving standalone LaTeX files, users can add a label such as |
escape |
boolean TRUE escapes or substitutes LaTeX/HTML characters which could
prevent the file from compiling/displaying. |
... |
all other arguments are passed through to three functions. See the documentation of these functions for lists of available arguments.
|
Details
output
The modelsummary_list
output is a lightweight format which can be used to save model results, so they can be fed back to modelsummary
later to avoid extracting results again.
When a file name with a valid extension is supplied to the output
argument,
the table is written immediately to file. If you want to customize your table
by post-processing it with an external package, you need to choose a
different output format and saving mechanism. Unfortunately, the approach
differs from package to package:
-
tinytable
: setoutput="tinytable"
, post-process your table, and use thetinytable::save_tt
function. -
gt
: setoutput="gt"
, post-process your table, and use thegt::gtsave
function. -
kableExtra
: setoutput
to your destination format (e.g., "latex", "html", "markdown"), post-process your table, and usekableExtra::save_kable
function.
vcov
To use a string such as "robust" or "HC0", your model must be supported
by the sandwich
package. This includes objects such as: lm, glm,
survreg, coxph, mlogit, polr, hurdle, zeroinfl, and more.
NULL, "classical", "iid", and "constant" are aliases which do not modify uncertainty estimates and simply report the default standard errors stored in the model object.
One-sided formulas such as ~clusterid
are passed to the sandwich::vcovCL
function.
Matrices and functions producing variance-covariance matrices are first
passed to lmtest
. If this does not work, modelsummary
attempts to take
the square root of the diagonal to adjust "std.error", but the other
uncertainty estimates are not be adjusted.
Numeric vectors are formatted according to fmt
and placed in brackets.
Character vectors printed as given, without parentheses.
If your model type is supported by the lmtest
package, the
vcov
argument will try to use that package to adjust all the
uncertainty estimates, including "std.error", "statistic", "p.value", and
"conf.int". If your model is not supported by lmtest
, only the "std.error"
will be adjusted by, for example, taking the square root of the matrix's
diagonal.
Value
a regression table in a format determined by the output
argument.
Version 2.0.0, kableExtra, and tinytable
Since version 2.0.0, modelsummary
uses tinytable
as its default table-drawing backend.
Learn more at: https://vincentarelbundock.github.io/tinytable/",
Revert to kableExtra
for one session:
options(modelsummary_factory_default = 'kableExtra')
options(modelsummary_factory_latex = 'kableExtra')
options(modelsummary_factory_html = 'kableExtra')
Global Options
The behavior of modelsummary
can be modified by setting global options. In particular, most of the arguments for most of the package's functions cna be set using global options. For example:
-
options(modelsummary_output = "modelsummary_list")
-
options(modelsummary_statistic = '({conf.low}, {conf.high})')
-
options(modelsummary_stars = TRUE)
Options not specific to given arguments are listed below.
Model labels: default column names
These global option changes the style of the default column headers:
-
options(modelsummary_model_labels = "roman")
The supported styles are: "model", "arabic", "letters", "roman", "(arabic)", "(letters)", "(roman)"
Table-making packages
modelsummary
supports 6 table-making packages: tinytable
, kableExtra
, gt
,
flextable
, huxtable
, and DT
. Some of these packages have overlapping
functionalities. To change the default backend used for a specific file
format, you can use ' the options
function:
options(modelsummary_factory_html = 'kableExtra')
options(modelsummary_factory_word = 'huxtable')
options(modelsummary_factory_png = 'gt')
options(modelsummary_factory_latex = 'gt')
options(modelsummary_factory_latex_tabular = 'kableExtra')
Table themes
Change the look of tables in an automated and replicable way, using the modelsummary
theming functionality. See the vignette: https://modelsummary.com/vignettes/appearance.html
-
modelsummary_theme_gt
-
modelsummary_theme_kableExtra
-
modelsummary_theme_huxtable
-
modelsummary_theme_flextable
-
modelsummary_theme_dataframe
Model extraction functions
modelsummary
can use two sets of packages to extract information from
statistical models: the easystats
family (performance
and parameters
)
and broom
. By default, it uses easystats
first and then falls back on
broom
in case of failure. You can change the order of priorities or include
goodness-of-fit extracted by both packages by setting:
options(modelsummary_get = "easystats")
options(modelsummary_get = "broom")
options(modelsummary_get = "all")
Formatting numeric entries
By default, LaTeX tables enclose all numeric entries in the \num{}
command
from the siunitx package. To prevent this behavior, or to enclose numbers
in dollar signs (for LaTeX math mode), users can call:
options(modelsummary_format_numeric_latex = "plain")
options(modelsummary_format_numeric_latex = "mathmode")
A similar option can be used to display numerical entries using MathJax in HTML tables:
options(modelsummary_format_numeric_html = "mathjax")
LaTeX preamble
When creating LaTeX via the tinytable
backend (default in version 2.0.0 and later), it is useful to include the following commands in the LaTeX preamble of your documents. These commands are automatically added to the preamble when compiling Rmarkdown or Quarto documents, except when the modelsummary()
calls are cached.
\usepackage{tabularray} \usepackage{float} \usepackage{graphicx} \usepackage[normalem]{ulem} \UseTblrLibrary{booktabs} \UseTblrLibrary{siunitx} \newcommand{\tinytableTabularrayUnderline}[1]{\underline{#1}} \newcommand{\tinytableTabularrayStrikeout}[1]{\sout{#1}} \NewTableCommand{\tinytableDefineColor}[3]{\definecolor{#1}{#2}{#3}}
Parallel computation
It can take a long time to compute and extract summary statistics from
certain models (e.g., Bayesian). In those cases, users can parallelize the
process. Since parallelization occurs at the model level, no speedup is
available for tables with a single model. Users on mac or linux can launch
parallel computation using the built-in parallel
package. All they need to
do is supply a mc.cores
argument which will be pushed forward to the
parallel::mclapply
function:
modelsummary(model_list, mc.cores = 5)
All users can also use the future.apply
package to parallelize model summaries.
For example, to use 4 cores to extract results:
library(future.apply) plan(multicore, workers = 4) options("modelsummary_future" = TRUE) modelsummary(model_list)
Note that the "multicore" plan only parallelizes under mac or linux. Windows
users can use plan(multisession)
instead. However, note that the first
time modelsummary()
is called under multisession can be a fair bit longer,
because of extra costs in passing data to and loading required packages on
to workers. Subsequent calls to modelsummary()
will often be much faster.
Some users have reported difficult to reproduce errors when using the
future
package with some packages. The future
parallelization in
modelsummary
can be disabled by calling:
options("modelsummary_future" = FALSE)
References
Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. doi:10.18637/jss.v103.i01.'
Examples
# The `modelsummary` website includes \emph{many} examples and tutorials:
# https://modelsummary.com
library(modelsummary)
# load data and estimate models
utils::data(trees)
models <- list()
models[["Bivariate"]] <- lm(Girth ~ Height, data = trees)
models[["Multivariate"]] <- lm(Girth ~ Height + Volume, data = trees)
# simple table
modelsummary(models)
# statistic
modelsummary(models, statistic = NULL)
modelsummary(models, statistic = "p.value")
modelsummary(models, statistic = "statistic")
modelsummary(models, statistic = "conf.int", conf_level = 0.99)
modelsummary(models, statistic = c(
"t = {statistic}",
"se = {std.error}",
"conf.int"))
# estimate
modelsummary(models,
statistic = NULL,
estimate = "{estimate} [{conf.low}, {conf.high}]")
modelsummary(models,
estimate = c(
"{estimate}{stars}",
"{estimate} ({std.error})"))
# vcov
modelsummary(models, vcov = "robust")
modelsummary(models, vcov = list("classical", "stata"))
modelsummary(models, vcov = sandwich::vcovHC)
modelsummary(models,
vcov = list(stats::vcov, sandwich::vcovHC))
modelsummary(models,
vcov = list(
c("(Intercept)" = "", "Height" = "!"),
c("(Intercept)" = "", "Height" = "!", "Volume" = "!!")))
# vcov with custom names
modelsummary(
models,
vcov = list(
"Stata Corp" = "stata",
"Newey Lewis & the News" = "NeweyWest"))
# fmt
mod <- lm(mpg ~ hp + drat + qsec, data = mtcars)
modelsummary(mod, fmt = 3)
modelsummary(mod, fmt = fmt_significant(3))
modelsummary(mod, fmt = NULL)
modelsummary(mod, fmt = fmt_decimal(4))
modelsummary(mod, fmt = fmt_sprintf("%.5f"))
modelsummary(mod, fmt = fmt_statistic(estimate = 4, conf.int = 1), statistic = "conf.int")
modelsummary(mod, fmt = fmt_term(hp = 4, drat = 1, default = 2))
m <- lm(mpg ~ I(hp * 1000) + drat, data = mtcars)
f <- function(x) format(x, digits = 3, nsmall = 2, scientific = FALSE, trim = TRUE)
modelsummary(m, fmt = f, gof_map = NA)
# coef_rename
modelsummary(models, coef_rename = c("Volume" = "Large", "Height" = "Tall"))
modelsummary(models, coef_rename = toupper)
modelsummary(models, coef_rename = coef_rename)
# coef_rename = TRUE for variable labels
datlab <- mtcars
datlab$cyl <- factor(datlab$cyl)
attr(datlab$hp, "label") <- "Horsepower"
attr(datlab$cyl, "label") <- "Cylinders"
modlab <- lm(mpg ~ hp * drat + cyl, data = datlab)
modelsummary(modlab, coef_rename = TRUE)
# coef_rename: unnamed vector of length equal to the number of terms in the final table
m <- lm(hp ~ mpg + factor(cyl), data = mtcars)
modelsummary(m, coef_omit = -(3:4), coef_rename = c("Cyl 6", "Cyl 8"))
# coef_map
modelsummary(models, coef_map = c("Volume" = "Large", "Height" = "Tall"))
modelsummary(models, coef_map = c("Volume", "Height"))
# coef_omit: omit the first and second coefficients
modelsummary(models, coef_omit = 1:2)
# coef_omit: omit coefficients matching one substring
modelsummary(models, coef_omit = "ei", gof_omit = ".*")
# coef_omit: omit a specific coefficient
modelsummary(models, coef_omit = "^Volume$", gof_omit = ".*")
# coef_omit: omit coefficients matching either one of two substring
# modelsummary(models, coef_omit = "ei|rc", gof_omit = ".*")
# coef_omit: keep coefficients starting with a substring (using a negative lookahead)
# modelsummary(models, coef_omit = "^(?!Vol)", gof_omit = ".*")
# coef_omit: keep coefficients matching a substring
modelsummary(models, coef_omit = "^(?!.*ei|.*pt)", gof_omit = ".*")
# shape: multinomial model
library(nnet)
multi <- multinom(factor(cyl) ~ mpg + hp, data = mtcars, trace = FALSE)
# shape: term names and group ids in rows, models in columns
modelsummary(multi, shape = response ~ model)
# shape: term names and group ids in rows in a single column
modelsummary(multi, shape = term:response ~ model)
# shape: term names in rows and group ids in columns
modelsummary(multi, shape = term ~ response:model)
# shape = "rcollapse"
panels <- list(
"Panel A: MPG" = list(
"A" = lm(mpg ~ hp, data = mtcars),
"B" = lm(mpg ~ hp + factor(gear), data = mtcars)),
"Panel B: Displacement" = list(
"A" = lm(disp ~ hp, data = mtcars),
"C" = lm(disp ~ hp + factor(gear), data = mtcars))
)
# shape = "cbind"
modelsummary(panels, shape = "cbind")
modelsummary(
panels,
shape = "rbind",
gof_map = c("nobs", "r.squared"))
# title
modelsummary(models, title = "This is the title")
# title with LaTeX label (for numbering and referencing)
modelsummary(models, title = "This is the title \\label{tab:description}", escape = FALSE)
# add_rows
rows <- tibble::tribble(
~term, ~Bivariate, ~Multivariate,
"Empty row", "-", "-",
"Another empty row", "?", "?")
attr(rows, "position") <- c(1, 3)
modelsummary(models, add_rows = rows)
attr(rows, "position") <- "gof_start"
modelsummary(models, add_rows = rows)
# notes
modelsummary(models, notes = list("A first note", "A second note"))
# gof_map: tribble
library(tibble)
gm <- tribble(
~raw, ~clean, ~fmt,
"r.squared", "R Squared", 5)
modelsummary(models, gof_map = gm)
# gof_map: list of lists
f <- function(x) format(round(x, 3), big.mark = ",")
gm <- list(
list("raw" = "nobs", "clean" = "N", "fmt" = f),
list("raw" = "AIC", "clean" = "aic", "fmt" = f))
modelsummary(models, gof_map = gm)
tidy generic
Description
These objects are imported from other packages. Follow the links below to see their documentation.
Retrieve or modify the row or column labels.
Description
Retrieve or modify the row or column labels.
Usage
rowLabels(x)
List of model objects from which modelsummary
can extract estimates and
statistics
Description
List of model objects from which modelsummary
can extract estimates and
statistics
Usage
supported_models()
Extract custom information from a model object and turn it into a tidy data.frame or tibble
Description
To customize the output of a model of class lm
, you can define a method
called tidy_custom.lm
which returns a data.frame with a column called
"term", and the other columns you want to use as "estimate" or "statistic"
in your modelsummary()
call. The output of this method must be similar to
the result of tidy(model)
.
Usage
tidy_custom(x)
Arguments
x |
An object to be converted into a tidy data.frame or tibble. |
Value
A data.frame or tibble with information about model components.
Avoid namespace conflict when we want to customize glance internally and still allow users to do the same with their own functions
Description
Avoid namespace conflict when we want to customize glance internally and still allow users to do the same with their own functions
Usage
tidy_custom_internal(x, ...)
Avoid namespace conflict when we want to customize glance internally and still allow users to do the same with their own functions
Description
Avoid namespace conflict when we want to customize glance internally and still allow users to do the same with their own functions
Usage
## Default S3 method:
tidy_custom_internal(x, ...)
Update modelsummary
and its dependencies
Description
Update modelsummary
and its dependencies to the latest R-Universe or CRAN versions. The R session needs to be restarted after install.
Usage
update_modelsummary(source = "development")
Arguments
source |
one of two strings: "development" or "cran" |