Version: | 0.7-4 |
Date: | 2023-09-02 |
Title: | Graphical Reporting for Clinical Trials |
Author: | Frank E Harrell Jr <fh@fharrell.com> |
Maintainer: | Frank E Harrell Jr <fh@fharrell.com> |
Depends: | Hmisc (≥ 4.0-0), |
Imports: | rms (≥ 5.0-0), lattice, latticeExtra, ggplot2, Formula, survival, methods, data.table |
Description: | Contains many functions useful for monitoring and reporting the results of clinical trials and other experiments in which treatments are compared. LaTeX is used to typeset the resulting reports, recommended to be in the context of 'knitr'. The 'Hmisc', 'ggplot2', and 'lattice' packages are used by 'greport' for high-level graphics. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
URL: | http://hbiostat.org/R/greport/, https://github.com/harrelfe/greport/ |
RoxygenNote: | 7.2.3 |
NeedsCompilation: | no |
Packaged: | 2023-09-02 18:22:59 UTC; harrelfe |
Repository: | CRAN |
Date/Publication: | 2023-09-02 22:20:02 UTC |
Graphical Reporting for Clinical Trials
Description
Graphical clinical trial reporting based on Rmarkdown, LaTeX, and pdf
Usage
.noGenerics
Format
An object of class logical
of length 1.
Author(s)
Frank E Harrell Jr fh@fharrell.com
Merge Multiple Data Frames or Data Tables
Description
Merges an arbitrarily large series of data frames or data tables containing common id
variables (keys for data tables). Information about number of observations and number of unique id
s in individual and final merged datasets is printed. The first data frame has special meaning in that all of its observations are kept whether they match id
s in other data frames or not. For all other data frames, by default non-matching observations are dropped. The first data frame is also the one against which counts of unique id
s are compared. Sometimes merge
drops variable attributes such as labels
and units
. These are restored by Merge
. If all objects are of class data.table
, faster merging will be done using the data.table
package's join operation. This assumes that all objects have identical key variables and those of the variables on which to merge.
Usage
Merge(..., id, all = TRUE, verbose = TRUE)
Arguments
... |
two or more dataframes or data tables |
id |
a formula containing all the identification variables such that the combination of these variables uniquely identifies subjects or records of interest. May be omitted for data tables; in that case the |
all |
set to |
verbose |
set to |
Examples
a <- data.frame(sid=1:3, age=c(20,30,40))
b <- data.frame(sid=c(1,2,2), bp=c(120,130,140))
d <- data.frame(sid=c(1,3,4), wt=c(170,180,190))
all <- Merge(a, b, d, id = ~ sid)
# For data.table, first file must be the master file and must
# contain all ids that ever occur. ids not in the master will
# not be merged from other datasets.
require(data.table)
a <- data.table(a); setkey(a, sid)
# data.table also does not allow duplicates without allow.cartesian=TRUE
b <- data.table(sid=1:2, bp=c(120,130)); setkey(b, sid)
d <- data.table(d); setkey(d, sid)
all <- Merge(a, b, d)
Accrual Report
Description
Generate graphics and LaTeX to analyze subject accrual
Usage
accrualReport(
formula,
data = NULL,
subset = NULL,
na.action = na.retain,
dateRange = NULL,
zoom = NULL,
targetN = NULL,
targetDate = NULL,
closeDate = NULL,
enrollmax = NULL,
studynos = TRUE,
minrand = 10,
panel = "accrual",
h = 2.5,
w = 3.75,
hb = 5,
wb = 5,
hdot = 3.5
)
Arguments
formula |
formula object, with time variables on the left (separated by +) and grouping variables on the right. Enrollment date, randomization date, region, country, and site when present must have the variables in parenthesis preceeded by the key words |
data |
data frame. |
subset |
a subsetting epression for the entire analysis. |
na.action |
a NA handling function for data frames, default is |
dateRange |
|
zoom |
|
targetN |
integer vector with target sample sizes over time, same length as |
targetDate |
|
closeDate |
|
enrollmax |
numeric specifying the upper y-axis limit for cumulative enrollment when not zoomed |
studynos |
logical. Set to |
minrand |
integer. Minimum number of randomized subjects a country must have before a box plot of time to randomization is included. |
panel |
character string. Name of panel, which goes into file base names and figure labels for cross-referencing. |
h |
numeric. Height of ordinary plots, in inches. |
w |
numeric. Width of ordinary plots. |
hb |
numeric. Height of extended box plots. |
wb |
numeric. Weight of extended box plots. |
hdot |
numeric. Height of dot charts in inches. |
Details
Typically the left-hand-side variables of the formula, in order, are date of enrollment and date of randomization, with subjects enrolled but not randomized having missing date of randomization. Given such date variables, this function generates cumulative frequencies optionally with target enrollment/randomization numbers and with time-zooming. Makes a variety of dot charts by right-hand-side variables: number of subjects, number of sites, number of subjects per site, fraction of enrolled subjects randomized, number per month, number per site-month.
Examples
## Not run:
# See test.Rnw in tests directory
## End(Not run)
Issue LaTeX section and/or subsection in appendix
Description
This is useful for copying section and subsection titles in the main body of the report to the appendix, to help in navigating supporting tables. LaTeX backslash characters need to be doubled.
Usage
appsection(section = NULL, subsection = NULL, main = FALSE, panel = "")
Arguments
section |
a character string that will cause a section command to be added to app.tex |
subsection |
a character string that will cause a subsection command to be added to app.tex |
main |
set to |
panel |
panel string; must be given if |
Draw Needles
Description
Create a LaTeX picture
to draw needles for current sample sizes. Uses colors set by call to setgreportOptions
.
Usage
dNeedle(sf, name, file = "", append = TRUE)
Arguments
sf |
output of |
name |
character string name of LaTeX variable to create |
file |
output file name (character string) |
append |
set to |
Descriptive Statistics Report
Description
Generate graphics and LaTeX with descriptive statistics
Usage
dReport(
formula,
groups = NULL,
what = c("box", "proportions", "xy", "byx"),
byx.type = c("violin", "quantiles"),
violinbox = TRUE,
violinbox.opts = list(col = adjustcolor("blue", alpha.f = 0.25), border = FALSE),
summaryPsort = FALSE,
exclude1 = TRUE,
stable = TRUE,
fun = NULL,
data = NULL,
subset = NULL,
na.action = na.retain,
panel = "desc",
subpanel = NULL,
head = NULL,
tail = NULL,
continuous = 10,
h = 5.5,
w = 5.5,
outerlabels = TRUE,
append = FALSE,
sopts = NULL,
popts = NULL,
lattice = FALSE
)
Arguments
formula |
a formula accepted by the |
groups |
a superpositioning variable, usually treatment, for categorical charts. For continuous analysis variables, |
what |
|
byx.type |
set to |
violinbox |
set to |
violinbox.opts |
a list to pass to |
summaryPsort |
set to |
exclude1 |
logical used for |
stable |
set to |
fun |
a function that takes individual response variables (which may be matrices, as in |
data |
data frame |
subset |
a subsetting epression for the entire analysis |
na.action |
a NA handling function for data frames, default is |
panel |
character string. Name of panel, which goes into file base names and figure labels for cross-referencing |
subpanel |
If calling |
head |
character string. Specifies initial text in the figure caption, otherwise a default is used |
tail |
optional character string. Specifies final text in the figure caption, e.g., what might have been put in a footnote in an ordinary text page. This appears just before any needles. |
continuous |
the minimum number of numeric values a variable must have in order to be considered continuous. Also passed to |
h |
numeric. Height of plot, in inches |
w |
numeric. Width of plot |
outerlabels |
logical that if |
append |
logical. Set to |
sopts |
list specifying extra arguments to pass to |
popts |
list specifying extra arguments to pass to a plot method. One example is |
lattice |
set to |
Details
dReport
generates multi-panel charts, separately for categorical analysis variables and continuous ones. The Hmisc summaryP
function and its plot method are used for categorical variables, and bpplotM
is used to make extended box plots for continuous ones unless what='byx'
. Stratification is by treatment or other variables. The user must have defined a LaTeX macro \eboxpopup
(which may be defined to do nothing) with one argument. This macro is called with argument extended box plot
whenever that phrase appears in the legend, so that a PDF
popup may be generated to show the prototype. See the example in report.Rnw
in the tests
directory. Similarly a popup macro \qintpopup
must be defined, which generates a tooltip for the phrase quantile intervals
.
Examples
# See test.Rnw in tests directory
Event Report
Description
Generates graphics for binary event proportions
Usage
eReport(
formula,
data = NULL,
subset = NULL,
na.action = na.retain,
minincidence = 0,
conf.int = 0.95,
etype = "adverse events",
panel = "events",
subpanel = NULL,
head = NULL,
tail = NULL,
h = 6,
w = 7,
append = FALSE,
popts = NULL
)
Arguments
formula |
a formula with one or two left hand variables (the first representing major categorization and the second minor), and 1-2 right hand variables. One of these may be enclosed in |
data |
input data frame |
subset |
subsetting criteria |
na.action |
function for handling |
minincidence |
a number between 0 and 1 specifying the minimum incidence in any stratum that must hold before an event is included in the summary |
conf.int |
confidence level for difference in proportions |
etype |
a character string describing the nature of the events, for example |
panel |
panel string |
subpanel |
a subpanel designation to add to |
head |
character string. Specifies initial text in the figure caption, otherwise a default is used. |
tail |
a character string to add to end of automatic caption |
h |
height of graph |
w |
width of graph |
append |
set to |
popts |
a list of options to pass to graphing functions |
Details
Generates dot charts showing proportions on left and risk difference with confidence intervals on the right, if there is only one level of event categorization. Input data must contain one record per event, with this record containing the event name. If there is more than one event of a given type per subject, unique subject ID must be provided. Denominators come from greport
options and it is assumed that only randomized subjects have records. Some of the graphics functions are modifications of those found in the HH package. The data are expected to have one record per event, and non-events are inferred from setgreportOption('denom')
. It is also assumed that only randomized subjects are included in the dataset.
Author(s)
Frank Harrell
Examples
# See test.Rnw in tests directory
Exclusion Report
Description
Generates graphics for sequential exclusion criteria
Usage
exReport(
formula,
data = NULL,
subset = NULL,
na.action = na.retain,
ignoreExcl = NULL,
ignoreRand = NULL,
plotExRemain = TRUE,
autoother = FALSE,
sort = TRUE,
whenapp = NULL,
erdata = NULL,
panel = "excl",
subpanel = NULL,
head = NULL,
tail = NULL,
apptail = NULL,
h = 5.5,
w = 6.5,
hc = 4.5,
wc = 5,
adjustwidth = "-0.75in",
append = FALSE,
popts = NULL,
app = TRUE
)
Arguments
formula |
a formula with only a right-hand side, possibly containing a term of the form |
data |
input data frame |
subset |
subsetting criteria |
na.action |
function for handling |
ignoreExcl |
a formula with only a right-hand side, specifying the names of exclusion variable names that are to be ignored when counting exclusions (screen failures) |
ignoreRand |
a formula with only a right-hand side, specifying the names of exclusion variable names that are to be ignored when counting randomized subjects marked as exclusions |
plotExRemain |
set to |
autoother |
set to |
sort |
set to |
whenapp |
a named character vector (with names equal to names of variables in formula). For each variable that is only assessed (i.e., is not |
erdata |
a data frame that is subsetted on the combination of |
panel |
panel string |
subpanel |
If calling |
head |
character string. Specifies initial text in the figure caption, otherwise a default is used. |
tail |
a character string to add to end of automatic caption |
apptail |
a character string to add to end of automatic caption for appendix table with listing of subject IDs |
h |
height of 2-panel graph |
w |
width of 2-panel graph |
hc |
height of cumulative exclusion 1-panel graph |
wc |
width of this 1-panel graph |
adjustwidth |
used to allow wide detailed exclusion table to go into left margin in order to be centered on the physical page. The default is |
append |
set to |
popts |
a list of options to pass to graphing functions |
app |
set to |
Details
With input being a series of essentially binary variables with positive indicating that a subject is excluded for a specific reason, orders the reasons so that the first excludes the highest number of subjects, the second excludes the highest number of remaining subjects, and so on. If a randomization status variable is present, actually randomized (properly or not) subjects are excluded from counts of exclusions. First draws a single vertical axis graph showing cumulative exclusions, then creates a 2-panel dot chart with the first panel showing that information, along with the marginal frequencies of exclusions and the second showing the number of subjects remaining in the study after the sequential exclusions. A pop-up table is created showing those quantities plus fractions. There is an option to not sort by descending exclusion frequencies but instead to use the original variable order. Assumes that any factor variable exclusions that have only one level and that level indicates a positive finding, that variable has a denominator equal to the overall number of subjects.
Author(s)
Frank Harrell
Examples
# See test.Rnw in tests directory
Get greport Options
Description
Get greport options, assigning default values of unspecified optios.
Usage
getgreportOption(opts = NULL)
Arguments
opts |
character vector containing list of option names to retrieve. If only one element, the result is a scalar, otherwise a list. If |
Setup lattice plots using greport options
Description
Initializes colors and other graphical attributes based on
what is stored in system option greport
.
Usage
latticeInit()
Mask Variables in a Data Frame
Description
Given a list of applicable variable names in a formula, runs maskVal
on any variables in a data frame x
whose name is found in formula
.
Usage
maskDframe(x, formula, ...)
Arguments
x |
an input data frame or data table |
formula |
a formula specifying the variables to perturb |
... |
parameters to pass to |
Mask Values of a Vector
Description
Modifies the value of a vector so as to mask the information by generating random data subject to constraints and keeping the length, type, label, and units attributes of the original variable. For a binary numeric or logical variable a random vector with prevalence (by default) of 0.5 replaces the original. For a factor variable, a random multinomial sample is drawn, with equal expected frequencies of all levels. For a numeric variable, the range is preserved but the distribution is uniform over that range, and generated values are rounded by an amount equal to the minimum spacing between distinct values. Character variables are just randomly reordered. In the special case where the input vector contains only one unique non-NA value, the variable is assumed to be the type of variable where NA represents FALSE or "no", and the variable is replaced by a logical vector with the specified prevalence.
Usage
maskVal(x, prev = 0.5, NAs = TRUE)
Arguments
x |
an input vector |
prev |
a numeric scalar specifying the prevalence for binary variables |
NAs |
if the variable contains |
Compute mfrow Parameter
Description
Compute a good par("mfrow")
given the
number of figures to plot.
Usage
mfrowSuggest(n, small = FALSE)
Arguments
n |
numeric. Total number of figures to place in layout. |
small |
logical. Set to ‘TRUE’ if the plot area should be smaller to accomodate many plots. |
Value
return numeric vector. oldmfrow <- mfrowSet(8)
Number at Risk Report
Description
Graph number of subjects at risk
Usage
nriskReport(
formula,
groups = NULL,
time0 = "randomization",
data = NULL,
subset = NULL,
na.action = na.retain,
ylab = "Number Followed",
panel = "nrisk",
head = NULL,
tail = NULL,
h = 5.5,
w = 5.5,
outerlabels = TRUE,
append = FALSE,
popts = NULL
)
Arguments
formula |
a formula with time and the left hand side, and with variables on the right side being possible stratification variables. If no stratification put |
groups |
a character string naming a superpositioning variable. Must also be included in |
time0 |
a character string defining the meaning of time zero in follow-up. Default is |
data |
data frame |
subset |
a subsetting epression for the entire analysis |
na.action |
a NA handling function for data frames, default is |
ylab |
character string if you want to override |
panel |
character string. Name of panel, which goes into file base names and figure labels for cross-referencing. The default is |
head |
character string. Specifies initial text in the figure caption, otherwise a default is used |
tail |
optional character string. Specifies final text in the figure caption, e.g., what might have been put in a footnote in an ordinary text page. This appears just before any needles. |
h |
numeric. Height of plot, in inches |
w |
numeric. Width of plot |
outerlabels |
logical that if |
append |
logical. Set to |
popts |
list specifying extra arguments to pass to |
Details
nriskReport
generates multi-panel charts, separately for categorical analysis variables. Each panel depicts the number at risk as a function of follow-up time. The Hmisc Ecdf
function is used. Stratification is by treatment or other variables. It is assumed that this function is only run on randomized subjects. If an id
variable is present but groups
and stratification variables are not, other plots are also produced: a histogram of the number of visits per subject, a histogram of times at which subjects have visits, the average number of contacts as a function of elapsed time, and a histogram showing the distribution of the longest gap between visits over subjects.
Examples
# See test.Rnw in tests directory
Put Figure
Description
Included a generated figure within LaTex document. tcaption
and tlongcaption
only apply if setgreportOption(tablelink="hyperref")
.
Usage
putFig(
panel,
name,
caption = NULL,
longcaption = NULL,
tcaption = caption,
tlongcaption = NULL,
poptable = NULL,
popfull = FALSE,
sidecap = FALSE,
outtable = FALSE,
append = TRUE
)
Arguments
panel |
character. Panel name. |
name |
character. Name for figure. |
caption |
character. Short caption for figure. |
longcaption |
character. Long caption for figure. |
tcaption |
character. Short caption for supporting table. |
tlongcaption |
character. Long caption for supporting table. |
poptable |
an optional character string containing LaTeX code that will be used as a pop-up tool tip for the figure (typically a tabular). Set to |
popfull |
set to |
sidecap |
set to |
outtable |
set to |
append |
logical. If ‘TRUE’ output will be appended instead of overwritten. |
Compute Sample Fractions
Description
Uses denominators stored with setgreportOption
along with counts specified to SampleFrac
to compute fractions of subjects in current analysis
Usage
sampleFrac(n, nobsY = NULL, table = TRUE)
Arguments
n |
integer vector, named with |
nobsY |
a result of the the |
table |
set to |
Set greport Options
Description
Set greport Options
Usage
setgreportOption(...)
Arguments
... |
a series of options for which non-default values are desired:
|
Plot Initialization
Description
Toggle plotting. Sets options by examining setgreportOption(gtype=)
.
Usage
startPlot(file, h = 7, w = 7, lattice = TRUE, ...)
endPlot()
Arguments
file |
character. Character string specifying file prefix. |
h |
numeric. Height of plot in inches, default=7. |
w |
numeric. Width of plot in inches, default=7. |
lattice |
logical. Set to |
... |
Arguments to be passed to |
Survival Report
Description
Generate a Survival Report with Kaplan-Meier Estimates
Usage
survReport(
formula,
data = NULL,
subset = NULL,
na.action = na.retain,
ylab = NULL,
what = c("S", "1-S"),
conf = c("diffbands", "bands", "bars", "none"),
cause = NULL,
panel = "surv",
subpanel = NULL,
head = NULL,
tail = NULL,
h = 3,
w = 4.5,
multi = FALSE,
markevent = TRUE,
mfrow = NULL,
y.n.risk = 0,
mylim = NULL,
bot = 2,
aehaz = TRUE,
times = NULL,
append = FALSE,
opts = NULL,
...
)
Arguments
formula |
a formula with survival ( |
data |
data.frame |
subset |
optional subsetting criteria |
na.action |
function for handling |
ylab |
character. Passed to |
what |
|
conf |
character. See |
cause |
character vector or list. If a vector, every |
panel |
character string. Name of panel, which goes into file base names and figure labels for cross-referencing. |
subpanel |
character string. If calling |
head |
character string. Specifies initial text in the figure caption, otherwise a default is used. |
tail |
optional character string. Specifies final text in the figure caption, e.g., what might have been put in a footnote in an ordinary text page. This appears just before any needles. |
h |
numeric. Height of plots. |
w |
numeric. Width of plots in inches. |
multi |
logical. If |
markevent |
logical. Applies only if |
mfrow |
numeric 2-vector, used if |
y.n.risk |
used if |
mylim |
numeric 2-vector. Used to force expansion of computed y-axis limits. See |
bot |
number of spaces to reserve at bottom of plot for numbers at risk, if |
aehaz |
logical. Set to |
times |
numeric vector. If specified, prints cumulative incidence probabilities at those times on the plots. |
append |
logical. If |
opts |
list. A list specifying arguments to |
... |
ignored |
Examples
## See tests directory test.Rnw for a live example
## Not run:
set.seed(1)
n <- 400
dat <- data.frame(t1=runif(n, 2, 5), t2=runif(n, 2, 5),
e1=rbinom(n, 1, .5), e2=rbinom(n, 1, .5),
treat=sample(c('a','b'), n, TRUE))
dat <- upData(dat,
labels=c(t1='Time to operation',
t2='Time to rehospitalization',
e1='Operation', e2='Hospitalization',
treat='Treatment')
units=c(t1='year', t2='year'))
survReport(Surv(t1, e1) + Surv(t2, e2) ~ treat, data=dat)
dat <- upData(dat, labels=c(t1='Follow-up Time', t2='Time'),
cause=factor(sample(c('death','MI','censor'), n, TRUE),
c('censor', 'MI', 'death')))
survReport(Surv(t1, cause) ~ treat, cause='death', data=dat)
survReport(Surv(t1, cause) + Surv(t2, cause) ~ treat,
cause=list(c('death', 'MI'), 'death'), data=dat)
# Two plots for t1, one plot for t2
## End(Not run)