Title: 'parsnip' Engines for Survival Models
Version: 0.3.3
Description: Engines for survival models from the 'parsnip' package. These include parametric models (e.g., Jackson (2016) <doi:10.18637/jss.v070.i08>), semi-parametric (e.g., Simon et al (2011) <doi:10.18637/jss.v039.i05>), and tree-based models (e.g., Buehlmann and Hothorn (2007) <doi:10.1214/07-STS242>).
License: MIT + file LICENSE
URL: https://github.com/tidymodels/censored, https://censored.tidymodels.org
BugReports: https://github.com/tidymodels/censored/issues
Depends: parsnip (≥ 1.3.0), R (≥ 3.5.0), survival (≥ 3.7-0)
Imports: cli, dials, dplyr (≥ 0.8.0.1), generics, glue, hardhat (≥ 1.4.1), lifecycle, mboost, prodlim (≥ 2023.03.31), purrr, rlang (≥ 1.0.0), stats, tibble (≥ 3.1.3), tidyr (≥ 1.0.0), vctrs
Suggests: aorsf (≥ 0.1.2), coin, covr, flexsurv (≥ 2.2.1), glmnet (≥ 4.1), ipred, partykit, pec, rmarkdown, rpart, testthat (≥ 3.0.0)
Config/Needs/website: tidymodels, tidyverse/tidytemplate
Config/testthat/edition: 3
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.2
NeedsCompilation: no
Packaged: 2025-02-14 19:21:15 UTC; hannah
Author: Emil Hvitfeldt ORCID iD [aut], Hannah Frick ORCID iD [aut, cre], Posit Software, PBC [cph, fnd]
Maintainer: Hannah Frick <hannah@posit.co>
Repository: CRAN
Date/Publication: 2025-02-14 20:20:02 UTC

censored: parsnip Engines for Survival Models

Description

censored provides engines for survival models from the parsnip package. The models include parametric survival models, proportional hazards models, decision trees, boosted trees, bagged trees, and random forests. See the "Fitting and Predicting with censored" article for various examples. See below for examples of classic survival models and how to fit them with censored.

Author(s)

Maintainer: Hannah Frick hannah@posit.co (ORCID)

Authors:

Other contributors:

See Also

Useful links:

Examples

# Accelerated Failure Time (AFT) model

fit_aft <- survival_reg(dist = "weibull") %>%
  set_engine("survival") %>%
  fit(Surv(time, status) ~ age + sex + ph.karno, data = lung)
predict(fit_aft, lung[1:3, ], type = "time")


# Cox's Proportional Hazards model

fit_cox <- proportional_hazards() %>%
  set_engine("survival") %>%
  fit(Surv(time, status) ~ age + sex + ph.karno, data = lung)
predict(fit_cox, lung[1:3, ], type = "time")


# Andersen-Gill model for recurring events

fit_ag <- proportional_hazards() %>%
  set_engine("survival") %>%
  fit(Surv(tstart, tstop, status) ~ treat + inherit + age + strata(hos.cat),
    data = cgd
  )
predict(fit_ag, cgd[1:3, ], type = "time")


Internal helper function for aorsf objects

Description

Internal helper function for aorsf objects

Usage

survival_prob_orsf(object, new_data, eval_time, time = deprecated())

Arguments

object

A parsnip model_fit object resulting from rand_forest() with engine = "aorsf".

new_data

A data frame to be predicted.

eval_time

A vector of times to predict the survival probability.

time

Deprecated in favor of eval_time. A vector of times to predict the survival probability.

Value

A tibble with a list column of nested tibbles.

Examples


mod <- rand_forest() %>%
  set_engine("aorsf") %>%
  set_mode("censored regression") %>%
  fit(Surv(time, status) ~ age + ph.ecog, data = na.omit(lung))
preds <- survival_prob_orsf(mod, lung[1:3, ], eval_time = c(250, 100))


Boosted trees via mboost

Description

blackboost_train() is a wrapper for the blackboost() function in the mboost package that fits tree-based models where all of the model arguments are in the main function.

Usage

blackboost_train(
  formula,
  data,
  family,
  weights = NULL,
  teststat = "quadratic",
  testtype = "Teststatistic",
  mincriterion = 0,
  minsplit = 10,
  minbucket = 4,
  maxdepth = 2,
  saveinfo = FALSE,
  ...
)

Arguments

formula

A symbolic description of the model to be fitted.

data

A data frame containing the variables in the model.

family

A mboost::Family() object.

weights

An optional vector of weights to be used in the fitting process.

teststat

A character specifying the type of the test statistic to be applied for variable selection.

testtype

A character specifying how to compute the distribution of the test statistic. The first three options refer to p-values as criterion, "Teststatistic" uses the raw statistic as criterion. "Bonferroni" and "Univariate" relate to p-values from the asymptotic distribution (adjusted or unadjusted). Bonferroni-adjusted Monte-Carlo p-values are computed when both "Bonferroni" and "MonteCarlo" are given.

mincriterion

The value of the test statistic or 1 - p-value that must be exceeded in order to implement a split.

minsplit

The minimum sum of weights in a node in order to be considered for splitting.

minbucket

The minimum sum of weights in a terminal node.

maxdepth

The maximum depth of the tree. The default maxdepth = Inf means that no restrictions are applied to tree sizes.

saveinfo

Logical. Store information about variable selection procedure in info slot of each partynode.

...

Other arguments to pass.

Value

A fitted blackboost model.

Examples


blackboost_train(Surv(time, status) ~ age + ph.ecog,
  data = lung[-14, ], family = mboost::CoxPH()
)


Wrapper for glmnet for censored

Description

Not to be used directly by users.

Usage

coxnet_train(
  formula,
  data,
  alpha = 1,
  lambda = NULL,
  weights = NULL,
  ...,
  call = caller_env()
)

Arguments

formula

The model formula.

data

The data.

alpha

The elasticnet mixing parameter, with 0\le\alpha\le 1. The penalty is defined as

(1-\alpha)/2||\beta||_2^2+\alpha||\beta||_1.

alpha=1 is the lasso penalty, and alpha=0 the ridge penalty.

lambda

A user supplied lambda sequence. Typical usage is to have the program compute its own lambda sequence based on nlambda and lambda.min.ratio. Supplying a value of lambda overrides this. WARNING: use with care. Avoid supplying a single value for lambda (for predictions after CV use predict() instead). Supply instead a decreasing sequence of lambda values. glmnet relies on its warms starts for speed, and its often faster to fit a whole path than compute a single fit.

weights

observation weights. Can be total counts if responses are proportion matrices. Default is 1 for each observation

...

additional parameters passed to glmnet::glmnet.

call

The call used in errors and warnings.

Details

This wrapper translates from formula interface to glmnet's matrix due to how stratification can be specified. glmnet requires that the response is stratified via glmnet::stratifySurv(). censored allows specification via a survival::strata() term on the right-hand side of the formula. The formula is used to generate the stratification information needed for stratifying the response. The formula without the strata term is used for generating the model matrix for glmnet.

The wrapper retains the original formula and the pre-processing elements including the training data to allow for predictions from the fitted model.

Value

A fitted glmnet model.

Examples


coxnet_mod <- coxnet_train(Surv(time, status) ~ age + sex, data = lung)


A wrapper for survival probabilities with coxnet models

Description

A wrapper for survival probabilities with coxnet models

Usage

survival_prob_coxnet(
  object,
  new_data,
  eval_time,
  time = deprecated(),
  output = "surv",
  penalty = NULL,
  multi = FALSE,
  ...
)

Arguments

object

A parsnip model_fit object resulting from proportional_hazards() with engine = "glmnet".

new_data

Data for prediction.

eval_time

A vector of integers for prediction times.

time

Deprecated in favor of eval_time. A vector of integers for prediction times.

output

One of "surv" or "haz".

penalty

Penalty value(s).

multi

Allow multiple penalty values? Defaults to FALSE.

...

Options to pass to survival::survfit().

Value

A tibble with a list column of nested tibbles.

Examples


cox_mod <- proportional_hazards(penalty = 0.1) %>%
  set_engine("glmnet") %>%
  fit(Surv(time, status) ~ ., data = lung)
survival_prob_coxnet(cox_mod, new_data = lung[1:3, ], eval_time = 300)


A wrapper for survival probabilities with coxph models

Description

A wrapper for survival probabilities with coxph models

Usage

survival_prob_coxph(
  object,
  x = deprecated(),
  new_data,
  eval_time,
  time = deprecated(),
  output = "surv",
  interval = "none",
  conf.int = 0.95,
  ...
)

Arguments

object

A parsnip model_fit object resulting from proportional_hazards() with engine = "survival".

x

Deprecated. A model from coxph().

new_data

Data for prediction

eval_time

A vector of integers for prediction times.

time

Deprecated in favor of eval_time. A vector of integers for prediction times.

output

One of "surv", "conf", or "haz".

interval

Add confidence interval for survival probability? Options are "none" or "confidence".

conf.int

The confidence level.

...

Options to pass to survival::survfit()

Value

A tibble with a list column of nested tibbles.

Examples

cox_mod <- proportional_hazards() %>% 
  set_engine("survival") %>%
  fit(Surv(time, status) ~ ., data = lung)
survival_prob_coxph(cox_mod, new_data = lung[1:3, ], eval_time = 300)

A wrapper for survival probabilities with mboost models

Description

A wrapper for survival probabilities with mboost models

Usage

survival_prob_mboost(object, new_data, eval_time, time = deprecated())

Arguments

object

A parsnip model_fit object resulting from boost_tree() with engine = "mboost".

new_data

Data for prediction.

eval_time

A vector of integers for prediction times.

time

Deprecated in favor of eval_time. A vector of integers for prediction times.

Value

A tibble with a list column of nested tibbles.

Examples


mod <- boost_tree() %>%
  set_engine("mboost") %>%
  set_mode("censored regression") %>%
  fit(Surv(time, status) ~ ., data = lung)
survival_prob_mboost(mod, new_data = lung[1:3, ], eval_time = 300)


A wrapper for survival probabilities with partykit models

Description

A wrapper for survival probabilities with partykit models

Usage

survival_prob_partykit(
  object,
  new_data,
  eval_time,
  time = deprecated(),
  output = "surv"
)

Arguments

object

A parsnip model_fit object resulting from decision_tree() with engine = "partykit" or rand_forest() with engine = "partykit".

new_data

A data frame to be predicted.

eval_time

A vector of times to predict the survival probability.

time

Deprecated in favor of eval_time. A vector of times to predict the survival probability.

output

Type of output. Can be either "surv" or "haz".

Value

A tibble with a list column of nested tibbles.

Examples


tree <- decision_tree() %>%
  set_mode("censored regression") %>%
  set_engine("partykit") %>%
  fit(Surv(time, status) ~ age + ph.ecog, data = lung)
survival_prob_partykit(tree, lung[1:3, ], eval_time = 100)
forest <- rand_forest() %>%
  set_mode("censored regression") %>%
  set_engine("partykit") %>%
  fit(Surv(time, status) ~ age + ph.ecog, data = lung[1:100, ])
survival_prob_partykit(forest, lung[1:3, ], eval_time = 100)


A wrapper for survival probabilities with pecRpart models

Description

A wrapper for survival probabilities with pecRpart models

Usage

survival_prob_pecRpart(object, new_data, eval_time)

Arguments

object

A parsnip model_fit object resulting from decision_tree() with engine = "rpart".

new_data

Data for prediction.

eval_time

A vector of integers for prediction times.

Value

A tibble with a list column of nested tibbles.

Examples


mod <- decision_tree() %>% 
  set_mode("censored regression") %>%
    set_engine("rpart") %>%
    fit(Surv(time, status) ~ ., data = lung)
survival_prob_pecRpart(mod, new_data = lung[1:3, ], eval_time = 300)


A wrapper for survival probabilities with survbagg models

Description

A wrapper for survival probabilities with survbagg models

Usage

survival_prob_survbagg(object, new_data, eval_time, time = deprecated())

Arguments

object

A parsnip model_fit object resulting from bag_tree() with engine = "rpart".

new_data

Data for prediction.

eval_time

A vector of prediction times.

time

Deprecated in favor of eval_time. A vector of prediction times.

Value

A vctrs list of tibbles.

Examples


bagged_tree <- bag_tree() %>%
  set_engine("rpart") %>%
  set_mode("censored regression") %>%
  fit(Surv(time, status) ~ age + ph.ecog, data = lung)
survival_prob_survbagg(bagged_tree, lung[1:3, ], eval_time = 100)


Internal function helps for parametric survival models

Description

Internal function helps for parametric survival models

Usage

survival_prob_survreg(object, new_data, eval_time, time = deprecated())

hazard_survreg(object, new_data, eval_time)

Arguments

object

A parsnip model_fit object resulting from survival_reg() with engine = "survival".

new_data

A data frame.

eval_time

A vector of time points.

time

Deprecated in favor of eval_time. A vector of time points.

Value

A tibble with a list column of nested tibbles.

Examples

mod <- survival_reg() %>% 
  set_engine("survival") %>%
  fit(Surv(time, status) ~ ., data = lung)
survival_prob_survreg(mod, lung[1:3, ], eval_time = 100)
hazard_survreg(mod, lung[1:3, ], eval_time = 100)

A wrapper for survival times with coxnet models

Description

A wrapper for survival times with coxnet models

Usage

survival_time_coxnet(object, new_data, penalty = NULL, multi = FALSE, ...)

Arguments

object

A parsnip model_fit object resulting from proportional_hazards() with engine = "glmnet".

new_data

Data for prediction.

penalty

Penalty value(s).

multi

Allow multiple penalty values?

...

Options to pass to survival::survfit().

Value

A vector.

Examples


cox_mod <- proportional_hazards(penalty = 0.1) %>%
  set_engine("glmnet") %>%
  fit(Surv(time, status) ~ ., data = lung)
survival_time_coxnet(cox_mod, new_data = lung[1:3, ], penalty = 0.1)


A wrapper for survival times with coxph models

Description

A wrapper for survival times with coxph models

Usage

survival_time_coxph(object, new_data)

Arguments

object

A parsnip model_fit object resulting from proportional_hazards() with engine = "survival".

new_data

Data for prediction

Value

A vector.

Examples

cox_mod <- proportional_hazards() %>% 
  set_engine("survival") %>%
  fit(Surv(time, status) ~ ., data = lung)
survival_time_coxph(cox_mod, new_data = lung[1:3, ])

A wrapper for mean survival times with mboost models

Description

A wrapper for mean survival times with mboost models

Usage

survival_time_mboost(object, new_data)

Arguments

object

A parsnip model_fit object resulting from boost_tree() with engine = "mboost".

new_data

Data for prediction

Value

A tibble.

Examples


boosted_tree <- boost_tree() %>%
  set_engine("mboost") %>%
  set_mode("censored regression") %>%
  fit(Surv(time, status) ~ age + ph.ecog, data = lung[-14, ])
survival_time_mboost(boosted_tree, new_data = lung[1:3, ])


A wrapper for survival times with survbagg models

Description

A wrapper for survival times with survbagg models

Usage

survival_time_survbagg(object, new_data)

Arguments

object

A parsnip model_fit object resulting from bag_tree() with engine = "rpart".

new_data

Data for prediction

Value

A vector.

Examples


bagged_tree <- bag_tree() %>%
  set_engine("rpart") %>%
  set_mode("censored regression") %>%
  fit(Surv(time, status) ~ age + ph.ecog, data = lung)
survival_time_survbagg(bagged_tree, lung[1:3, ])


Number of days before a movie grosses $1M USD

Description

These data are a somewhat biased random sample of 551 movies released between 2015 and 2018. Columns include

Details

Value

time_to_million

a tibble