%\VignetteIndexEntry{Using mstate for the analyses of Putter et al. (Stat Med 2007)}
%\VignetteDepends{cmprsk}

\SweaveOpts{eval=TRUE}
\SweaveOpts{keep.source=FALSE}

\documentclass[11pt,twoside,a4paper,final]{article}

\usepackage[leqno]{amsmath}
\usepackage{xspace}
\usepackage{amsbsy}
\usepackage{fancyhdr}
\usepackage{comment}
\usepackage{fullpage}
\usepackage{natbib}
\usepackage{hyperref}
\usepackage{Sweave}
\usepackage[utf8]{inputenc}

\renewcommand{\headrulewidth}{0.4pt}
\renewcommand{\headsep}{25pt}
\renewcommand{\footskip}{30pt}
\renewcommand{\footrulewidth}{0pt}

\newcommand{\Rcode}[1]{{\texttt{#1}}}
\newcommand{\Robject}[1]{{\texttt{#1}}}
\newcommand{\Rfunction}[1]{\textsl{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\textit{#1}}}
\newcommand{\Rclass}[1]{{\textit{#1}}}
\newcommand{\Rfunarg}[1]{{\textit{#1}}}
\newcommand{\Rmethod}[1]{{\textit{#1}}}
\newcommand{\Ritem}[1]{{\texttt{#1}}} % self defined, item in a list
\newcommand{\Rcolumn}[1]{{\texttt{#1}}} % self defined, column name in data frame

\newcommand{\R}{\texttt{R}\xspace}
\newcommand{\survival}{\Rpackage{survival}\xspace}
\newcommand{\mstate}{\Rpackage{mstate}\xspace}
\newcommand{\msprep}{\Rfunction{msprep}\xspace}
\newcommand{\msfit}{\Rfunction{msfit}\xspace}
\newcommand{\expandcovs}{\Rfunction{expand.covs}\xspace}
\newcommand{\events}{\Rfunction{events}\xspace}
\newcommand{\plot}{\Rfunction{plot}\xspace}
\newcommand{\paths}{\Rfunction{paths}\xspace}
\newcommand{\probtrans}{\Rfunction{probtrans}\xspace}
\newcommand{\Cuminc}{\Rfunction{Cuminc}\xspace}
\newcommand{\coxph}{\Rfunction{coxph}\xspace}
\newcommand{\survfit}{\Rfunction{survfit}\xspace}
\newcommand{\stratum}{\Rcolumn{stratum}\xspace}
\newcommand{\strata}{\Rcolumn{strata}\xspace}
\newcommand{\prtime}{\Rcolumn{prtime}\xspace}
\newcommand{\pr}{\Rcolumn{pr}\xspace}
\newcommand{\from}{\Rcolumn{from}\xspace}
%\newcommand{\direct}{\texttt{direct}\xspace}
\newcommand{\Tstart}{\Rcolumn{Tstart}\xspace}
\newcommand{\Tstop}{\Rcolumn{Tstop}\xspace}
\newcommand{\status}{\Rcolumn{status}\xspace}
\newcommand{\patid}{\Rcolumn{patid}\xspace}
\newcommand{\trans}{\Rcolumn{trans}\xspace}
\newcommand{\keep}{\Rcolumn{keep}\xspace}
\newcommand{\NA}{\Robject{NA}\xspace}
\newcommand{\FALSE}{\Robject{FALSE}\xspace}
\newcommand{\TRUE}{\Robject{TRUE}\xspace}
\newcommand{\newdata}{\Rfunarg{newdata}\xspace}
\newcommand{\Haz}{\Ritem{Haz}\xspace}
\newcommand{\varHaz}{\Ritem{varHaz}\xspace}
\newcommand{\data}{\texttt{data}\xspace}

\newcommand{\tut}{the tutorial\xspace}
\newcommand{\Tx}{transplantation\xspace}
\newcommand{\PR}{platelet recovery\xspace}
\newcommand{\RD}{relapse/death\xspace}
\newcommand{\PH}{proportional hazards\xspace}
\newcommand{\se}{standard error\xspace}
\newcommand{\ses}{standard errors\xspace}
\newcommand{\msm}{multi-state model\xspace}
\newcommand{\msms}{multi-state models\xspace}
\newcommand{\crs}{competing risks\xspace}
\newcommand{\ci}{cumulative incidence\xspace}
\newcommand{\cis}{cumulative incidences\xspace}
\newcommand{\LLambda}{\boldsymbol{\Lambda}}
\newcommand{\II}{{\mathbf I}}
\newcommand{\PP}{{\mathbf P}}
\newcommand{\given}{\,|\,}


\begin{document}
\SweaveOpts{concordance=TRUE}
\title{Tutorial in biostatistics: Competing risks and multi-state models\\
    Analyses using the \mstate package}
\author{Hein Putter\\
    Department of Medical Statistics and Bioinformatics\\
    Leiden University Medical Center\\
    Postzone S-5-P\\
    PO Box 9600\\
    2300 RC\ \ Leiden\\
    The Netherlands\\
    E-mail: \texttt{h.putter@lumc.nl}}

\maketitle

\newpage

\section{Introduction}\label{sec:intro}

This is a companion file both for the \mstate package and for the Tutorial in Biostatistics: Competing risks and multi-state models~\citep{Tut}, simply referred to henceforth as the tutorial. Emphasis in this document will be on the use of \mstate, not on the theory of competing risks and multi-state models. The only exception is that I have added some theory about the Aalen-Johansen estimator that is implemented in \mstate but did not appear in \tut. For other theory on multi-state models, and for interpretation of the results of the analyses, we will repeatedly refer to \tut. I will occasionally give more detail and show more analyses than in \tut.  Also I sometimes give more details on the function in \mstate than strictly necessary for the analyses in \tut, but not all features will be shown either. This file and the \mstate package, which in turn contains all the data used in \tut, can be found at \url{https://github.com/hputter/mstate}. This file is also a vignette of the \mstate package. Type \Rcode{vignette("Tutorial")} after having installed and loaded \mstate to access this document within R.

I do not follow the order of \tut. Rather, I will start with multi-state models, Section 4 of \tut, and finally switch back to the special case of \crs models. Sections~\ref{sec:prep}, \ref{sec:est} and \ref{sec:pred} of this document will discuss data preparation, estimation and prediction, respectively in multi-state models. In Section~\ref{sec:comprisk} I illustrate some functions of \mstate designed especially for competing risks.

After installation, the \mstate package is loaded in the usual way.
<<label=mstate>>=
library(mstate)
@
The versions of \R and \mstate used in this document are as follows:
<<label=mstate>>=
R.version$version.string
packageDescription("mstate", fields = "Version")
@

\section{Data preparation}\label{sec:prep}

The data used in Section 4 of \tut are 2204 patients transplanted at the EBMT between 1995 and 1998. These data are included in the \mstate package. For (a tiny bit) more background on the data, refer to \tut, or type \Rcode{help(ebmt3)}.
<<label=data>>=
data(ebmt3)
head(ebmt3)
@
Let us first have a look at the covariates. For instance disease subclassification:
<<label=tables>>=
n <- nrow(ebmt3)
table(ebmt3$dissub); round(100*table(ebmt3$dissub)/n)
@
The output of the other covariates is omitted.
<<label=tables2,eval=FALSE>>=
table(ebmt3$age); round(100*table(ebmt3$age)/n)
table(ebmt3$drmatch); round(100*table(ebmt3$drmatch)/n)
table(ebmt3$tcd); round(100*table(ebmt3$tcd)/n)
@
The first step in a \msm analysis is to set up the transition matrix. The transition matrix specifies which direct transitions are possible (those with \NA are impossible) and assigns numbers to the transitions for future reference. This can be done explicitly.
<<label=tmat>>=
tmat <- matrix(NA,3,3)
tmat[1,2:3] <- 1:2
tmat[2,3] <- 3
dimnames(tmat) <- list(from=c("Tx","PR","RelDeath"),to=c("Tx","PR","RelDeath"))
tmat
@
Steven McKinney has kindly provided a convenient function \Rfunction{transMat} to define transition matrices. The same transition matrix may be constructed as follows.
<<label=transMat>>=
tmat <- transMat(x=list(c(2,3),c(3),c()), names=c("Tx","PR","RelDeath"))
tmat
@
For common \msms, such as the illness-death model (and competing risks models, Section~\ref{sec:comprisk}) there is a built-in function to obtain these transition matrices more easily.
<<label=tmat2>>=
tmat <- trans.illdeath(names=c("Tx","PR","RelDeath"))
tmat
@
The function \paths can be used to give a list of all possible paths through the multi-state model. This function should not be used for transition matrices specifying a multi-state model with loops, since there will be infinitely many paths. At the moment there is no check for the presence of loops, but this will be included shortly.
<<label=paths>>=
paths(tmat)
@
Time in the \Robject{ebmt3} data is reported in days; before doing any analysis, we first convert this to years.
<<label=times>>=
ebmt3$prtime <- ebmt3$prtime/365.25
ebmt3$rfstime <- ebmt3$rfstime/365.25
@
In order to prepare data in long format, we specify the names of the covariates that we are interested in modeling. Note that I am adding \prtime, which is not really a covariate, but specifying the time of platelet recovery. The purpose of this will become clear later. The specified covariates are to be retained in the dataset in long format (this is the argument \Rfunarg{keep}), which we are going to call \Robject{msbmt}. For the original dataset \Robject{ebmt3}, each row corresponds to a single patient. For the long format data \Robject{msbmt}, each row will correspond to a transition for which a patient is at risk. See \tut for more detailed information.
<<label=msprep>>=
covs <- c("dissub", "age", "drmatch", "tcd", "prtime")
msbmt <- msprep(time=c(NA,"prtime","rfstime"),
    status=c(NA,"prstat","rfsstat"), data=ebmt3, trans=tmat,
    keep=covs)
@
The result is an S3 object of class \Rclass{msdata} and \Rclass{data.frame}. An \Rclass{msdata} object is actually only a data frame with a \Ritem{trans} attribute holding the transition matrix used to define it. A \Rfunction{print} method has been defined for \Rclass{msdata} objects, which also prints the transition matrix if requested (set argument \Rfunarg{trans} to \TRUE, default is \FALSE).
<<label=msbmt>>=
head(msbmt)
@
In the above call of \msprep, the \Rfunarg{time} and \Rfunarg{status} arguments specify the column names in the data \Robject{ebmt3} corresponding to the three states in the multi-state model. Since all the patients start in state 1 at time 0, the \Rfunarg{time} and \Rfunarg{status} arguments corresponding to the first state do not really have a value. In such cases, the corresponding elements of \Rfunarg{time} and \Rfunarg{status} may be given the value \NA. An alternative way of specifying \Rfunarg{time} and \Rfunarg{status} (and \Rfunarg{keep} as well) is as matrices of dimension $n \times S$ with $S$ the number of states (and $n \times p$ with $p$ the number of covariates for \Rfunarg{keep}). The \Rfunarg{data} argument doesn't need to be specified then.

The number of events in the data can be summarized with the function \events.
<<label=events>>=
events(msbmt)
@
For regression purposes, we now add transition-specific covariates to the dataset. For more details on transition-specific covariates, refer to \tut. For a numerical covariate \Rcolumn{cov}, the names of the expanded (transition-specific) covariates are \Rcolumn{cov.1}, \Rcolumn{cov.2} etc. The extension \Rcolumn{.i} refers to transition number $i$. First, we define these transition-specific covariates as a separate dataset, by setting \Rfunarg{append} to \FALSE.
<<label=expandcovs>>=
expcovs <- expand.covs(msbmt,covs[2:3],append=FALSE)
head(expcovs)
@
We see that this expanded covariates dataset is quite large, and that the covariate names are quite long. For categorical covariates, the default names of the expanded covariates are a combination of the covariate name, the level (similar to the names of the regression coefficients that you see in regression output), followed by the transition number, in such a way that the combination is allowed as column name. If these names are too long, the user may set the value of \Rfunarg{longnames} (default=\TRUE) to \FALSE. In this case, the covariate name is followed by \Rcolumn{1}, \Rcolumn{2} etc, before the transition number. In case of a covariate with only two levels, the covariate name is just followed by the transition number. Confident that this will work out, we also set \Rfunarg{append} to \TRUE (default), which will append the expanded covariates to the dataset.
<<label=expandcovs2>>=
msbmt <- expand.covs(msbmt,covs,append=TRUE,longnames=FALSE)
head(msbmt)
@
The names indeed are quite a bit shorter. The downside however is that we need to remember for ourselves to which category for instance the number \Rcolumn{1} in \Rcolumn{age1.2} corresponds (age 20-40 with $\leq 20$ as reference category).

\section{Estimation}\label{sec:est}

After having prepared the data in long format, estimation of covariate effects using Cox regression is straightforward using the \coxph function of the \survival package. This is not at all a feature of the \mstate package, other than that \msprep has facilitated preparation of the data. Let us consider the Markov model, where we assume different effects of the covariates for different transitions; hence we use the transition-specific covariates obtained by \expandcovs. The delayed entry aspect of this model for transition 3 (see discussion in \tut) is achieved by specifying \Rfunarg{Surv(Tstart,Tstop,status)}, where (this is reflected in the long format data) \Tstart is the time of entry in the state, and \Tstop the event or censoring time, depending on the value of \Rcolumn{status}. We consider first the model without any proportionality assumption on the baseline hazards; this is achieved by adding \Rfunarg{strata(trans)} to the formula, which estimates separate baseline hazards for different values of \trans (the transitions).
The results appear in the left column of Table III of \tut.
<<label=Markovnonprop>>=
c1 <- coxph(Surv(Tstart,Tstop,status) ~ dissub1.1 + dissub2.1 + age1.1 + age2.1 + drmatch.1 + tcd.1 
    + dissub1.2 + dissub2.2 + age1.2 + age2.2 + drmatch.2 + tcd.2 
    + dissub1.3 + dissub2.3 + age1.3 + age2.3 + drmatch.3 + tcd.3 
    + strata(trans),
    data=msbmt, method="breslow")
c1
@
The interpretation is discussed in \tut.

The next model considered is the Markov model where the transition hazards into relapse or death (these correspond to transitions 2 and 3) are assumed to be proportional. For this purpose transition 1 (\Tx $\to$ \PR) belongs to one stratum and transitions 2 (\Tx $\to$ \RD) and 3 (\PR $\to$ \RD) belong to a second stratum. Transitions 2 and 3 have the same receiving state, hence the same value of \Rcolumn{to}, so the two strata can be distinguished by the variable \Rcolumn{to} in our dataset. In order to distinguish between transitions 2 and 3, we introduce a time-dependent covariate \pr that indicates whether or not platelet recovery has already occurred. For transition 2 (Tx $\to$ RelDeath) the value of \pr equals 0, while for transition 3 (PR $\to$ RelDeath) the value of \pr equals 1. Results are found in the middle of Table III of \tut.
<<label=Markovprop>>=
msbmt$pr <- 0
msbmt$pr[msbmt$trans==3] <- 1
c2 <- coxph(Surv(Tstart,Tstop,status) ~ dissub1.1 + dissub2.1 + age1.1 + age2.1 + drmatch.1 + tcd.1
    + dissub1.2 + dissub2.2 + age1.2 + age2.2 + drmatch.2 + tcd.2
    + dissub1.3 + dissub2.3 + age1.3 + age2.3 + drmatch.3 + tcd.3
    + pr
    + strata(to),
    data=msbmt, method="breslow")
c2
@
For a discussion of the results we again refer to \tut. The hazard ratio of \pr (0.685) and its $p$-value (0.073) indicate a trend-significant beneficial effect of \PR on relapse-free survival. Later on we will look at the corresponding baseline transition intensities for these two models and see as a graphical check that the assumption of proportionality of the baseline hazards for transitions 2 and 3 is reasonable. This can also be tested formally using the function \Rfunction{cox.zph} (part of the \survival package, not of \mstate).
<<label=coxzph>>=
cox.zph(c2)
@
There is no evidence of non-proportionality of the baseline transition intensities of transitions 2 ($p$=0.496 for \pr). There is strong evidence that the \PH assumption for \Rcolumn{dissub2} (CML vs AML) is violated, at least for the transitions into relapse and death. This makes sense, clinically, since CML and AML are two diseases with completely different biological pathways. It would have been much better to  study separate \msms for the three disease subclassifications. However, since the purpose of this manuscript is to illustrate the use of \mstate, we will blatantly ignore the clear evidence of non-proportionality for the disease subclassifications.

Building on the Markov PH model, we can investigate whether the time at which a patient arrived in state 2 (PR) influences the subsequent RFS rate, that is, the transition hazard of PR $\to$ RelDeath. Here the purpose of expanding \prtime becomes apparent. Since \prtime only makes sense for transition 3 (PR $\to$ RelDeath), we need the transition-specific covariate of \prtime for transition 3, which is \Rcolumn{prtime.3}. The corresponding model is termed the "state arrival extended Markov PH" model in \tut, and appears on the right of Table III.
<<label=nonMarkovPH>>=
c3 <- coxph(Surv(Tstart,Tstop,status) ~ dissub1.1 + dissub2.1 + age1.1 + age2.1 + drmatch.1 + tcd.1
    + dissub1.2 + dissub2.2 + age1.2 + age2.2 + drmatch.2 + tcd.2
    + dissub1.3 + dissub2.3 + age1.3 + age2.3 + drmatch.3 + tcd.3
    + pr
    + prtime.3
    + strata(to),
    data=msbmt, method="breslow")
c3
@
The influence of the time at which \PR occurred seems small and is not significant ($p$=0.62, last row).

The clock-reset models may be obtained very similarly to those of the clock-forward models. The only difference is that \Rfunarg{Surv(Tstart,Tstop,status)} is replaced by \Rfunarg{Surv(time,status)}. This reflects the fact (recall that in our long format data each row corresponds to a transition) that for each transition the time starts at 0, rather than \Tstart, the time since start of study at which the state has been entered. We will only show the code, not the output; the reader may try this for him-or herself.
<<label=semiMarkov>>=
c4 <- coxph(Surv(time,status) ~ dissub1.1 + dissub2.1 + age1.1 + age2.1 + drmatch.1 + tcd.1 
    + dissub1.2 + dissub2.2 + age1.2 + age2.2 + drmatch.2 + tcd.2 
    + dissub1.3 + dissub2.3 + age1.3 + age2.3 + drmatch.3 + tcd.3 
    + strata(trans),
    data=msbmt, method="breslow")
c5 <- coxph(Surv(time,status) ~ dissub1.1 + dissub2.1 + age1.1 + age2.1 + drmatch.1 + tcd.1
    + dissub1.2 + dissub2.2 + age1.2 + age2.2 + drmatch.2 + tcd.2
    + dissub1.3 + dissub2.3 + age1.3 + age2.3 + drmatch.3 + tcd.3
    + pr
    + strata(to),
    data=msbmt, method="breslow")
c6 <- coxph(Surv(time,status) ~ dissub1.1 + dissub2.1 + age1.1 + age2.1 + drmatch.1 + tcd.1
    + dissub1.2 + dissub2.2 + age1.2 + age2.2 + drmatch.2 + tcd.2
    + dissub1.3 + dissub2.3 + age1.3 + age2.3 + drmatch.3 + tcd.3
    + pr
    + prtime.3
    + strata(to),
    data=msbmt, method="breslow")
@

\section{Prediction}\label{sec:pred}

In order to obtain prediction probabilities in the context of the Markov \msms discussed in the previous section, basically two steps are involved. The first is to use the estimated parameters and baseline transition hazards and the covariate values of a patient of interest, to obtain patient-specific transition hazards for that patient, for each of the transitions in the \msm. This is what the function \msfit is designed to do. The second step is to use the resulting patient-specific transition hazards (and variances and covariances) as input for \probtrans to obtain (patient-specific) transition probabilities.

I will first show how \msfit can be used to obtain the baseline hazards associated with the Markov stratified and PH models. The hazards of the Markov stratified models (and their variances and covariates) are obtained by first creating a new dataset containing the (expanded) covariates along with their values (in this case 0). This is very similar to the use of \survfit from the \survival package. The important difference is that for one patient, this \newdata data frame needs to have exactly one line for each transition. When transition-specific covariates have been used in the model, the easiest way to obtain such a data frame is to first create a data frame with the basic covariates and then using \expandcovs to obtain the transition-specific covariates. Since \expandcovs expects an \Rclass{msdata} object, we set the class of the \newdata data to \Rclass{msdata} explicitly. We also copy the levels of the categorical covariates before expanding, although this is not really necessary here.
<<label=newdata>>=
newd <- data.frame(dissub=rep(0,3),age=rep(0,3),drmatch=rep(0,3),tcd=rep(0,3),trans=1:3)
newd$dissub <- factor(newd$dissub,levels=0:2,labels=levels(ebmt3$dissub))
newd$age <- factor(newd$age,levels=0:2,labels=levels(ebmt3$age))
newd$drmatch <- factor(newd$drmatch,levels=0:1,labels=levels(ebmt3$drmatch))
newd$tcd <- factor(newd$tcd,levels=0:1,labels=levels(ebmt3$tcd))
attr(newd, "trans") <- tmat
class(newd) <- c("msdata","data.frame")
newd <- expand.covs(newd,covs[1:4],longnames=FALSE)
newd$strata=1:3
newd
@
The last command where the column \Rcolumn{strata} is added is important and points to a second major difference between \survfit and \msfit. The \newdata data frame needs to have a column \strata specifying to which stratum in the \Robject{coxph} object each transition belongs. Here each transition corresponds to a separate stratum, so we specify 1, 2, and 3.

To obtain an estimate of the baseline cumulative hazard for the "stratified hazards" model, \msfit can be called with the first Cox model, \Robject{c1}, as input model, and \Robject{newd} as \Rfunarg{newdata} argument.
<<label=msfit>>=
msf1 <- msfit(c1, newdata=newd, trans=tmat)
@
The result is an object of class \Rclass{msfit}, which is a list with three items, \Haz, \varHaz, and \Ritem{trans}. The item \Ritem{trans} records the transition matrix used when constructing the \Rclass{msfit} object. \Haz contains the estimated cumulative hazard for each of the transitions for the particular patient specified in \Robject{newd}, while \varHaz contains the estimated variances of these cumulative hazards, as well as the covariances for each combination of two transitions. All are evaluated at the time points for which any event in any transition occurs, possibly augmented with the largest (non-event) time point in the data. The \Rfunction{summary} method for \Rclass{msfit} objects is most conveniently used for a summary. If we also would like to have a look at the covariances, we could set the argument \Rfunarg{variance} equal to \TRUE.
<<label=msfitsummary>>=
summary(msf1)
@
Let us have a closer look at some of the variances and covariances as well.
<<label=msf1cov>>=
vH1 <- msf1$varHaz
head(vH1[vH1$trans1==1 & vH1$trans2==1,])
tail(vH1[vH1$trans1==1 & vH1$trans2==1,])
tail(vH1[vH1$trans1==1 & vH1$trans2==2,])
tail(vH1[vH1$trans1==1 & vH1$trans2==3,])
tail(vH1[vH1$trans1==2 & vH1$trans2==3,])
@
Note that the covariances of the estimated cumulative hazards are practically (apart from rounding errors) 0. Theoretically, they should be 0, because with separate strata and separate covariate effects for the different transitions, the estimates of the three transitions could in fact have been estimated as three separate Cox models (this would give exactly the same results).

The estimated baseline cumulative hazards for the Markov PH model are obtained in mostly the same way. The only exception is the specification of the \Rfunarg{strata} argument in \Robject{newd}. Instead of taking the values 1, 2, and 3, for the three transitions, they take values 1, 2, 2, to indicate that transition 1 corresponds to stratum 1, and both transitions 2 and 3 correspond to stratum 2 (the order of the strata as defined in the \Robject{coxph} object). Also the time-dependent covariate \pr needs to be included, taking the value 0 for transitions 1 and 2, and 1 for transition 3.
<<label=msfit2>>=
newd$strata=c(1,2,2)
newd$pr <- c(0,0,1)
msf2 <- msfit(c2, newdata=newd, trans=tmat)
summary(msf2)
vH2 <- msf2$varHaz
tail(vH2[vH2$trans1==1 & vH2$trans2==2,])
tail(vH2[vH2$trans1==1 & vH2$trans2==3,])
tail(vH2[vH2$trans1==2 & vH2$trans2==3,])
@
Note that the estimated cumulative hazards and variances for transition 1 are identical to those from \Robject{msf1}. We saw earlier that the estimated regression coefficients were also identical for the Markov stratified and the Markon PH models. Note also that the variance of the cumulative hazard of transition 3 (and 2, not shown) is smaller than with \Robject{msf1}. The cumulative hazard estimates of transitions 1 and 2 are still uncorrelated (and 1 and 3), but those of transitions 2 and 3 are correlated now, because they share a common baseline. 

Let us compare the baseline hazards of the Markov stratified and PH models graphically. For this we use the \Rfunction{plot} method for \Rclass{msfit} objects. Figure~\ref{fig:fig14} corresponds to Figure~14 in \tut.
<<label=fig14a,eval=FALSE>>=
par(mfrow=c(1,2))
plot(msf1,cols=rep(1,3),lwd=2,lty=1:3,
    xlab="Years since transplant",ylab="Stratified baseline hazards",legend.pos=c(2,0.9))
plot(msf2,cols=rep(1,3),lwd=2,lty=1:3,
    xlab="Years since transplant",ylab="Proportional baseline hazards",legend.pos=c(2,0.9))
par(mfrow=c(1,1))
@
\begin{figure}[h!]
    \centering
<<label=fig14b,echo=FALSE,fig=TRUE>>=
par(mfrow=c(1,2))
plot(msf1,cols=rep(1,3),lwd=2,lty=1:3,
    xlab="Years since transplant",ylab="Stratified baseline hazards",legend.pos=c(2,0.9))
plot(msf2,cols=rep(1,3),lwd=2,lty=1:3,
    xlab="Years since transplant",ylab="Proportional baseline hazards",legend.pos=c(2,0.9))
par(mfrow=c(1,1))
@
    \caption{Baseline cumulative hazard curves for the EBMT illness-death model.
    On the left the Markov stratified hazards model, on the right the Markov PH model.}
    \label{fig:fig14}
\end{figure}
\setkeys{Gin}{width=0.65\textwidth}
Define the \msm as $X(t)$, a random process taking values in $1,\ldots,S$ ($S$ being the number of states). We are interested in estimating so called transition probabilities $P_{gh}(s,t) = P(X(t) = h \given X(s) = g)$, possibly depending on covariates. For instance, $P_{13}(0,t)$ indicates the probability of having relapsed/died (state 3) by time $t$, given that the individual was alive without relapse or \PR (state 1) at time $s=0$. By fixing $s$ and varying $t$, we can predict the future behavior of the \msm given the present at time $s$. For Markov models, these probabilities will depend only on the state at time $s$, not on what happened before. For these Markov models there is a powerful relation between these transition probabilities and the transition intensities, given by
\begin{equation}\label{eq:Kolm}
    \PP(s,t) = \prod_{(s,t]} (\II + d\LLambda(u))
\end{equation}
Here $\PP(s,t)$ is an $S \times S$ matrix with as $(g,h)$ element the $P_{gh}(s,t)$ in which we are interested, and $\LLambda(t)$ is an $S \times S$ matrix with as off-diagonal $(g,h)$ elements the transition intensities $\Lambda_{gh}(t)$ of transition $g \to h$. If such a direct transition is not possible, then $\Lambda_{gh}(t)=0$. The diagonal elements of $\LLambda(t)$ are defined as $\Lambda_{gg}(t) = -\sum_{h \not= g} \Lambda_{gh}(t)$, i.e.~as minus the sum of the transition intensities of the transitions out from state $g$. Finally, $\II$ is the $S \times S$ identity matrix. Equation~\eqref{eq:Kolm} describes a theoretical relation between the true underlying transition intensities and transition probabilities. The product is a so called product integral~\citep{ABGK} when the transition intensities are continuous.

We already have estimates of all the transition intensities. If we gather these in a matrix and plug them in equation~\eqref{eq:Kolm}, we get
\begin{equation}\label{eq:AJ}
    \hat{\PP}(s,t) = \prod_{s < u \leq t} \left(
        \II + d\hat{\LLambda}(u)
    \right)
\end{equation}
as an estimate of the transition probabilities. This estimator is called the Aalen-Johansen estimator, and it is implemented in \probtrans. By working with matrices, we immediately get all the transition probabilities from all the starting states $g$ to all the receiving states $h$ in one go. When we fix $s$, we can calculate all these transition probabilities by forward matrix multiplications using the simple recursive relation
\[
    \hat{\PP}(s,t+) = \hat{\PP}(s,t) \cdot
        \left( \II + d\hat{\LLambda}(t+) \right)\ .
\]
\citet{ABGK} and \citet{deW} also describe recursive formulas for the covariance matrix of $\hat{\PP}(s,t)$, with and without covariates, which are implemented in \mstate.

Let us see all this theory in action and let us recreate Figure~15 of \tut. For this we need to calculate transition probabilities for a baseline patient, based on the Markov PH model. We thus use \Robject{msf2} as input for \probtrans. By default, \probtrans uses forward prediction, which means that $s$ is kept fixed and $t>s$. The argument \Rfunarg{predt} specifies either $s$ or $t$. In this case (forward prediction) it specifies $s$. From version 0.2.3 on, \probtrans no longer needs a \Rfunarg{trans} argument, but takes that from the \Ritem{trans} item of the \Rclass{msfit} object.
<<label=probtrans1>>=
pt <- probtrans(msf2, predt=0)
@
The result of \probtrans is a \Rclass{probtrans} object, which is a list, where item \Ritem{[[i]]} contains predictions from state $i$. Each item of the list is a data frame with \Rcolumn{time} containing all event time points, and \Rcolumn{pstate1}, \Rcolumn{pstate2}, etc the probabilities of being in state 1, 2, etc, and finally \Rcolumn{se1}, \Rcolumn{se2} etc the standard errors of these estimated probabilities. The item \Ritem{[[3]]} contains predictions $\hat{P}_{3h}(0,t)$ (we chose $s=0$) starting from the RelDeath state, which is absorbing.
<<label=pt13>>=
head(pt[[3]])
tail(pt[[3]])
@
We see that these prediction probabilities are not so interesting; the probabilities are all 0 or 1, and, since there is no randomness, all the SE's are 0. Item \Ritem{[[2]]} contains predictions $\hat{P}_{2h}(0,t)$ from state 2.

It is easier to use the \Rfunction{summary} method for \Rclass{probtrans} objects. The user may specify a \Rfunarg{from} argument, specifying from which state the predictions are to be printed. The \Rfunction{summary} method prints a selection, the \Rfunction{head} and \Rfunction{tail} by default unless there are fewer than 12 time points. When \Rfunarg{complete} is set to \TRUE, predictions for all time points are printed. If the \Rfunarg{from} argument is missing in the function call, then predictions from all states are printed.
<<label=pt12>>=
summary(pt, from=2)
@
From state 2 it is only possible to visit state 3 or to remain in state 2. The probability of going to state 1 is 0. The predictions $\hat{P}_{1h}(0,t)$ from state 1 in \Ritem{[[1]]} are perhaps of most interest here.
<<label=pt11>>=
summary(pt, from=1)
@
But we see that we do not have enough information to create Figure 15 of \tut, since the probability of the relapse/death state (\Rcolumn{pstate3}) does not distinguish between relapse/death before or after \PR. The remedy is actually easy in this case. Consider a different \msm with two RelDeath states, the first one (state 3) after \PR, the second one (state 4) without \PR. The transition matrix of this \msm is defined as
<<label=tmat2>>=
tmat2 <- transMat(x=list(c(2,4),c(3),c(),c()))
tmat2
@
The \msm has four states and the same three transitions as before. If we apply \probtrans to this new \msm with the same estimated cumulative hazards and standard errors as before, we get exactly what we want. Thus, we just have to call \probtrans with the old \Robject{msf2} and the new \Robject{tmat2}. From version 0.2.3 on, since the transition matrix is in the \Rclass{msfit} object, we just need to replace the \Ritem{trans} item of \Robject{msf2} by \Robject{tmat2}. In the elements of the resulting lists, \Rcolumn{pstate3} will indicate the probability of relapse/death after \PR and \Rcolumn{pstate4} the probability of relapse/death without \PR.
<<label=probtrans2>>=
msf2$trans <- tmat2
pt <- probtrans(msf2, predt=0)
summary(pt, from=1)
@
The reader may check that the \Rcolumn{pstate3} and \Rcolumn{pstate4} probabilities of this new Aalen-Johansen estimator sum up to the \Rcolumn{pstate3} probability of the result of the previous call to \probtrans, and that the \Rcolumn{pstate1} and \Rcolumn{pstate2} probabilities are unchanged.

Figure~\ref{fig:fig15} contains a plot of \Robject{pt1}. For this we use the \Rfunction{plot} method for \Rclass{probtrans} objects.
<<label=pt2plot,eval=FALSE>>=
plot(pt,ord=c(2,3,4,1),lwd=2,
    xlab="Years since transplant",ylab="Prediction probabilities",cex=0.75,
    legend=c("Alive in remission, no PR","Alive in remission, PR",
    "Relapse or death after PR","Relapse or death without PR"))
@
\begin{figure}[h!]
    \centering
<<label=pt2plot,echo=FALSE,fig=TRUE>>=
plot(pt,ord=c(2,3,4,1),lwd=2,
    xlab="Years since transplant",ylab="Prediction probabilities",cex=0.75,
    legend=c("Alive in remission, no PR","Alive in remission, PR",
    "Relapse or death after PR","Relapse or death without PR"))
@
    \caption{Stacked prediction probabilities at $s=0$ for a reference patient. PR stands for \PR}
    \label{fig:fig15}
\end{figure}
The argument \Rfunarg{from} determines from which state the transition probabilities are to be plotted. The default is from state 1, which is what we want, so the \Rfunarg{from} argument is omitted here. The default \Rfunarg{type} of the \Rfunction{plot} method for \Rclass{probtrans} objects is a "stacked" plot, for which the difference between two adjacent lines represents the probability of being in a state. The argument \Rfunarg{ord} specifies the order of the states of which the probabilities are stacked. The present order, 2, 3, 4, 1, allows states 2 and 3 to be combined visually (states with \PR) and states 3 and 4 (death states). Other plot types are "filled", which is like "stacked", but uses colors to fill the space between adjacent lines, "single", which simply plots the transition probabilities as different lines in a single plot, and "separate", which uses separate plots for the transition probabilities.

To obtain the predictions $\hat{P}_{1h}(s,t)$ for $s=0.5$, which are plotted in Figure 16 of \tut, we simply change the value of \Rfunarg{predt} in the call to \probtrans.
<<label=probtrans2>>=
pt <- probtrans(msf2, predt=0.5)
summary(pt, from=1)
@
The result now contains only time points $t \geq 0.5$. Figure~\ref{fig:fig16} contains a plot of \Robject{pt1}.
<<label=pt2plot2state,eval=FALSE>>=
plot(pt,ord=c(2,3,4,1),lwd=2,
    xlab="Years since transplant",ylab="Prediction probabilities",cex=0.75,
    legend=c("Alive in remission, no PR","Alive in remission, PR",
    "Relapse or death after PR","Relapse or death without PR"))
@
\begin{figure}
    \centering
<<label=pt2plot2do,echo=FALSE,fig=TRUE>>=
plot(pt,ord=c(2,3,4,1),lwd=2,
    xlab="Years since transplant",ylab="Prediction probabilities",cex=0.75,
    legend=c("Alive in remission, no PR","Alive in remission, PR",
    "Relapse or death after PR","Relapse or death without PR"))
@
    \caption{Stacked prediction probabilities at $s=0.5$ for a reference patient}
    \label{fig:fig16}
\end{figure}
Figure 17 of \tut distinguishes between three patients, one being the good old (or rather young) reference patient, for which we have already calculated the probabilities, one for a patient in the age category 20-40, and one for a patient older than 40. To obtain prediction probabilities for the latter two patients as well, we have to repeat part of the calculations, changing only the value of age in the \Robject{newdata} data frame.
<<label=makefig17a,keep.source=TRUE>>=
msf2$trans <- tmat
msf.20 <- msf2 # copy msfit result for reference (young) patient
newd <- newd[,1:5] # use the basic covariates of the reference patient
newd2 <- newd
newd2$age <- 1
newd2$age <- factor(newd2$age,levels=0:2,labels=levels(ebmt3$age))
attr(newd2, "trans") <- tmat
class(newd2) <- c("msdata","data.frame")
newd2 <- expand.covs(newd2,covs[1:4],longnames=FALSE)
newd2$strata=c(1,2,2)
newd2$pr <- c(0,0,1)
msf.2040 <- msfit(c2, newdata=newd2, trans=tmat)
newd3 <- newd
newd3$age <- 2
newd3$age <- factor(newd3$age,levels=0:2,labels=levels(ebmt3$age))
attr(newd3, "trans") <- tmat
class(newd3) <- c("msdata","data.frame")
newd3 <- expand.covs(newd3,covs[1:4],longnames=FALSE)
newd3$strata=c(1,2,2)
newd3$pr <- c(0,0,1)
msf.40 <- msfit(c2, newdata=newd3, trans=tmat)
pt.20 <- probtrans(msf.20,predt=0) # original young (<= 20) patient
pt.201 <- pt.20[[1]]; pt.202 <- pt.20[[2]]
pt.2040 <- probtrans(msf.2040,predt=0) # patient 20-40
pt.20401 <- pt.2040[[1]]; pt.20402 <- pt.2040[[2]]
pt.40 <- probtrans(msf.40,predt=0) # patient > 40
pt.401 <- pt.40[[1]]; pt.402 <- pt.40[[2]]
@
The 5-years transition probabilities $P_{13}(0,5)$ and $P_{23}(0,5)$ are estimated as 0.30275 and 0.26210 respectively.
<<label=rfs5,keep.source=TRUE>>=
pt.201[488:489,] # 5 years falls between 488th and 489th time point
pt.202[488:489,] # 5-years probabilities
@
Figure~\ref{fig:Fig17} shows relapse-free survival probabilities without distinction between before or after \PR, so we can use the first transition matrix \Robject{tmat}. The probabilities we want are $1-\hat{P}_{13}(0,t)$ and $1-\hat{P}_{23}(0,t)$, the first one conditioning on being in state 1 (transplantation, i.e.~no PR), the second in being in state 2 (PR).
<<label=makefig17b,eval=FALSE>>=
plot(pt.201$time,1-pt.201$pstate3,ylim=c(0.425,1),type="s",lwd=2,col="red",xlab="Years since transplant",ylab="Relapse-free survival")
lines(pt.20401$time,1-pt.20401$pstate3,type="s",lwd=2,col="blue")
lines(pt.401$time,1-pt.401$pstate3,type="s",lwd=2,col="green")
lines(pt.202$time,1-pt.202$pstate3,type="s",lwd=2,col="red",lty=2)
lines(pt.20402$time,1-pt.20402$pstate3,type="s",lwd=2,col="blue",lty=2)
lines(pt.402$time,1-pt.402$pstate3,type="s",lwd=2,col="green",lty=2)
legend(6,1,c("no PR","PR"),lwd=2,lty=1:2,xjust=1,bty="n")
legend("topright",c("<=20","20-40",">40"),lwd=2,col=c("red","blue","green"),bty="n")
@
\begin{figure}
    \centering
<<label=makefig17c,echo=FALSE,fig=TRUE>>=
plot(pt.201$time,1-pt.201$pstate3,ylim=c(0.4,1),type="s",lwd=2,col="red",xlab="Years since transplant",ylab="Relapse-free survival")
lines(pt.20401$time,1-pt.20401$pstate3,type="s",lwd=2,col="blue")
lines(pt.401$time,1-pt.401$pstate3,type="s",lwd=2,col="green")
lines(pt.202$time,1-pt.202$pstate3,type="s",lwd=2,col="red",lty=2)
lines(pt.20402$time,1-pt.20402$pstate3,type="s",lwd=2,col="blue",lty=2)
lines(pt.402$time,1-pt.402$pstate3,type="s",lwd=2,col="green",lty=2)
legend(6,1,c("no PR","PR"),lwd=2,lty=1:2,xjust=1,bty="n")
legend("topright",c("<=20","20-40",">40"),lwd=2,col=c("red","blue","green"),bty="n")
@
    \caption{Predicted relapse-free survival probabilities for three patients in different
    age categories, given platelet recovery (dashed) and given no \PR (solid). The time of
    prediction was at transplant (note: in \tut this was at 1 month after transplant).}
    \label{fig:Fig17}
\end{figure}
It is also possible to do prediction with a fixed horizon. This should not be understood as attempting to predict the past. It means that in our prediction probabilities $P_{gh}(s,t)$, we fix $t$, a time horizon, and we want to study how $P_{gh}(s,t)$ changes as more and more information on a patient becomes available. From a computational point of view this just means that the order of the matrix multiplication in~\eqref{eq:AJ} is reversed. We will plot $1-\hat{P}_{13}(s,5)$ and $1-\hat{P}_{23}(s,5)$, the 5-years relapse-free survival probabilities given that the patient is in state 1 (no PR) and in state 2 (PR), respectively, for the same three patients as before.
<<label=backward>>=
pt.20 <- probtrans(msf.20,direction="fixedhorizon",predt=5)
pt.201 <- pt.20[[1]]; pt.202 <- pt.20[[2]]
head(pt.201); head(pt.202)
@
Here item \Ritem{[[1]]} gives estimates $\hat{P}_{1h}(s,5)$ and \Ritem{[[2]]} gives estimates $\hat{P}_{2h}(s,5)$. For item \Ritem{[[g]]}, the column \Rcolumn{time} gives the different values of $s$ and \Rcolumn{pstate1} etc give the estimated probabilities of being in state 1 etc at 5 years, conditional on being in state $g$ at time $s$. In \Robject{pt.201} we recognize at \Rcolumn{time} ($s$)=0) 0.30275 as $\hat{P}_{1h}(0,5)$ and in \Robject{pt.202} we see 0.26210 as $\hat{P}_{2h}(0,5)$. The backward transition probabilities for the other two patients are calculated similarly.
<<label=backward2>>=
pt.2040 <- probtrans(msf.2040,direction="fixedhorizon",predt=5) # patient 20-40
pt.20401 <- pt.2040[[1]]; pt.20402 <- pt.2040[[2]]
pt.40 <- probtrans(msf.40,direction="fixedhorizon",predt=5) # patient > 40
pt.401 <- pt.40[[1]]; pt.402 <- pt.40[[2]]
@
As mentioned before, in $s=0$, these probabilities are the same as the five-years probabilities of Figure~\ref{fig:Fig17}, and as $s$ approaches 5, the probabilities approach 1, since both $\hat{P}_{13}(s,5)$ and $\hat{P}_{23}(s,5)$ approach 0. Figure~\ref{fig:new} shows 5-years relapse-free survival probabilities, both with and without \PR, with the prediction time $s$ varying.
<<label=makenewa,eval=FALSE>>=
plot(pt.201$time,1-pt.201$pstate3,ylim=c(0.425,1),type="s",
    lwd=2,col="red",xlab="Years since transplant",
    ylab="Relapse-free survival")
lines(pt.20401$time,1-pt.20401$pstate3,type="s",lwd=2,col="blue")
lines(pt.401$time,1-pt.401$pstate3,type="s",lwd=2,col="green")
lines(pt.202$time,1-pt.202$pstate3,type="s",lwd=2,col="red",lty=2)
lines(pt.20402$time,1-pt.20402$pstate3,type="s",lwd=2,col="blue",lty=2)
lines(pt.402$time,1-pt.402$pstate3,type="s",lwd=2,col="green",lty=2)
legend("topleft",c("<=20","20-40",">40"),lwd=2,col=c("red","blue","green"),bty="n")
legend(1,1,c("no PR","PR"),lwd=2,lty=1:2,bty="n")
title(main="Backward prediction")
@
\begin{figure}
    \centering
<<label=makenewb,echo=FALSE,fig=TRUE>>=
plot(pt.201$time,1-pt.201$pstate3,ylim=c(0.425,1),type="s",
    lwd=2,col="red",xlab="Prediction time",
    ylab="Relapse-free survival")
lines(pt.20401$time,1-pt.20401$pstate3,type="s",lwd=2,col="blue")
lines(pt.401$time,1-pt.401$pstate3,type="s",lwd=2,col="green")
lines(pt.202$time,1-pt.202$pstate3,type="s",lwd=2,col="red",lty=2)
lines(pt.20402$time,1-pt.20402$pstate3,type="s",lwd=2,col="blue",lty=2)
lines(pt.402$time,1-pt.402$pstate3,type="s",lwd=2,col="green",lty=2)
legend("topleft",c("<=20","20-40",">40"),lwd=2,col=c("red","blue","green"),bty="n")
legend(1,1,c("no PR","PR"),lwd=2,lty=1:2,bty="n")
title(main="Backward prediction")
@
    \caption{Predicted probabilities of 5-years relapse-free survival, conditional on being alive without relapse with (PR) and without \PR (no PR).
    Patients in three age categories.}
    \label{fig:new}
\end{figure}

\section{Competing risks}\label{sec:comprisk}

The data used in Section 3 of \tut is available in \mstate under the name \Robject{aidssi}. See the help file for more information.
<<label=aidssi,keep.source=TRUE>>=
data(aidssi)
si <- aidssi # Just a shorter name
head(si)
table(si$status)
@
To prepare data in long format, it is possible to use \msprep. In this case there is not a huge advantage in using \msprep; the long data may just as easily be prepared directly. Nevertheless we will illustrate the use of \msprep to obtain data in long format. The function \Rfunction{trans.comprisk} prepares a transition matrix for \crs models. The first argument is the number of causes of failure; in the \Rfunarg{names} argument a character vector of length three (the total number of states in the \msm including the failure-free state) may be given. The transition matrix has three states with stte 1 being the failure-free state and the subsequent sttes representing the different causes of failure.
<<label=tmatcr>>=
tmat <- trans.comprisk(2,names=c("event-free","AIDS","SI"))
tmat
@
Now follows the actual call to \msprep.
<<label=dataprep>>=
si$stat1 <- as.numeric(si$status==1)
si$stat2 <- as.numeric(si$status==2)
silong <- msprep(time=c(NA,"time","time"),status=c(NA,"stat1","stat2"),data=si,
                    keep="ccr5",trans=tmat)
@
We can use \events to check whether the number of events from original data (\Robject{si}) corresponds with long data.
<<label=check>>=
events(silong)
@
For the regression analyses to be performed later we add transition-specific covariates. In the context of \crs one could call them cause-specific covariates. Since the factor levels of CCR5 are quite short we keep the default setting (\TRUE) of \Rfunarg{longnames}.
<<label=dummies>>=
silong <- expand.covs(silong,"ccr5")
silong[1:8,] # shows the first four patients in long format, as in tutorial
@
To illustrate the fact that naive Kaplan-Meiers are biased estimators of the probabilities of failing from the different causes of failure, we just make use of the functions in the \Rpackage{survival} package. I am using \coxph below, probably this could be done quicker.
<<label=naive>>=
c1 <- coxph(Surv(time,status) ~ 1, data=silong, subset = (trans==1), method="breslow")
c2 <- coxph(Surv(time,status) ~ 1, data=silong, subset = (trans==2), method="breslow")
h1 <- survfit(c1)
h1 <- data.frame(time=h1$time,surv=h1$surv)
h2 <- survfit(c2)
h2 <- data.frame(time=h2$time,surv=h2$surv)
@
These naive Kaplan-Meier curves are shown in Figure~\ref{fig:Fig2} (Figure 2 in \tut). The Kaplan-Meier estimate of AIDS is plotted as a survival curve, while that of SI appearance is shown as a distribution function. There is some extra code to chop the time at 13 years. This was just done to make the picture prettier.
<<label=fig2,eval=FALSE,keep.source=TRUE>>=
idx1 <- (h1$time<13) # this restricts the plot to the first 13 years
plot(c(0,h1$time[idx1],13),c(1,h1$surv[idx1],min(h1$surv[idx1])),type="s",
    xlim=c(0,13),ylim=c(0,1),xlab="Years from HIV infection",ylab="Probability",lwd=2)
idx2 <- (h2$time<13)
lines(c(0,h2$time[idx2],13),c(0,1-h2$surv[idx2],max(1-h2$surv[idx2])),type="s",lwd=2)
text(8,0.71,adj=0,"AIDS")
text(8,0.32,adj=0,"SI")
@
\begin{figure}
    \centering
<<label=fig2do,echo=FALSE,fig=TRUE,keep.source=TRUE>>=
idx1 <- (h1$time<13) # this restricts the plot to the first 13 years
plot(c(0,h1$time[idx1],13),c(1,h1$surv[idx1],min(h1$surv[idx1])),type="s",
    xlim=c(0,13),ylim=c(0,1),xlab="Years from HIV infection",ylab="Probability",lwd=2)
idx2 <- (h2$time<13)
lines(c(0,h2$time[idx2],13),c(0,1-h2$surv[idx2],max(1-h2$surv[idx2])),type="s",lwd=2)
text(8,0.71,adj=0,"AIDS")
text(8,0.32,adj=0,"SI")
@
    \caption{Estimated survival curve for AIDS and probability of SI appearance, based on the naive Kaplan-Meier estimator.}
    \label{fig:Fig2}
\end{figure}
Cumulative incidence functions can be computed using the function \Cuminc. It takes as main arguments \Rfunarg{time} and \Rfunarg{status}, which can be provided as vectors
<<label=Cuminc1>>=
ci <- Cuminc(time=si$time, status=si$status)
@
or, alternatively, as column names representing time and status, along with a \Rfunarg{data} argument containing these column names.
<<label=Cuminc2>>=
ci <- Cuminc(time="time", status="status", data=aidssi)
@
The result is a data frame containing the failure-free probabilities (\Rcolumn{Surv}) and the \ci functions with their standard errors. Other arguments allow to specify the codes for the causes of failure and a group identifier.
<<label=ci>>=
head(ci); tail(ci)
@
The \ci functions just obtained can be used to reproduce Figure 3 of \tut. The plots are shown in Figure~\ref{fig:Fig3}.
<<label=fig3,eval=FALSE>>=
idx0 <- (ci$time<13)
plot(c(0,ci$time[idx0],13),c(1,1-ci$CI.1[idx0],min(1-ci$CI.1[idx0])),type="s",
  xlim=c(0,13),ylim=c(0,1),xlab="Years from HIV infection",ylab="Probability",lwd=2)
idx1 <- (h1$time<13)
lines(c(0,h1$time[idx1],13),c(1,h1$surv[idx1],min(h1$surv[idx1])),type="s",lwd=2,col=8)
lines(c(0,ci$time[idx0],13),c(0,ci$CI.2[idx0],max(ci$CI.2[idx0])),type="s",lwd=2)
idx2 <- (h2$time<13)
lines(c(0,h2$time[idx2],13),c(0,1-h2$surv[idx2],max(1-h2$surv[idx2])),type="s",lwd=2,col=8)
text(8,0.77,adj=0,"AIDS")
text(8,0.275,adj=0,"SI")
@
\begin{figure}
    \centering
<<label=fig3do,echo=FALSE,fig=TRUE>>=
idx0 <- (ci$time<13)
plot(c(0,ci$time[idx0],13),c(1,1-ci$CI.1[idx0],min(1-ci$CI.1[idx0])),type="s",
  xlim=c(0,13),ylim=c(0,1),xlab="Years from HIV infection",ylab="Probability",lwd=2)
idx1 <- (h1$time<13)
lines(c(0,h1$time[idx1],13),c(1,h1$surv[idx1],min(h1$surv[idx1])),type="s",lwd=2,col=8)
lines(c(0,ci$time[idx0],13),c(0,ci$CI.2[idx0],max(ci$CI.2[idx0])),type="s",lwd=2)
idx2 <- (h2$time<13)
lines(c(0,h2$time[idx2],13),c(0,1-h2$surv[idx2],max(1-h2$surv[idx2])),type="s",lwd=2,col=8)
text(8,0.77,adj=0,"AIDS")
text(8,0.275,adj=0,"SI")
@
    \caption{Estimates of probabilities of AIDS and SI appearance, based on the naive Kaplan-Meier (grey) and on cumulative incidence functions (black).}
    \label{fig:Fig3}
\end{figure}
The stacked plots of Figure 4 of \tut are shown in Figure~\ref{fig:Fig4}.
<<label=fig4,eval=FALSE>>=
idx0 <- (ci$time<13)
plot(c(0,ci$time[idx0]),c(0,ci$CI.1[idx0]),type="s",
  xlim=c(0,13),ylim=c(0,1),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,ci$time[idx0]),c(0,ci$CI.1[idx0]+ci$CI.2[idx0]),type="s",lwd=2)
text(13,0.5*max(ci$CI.1[idx0]),adj=1,"AIDS")
text(13,max(ci$CI.1[idx0])+0.5*max(ci$CI.2[idx0]),adj=1,"SI")
text(13,0.5+0.5*max(ci$CI.1[idx0])+0.5*max(ci$CI.2[idx0]),adj=1,"Event-free")
@
\begin{figure}
    \centering
<<label=fig4do,echo=FALSE,fig=TRUE>>=
idx0 <- (ci$time<13)
plot(c(0,ci$time[idx0]),c(0,ci$CI.1[idx0]),type="s",
  xlim=c(0,13),ylim=c(0,1),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,ci$time[idx0]),c(0,ci$CI.1[idx0]+ci$CI.2[idx0]),type="s",lwd=2)
text(13,0.5*max(ci$CI.1[idx0]),adj=1,"AIDS")
text(13,max(ci$CI.1[idx0])+0.5*max(ci$CI.2[idx0]),adj=1,"SI")
text(13,0.5+0.5*max(ci$CI.1[idx0])+0.5*max(ci$CI.2[idx0]),adj=1,"Event-free")
@
    \caption{Cumulative incidence curves of AIDS and SI appearance. The cumulative incidence functions are stacked; the distances between two curves represent the probabilities of the different events.}
    \label{fig:Fig4}
\end{figure}

\subsection*{Regression}

The section on regression in \tut already shows some \R code and occasional output. Because of the fact that I used \msprep to prepare the long data, occasionally there will be very small differences with the code in \tut. We start with regression on cause-specific hazards. Using the original dataset, we can apply ordinary Cox regression for cause 1 (AIDS), taking only the AIDS cases as events. This is done by specifying \Rcode{status==1} below (observations with status=0 (true censorings) and status=2 (SI) are treated as censorings). Similarly for cause 2 (SI appearance), where \Rcode{status==2} indicates that only failures due to SI appearance are to be treated as events.
<<label=csh1,keep.source=TRUE>>=
coxph(Surv(time, status == 1) ~ ccr5, data = si) # AIDS
coxph(Surv(time, status == 2) ~ ccr5, data = si) # SI appearance
@
The same analysis can be performed using the long format dataset \Robject{silong} in several ways. For instance, as separate Cox regressions.
<<label=csh2>>=
coxph(Surv(time,status) ~ ccr5, data=silong, subset = (trans==1), method="breslow")
coxph(Surv(time,status) ~ ccr5, data=silong, subset = (trans==2), method="breslow")
@
And in a single analysis, using the expanded covariates.
<<label=csh3>>=
coxph(Surv(time, status) ~ ccr5WM.1 + ccr5WM.2 + strata(trans), data = silong)
@
The same model, but now using a covariate by cause interaction.
<<label=csh4>>=
coxph(Surv(time, status) ~ ccr5 * factor(trans) + strata(trans), data = silong)
@
In the model below we assume that the effect of CCR5 on the two cause-specific hazards is equal. The significant effect of the interaction in the model we just saw indicates that this is not a good idea. But, again, this is just for educational purposes.
<<label=csh5>>=
coxph(Surv(time, status) ~ ccr5 + strata(trans), data = silong)
@
There are two alternative ways yielding the same result. First, we can actually leave out the \Rfunarg{strata} term.
<<label=csh6>>=
coxph(Surv(time, status) ~ ccr5, data = silong)
@
Second, since the \Rfunarg{strata} term is not needed we can use \Robject{si}.
<<label=csh7>>=
coxph(Surv(time, status != 0) ~ ccr5, data = si)
@
Note: the actual estimated baseline hazards may be different, whether or not the strata term is used.

Assuming that baseline hazards for AIDS and SI are proportional (this is generally not a realistic assumption by the way, but just for illustration purposes).
<<label=csh7>>=
coxph(Surv(time, status) ~ ccr5WM.1 + ccr5WM.2 + factor(trans), data = silong)
@
Or, again using covariate by cause (transition) interaction.
<<label=csh8>>=
coxph(Surv(time, status) ~ ccr5 * factor(trans), data = silong)
@
Note that, even though patients are replicated in the long format, it is not necessary to use robust standard errors. Any of the previous analyses with the \Robject{silong} dataset gives identical results when a \Rcode{cluster(id)} term is added. For instance,
<<label=csh9>>=
coxph(Surv(time, status) ~ ccr5 * factor(trans) + cluster(id), data = silong)
@
gives the same result as before.

So far in the regression context we have just used the \coxph function of the \survival package. In order to obtain predicted cumulative incidences, \msprep is useful. First let us store our analysis with separate covariate effects for the two causes.
<<label=ci1>>=
c1 <- coxph(Surv(time, status) ~ ccr5WM.1 + ccr5WM.2 + strata(trans), data = silong, method="breslow")
@
If we want the predicted cumulative incidences for an individual with CCR5 wild-type (WW), we make a \Rfunarg{newdata} data frame containing the (transition-specific) covariate values for each of the transitions for the individual of interest. Then we apply \msfit as illustrated earlier in the context of \msms.
<<label=ci12>>=
WW <- data.frame(ccr5WM.1=c(0,0),ccr5WM.2=c(0,0),trans=c(1,2),strata=c(1,2))
msf.WW <- msfit(c1, WW, trans=tmat)
@
And finally, to obtain the \cis we apply \probtrans. Item \Ritem{[[1]]} is selected because the prediction starts from state 1 (event-free) at time $s=0$.
<<label=ci13>>=
pt.WW <- probtrans(msf.WW, 0)[[1]]
@
Similarly for an individual with the CCR5 mutant (WM) genotype.
<<label=ci22>>=
WM <- data.frame(ccr5WM.1=c(1,0),ccr5WM.2=c(0,1),trans=c(1,2),strata=c(1,2))
msf.WM <- msfit(c1, WM, trans=tmat)
pt.WM <- probtrans(msf.WM, 0)[[1]]
@
We now plot these \ci curves for AIDS (\Rcolumn{pstate2}) and SI appearance (\Rcolumn{pstate3}), for wild-type (WW) and mutant (WM) in Figure~\ref{fig:Fig5} (Figure 5 in \tut).
<<label=fig5,eval=FALSE>>=
idx1 <- (pt.WW$time<13)
idx2 <- (pt.WM$time<13)
## AIDS
plot(c(0,pt.WW$time[idx1]),c(0,pt.WW$pstate2[idx1]),type="s",ylim=c(0,0.5),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,pt.WM$time[idx2]),c(0,pt.WM$pstate2[idx2]),type="s",lwd=2,col=8)
title(main="AIDS")
text(9.2,0.345,"WW",adj=0,cex=0.75)
text(9.2,0.125,"WM",adj=0,cex=0.75)
## SI appearance
plot(c(0,pt.WW$time[idx1]),c(0,pt.WW$pstate3[idx1]),type="s",ylim=c(0,0.5),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,pt.WM$time[idx2]),c(0,pt.WM$pstate3[idx2]),type="s",lwd=2,col=8)
title(main="SI appearance")
text(7.5,0.31,"WW",adj=0,cex=0.75)
text(7.5,0.245,"WM",adj=0,cex=0.75)
@
\setkeys{Gin}{width=0.45\textwidth}
\begin{figure}
    \centering
<<label=fig5do,echo=FALSE,fig=TRUE>>=
idx1 <- (pt.WW$time<13)
idx2 <- (pt.WM$time<13)
## AIDS
plot(c(0,pt.WW$time[idx1]),c(0,pt.WW$pstate2[idx1]),type="s",ylim=c(0,0.5),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,pt.WM$time[idx2]),c(0,pt.WM$pstate2[idx2]),type="s",lwd=2,col=8)
title(main="AIDS")
text(9.2,0.345,"WW",adj=0,cex=0.75)
text(9.2,0.125,"WM",adj=0,cex=0.75)
@
<<label=fig5do2,echo=FALSE,fig=TRUE>>=
## SI appearance
plot(c(0,pt.WW$time[idx1]),c(0,pt.WW$pstate3[idx1]),type="s",ylim=c(0,0.5),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,pt.WM$time[idx2]),c(0,pt.WM$pstate3[idx2]),type="s",lwd=2,col=8)
title(main="SI appearance")
text(7.5,0.31,"WW",adj=0,cex=0.75)
text(7.5,0.245,"WM",adj=0,cex=0.75)
@
    \caption{Cumulative incidence functions for AIDS (left) and SI appearance (right), for wild-type (\texttt{WW})
    and mutant (\texttt{WM}) CCR5 genotype, based on a proportional hazards model on the cause-specific hazards.}
    \label{fig:Fig5}
\end{figure}
\setkeys{Gin}{width=0.65\textwidth}
The illustration of the phenomenon that the same cause-specific hazard ratio may have different effects on the cumulative incidences (Figure 7 in \tut) may be performed as well, by replacing the appropriate parts of the cumulative hazard of AIDS (\Rcolumn{trans}=1), and calling \probtrans. We are interested in SI appearance and adjust the hazards of the competing risk (AIDS) while keeping the remainder the same (Figure 7 in \tut). The result is shown in Figure~\ref{fig:Fig7}. We multiply the baseline hazard of AIDS with factors (\Robject{ff} = 0, 0.5, 1, 1.5, 2, 4).
<<label=fig7,eval=FALSE>>=
ffs <- c(0,0.5,1,1.5,2,4)
newmsf.WW <- msf.WW
newmsf.WM <- msf.WM
par(mfrow=c(2,3))
for (ff in ffs) {
    # WW
    newmsf.WW$Haz$Haz[newmsf.WW$Haz$trans==1] <- ff*msf.WW$Haz$Haz[msf.WW$Haz$trans==1]
    pt.WW <- probtrans(newmsf.WW, 0, variance=FALSE)[[1]]
    # WM
    newmsf.WM$Haz$Haz[newmsf.WM$Haz$trans==1] <- ff*msf.WM$Haz$Haz[msf.WM$Haz$trans==1]
    pt.WM <- probtrans(newmsf.WM, 0, variance=FALSE)[[1]]
    # Plot
    idx1 <- (pt.WW$time<13)
    idx2 <- (pt.WM$time<13)
    plot(c(0,pt.WW$time[idx1]),c(0,pt.WW$pstate3[idx1]),type="s",ylim=c(0,0.52),xlab="Years from HIV infection",ylab="Probability",lwd=2)
    lines(c(0,pt.WM$time[idx2]),c(0,pt.WM$pstate3[idx2]),type="s",lwd=2,col=8)
    title(main=paste("Factor =",ff))
}
par(mfrow=c(1,1))
@
\begin{figure}
    \centering
<<label=fig7do,echo=FALSE,fig=TRUE>>=
### Multiply baseline hazard of AIDS with factors (ff)
### of ff = 0, 0.5, 1, 1.5, 2, 4
ffs <- c(0,0.5,1,1.5,2,4)
newmsf.WW <- msf.WW
newmsf.WM <- msf.WM
par(mfrow=c(2,3))
for (ff in ffs) {
    # WW
    newmsf.WW$Haz$Haz[newmsf.WW$Haz$trans==1] <- ff*msf.WW$Haz$Haz[msf.WW$Haz$trans==1]
    pt.WW <- probtrans(newmsf.WW, 0, variance=FALSE)[[1]]
    # WM
    newmsf.WM$Haz$Haz[newmsf.WM$Haz$trans==1] <- ff*msf.WM$Haz$Haz[msf.WM$Haz$trans==1]
    pt.WM <- probtrans(newmsf.WM, 0, variance=FALSE)[[1]]
    # Plot
    idx1 <- (pt.WW$time<13)
    idx2 <- (pt.WM$time<13)
    plot(c(0,pt.WW$time[idx1]),c(0,pt.WW$pstate3[idx1]),type="s",ylim=c(0,0.52),xlab="Years from HIV infection",ylab="Probability",lwd=2)
    lines(c(0,pt.WM$time[idx2]),c(0,pt.WM$pstate3[idx2]),type="s",lwd=2,col=8)
    title(main=paste("Factor =",ff))
}
par(mfrow=c(1,1))
@
    \caption{Cumulative incidence functions for Si appearance, for CCR5 wild-type \texttt{WW} (black) and mutant \texttt{WM} (grey).
    The baseline hazard of AIDS was multiplied with different factors, while keeping everything else the same.}
    \label{fig:Fig7}
\end{figure}
Fine and Gray regression on cumulative incidence functions is not implemented in \mstate, but in the \R package \Rpackage{cmprsk}. Since our main purpose here is illustration of \mstate, we just give the code and the output.
<<label=FG,keep.source=TRUE>>=
library(cmprsk)
sic <- si[!is.na(si$ccr5),]
ftime <- sic$time
fstatus <- sic$status
cov <- as.numeric(sic$ccr5)-1
# for failures of type 1 (AIDS)
z1 <- crr(ftime,fstatus,cov)
z1
# for failures of type 2 (SI)
z2 <- crr(ftime,fstatus,cov,failcode=2)
z2
@
The result (Figure 8 in \tut) is shown in Figure~\ref{fig:Fig8}.
<<label=fig8,eval=FALSE,keep.source=TRUE>>=
z1.pr <- predict(z1,matrix(c(0,1),2,1))
# this will contain predicted cum inc curves, both for WW (2nd column) and WM (3rd)
z2.pr <- predict(z2,matrix(c(0,1),2,1))
# Standard plots, not shown
par(mfrow=c(1,2))
plot(z1.pr,lty=1,lwd=2,color=c(8,1))
plot(z2.pr,lty=1,lwd=2,color=c(8,1))
par(mfrow=c(1,1))
## AIDS
n1 <- nrow(z1.pr) # remove last jump
plot(c(0,z1.pr[-n1,1]),c(0,z1.pr[-n1,2]),type="s",ylim=c(0,0.5),
    xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,z1.pr[-n1,1]),c(0,z1.pr[-n1,3]),type="s",lwd=2,col=8)
title(main="AIDS")
text(9.3,0.35,"WW",adj=0,cex=0.75)
text(9.3,0.14,"WM",adj=0,cex=0.75)
## SI appearance
n2 <- nrow(z2.pr) # again remove last jump
plot(c(0,z2.pr[-n2,1]),c(0,z2.pr[-n2,2]),type="s",ylim=c(0,0.5),
    xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,z2.pr[-n2,1]),c(0,z2.pr[-n2,3]),type="s",lwd=2,col=8)
title(main="SI appearance")
text(7.9,0.28,"WW",adj=0,cex=0.75)
text(7.9,0.31,"WM",adj=0,cex=0.75)
@
\setkeys{Gin}{width=0.45\textwidth}
\begin{figure}
    \centering
<<label=fig8do,echo=FALSE,fig=TRUE>>=
z1.pr <- predict(z1,matrix(c(0,1),2,1)) # this will contain predicted cum inc curves, both for WW (2nd column) and WM (3rd)
z2.pr <- predict(z2,matrix(c(0,1),2,1))
## AIDS
n1 <- nrow(z1.pr) # remove last jump
plot(c(0,z1.pr[-n1,1]),c(0,z1.pr[-n1,2]),type="s",ylim=c(0,0.5),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,z1.pr[-n1,1]),c(0,z1.pr[-n1,3]),type="s",lwd=2,col=8)
title(main="AIDS")
text(9.3,0.35,"WW",adj=0,cex=0.75)
text(9.3,0.14,"WM",adj=0,cex=0.75)
@
<<label=fig8do2,echo=FALSE,fig=TRUE>>=
## SI appearance
n2 <- nrow(z2.pr) # again remove last jump
plot(c(0,z2.pr[-n2,1]),c(0,z2.pr[-n2,2]),type="s",ylim=c(0,0.5),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,z2.pr[-n2,1]),c(0,z2.pr[-n2,3]),type="s",lwd=2,col=8)
title(main="SI appearance")
text(7.9,0.28,"WW",adj=0,cex=0.75)
text(7.9,0.31,"WM",adj=0,cex=0.75)
@
    \caption{Cumulative incidence functions for AIDS (left) and SI appearance (right), for CCR5 wild-type \texttt{WW} and mutant \texttt{WM}, based on the Fine and Gray model.}
    \label{fig:Fig8}
\end{figure}
To judge the "fit" of the cause-specific and Fine \& Gray regression models we estimate cumulative incidence curves nonparametrically, i.e., for two subgroups of WW and WM CCR5-genotypes. Here we can use the \Rfunarg{group} argument of \Cuminc.
<<label=nonp>>=
ci <- Cuminc(si$time,si$status,group=si$ccr5)
ci.WW <- ci[ci$group=="WW",]
ci.WM <- ci[ci$group=="WM",]
@
We show these nonparametric estimates in Figure~\ref{fig:Fig9} (Figure 9 in \tut).
<<label=Fig9,eval=FALSE>>=
idx1 <- (ci.WW$time<13)
idx2 <- (ci.WM$time<13)
# AIDS
plot(c(0,ci.WW$time[idx1]),c(0,ci.WW$CI.1[idx1]),type="s",ylim=c(0,0.5),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,ci.WM$time[idx2]),c(0,ci.WM$CI.1[idx2]),type="s",lwd=2,col=8)
title(main="AIDS")
text(9.3,0.35,"WW",adj=0,cex=0.75)
text(9.3,0.11,"WM",adj=0,cex=0.75)
# SI appearance
plot(c(0,ci.WW$time[idx1]),c(0,ci.WW$CI.2[idx1]),type="s",ylim=c(0,0.5),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,ci.WM$time[idx2]),c(0,ci.WM$CI.2[idx2]),type="s",lwd=2,col=8)
title(main="SI appearance")
text(7.9,0.32,"WW",adj=0,cex=0.75)
text(7.9,0.245,"WM",adj=0,cex=0.75)
@
\begin{figure}
    \centering
<<label=Fig9do,echo=FALSE,fig=TRUE>>=
idx1 <- (ci.WW$time<13)
idx2 <- (ci.WM$time<13)
# AIDS
plot(c(0,ci.WW$time[idx1]),c(0,ci.WW$CI.1[idx1]),type="s",ylim=c(0,0.5),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,ci.WM$time[idx2]),c(0,ci.WM$CI.1[idx2]),type="s",lwd=2,col=8)
title(main="AIDS")
text(9.3,0.35,"WW",adj=0,cex=0.75)
text(9.3,0.11,"WM",adj=0,cex=0.75)
@
<<label=Fig9do2,echo=FALSE,fig=TRUE>>=
# SI appearance
plot(c(0,ci.WW$time[idx1]),c(0,ci.WW$CI.2[idx1]),type="s",ylim=c(0,0.5),xlab="Years from HIV infection",ylab="Probability",lwd=2)
lines(c(0,ci.WM$time[idx2]),c(0,ci.WM$CI.2[idx2]),type="s",lwd=2,col=8)
title(main="SI appearance")
text(7.9,0.32,"WW",adj=0,cex=0.75)
text(7.9,0.245,"WM",adj=0,cex=0.75)
@
    \caption{Non-parametric cumulative incidence functions for AIDS (left) and SI appearance (right), for CCR5 wild-type \texttt{WW} and mutant \texttt{WM}.}
    \label{fig:Fig9}
\end{figure}
\setkeys{Gin}{width=0.65\textwidth}
%\section{Discussion}\label{sec:discussion}
%\setcounter{equation}{0}
%\markright{}

\bibliographystyle{agsm}
\bibliography{Tutorial}

\end{document}