USUBJID | TRTAN | OS | OS.status | TTP | TTP.status |
---|---|---|---|---|---|
015246-036-0001-00001 | 2 | 1430 | 0 | 376 | 0 |
015246-036-0001-00002 | 2 | 535 | 1 | 313 | 1 |
2024-01-23
Introduction of multi-state model
Multi-state model
Application to three states data
What if the markov assumption is not respected ?
Conclusion
Classical survival analysis refers to the analysis of time to event (longitudinal data)
. The response is often referred as a failure time, survival time, or event time.
multiple events or states
for an individual over the time.It is used to describe the dynamic of transitions between states
(different events).
The study of these models consists in analyzing the transition forces (transition intensities) between the different states.
Obviously, the events cannot be seen as independent
.
Example graphs corresponding to the survival model, the 2-state competing risk model and the illness-death model.
The survival model is the classical case where the individual can only move from the state 1 to the state 2
.
The competing riks model is a special case of the multi-state model. Each transition leads to an absorbing state
.
One exemple mainly used in epidemology is the Illness-death model
. In our example, the state 2 is the diseased state and the state 3 is the death state.
In this presentation, we will focus on the illness-death model :
Exemple of a dataset.
The dataset contains 6 variables : USUBJID (patient ID), TRTAN (treatment), OS (overall survival), OS.status (status indicator for OS), TTP (time to progression), TTP.status (status indicator for TTP). It contains 299 patients.
USUBJID | TRTAN | OS | OS.status | TTP | TTP.status |
---|---|---|---|---|---|
015246-036-0001-00001 | 2 | 1430 | 0 | 376 | 0 |
015246-036-0001-00002 | 2 | 535 | 1 | 313 | 1 |
The value of the matrix corresponds to the name of the transition
.
basal | progression | death | |
---|---|---|---|
basal | NA | 1 | 2 |
progression | NA | NA | 3 |
death | NA | NA | NA |
Then, we need to transform the data in a long format.
In the long format, each row corresponds to a transition. It exists also some convention for the time and the status.
USUBJID | from | to | trans | Tstart | Tstop | time | status | TRTAN |
---|---|---|---|---|---|---|---|---|
015246-036-0001-00001 | 1 | 2 | 1 | 0 | 376 | 376 | 0 | 2 |
015246-036-0001-00001 | 1 | 3 | 2 | 0 | 376 | 376 | 0 | 2 |
015246-036-0001-00002 | 1 | 2 | 1 | 0 | 313 | 313 | 1 | 2 |
015246-036-0001-00002 | 1 | 3 | 2 | 0 | 313 | 313 | 0 | 2 |
015246-036-0001-00002 | 2 | 3 | 3 | 313 | 535 | 222 | 1 | 2 |
USUBJID | TRTAN | OS | OS.status | TTP | TTP.status |
---|---|---|---|---|---|
015246-036-0001-00002 | 2 | 535 | 1 | 313 | 1 |
015246-036-0001-00003 | 2 | 1373 | 0 | 77 | 1 |
Here’s the illustration on how to read the data (by convention of the long format) :
Extension of traditional
binary outcomes can’t fully describe the dynamic of the disease :
summarising information
: death/progression as unwanted eventstatic analysis
: the dynamic of the disease is not taken into accountdynamic of transitions
between states :
introduce more complex model
: certain states can be visited multiple times (multiple remissions)more granular analysis
: probability of being in a state at a given time, probability of moving from one state to another, hazard ratio of a treatment for each transition, etc…For each individual, we observe: \(T_{i}=X_{i} \bigcap C_{i}\) and \(\delta_{i}=1_{X_{i} \leq C_{i}}\) with \(\delta_{i}\) the indicator of censoring (where \(\delta_{i}=1\) if \(T_{i}=X_{i}\) and \(\delta_{i}=0\) if \(T_{i}=C_{i}\)).
We can define function of interest as :
Each of these function remains valid in the multi-state context. We just need to specify the state of interest (from x to y).
The previous notations for multi-state analysis is the basis for all multi-state model.
The estimators can be estimated based on the counting process properties.
It exists some assumptions to simplify the calculations of these estimators :
Presentation of the quantities of interest in multi-state model (transition dependent quantities):
Probability of progression free survival time (Time until a progression or death) : \(P(X>t)\) (ex : Kaplan meier estimator on summarized information)
Probability of being still alive at time t : \(P(X>t)\) (ex : Kaplan meier estimator on overall survival time)
The probability of staying in state 1 : \(P_{11}(t)\)
The opposite probability of death in being in state 1 : \(1- P_{13}(t)\)
The following estimators are a generalization to Markov models estimators.
Reminder : non-homogeneous Markov model is a Markov model where the instantaneous hazard function depends only of the time of follow-up (\(\alpha_{hj}(t,d)=\alpha_{hj}(t)\)).
Non parametric estimators can be seen as purely descriptive
. Covariates can’t be included in the estimators and prediction is not possible.
Function of interest | Estimator | Simple survival model | Multi-state model (by transition) |
---|---|---|---|
Cumulative hazard function | Nelson-Aalen | \(\hat{\Lambda}(t)=\sum_{i:t_{j} \leq t} (\frac{d_{i}}{n_{i}})\) | \(\hat{A}_{hj}(t)=\int_{0}^{t}\frac{J_{h}(u)}{Y_{h}(u)}dN_{hj}(u)\) [1] |
Probability of transition | Aalen-Johansen | no equivalent | \(\hat{P}(s,t)=\prod_{l:t_{l} \leq t} (Id + \Delta \hat{A}(T_{l}))\) [2] |
Note : In the following examples, we will consider that the censoring is right censoring and that the censoring is i.i.d (independent and identically distributed) and not informative.
[1] Details and demonstrations in appendix
[2] Goodman and Johansen (1973)
Some strong assumptions are necessary: \(\alpha_{hj,i}(t)=\alpha_{hj,0}(t)exp(\beta^{T}Z_{i})\) (proportional risk model). It is the same assumption as the classical Cox model.
Semi parametric estimators enable to include covariates
in the estimators. Prediction
is possible.
Function of interest | Estimator | Simple survival model | Multi-state model (by transition) |
---|---|---|---|
The baseline cumulative hazard function | Breslow | \(\Lambda_{0}(t)=\sum_{i:T_{i} \leq t} \frac{d_{i}}{\sum_{j \in R(T_{i})} exp(\hat{\beta'}*Z_{j})}\) | \(A_hj0(t)=\int_0^t \frac{J_h(u)}{Y_h(u)*exp(\beta^T_{hj}*Z_{i})}dN_hj(u)\) |
Regression coefficient \(\beta\) | Cox model | maximum partial likelihood | maximum partial likelihood |
Probability of transition | Cox model | no equivalent | \(\hat{P}(s,t|Z)=\prod_{l:t_{l} \leq t} (Id + \Delta \hat{A}(T_{l}|Z))\) |
WARNING :
The semi-parametric assumption have to be checked (Graphical method, Schoenfeld residuals, etc.)
USUBJID | TRTAN | OS | OS.status | TTP | TTP.status |
---|---|---|---|---|---|
015246-036-0001-00001 | 2 | 1430 | 0 | 376 | 0 |
015246-036-0001-00002 | 2 | 535 | 1 | 313 | 1 |
015246-036-0001-00003 | 2 | 1373 | 0 | 77 | 1 |
In usual survival analysis, the study considers :
Progression-free survival (PFS)
time defined by the time from randomization until disease progression or death from any cause.
Overall survival
time from randomization until death from any cause.
In order to compare the multi-state model with the classical survival analysis, we will first analyse the data with the classical survival analysis
.
First, we will summarise the data : death and progession as unwanted event.
```{r}
#| code-line-numbers: "5-8"
datable$status <- ifelse(datable$OS.status == 1 | datable$TTP.status == 1, 1, 0)
# if OS < TTP and OS.status =1 and TTP.status =1, then time = OS
# if OS.status = 1 and TTP.status = 0, then time = OS
# if OS.status = 0 and TTP.status = 1, then time = TTP
# if OS.status = 0 and TTP.status = 0, then time = min(OS, TTP)
datable$time <- ifelse(datable$OS.status == 1 & datable$TTP.status == 1, pmin(datable$OS,datable$TTP), ifelse(datable$OS.status == 1 & datable$TTP.status == 0, datable$OS, ifelse(datable$OS.status == 0 & datable$TTP.status == 1, datable$TTP, ifelse(datable$OS.status == 0 & datable$TTP.status == 0, pmin(datable$OS, datable$TTP), NA))))
datable$OS <- datable$OS/30
datable$TTP <- datable$TTP/30
datable$time <- datable$time/30
```
Progression free survival : time until a progression or death
```{r}
# Kaplan meier of proportion of surviving without progression
km <- survfit(Surv(time, status) ~ strata(TRTAN), data = datable)
km
```
Call: survfit(formula = Surv(time, status) ~ strata(TRTAN), data = datable)
n events median 0.95LCL 0.95UCL
strata(TRTAN)=TRTAN=1 122 86 19.3 16.3 25.3
strata(TRTAN)=TRTAN=2 177 98 31.6 25.1 43.0
The median follow-up time for progression-free survival was 19.3 months (95% CI: 16.3-25.3) in the treatment group 1
and 31.6 months (95% CI: 25.1-43) in the treatment group 2
.
After checking the proportional risk hypothesis, we can calculate the hazard ratio :
```{r}
c1 <- coxph(Surv(time,status)~ TRTAN,data=datable)
c1
# CI for HR 95%
exp(confint(c1,level=0.95))
```
Call:
coxph(formula = Surv(time, status) ~ TRTAN, data = datable)
coef exp(coef) se(coef) z p
TRTAN -0.5427 0.5812 0.1492 -3.637 0.000276
Likelihood ratio test=12.94 on 1 df, p=0.0003216
n= 299, number of events= 184
2.5 % 97.5 %
TRTAN 0.4337851 0.7786188
The hazard ratio for the treatment 1 vs treatment 2 is 0.58 (95% CI: 0.43-0.78)
.
The treatment 2 is associated with a lower risk of progression or death (p-value : 0.0003) : The PFS time decreases by 42% with the treatment 2
compared to the treatment 1.
Overall survival : Time from randomization until death from any cause
Call:
coxph(formula = Surv(OS, OS.status) ~ TRTAN, data = datable)
coef exp(coef) se(coef) z p
TRTAN -0.2618 0.7696 0.1855 -1.411 0.158
Likelihood ratio test=1.97 on 1 df, p=0.1602
n= 299, number of events= 117
The overall survival seems to be identical between the two treatments graphically.
It is confirmed by the HR : The effect of the treatment on the overall survival is not significant (p-value : 0.16).
The treatment 2 is associated with a lower risk of progression or death (p-value : 0.0003) : The PFS time decreases by 42% with the treatment 2
compared to the treatment 1.
The effect of the treatment on the overall survival is not significant (p-value : 0.16).
It can be resumed by the following graph :
For computing the estimators in the context of non-homogeneous Markov model, the packages mstate
, msSurv
and etm
can be used.
Before analysing the data with the multi-state model, we need to check the assumptions of the markov multi-state model in using the package mstate
.
It exists some functions which check the assumptions of the markov multi-state model in applying log-rank test for each transition (recommanded in Titman and Putter (2020) detailed in appendix).
Reminder : Nelson-Aalen is an estimator for the cumulative hazard function and can be expressed as \(\hat{A}_{hj}(t)=\int_{0}^{t}\frac{J_{h}(u)}{Y_{h}(u)}dN_{hj}(u)\)
Thanks to this estimator, the probability of transition can be computed by Aalen-Johansen estimator [1].
[1] Details and demonstration in appendix
Non-parametric estimator for the probability of transition. It can be expressed as \(P_{hj}(s,t)= P(X_{t}=j | X_{s}=h)\).
What the plot shows is \(P_{1-j}(0,t)\), the probability of being in state 1 at time 0 and in state j at time t :
What the plot shows is \(P_{2-j}(0,t)\), the probability of being in state 2 at time 0 and in state j at time t :
What the plot shows is \(P_{3-j}(0,t)\), the probability of being in state 3 at time 0 and in state j at time t :
Semi-parametric estimator of the probability of transition. It enables prediction.
The treatment is the only covariate included in the model. The covariates have to be in long format also with 1 information per transition : TRTAN2.1 (logical value on treatment 2 in transition 1), TRTAN2.2, TRTAN2.3.
id | trans | TRTAN | TRTAN2.1 | TRTAN2.2 | TRTAN2.3 | |
---|---|---|---|---|---|---|
1 | 015246-036-0001-00001 | 1 | 2 | 1 | 0 | 0 |
2 | 015246-036-0001-00001 | 2 | 2 | 0 | 1 | 0 |
5 | 015246-036-0001-00002 | 3 | 2 | 0 | 0 | 1 |
```{r}
fit_cox <- coxph(Surv(Tstart,Tstop,status)~ TRTAN2.1+TRTAN2.2+TRTAN2.3 +strata(trans), id=id, data = datable_long_transformed, method="breslow")
fit_cox
#exp(confint(fit_cox,level=0.95))
```
Call:
coxph(formula = Surv(Tstart, Tstop, status) ~ TRTAN2.1 + TRTAN2.2 +
TRTAN2.3 + strata(trans), data = datable_long_transformed,
method = "breslow", id = id)
coef exp(coef) se(coef) robust se z p
TRTAN2.1 -0.6139 0.5412 0.1723 0.1712 -3.586 0.000335
TRTAN2.2 -0.3274 0.7208 0.3017 0.3023 -1.083 0.278768
TRTAN2.3 0.1465 1.1577 0.2397 0.2238 0.654 0.512822
Likelihood ratio test=13.99 on 3 df, p=0.002915
n= 735, number of events= 254
The p-value (p) must be less than 0.05 to admit the significance of a coefficient.
To help our understanding :
Transition | Coefficient regression for Treatment 2 | HR (CI 95%) | p-value for HR |
---|---|---|---|
1 | -0.6139 (0.17) | 0.54 (0.38-0.75) | 0.0003 |
2 | -0.3274 (0.17) | 0.72 (0.40-1.30) | 0.28 |
3 | 0.1465 (0.17) | 1.15 (0.75-1.8) | 0.52 |
In using this method, coefficient regression then HR (exp(coef)) are computed for each transition.
For transition 1 (from initial state to progression): The hazard ratio for the treatment 1 vs treatment 2 is 0.54 (95% CI: 0.38-0.75). The treatment 2 is significantly (p-value : 0.0003) associated with a lower risk of progression (reduction of 46% of having a progression)
.
For transition 2 (from initial state to death): The hazard ratio for the treatment 1 vs treatment 2 is 0.72 (95% CI: 0.40-1.30). The treatment 2 is not significantly (p-value=0.28) associated with a lower risk of death from the initial state.
For transition 3 (from progression to death) : The hazard ratio for the treatment 1 vs treatment 2 is 1.15 (95% CI: 0.75-1.8). The treatment 2 is not significantly (p-value=0.52) associated with a lower risk of death from progression state.
The plot shows the prediction probability of transition in having the treatment 1 from the initial state.
The plot shows the prediction probability of transition in having the treatment 2 from the initial state.
It confirms visually the results of the cox model : the treatment 2 gives a higher probability of staying in the initial state
(basal) than the treatment 1.
Example of question : what is the probability of death for patient having treatment 2 and a progression at time 10 months ?
The answer
Basically, the answer of the question is \(P_{2-3}(10,t)\), the probability of being in state 2 (prgression) at time 10 and in state 3 (death) at time t :
PFS
for multi-state analysis : \(p_{11}\)
Overall survival
for multi-state analysis : 1-\(p_{13}\)
PFS
for usual survival analysis :
Overall survival
for usual survival analysis :
The markov assumption is very strong.
In some case, the validity of markov assumption can be discussed :
When a patient is in a progression state (state 2), the probability of death could be different if the patient has a progression at time 1 month or at time 20 months.
In a more complex setting (with possible multiple remission, progression, etc.), the markov assumption cannot take into account the number of total progression for the probability of transition to death.
Semi-markov model can be set as an alternative model to the markov model.
In the simple case (homogeneous semi-markov model), the quantities to compute are different :
These two quantities are used to compute the intensity of transition then the probability of transition.
The package 'SemiMarkov'
can be used to compute semi-markov model.
The multi-state model enables to analyse the dynamic of the disease :
In our case, we have seen :
Instead of
Whereas the ordinary integral of a function a provides a solution of the equation :
\[ y'(x) = a(x) \]
The product integral helps us to find solutions of the equation : \[ y'(x) = a(x)y(x) \]
The integral product is introduced by Voltera in 1887 :
\(\prod_{s \leq u \leq t} (Id + d\hat{A}(u)) = \lim_{max |t_{i}-t_{i-1}| \to 0 } (Id + \hat{A}(T_{i})-\hat{A}(T_{i-1}))\)
The probability of transition of a Markov process verify :
\[ \begin{aligned} & \forall i, j \in S=\{1, \ldots, k\} \text { et } \forall 0<s<u<t, \\ & p_{h j}(s, t)=\sum_{k \in S} p_{h k}(s, u) p_{k j}(u, t), \end{aligned} \]
Thus, the probability of transition can be expressed by the Kolmogorov forward equation :
\[ \frac{\delta P_{hj}(s,t)}{\delta t} = P(s,t)*A(dt) \]
The derivate of the probability of transition is equal to the probability of transition multiplied by the infinitesimal cumulative hazard matrix.
We need to introduce other notations for multi-state analysis :
\(N_{hj}(t)=\sum_{i:1}^{n}1_{X_{i} \leq t}\) is the number of transitions between the state h and j that have occurred at time t for all patients. \(N_{hj}(t)\) is a right statusored counting process.
Thanks to some counting process properties : \[
N_{hj}(t)=\Delta_{hj}(t)+M_{hj}(t)
\]
with \[
\Delta_{hj}(t)=\int_{0}^{t} \lambda_{hj}(u)du = \int_{0}^{t} \alpha_{hj}(u) Y_{h}(u)du
\] and \(M_{hj}(t)\) a local martingale (random noise with \(E(M)=0\)) SAINT PIERRE (2021).
WARNING : \(\Delta_{hj}(t)\) is not the cumulative hazard function.
NOTE : \(\alpha_{hj}(t)\) can be seen as the instantaneous hazard function of the transition from the state h to the state j at time t (\(\alpha_{hj}(s)=\lim_{\Delta t \to 0} \frac{p_{hj}(s, s+\Delta t )}{\Delta t}\)).
\(N_{hj}(t)=\sum_{i:1}^{n}1_{X_{i} \leq t}\) is the number of transitions between the state h and j that have occurred at time t for all patients. \(N_{hj}(t)\) is a right statusored counting process.
Thus (appendix), thanks to Doob-Meyer decomposition, we can decompose the counting process as a sum of a predictable process (\(\Delta\)) and a martingale (\(M\)) : \(N_{hj}(t)=\Delta_{hj}(t)+M_{hj}(t)\)
with \(\Delta_{hj}(t)=\int_{0}^{t} \lambda_{hj}(u)du = \int_{0}^{t} \alpha_{hj}(u) Y_{h}(u)du\),
with \(\lambda\) the intensity of the counting process \(N\) (Property), \(Y_{h}(t)=\sum_{i:1}^{n}1_{X_{i} \geq t}\) (the number of patients at risk in the state h at time t) and \(M_{hj}(t)\) a local martingale (random noise with \(E(M)=0\)).
WARNING : \(\Delta_{hj}(t)\) is not the cumulative hazard function.
Note : \(\alpha_{hj}(t)\) can be seen as the instantaneous hazard function of the transition from the state h to the state j at time t (\(\alpha_{hj}(s)=\lim_{\Delta t \to 0} \frac{p_{hj}(s, s+\Delta t )}{\Delta t}\)).
Assumption : Non-homogeneous markov model (the instantaneous hazard function depends only of the time of follow-up).
By definition :
1- the counting process is equal to \(N(t)=\Delta(t)+M(t)\) with \(\Delta(t)=\int_{0}^{t}\alpha(u)Y(u)du\).
2- \(M(t)\) is a local martingale thus \(E(M(t))=0\) and considered as random noise.
3- the cumulative hazard function is \(A(t)=\int_{0}^{t}\alpha(u)du\) with \(\alpha(u)\) the instantaneous hazard function.
\[ \begin{aligned} N(t) &=\Delta(t)+M(t) \text{ by (1)} \\ \Rightarrow \frac{dN(t)}{dt} &=\frac{d\Delta(t)}{dt}+\frac{dM(t)}{dt} \\ \Rightarrow dN(t) &=\alpha(t)Y_{h}(t)dt+dM(t) \\ \Rightarrow \alpha(t) &=(dN(t)-dM(t))*\frac{1}{Y(t)} \text{ by (2) as M is a random noise}\\ \Rightarrow A(t) &= \int_{0}^{t}\frac{1}{Y(t)}*dN(t) \text{ by (3)}\\ \Rightarrow A(t) &= \int_{0}^{t}\frac{J(t)}{Y(t)}*dN(t) \text{ with $J(t)=1_{Y(t) \geq 0}$}\\ \end{aligned} \]