--- title: Time inhomogeneous Markov cohort models date: "`r Sys.Date()`" output: html_vignette: toc: yes toc_depth: 2 number_sections: TRUE pkgdown: as_is: false vignette: > %\VignetteIndexEntry{Time inhomogeneous Markov cohort models} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- # Overview We will illustrate a cohort discrete time state transition model (cDTSTM) with a time inhomogeneous Markov process by replicating the total hip replacement (THR) example from the [*Decision Modeling for Health Economic Evaluation*](https://www.herc.ox.ac.uk/downloads/decision-modelling-for-health-economic-evaluation) textbook. The analysis compares two treatment strategies, a "standard" prosthesis and a "new" prosthesis. A probabilistic sensitivity analysis (PSA) will be used to quantify parameter uncertainty. The model consists 5 health states: (i) primary THR, (ii) successful primary, (iii) revision THR, (iv) successful revision, and (v) death. ```{r, out.width = "700px", echo = FALSE} knitr::include_graphics("markov-inhomogeneous.png") ``` The model is time-inhomogeneous because (i) the prosthesis survival time is modeled using a Weibull survival model so that the probability of failure is time-varying and (ii) background mortality rates increase as patients age. As in the [Simple Markov cohort model](markov-cohort.html) vignette, we define the model in terms of expressions with `define_model()`. We demonstrate an alternative approach in which we construct the transition probability matrices directly in the [Appendix](#standard-eval). # Model setup We set up the model for two treatment strategies and 10 subgroups defined by age and sex. ```{r, warning = FALSE, message = FALSE} library("hesim") library("data.table") # Treatment strategies strategies <- data.table(strategy_id = 1:2, strategy_name = c("Standard prosthesis", "New prosthesis")) # Patients ages <- seq(55, 75, 5) age_weights <- c(.05, .1, .4, .25, .20) sex_weights <- c(.65, .35) weights <- rep(age_weights, times = 2) * rep(sex_weights, each = length(ages)) patients <- data.table(patient_id = 1:10, grp_id = 1:10, sex = rep(c("Female", "Male"), each = length(ages)), age = rep(ages, times = 2), patient_wt = weights) # Health states states <- data.table(state_id = 1:4, state_name = c("Primary THR", "Successful primary", "Revision THR", "Successful revision")) # Combined data hesim_dat <- hesim_data(strategies = strategies, patients = patients, states = states) print(hesim_dat) ``` `get_labels()` is used to obtain nice labels for plots and summary tables. ```{r} labs <- get_labels(hesim_dat) print(labs) ``` # Model parameters ## Estimates from the literature ### Transition probabilities Transition probabilities from state $i$ (rows) to state $j$ (columns) are determined using the parameters shown in the table below: * **C**: The complement of other probabilities in a given row * **omrPTHR**: Operative mortality rate following primary THR * **rr**: Revision rate for prosthesis * **mr**: Age and sex-specific mortality rates * **omrRTHR**: Operative mortality rate following revision THR * **rrr**: Re-revision rate ```{r, echo = FALSE} library("kableExtra") tpmat <- matrix( c(0, "C", 0, 0, "omrPTHR", 0, "C", "rr", 0, "mr", 0, 0, 0, "C", "omrRTHR + mr", 0, 0, "rrr", "C", "mr", 0, 0, 0, 0, 1), nrow = 5, ncol = 5, byrow = TRUE ) colnames(tpmat) <- rownames(tpmat) <- names(labs$state_id) knitr::kable(tpmat) %>% kable_styling() ``` Three parameters of the transition probability matrix are assumed to be constant in the model: (i) the operative mortality rate following primary THR, *omrPTHR*, (ii) the operative mortality rate following revision THR, *omrRTHR*, and the re-revision rate, *rrr*. In a sample of 100 patients receiving primary THR 2 died implying that *omrPTHR* can be characterized by a beta distribution with $\alpha = 2$ and $\beta = 98$. Similarly, in a sample of 100 patients experiencing a revision procedure, four patients had another procedure within one year suggesting that *rrr* can be characterized by a beta distribution with $\alpha = 4$ and $\beta = 96$. Finally. there was information available for *omrRTHR*, so it assumed to follow the same beta distribution ($\alpha = 2$ and $\beta = 98$) as *omrPTHR*. The remaining parameters, *rr* and *mr*, are time-varying. The yearly mortality rates, *mr*, are stratified by age and sex. ```{r, echo = FALSE} mort_tbl <- rbind( c(35, 45, .00151, .00099), c(45, 55, .00393, .0026), c(55, 65, .0109, .0067), c(65, 75, .0316, .0193), c(75, 85, .0801, .0535), c(85, Inf, .1879, .1548) ) colnames(mort_tbl) <- c("age_lower", "age_upper", "male", "female") mort_tbl <- data.frame(mort_tbl) ``` ```{r} print(mort_tbl) ``` Revision rates, *rr*, were modeled using a proportional hazards Weibull model. The scale parameter was modeled as a function of age and indicators for male sex and and whether a new prosthesis was used. That is, given a scale parameter $\lambda$ and a shape parameter $\gamma$, the survival function is, $$ S(t) = \exp(-\lambda t^\gamma) \\ \lambda = \exp(\beta_0 + \beta_1 \cdot age + \beta_2 \cdot male + \beta_3 \cdot new) \\ $$ Note that this Weibull distribution corresponds to the distribution implemented in `flexsurv::pweibullPH()` with $a = \gamma$ and $m = \lambda$. It can be shown (see Appendix below) that the transition probabilities (with a one-year model cycle) are given by, $$ 1 - \exp \left\{\lambda\left[(t-1)^\gamma - t^\gamma \right] \right\} $$ The coefficients from the regression model and the variance-covariance matrix used for the PSA are stored in `rr_coef` and `rr_vcov`, respectively. ```{r, echo = FALSE} # Coefficients rr_coef <- c(0.3740968, -5.490935, -0.0367022, 0.768536, -1.344474) names(rr_coef) <- c("lngamma", "cons", "age", "male", "np1") # Variance-covariance matrix rr_vcov <- matrix( c(0.0474501^2, -0.005691, 0.000000028, 0.0000051, 0.000259, -0.005691, 0.207892^2, -0.000783, -0.007247, -0.000642, 0.000000028, -0.000783, 0.0052112^2, 0.000033, -0.000111, 0.0000051, -0.007247, 0.000033, 0.109066^2, 0.000184, 0.000259, -0.000642, -0.000111, 0.000184, 0.3825815^2), ncol = 5, nrow = 5, byrow = TRUE ) rownames(rr_vcov) <- colnames(rr_vcov) <- names(rr_coef) ``` ```{r} print(rr_coef) print(rr_vcov) ``` ### Utility The mean (standard error) of utility was estimated to be 0.85 (0.03), 0.30 (0.03), and 0.75 (0.04) in the successful primary, revision, and successful revision health states, respectively. ### Costs The largest costs are the cost of the prostheses themselves. The standard prosthesis costs $£394$ while the new prosthesis costs $£579$. Both are assumed to be known with certainty. The model assumes that there are no ongoing medical costs. The only remaining cost is therefore the cost of the revision procedure, which was estimated to have a mean of $£5,294$ and standard error of $£1,487$. ### Combining all parameters All underlying parameter estimates are stored in a list. ```{r} params <- list( # Transition probabilities ## Operative mortality following primary THR omrPTHR_shape1 = 2, omrPTHR_shape2 = 98, ## Revision rate for prosthesis rr_coef = rr_coef, rr_vcov = rr_vcov, ## Mortality_rates mr = mort_tbl, ## Operative mortality following revision THR omrRTHR_shape1 = 2, omrRTHR_shape2 = 98, ## re-revision rate rrr_shape1 = 4, rrr_shape2 = 96, # Utility u_mean = c(PrimaryTHR = 0, SuccessP = .85, Revision = .30, SuccessR = .75), u_se = c(PrimaryTHR = 0, SuccessP = .03, Revision = .03, SuccessR = .04), # Costs c_med_mean = c(PrimaryTHR = 0, SuccessP = 0, Revision = 5294, SuccessR = 0), c_med_se = c(PrimaryTHR = 0, SuccessP = 0, Revision = 1487, SuccessR = 0), c_Standard = 394, c_NP1 = 579 ) ``` ## Random number generation As noted above, *omrPTHR*, *omrRTHR*, and *rrr* are drawn from beta distributions with $\alpha$ and $\beta$ (i.e. `shape1` and `shape2`) specified. Similarly, utility is drawn from a beta distribution, but `shape1` and `shape2` are derived from the mean and standard error using the method of moments. The mortality rate and the cost of the prostheses are assumed to be known with certainty. The medical costs associated with health states are drawn from a gamma distribution, for which, like utility, the underlying parameters are derived from the mean and standard error using the method of moments. Finally, the parameters of the Weibull survival model are drawn from a multivariate normal distribution using the point estimates and the variance-covariance matrix. 500 samples will be drawn for the PSA. ```{r} rng_def <- define_rng({ list( omrPTHR = beta_rng(shape1 = omrPTHR_shape1, shape2 = omrPTHR_shape2), rr_coef = multi_normal_rng(mu = rr_coef, Sigma = rr_vcov), mr_male = fixed(mr$male, names = mr$age_lower), mr_female = fixed(mr$female, names = mr$age_lower), omrRTHR = beta_rng(shape1 = omrRTHR_shape1, shape2 = omrRTHR_shape2), rrr = beta_rng(shape1 = rrr_shape1, shape2 = rrr_shape2), u = beta_rng(mean = u_mean, sd = u_se), c_med = gamma_rng(mean = c_med_mean, sd = c_med_se), c_Standard = c_Standard, c_NP1 = c_NP1 ) }, n = 500) ``` ## Transformed parameters The sampled parameter values are "transformed" as a function of input data, which consists of all treatment strategy and patient combinations. ```{r} input_data <- expand(hesim_dat, by = c("strategies", "patients")) head(input_data) ``` When evaluating the `define_tparams()` expression, operations are vectorized and performed for each combination of the sampled parameters, input data, and time periods. The model will be simulated for 60 model cycles (i.e., 60 years) so we will create 60 time intervals of length 1. Separate transformation functions are used for the transition model and the cost/utility models. This is done for computational efficiency since only the transition model depends on cycle time. In the transition model, the revision rate (*rr*) depends on the scale and the shape parameters, which, in turn, depend on the sampled values of the parameters from the Weibull model. The mortality rate depends on sex and a patient's age during each model cycle. ```{r} transitions_def <- define_tparams({ # Regression for revision risk male <- ifelse(sex == "Female", 0, 1) np1 <- ifelse(strategy_name == "Standard prosthesis", 0, 1) scale <- exp(rr_coef$cons + rr_coef$age * age + rr_coef$male * male + rr_coef$np1 * np1) shape <- exp(rr_coef$lngamma) rr <- 1 - exp(scale * ((time - 1)^shape - time^shape)) # Mortality rate age_new <- age + time mr <- mr_female[["35"]] * (sex == "Female" & age_new >= 35 & age_new < 45) + mr_female[["45"]] * (sex == "Female" & age_new >= 45 & age_new < 55) + mr_female[["55"]] * (sex == "Female" & age_new >= 55 & age_new < 65) + mr_female[["65"]] * (sex == "Female" & age_new >= 65 & age_new < 75) + mr_female[["75"]] * (sex == "Female" & age_new >= 75 & age_new < 85) + mr_female[["85"]] * (sex == "Female" & age_new >= 85) + mr_male[["35"]] * (sex == "Male" & age_new >= 35 & age_new < 45) + mr_male[["45"]] * (sex == "Male" & age_new >= 45 & age_new < 55) + mr_male[["55"]] * (sex == "Male" & age_new >= 55 & age_new < 65) + mr_male[["65"]] * (sex == "Male" & age_new >= 65 & age_new < 75) + mr_male[["75"]] * (sex == "Male" & age_new >= 75 & age_new < 85) + mr_male[["85"]] * (sex == "Male" & age_new >= 85) list( tpmatrix = tpmatrix( 0, C, 0, 0, omrPTHR, 0, C, rr, 0, mr, 0, 0, 0, C, omrRTHR + mr, 0, 0, rrr, C, mr, 0, 0, 0, 0, 1) ) }, times = 1:60) statevals_def <- define_tparams({ c_prosthesis <- ifelse(strategy_name == "Standard prosthesis", c_Standard, c_NP1) list( utility = u, costs = list( prosthesis = c_prosthesis, medical = c_med ) ) }) ``` # Simulation ## Constructing the economic model The economic model is defined by specifying parameter estimates, a random number generation expression for the PSA, and the transformed parameter expressions. ```{r} mod_def <- define_model(tparams_def = list(transitions_def, statevals_def), rng_def = rng_def, params = params) ``` The economic model is then created from the input data and the defined model. Arguments for instantiating the cost models (a list of objects of class `StateVals`) can be specified with a named list containing the values of arguments to pass to `StateVals$new()`. In this case, we note that the model for the cost of the prostheses is based on one-time costs accrued at the start of the model (with no discounting); conversely, medical costs accrue by weighting state values by time spent in the health states (in the current model simply the cost of revision over 1 year). ```{r echo = FALSE} ptm <- proc.time() ``` ```{r econmod} cost_args <- list( prosthesis = list(method = "starting"), medical = list(method = "wlos") ) econmod <- create_CohortDtstm(mod_def, input_data, cost_args = cost_args) ``` ```{r echo = FALSE} run_time1 <- time <- proc.time() - ptm ``` ## Simulating outcomes ## Health state probabilities Health state probabilities are simulated for 60 model cycles. ```{r simStateprobs} econmod$sim_stateprobs(n_cycles = 60) ``` The expected number of patients per 1,000 in each health state is plotted below. The counts in the plot are for `patient_id = 2`, a 60 year old female as in the original textbook example. Since a PSA was performed both mean values and 95 percent credible intervals can be displayed. Patients are more likely to have a revision with a standard prosthesis than with the new prosthesis. ```{r simStateprobsPlot, echo = FALSE, fig.width = 7, fig.height = 4} library("ggplot2") theme_set(theme_bw()) stateprob_summary <- econmod$stateprobs_[patient_id == 2, .(count_mean = mean(prob) * 1000, count_lower = 1000 * quantile(prob, .025), count_upper = 1000 * quantile(prob, .975)), by = c("strategy_id", "patient_id", "state_id", "t")] set_labels(stateprob_summary, labels = labs, new_names = c("strategy_name", "state_name")) ggplot(stateprob_summary, aes(x = t, y = count_mean)) + geom_line(aes(col = strategy_name)) + geom_ribbon(aes(x = t, ymin = count_lower, ymax = count_upper, fill = strategy_name), alpha = .3) + facet_wrap(~state_name, scale = "free_y") + xlab("Year") + ylab("Count") + scale_fill_discrete("Strategy") + scale_color_discrete("Strategy") ``` ## Costs and QALYs To again be consistent with the textbook example, QALYs and costs are simulated with discount rates of .015 and 0.06, respectively. QALYs and costs (when `method = "wlos"`) are computed with a right Riemann sum, meaning that they are measured at the right endpoint of each model cycle. ```{r simStateVals} econmod$sim_qalys(dr = .015, integrate_method = "riemann_right") econmod$sim_costs(dr = .06, integrate_method = "riemann_right") ``` # Decision analysis ## Subgroup analysis Subgroup analyses can be performed with `$summarize()` using the `by_grp = TRUE` option. Cost-effectiveness analyses are then performed for each subgroup (`grp_id`) from input data. We will begin by reporting summary results using `summary.ce()` for the 60-year old female, which are consistent with those from the textbook. ```{r cea} ce_sim <- econmod$summarize(by_grp = TRUE) wtp <- seq(0, 25000, 500) cea_pw_out <- cea_pw(ce_sim, comparator = 1, dr_qalys = 0.015, dr_costs = .06, k = wtp) summary(ce_sim, labels = labs)[grp == 2] %>% format() ``` We can then compute incremental cost-effectiveness ratios for each subgroup. There is considerable variation which is not unexpected since the the revision rate for prostheses (*rr*) depends on covariates. ```{r icerSubgroup} icer(cea_pw_out) %>% format(pivot_from = "outcome") ``` ## Overall Estimates aggregated across all patients can also be computed with `$sim_summarize()` by using `by_grp = FALSE`. In this case, overall costs and QALYs are a weighted average (using the column `patient_wt`) of costs and QALYs for each patient in the input data. ```{r icerOverall} ce_sim <- econmod$summarize(by_grp = FALSE) cea_pw_out <- cea_pw(ce_sim, comparator = 1, dr_qalys = .015, dr_costs = .06, k = wtp) icer(cea_pw_out, labels = labs) %>% format(digits_qalys = 3) ``` # Appendix ## Transition probabilities from a Weibull survival model In general, the transition probability of an event for a given model cycle is one minus the probability a patient does not have an event during the cycle conditional on not having an event up until that cycle. That is, given a survivor function, $S(t)$ and a cycle of length $u$, the transition probability, $tp(t_u)$ of an event is, $$ \begin{align} tp(t_u) &= 1 - \frac{S(t)}{S(t-u)} \\ &= 1 - \frac{exp\{-H(t)\}}{exp\{-H(t-u)\}} \\ &= 1 - exp\{H(t-u) - H(t)\}, \end{align} $$ where $H(t)$ is the cumulative hazard function. The proportional hazard Weibull distribution with scale parameter $\lambda$ and shape parameter, $\gamma$ has the cumulative hazard function, $$ H(t) = \lambda t^\gamma $$ The transition probability of an event for the Weibull model is thus, $$ \begin{align} tp(t_u) &= 1 - exp\{\lambda (t-u)^\gamma -\lambda t^\gamma\} \\ &= 1 - exp\{\lambda[(t-u)^\gamma - t^\gamma]\} \end{align} $$ ## Direct construction of transition matrices {#standard-eval} Transition probability matrices can also be constructed without using `define_model()`. One way to do this is to construct the transition probability matrices using matrix algebra. That is, for a given sample of the parameters from the PSA, we can predict the element for the $r$th row and $s$th column as $p_{rs} = g(x^T\theta)$ where $x$ is an input matrix, $\theta$ is a vector of coefficients (i.e., a parameter sample), and $g()$ is a transformation function. An input matrix will always vary by treatment strategy and patient; in a time-inhomogeneous model, it will vary by time period as well. This approach is more computationally efficient than defining a model because `define_model()` is a general function, resulting in some overhead. ```{r echo = FALSE} ptm <- proc.time() ``` ### Random number generation To get started, we sample the parameters for the PSA using `eval_rng()`. ```{r} params_rng <- eval_rng(rng_def, params = params) ``` ### Input data We then create input data and add a few variables needed to construct the input matrices. Note that the variable `age_new` reflects a patients' age as a function of time in the model. ```{r} # Create new variables tpmatrix_data <- expand(hesim_dat, by = c("strategies", "patients"), times = 1:60) tpmatrix_data[, time_stop := ifelse(is.infinite(time_stop), 61, time_stop)] tpmatrix_data[, male := ifelse(sex == "Male", 1, 0)] tpmatrix_data[, np1 := ifelse(strategy_name == "New prosthesis", 1, 0)] tpmatrix_data[, age_new := age + time_stop] tpmatrix_data[, cons := 1] ``` Different input matrices are constructed using the input data for different elements of the transition probability matrix. Some of the parameters aren't a function of any patient characteristics or time and can therefore be predicted using a matrix with a single column of ones. A separate matrix is needed for the revision rate given the covariates in the Weibull survival model. ```{r} x_cons <- as.matrix(tpmatrix_data$cons) x_rr <- as.matrix(tpmatrix_data[, .(cons, age, male, np1)]) head(x_cons) head(x_rr) ``` As in our `define_model()` block above, predicting the mortality rate is a little more cumbersome. One way to do it is to create dummy variables for each age and sex category. To do this it's helpful to reshape the mortality table above to a longer format. ```{r, echo = FALSE} mort_tbl2 <- melt(data.table(mort_tbl), id.vars = c("age_lower", "age_upper"), variable.name = "sex", value.name = "rate", variable.factor = FALSE) mort_tbl2[, sex := ifelse(sex == "female", "Female", "Male")] mort_tbl2[, age_sex := paste0(sex, "_", age_lower, "_", age_upper)] mort_tbl2 <- mort_tbl2[, .(age_lower, age_upper, sex, age_sex, rate)] setorderv(mort_tbl2, c("sex", "age_lower")) ``` ```{r} print(mort_tbl2) ``` We can write a short function to create a matrix of dummy variables. ```{r} create_age_sex_dummies <- function(data, mort_tbl) { x <- matrix(0, nrow = nrow(data), ncol = nrow(mort_tbl2)) colnames(x) <- mort_tbl2$age_sex for (i in 1:nrow(mort_tbl2)) { x[, i] <- ifelse(data$sex == mort_tbl2[i]$sex & data$age_new >= mort_tbl2[i]$age_lower & data$age_new < mort_tbl2[i]$age_upper, 1, 0) } return(x) } x_age_sex <- create_age_sex_dummies(tpmatrix_data, mort_tbl2) head(x_age_sex) ``` ### Transformed parameters With input matrices in hand, we can transform the parameter samples to construct a `tparams_transprobs()` object, which contains all the predicted transition probabilities needed for simulation. These objects are most easily constructed using `tpmatrix()` and `tpmatrix_id()` objects. We will start with `tpmatrix_id()` which contains the ID variables for each transition probability matrix in `tparams_transprobs()` objects. ```{r} tpmat_id <- tpmatrix_id(tpmatrix_data, 500) head(tpmat_id) ``` `tpmatrix()` produces a transition probability matrix for each row of `tpmatrix_id`. We predict the elements of the matrix using the input matrices we derived above. Matrix multiplication of an $n \times k$ input matrix by a $k \times s$ coefficient matrix results in a $n \times s$ matrix of predicted values. In other words, there is one row for each of the $n$ rows in the input data and one column for each of the $s$ parameter samples. This matrix can itself be flattened into a vector of length $n \cdot s$ using the `c()` function. The result is a vector of predicted values stacked by parameter sample; that is, a prediction for each row produced by `tpmatrix_id()`. ```{r} # "Transformed" parameters time <- tpmat_id$time_stop omrPTHR <- c(x_cons %*% params_rng$omrPTHR) omrRTHR <- omrPTHR rr_scale <- exp(c(x_rr %*% t(params_rng$rr_coef[, -1]))) rr_shape <- exp(c(x_cons %*% params_rng$rr_coef$lngamma)) rr <- 1 - exp(rr_scale * ((time - 1)^rr_shape - time^rr_shape)) mr <- c(x_age_sex %*% t(cbind(params_rng$mr_female, params_rng$mr_male))) rrr <- c(x_cons %*% params_rng$rrr) # Transition probability matrix tpmat <- tpmatrix( 0, C, 0, 0, omrPTHR, 0, C, rr, 0, mr, 0, 0, 0, C, omrRTHR + mr, 0, 0, rrr, C, mr, 0, 0, 0, 0, 1 ) tpmat[1, ] ``` Finally, we combine the ID variables and the transition probability matrices into a single object. ```{r} tparams_transprobs <- tparams_transprobs(tpmat, tpmat_id) ``` ```{r echo = FALSE} run_time2 <- time <- proc.time() - ptm ``` ### Simulation To compare with the `define_model()` approach, we will (re)construct and (re)simulate the model. Estimated costs and QALYs are very similar, as expected. ```{r} transmod <- CohortDtstmTrans$new(params = tparams_transprobs) econmod$trans_model <- transmod econmod$sim_stateprobs(n_cycles = 60) econmod$sim_qalys(dr = .015, integrate_method = "riemann_right") econmod$sim_costs(dr = .06, integrate_method = "riemann_right") ce_sim <- econmod$summarize(by_grp = TRUE) summary(ce_sim, labels = labs)[grp == 2] %>% format() ``` However, time to construct the transition model was `r formatC(run_time1["elapsed"]/run_time2["elapsed"], digits = 2)` times faster. Although both are quite fast (time in the table is in seconds), this difference would become more pronounced as the number of parameter samples, treatment strategies, patients, and/or distinct time periods increased. ```{r, hide = TRUE} data.table( Method = c("define_model()", "Custom tpmatrix()"), rbind( run_time1[1:3], run_time2[1:3] ) ) %>% kable() %>% kable_styling() ```