class: bottom, left, inverse, title-slide # A Practical Tour in the World of
Joint Modelling
### Alessandro Gasparini ###
alessandro.gasparini@ki.se
### 2021-03-02 --- class: middle <script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { Macros: { '\\_': '_' } } }); </script> <!-- # Abstract Joint models are used increasingly often in a wide variety of settings of medical research. However, what researchers (including statisticians) refer to when talking about joint modelling is not always clear. I will introduce a variety of joint models — in common and less-common settings — with several applications illustrating some of the advantages and useful predictions that can be obtained after fitting such models. --> .pull-left[ * Undergrad in Statistics, then M.Sc. in Biostatistics * Worked at KI in 2014-2016 on kidney disease epidemiology * PhD in Biostatistics from the University of Leicester, graduated in early 2020 * Joined MEB as a post-doc in September 2019 * Currently working with Keith Humphreys on extending statistical models for breast cancer growth and spread ] .pull-right[ <img src="pic.jpg" width="90%" style="display: block; margin: auto; border-radius: 15px"> ] .footnote[ Photo by [Gunilla Sonnebring](https://staff.ki.se/people/gunson). ] --- class: center, middle, inverse # What does joint modelling mean? --- # It Depends™ <img src="chris-liverani-9cd8qOgeNIY-unsplash.jpg" width="75%" style="display: block; margin: auto; border-radius: 15px"> .footnote[ Photo by [Chris Liverani](https://unsplash.com/@chrisliverani) on [Unsplash](https://unsplash.com/photos/9cd8qOgeNIY). ] --- class: center, middle # In the (bio)-statistical literature, that often refers to .cyclamen[joint longitudinal] and .cyclamen[time-to-event modelling]. --- class: outline # The plan for today We are going to see: -- 1. Joint _longitudinal_ and _time-to-event_ modelling; -- 1. Joint _observation process_ and _longitudinal_ modelling; -- 1. (Briefly), joint _any number and kind of outcomes_ modelling; -- 1. Joint _tumour growth_ and _spread_ modelling. --- class: center, middle, inverse # Joint _longitudinal_ <br/> and _time-to-event_ --- # Joint longitudinal and time-to-event This is the traditional setting. First, we have a longitudinal, mixed-effects model: `$$Y_i(t_i) = m_i(t_i) + \epsilon_i(t_i)$$` with `$$m_i(t_i) = X_i(t_i) \beta + Z_i(t_i) b_i$$` and `$$\epsilon_i \sim N(0, \sigma^2_{\epsilon})$$` `$$b_i \sim N(0, \Sigma)$$` --- # Joint longitudinal and time-to-event This is the traditional setting. Then, we have a proportional hazards survival model: `$$h(t_i | M_i(t_i), W_i) = h_0(t_i) \exp(W_i \psi + \alpha m_i(t_i))$$` where `$$M_i(t_i) = \{ m_i(s_i) \ \forall \ 0 \le s_i \le t_i \}$$` -- `\(\alpha\)` represents the parameter of a given .cyclamen[association structure] that links the two sub-models. Examples are: * Expected value of `\(m_i(t_i)\)`, * Expected rate of change of `\(m_i(t_i)\)`, * ... --- # Joint longitudinal and time-to-event Generally focus is on the survival component with a time-varying longitudinal covariate (e.g. a biomarker). Classical examples: * Studying the association between viral load and time to AIDS, * Studying the association between serum bilirubin and the time to death in liver cirrhosis, * Studying the association between prostate-specific antigen (PSA) levels and the time to prostate cancer. -- Other examples from studies hosted at MEB: * Kidney function over time in SCREAM, * Breast density over time in KARMA (Maya is working on it). --- class: middle, center # In those settings, joint longitudinal-survival models can accommodate .cyclamen[repeated measures] of an internal covariate, potentially .cyclamen[measured with error]. --- class: middle, center # This makes .cyclamen[full use] of the available information — which is much better than e.g. focussing on a single baseline measurement. --- # Predictions <img src="dynpred-final.png" width="85%" style="display: block; margin: auto;" /> --- # _Dynamic_ predictions <img src="dynpred.gif" width="85%" style="display: block; margin: auto;" /> --- class: center, middle # However, we might want to focus on the longitudinal outcome instead --- # Focussing on the longitudinal outcome Mixed-effects models assume that the longitudinal outcome and the drop-out process are independent. This is often .cyklamen[unreasonable] and might introduce bias in the analysis. -- A joint longitudinal-survival model can account for informative drop-out by directly modelling it in a unified framework. This works when the drop-out process is _at least_ .cyklamen[at random], according to the definition of Rubin (DOI: [10.1093/biomet/63.3.581](https://doi.org/10.1093/biomet/63.3.581)). -- We'll now see an example using data from a post hoc analysis of a clinical trial in intensive care medicine. .footnote[ See Harhay, Gasparini et al. for more details. DOI: [10.1097/CCE.0000000000000104](https://journals.lww.com/ccejournal/Fulltext/2020/04000/Assessing_the_Course_of_Organ_Dysfunction_Using.20.aspx) ] --- # VASST VASST was a multi-centre, double-blind randomised controlled trial that randomised patients with septic shock to either low-dose vasopressin or norepinephrine plus open-label vasopressors. .footnote[ See Harhay, Gasparini et al. for more details. DOI: [10.1097/CCE.0000000000000104](https://journals.lww.com/ccejournal/Fulltext/2020/04000/Assessing_the_Course_of_Organ_Dysfunction_Using.20.aspx) ] -- The primary end-point was mortality rate at 28 days, and no significant difference between treatment arms was found. -- The Sequential Organ Failure Assessment [SOFA] score was measured over time during the trial, and it is considered to be a relevant clinical outcome given that treatment-associated changes in SOFA score are reliably and consistently associated with observed mortality. -- > However, SOFA score trajectories might be informatively truncated due to mortality. --- # Raw trajectories <img src="20210302-faculty-event_files/figure-html/vasst-dropout-trajectories-A-1.png" width="85%" style="display: block; margin: auto;" /> --- # Fitted trajectories <img src="20210302-faculty-event_files/figure-html/vasst-jm-trajectories-1.png" width="85%" style="display: block; margin: auto;" /> --- # Fitted difference between arms <img src="20210302-faculty-event_files/figure-html/vasst-jm-diff-1.png" width="85%" style="display: block; margin: auto;" /> --- class: center, middle, inverse # Joint _observation process_ <br/> and _longitudinal_ --- # Informative observation process Here we focus on the analysis of longitudinal data, e.g. to _understand_ the evolution of a disease and the _effect_ of interventions over time. -- Traditional methods used to analyse longitudinal data assume that the underlying mechanism that controls observation times is independent of disease severity. This is not always the case, e.g. in administrative records data: * Observation times are likely correlated with disease severity; -- * <span style='color: #E74C3C'>Worse disease status</span> is likely reflected in <span style='color: #E74C3C'>worse biomarker values</span> recorded at such visits; -- * <span style='color: #E74C3C'>Abnormal</span> values are likely <span style='color: #E74C3C'>over-represented</span>, and <span style='color: #27AE60'>normal</span> values are likely <span style='color: #27AE60'>under-represented</span>. --- class: center, middle # If we naïvely apply traditional methods when the observation process is informative we obtain biased results. .footnote[ See e.g. Pullenayegum et al., DOI: [10.1177/0962280214536537](https://doi.org/10.1177/0962280214536537) ] --- # A joint modelling approach We can formulate a joint model for longitudinal data and the observation process. First, the observation times `\(\tilde{t}_{ij}\)` follow a recurrent events, frailty model: `$$r(\tilde{t}_{ij} | W_{ij}, u_i) = r_0(\tilde{t}_{ij}) \exp(W_{ij} \psi + u_i)$$` where * `\(W_{ij}\)` are (potentially time-varying) covariates; * `\(u_i\)` is a subject-specific random effect (e.g. a frailty). .footnote[ This approach is described in more detail in Gasparini et al. DOI: [10.1111/stan.12188](https://doi.org/10.1111/stan.12188) ] --- # A joint modelling approach We can formulate a joint model for longitudinal data and the observation process. Then, we have a longitudinal mixed-effects model: `$$Y_i(t_i) = m_i(t_i) + \epsilon_i(t_i) = X_i(t_i) \beta + \gamma u_i + Z_i(t_i) b_i + \epsilon_i(t_i)$$` Note that `\(u_i\)` appears in the observation process sub-model too. That is the random effect that .cyclamen[links] the longitudinal and observation times sub-models, with associated parameter `\(\gamma\)`. .footnote[ This approach is described in more detail in Gasparini et al. DOI: [10.1111/stan.12188](https://doi.org/10.1111/stan.12188) ] --- # PSP-CKD The _Primary-Secondary Care Partnership to Prevent Adverse Outcomes in Chronic Kidney Disease_ (PSP-CKD) study is a cluster randomised controlled pragmatic trial of enhanced chronic kidney disease (CKD) care against usual primary care management. Primary care practices in Northamptonshire, United Kingdom, were randomised to either routine care or enhanced care, and data on adult individuals with CKD was extracted from the practices. We use a dataset with 187,671 observations for 35,822 individuals, over approximately 3 years of follow-up. -- > PSP-CKD investigators concluded that CKD management programs in primary care did not slow disease progression, but improved care with the potential to decrease costs and comorbidity burden. --- # Is the observation process informative? Statistically significant Spearman's rank correlations between gap times and treatment, age at baseline, and sex (of small magnitude). -- A linear mixed model for gap times versus treatment, age at baseline and gender with a random intercept yielded significant associations. On average: * Females had 12.23-days (95% C.I.: 9.44 - 15.02) longer gap times ; * Treated individuals had 3.41-days (0.68 - 6.14) shorter gap-times; * 2.74-days (2.16 - 3.32) shorter gap times every 5-years age difference. -- A (recurrent events) Andersen-Gill model for the observation process alone showed similar results. --- class: center, middle # Gap times are associated with gender, age at baseline, and treatment modality. Hence, the observation process is .cyclamen[likely] informative. --- # Joint modelling results The main result here is that _the time by treatment interaction is not significant_. -- Furthermore: * Reduced risk of having a new observation for females compared to males (~8%, hazard ratio of 0.920 with 95% C.I.: 0.902 - 0.938); * Increased risk for treated individuals (7%, hazard ratio of 1.069 with 95% C.I.: 1.048 - 1.090); * Reduced risk for higher age at baseline (hazard ratio for a 5-years increase: 0.993, 95% C.I.: 0.989 - 0.997); * Estimated value of the association parameter `\(\gamma\)` = -2.682 (95% CI: -2.900 to -2.468). --- # Fitted trajectories <img src="20210302-faculty-event_files/figure-html/psp-ckd-jm-app-long-1.png" width="85%" style="display: block; margin: auto;" /> --- # Extended joint models <img src="20210302-faculty-event_files/figure-html/psp-ckd-jm-app-4jm-long-1.png" width="85%" style="display: block; margin: auto;" /> --- class: center, middle, inverse # Joint _any number <br/> and kind of outcomes_ --- class: middle, center # This is where we aim to go. --- # Joint models and healthcare data With data extracted from administrative records, we might suffer from _both_ informative observation process and drop-out in longitudinal studies. > Naturally, we might want to fit the three processes jointly! -- This can now be done within the `merlin` (🧙♂️) framework developed by Michael Crowther. `merlin` is a framework for modelling multiple outcomes, with any number of levels, and with any number of random effects at each level; standard distributions are supported, as well as non-standard user-defined and non-linear models. .footnote[ See e.g. [arXiv.org:1710.02223](https://arxiv.org/abs/1710.02223), [arXiv:1806.01615](https://arxiv.org/abs/1806.01615), and [arXiv:2007.14109](https://arxiv.org/abs/2007.14109). ] --- # Multivariate joint models `merlin` not only allows us to fit tri-variate joint models, but it also lets us fit (_almost_) any joint model we might think of. For instance, we could model: * Multiple longitudinal biomarkers, * (Potentially) each with their own drop-out and observation process, * Competing risks, * ... -- And of course, we can get model-based predictions from these joint models too. --- # Computational issues _Standard_ joint models are readily available in statistical software, with reasonable performance. -- However, with large studies or many sub-models (with several random effects) the performance of the estimating algorithms degrades swiftly. > In short, we pay for <span style='color: #27AE60'>flexibility</span> with <span style='color: #E74C3C'>computational speed</span> -- Some work is being done on improving estimation algorithms. Among others: * Using Quasi-Monte Carlo methods, * Using sample weights and clever study designs, * Using variational approximations (Benjamin is working on this). --- class: center, middle, inverse # Joint _tumour growth_ <br/> and _spread_ --- # Setup <img src="20210302-faculty-event_files/figure-html/unnamed-chunk-4-1.png" width="85%" style="display: block; margin: auto;" /> --- # Setup <img src="20210302-faculty-event_files/figure-html/unnamed-chunk-5-1.png" width="85%" style="display: block; margin: auto;" /> --- # Setup <img src="20210302-faculty-event_files/figure-html/unnamed-chunk-6-1.png" width="85%" style="display: block; margin: auto;" /> --- class: middle, center # The main difference here is that we are dealing with latent processes that are observed only at diagnosis. --- class: middle, center # However, we can infer the past probabilistically by making _stronger_ parametric assumptions. --- # Joint tumour growth and local spread This builds upon work by Gabriel Isheden during his PhD at MEB. -- In his thesis, Gabriel developed a joint model with 1. A sub-model for exponential tumour growth in breast cancer with a random effect on growth rates, 1. A sub-model for the number of affected lymph nodes with a random effect for spread heterogeneity, and 1. A sub-model for mode of detection (symptomatic vs screen-detected). -- > Gabriel showed that fit of the new model to real data is considerably better than other models in the literature. --- # Predictions <img src="ln.png" width="85%" style="display: block; margin: auto;" /> .footnote[ From Isheden et al., DOI: [10.1177%2F0962280218819568](https://doi.org/10.1177%2F0962280218819568) ] --- # Joint tumour growth and distant metastatic spread Keith and I developed a joint model for mammography screening data with 1. A sub-model for exponential tumour growth with a random effect on growth rates, 1. A sub-model for time to diagnosis of distant metastasis with a random effect for spread heterogeneity, which (conveniently) follows a Weibull-like distribution, and 1. A sub-model for mode of detection. > Here we account for screening patterns, but we do not consider (yet) loco-regional spread to the lymph nodes. --- class: middle, center # Nevertheless, we can obtain useful predictions out of this model. --- # Predictions <img src="cahres-predictions-by-volume.png" width="85%" style="display: block; margin: auto;" /> .footnote[ Predictions for three hypothetical patients .cyclamen[detected via screening], with a history of .cyclamen[two negative screens], and with varying tumour size at detection. ] --- # Predictions <img src="cahres-predictions-by-detection.png" width="85%" style="display: block; margin: auto;" /> .footnote[ Predictions for two hypothetical patients with a tumour of .cyclamen[20 mm] at detection and .cyclamen[two negative screens]. One subject is .cyclamen[detected via screening], one is .cyclamen[detected symptomatically] six months after the previous negative screen. ] --- # Predictions <img src="probs-plot-1.png" width="85%" style="display: block; margin: auto;" /> .footnote[ Estimated probability of latent and detected metastases at detection of the primary tumour (diagnosis) as a function of tumour size, based on a model estimated on CAHRES data. ] --- # The next steps Now, we are working to jointly model all that we discussed so far: 1. Breast cancer tumour growth and mode of detection, 1. Spread to the lymph nodes, and 1. Distant metastatic spread. -- This is fundamentally important: * We need to consider both lymph nodes and distant metastasis to model the full prognosis of breast cancer; -- * We need a model bridging the latent growth of a tumour to observable future outcomes to use (for instance) in microsimulation studies of e.g. effectiveness of dynamic breast cancer screening. --- class: center, middle, inverse # Closing time --- class: middle # .cyclamen[1.] Joint modelling lets us make better use of the available data, studying a variety of outcomes together (and how they interact) --- class: middle ### .cyclamen[1.] Joint modelling lets us make better use of the available data, studying a variety of outcomes together (and how they interact) # .cyclamen[2.] We saw a variety of useful (static and dynamic) prediction that could be extracted from these models --- class: middle ### .cyclamen[1.] Joint modelling lets us make better use of the available data, studying a variety of outcomes together (and how they interact) ### .cyclamen[2.] We saw a variety of useful (static and dynamic) prediction that could be extracted from these models # .cyclamen[3.] The methodology is still relatively novel, with several ongoing developments --- class: middle ### .cyclamen[1.] Joint modelling lets us make better use of the available data, studying a variety of outcomes together (and how they interact) ### .cyclamen[2.] We saw a variety of useful (static and dynamic) prediction that could be extracted from these models ### .cyclamen[3.] The methodology is still relatively novel, with many ongoing developments # .cyclamen[4.] Complexity grows quickly --- class: center, middle # That's all from me for today. # Thanks for listening! .footnote[ Slides available online: https://tinyurl.com/faculty-0302 ]