Open Tools to Analyse and Report Monte Carlo Simulation Studies

class: bottom, left, inverse, title-slide

# Open Tools to Analyse and Report Monte Carlo Simulation Studies
### Alessandro Gasparini
### <a href="mailto:alessandro.gasparini@ki.se" class="email">alessandro.gasparini@ki.se</a> · <a href='https://twitter.com/ellessenne'>@ellessenne</a>
### FMS Spring Meeting · 2020-05-29

---

class: outline

# Outline

## 1. Monte Carlo simulation studies

## 2. The {`rsimsum`} R package

## 3. The INTEREST Shiny app

## 4. Some considerations and conclusions

---
class: center, middle, inverse

# Monte Carlo simulation studies

---
class: center, middle

# Monte Carlo simulation studies are computer experiments that involve generating data by pseudo-random sampling.

---
class: center, middle

.large[
From now on, I will be using the terms _Monte Carlo simulation study_ and _simulation study_ interchangeably.
]

---

# The Process&trade;

In a simulation study, we design an experiment such that:

1. Generate data from a known distribution/model (so that we know the _truth_);

2. Analyse the data;

3. Compare the analysis results with the truth;

4. Then, we (generally) repeat steps 1-3 _B_ times.

> After running the experiment _B_ times, we collect and analyse the results to draw our conclusions.

---

# Simulation studies are useful...

Monte Carlo simulation studies provide an invaluable tool for statistical and biostatistical research.

They can help to answer questions such as:

* Is an estimator biased in a finite sample?

* Do confidence intervals for a given parameter achieve the desired nominal level of coverage?

* How does a newly developed method compare to an established one?

* What is the power to detect a desired effect size under complex experimental
settings and analysis methods?

* ...you name it!

---

# ...and increasingly popular!

.footnote[.tiny[
Scopus search key: `TITLE-ABS-KEY ("simulation study") AND SUBJAREA (math)`
]]

---

# There are issues...

1. Monte Carlo standard errors of performance measures are often not computed/reported;

2. Reproducibility of results;

3. Dissemination of results.

> As statisticians, we should follow our advice — and could do better — when it comes to the design, reporting, and reproducibility of our own _experiments_.

---
class: center, middle

# Disclaimer:

This talk will not cover how to plan, design, and run a Monte Carlo simulation study.
A full example can be found in the [tutorial paper by Morris, White, and Crowther](https://onlinelibrary.wiley.com/doi/10.1002/sim.8086) (2019).

---
class: center, middle, inverse

# The {`rsimsum`} R package

---

# Enter `rsimsum`

> Why _yet_ another R package?

* There is a similar package in [Stata](https://www.stata-journal.com/article.html?article=st0200), but I could not find anything comparable in R;

* Most performance measures are supported out of the box — no need to code tedious (and error-prone) calculations by hand;

* Monte Carlo standard errors are computed and displayed by default;

* Several quick plotting options for fast iteration/exploration.

---

# Example (1)

We compare three different methods for estimating the hazard ratio in a randomised trial with a survival outcome; it's the same example from the tutorial paper by Morris, White, and Crowther (2019).

Specifically, we fit the proportional hazards model

.center[.large[
hi(t) = h0(t)exp(Xi&theta;) 
]]

where Xi is a binary treatment and &theta; is the log-hazard ratio.

The dataset with the results of the simulation is readily available online:

```r
library(haven)
df <- haven::read_dta(file = "http://tiny.cc/example-data-dta")
```

---

# Example (2)

## Aims of the simulation study

Evaluating the impact of model misspecification when using proportional hazards survival models to estimate the hazard ratio.

## Data-generating mechanisms

500 patients, with Xi a binary treatment with simple randomisation and equal allocation ratio and &theta; = -0.5.

Survival times are simulated assuming

1. An exponential baseline hazard with scale parameter &lambda; = 0.1;

1. A Weibull baseline hazard with scale &lambda; = 0.1 and shape &gamma; = 1.5.

---

# Example (3)

## Estimand(s)

The log-hazard ratio &theta;.

## Methods

An (1) exponential, (2) Weibull, and (3) Cox proportional hazards model.

## Performance measure(s)

The study focuses on convergence, bias, coverage, empirical and model-based standard errors for &theta;.

1,600 replications per DGM are required to control Monte Carlo SE of bias to be lower than 0.005 (assuming Var(&theta;) &le; 0.04).

---

# `rsimsum`'s main function: `simsum`

```r
str(rsimsum::simsum)
```

```
## function (data, estvarname, se = NULL, true = NULL, 
##     methodvar = NULL, ref = NULL, by = NULL, ci.limits = NULL, 
##     df = NULL, dropbig = FALSE, x = FALSE, control = list())
```

> The full documentation is available online at https://ellessenne.github.io/rsimsum/

---

# Summarising a simulation study (1)

```r
s <- rsimsum::simsum(
 data = df, estvarname = "theta", se = "se", true = -0.5,
 methodvar = "method", by = "dgm", x = TRUE
)
s
```

```
## Summary of a simulation study with a single estimand.
## True value of the estimand: -0.5 
## 
## Method variable: method 
## 	Unique methods: Exp, Wei, Cox 
## 	Reference method: Exp 
## 
## By factors: dgm 
## 
## Monte Carlo standard errors were computed.
```

---

# Summarising a simulation study (2)

```r
summary(s, stats = c("bias", "cover"))
```

```
## Values are:
## 	Point Estimate (Monte Carlo Standard Error)
## 
## Bias in point estimate:
##  dgm              Exp              Wei              Cox
##    1 -0.0029 (0.0052) -0.0032 (0.0052) -0.0021 (0.0052)
##    2  0.0494 (0.0035)  0.0048 (0.0038)  0.0062 (0.0038)
## 
## Coverage of nominal 95% confidence interval:
##  dgm             Exp             Wei             Cox
##    1 0.9544 (0.0052) 0.9544 (0.0052) 0.9544 (0.0052)
##    2 0.9600 (0.0049) 0.9556 (0.0051) 0.9575 (0.0050)
```

---

# Performance measures in `rsimsum`

The following performance measures are implemented in `rsimsum`:

* Bias;

* Empirical SE, relative % increase in precision, model-based SE, and relative % error in model-based SE;

* Mean squared error (MSE);

* Coverage probability and bias-corrected coverage probability;

* Power of type I error.

Each performance measure is described in more detail in the [tutorial paper](https://onlinelibrary.wiley.com/doi/10.1002/sim.8086).

---
class: center, middle

# `rsimsum` supports a variety of (opinionated) fast plotting options for point estimates, standard errors, summary statistics

---

# Plotting point estimates (1)

```r
autoplot(object = s, type = "est")
```

---

# Plotting point estimates (2)

```r
autoplot(object = s, type = "est_density")
```

---

# Plotting point estimates (3)

```r
autoplot(object = s, type = "est_hex")
```

---

# Plotting point estimates (4)

```r
autoplot(object = s, type = "est_ba")
```

---

# Plotting point estimates (5)

```r
autoplot(object = s, type = "est_ridge")
```

---

# Same plots available for SEs too!

```r
autoplot(object = s, type = "se_ba")
```

---

# Plotting summary statistics (1)

```r
autoplot(object = summary(s), type = "forest", stats = "bias")
```

---

# Plotting summary statistics (2)

```r
autoplot(object = summary(s), type = "lolly", stats = "cover")
```

---

# Plotting summary statistics (3)

```r
autoplot(object = s, type = "heat", stats = "bias")
```

---

# Plotting coverage probability

```r
autoplot(object = s, type = "zip", zoom = 0.1)
```

---

# Nested loop plots

```r
autoplot(object = obj, type = "nlp", stats = "bias")
```

.footnote[
This uses a different dataset: `data("nlp", package = "rsimsum")`
]

---

# Customising plots

```r
* autoplot(object = s, type = "est_ridge") +
    viridis::scale_fill_viridis(discrete = TRUE) +
    viridis::scale_color_viridis(discrete = TRUE) +
    ggplot2::scale_x_continuous(labels = scales::comma) +
    ggplot2::theme_minimal(
      base_size = 12,
      base_family = "Arial Narrow"
    ) +
    ggplot2::theme(legend.position = "top") +
    ggplot2::labs(
      x = expression(Point ~ estimates ~ of ~ theta),
      y = "DGM",
      color = "",
      fill = ""
    )
```

---
class: center, middle

---

# Multiple estimands

```r
str(rsimsum::multisimsum)
```

```
## function (data, par, estvarname, se = NULL, true = NULL, 
##     methodvar = NULL, ref = NULL, by = NULL, ci.limits = NULL, 
##     df = NULL, dropbig = FALSE, x = FALSE, control = list())
```

> This supports multiple estimands at once via the `par` argument.
> Everything else works analogously as with `simsum`, including plotting!

---
class: center, middle, inverse

# The INTEREST Shiny app

---
class: center, middle

# INTEREST:

## **IN**teractive **T**ool for **E**xploring **RE**sults

## from **S**imulation s**T**udies

---

# INTEREST: Why?

> Why a Shiny app?

* Dissemination of results and open science;

* Fast(er) iteration and exploration of results, with a point-and-click interface;

* Supporting devices where R does not run natively (smartphones, iPads, Chromebooks, ...);

* Supporting devices where you are not allowed to install R (e.g. because of IT policies).

---

# INTEREST: Homepage

---

# INTEREST: Workflow

---
class: center, middle

# Demo:

## http://interest.shinyapps.io/interest/

---
class: center, middle, inverse

# Considerations and conclusions

---

# How to get `rsimsum` and INTEREST?

`rsimsum` can be installed directly from CRAN:

```r
install.packages("rsimsum")
# Development version on GitHub:
# require("remotes")
# remotes::install_github(repo = "ellessenne/rsimsum")
```

INTEREST is on GitHub:

```r
# require("remotes")
remotes::install_github(repo = "ellessenne/interest")
```

---
class: center, middle

## `rsimsum` and INTEREST are actively developed: bugs are getting squished, new features are being implemented, ...

### If there is anything you'd like to see implemented, feel free to [e-mail me](mailto:alessandro.gasparini@ki.se) or [open an issue on GitHub](https://github.com/ellessenne).

---
class: center, middle

## Report _Monte Carlo standard errors_ — how would you feel if a paper in NEJM did not report standard errors of effect estimates?

---
class: center, middle

## The importance of _reproducibility_ and _dissemination_ of research findings

---
class: center, middle

## The opportunities of _open-source software_

---

# That's all — thank you!

Some references:

* _Using simulation studies to evaluate statistical methods_. Morris TP, White IR, and Crowther MJ (2019). Statistics in Medicine 38(11):2074--2102, DOI: 10.1002/sim.8086

* _rsimsum: Summarise results from Monte Carlo simulation studies_. Gasparini A (2018). Journal of Open Source Software 3(26):739, DOI: 10.21105/joss.00739

* _INTEREST: INteractive Tool for Exploring REsults from Simulation sTudies_. Gasparini A, Morris TP, Crowther MJ (2020). arXiv:1909.03813

* `rsimsum`'s website: https://ellessenne.github.io/rsimsum/