- 1. Randomization inference for the Average Treatment Effect
- 2. Randomization inference for alternative designs
- 3. Randomization inference with covariate adjustment
- 4. Randomization inference for a balance test
- 5. Randomization inference for treatment effect heterogeneity by subgroups
- 6. Randomization inference for unmodeled treatment effect heterogeneity
- 7. Randomization inference for multi-arm trials
- 7. Randomization inference for joint significance
- 9. Randomization inference under noncompliance
- 10. Randomization inference for arbitrary test statistics

Randomization inference is a procedure for conducting hypothesis tests that takes explicit account of a study’s randomization procedure. See 10 things about Randomization Inference for more about the theory behind randomization inference. In this guide, we’ll see how to use the ri2 package for r to conduct 10 different analyses. This package was developed with funding from EGAP’s innaugural round of standards grants, which are aimed at projects designed to improve the quality of experimental research.

To illustrate what you can do with `ri2`

, we’ll use some
data from a hypothetical experiment involving 200 students in 20
schools. We’ll consider how to do randomization inference using a
variety of different designs, including complete random assignment,
block random assignment, cluster random assignment, and a multi-arm
trial. You can check the
kinds
of random assignment methods guide for more on the varieties of
random assignment.

Follow the links below to download the four datasets we’ll use in the examples:

- complete randomization assignment dataset
- blocked randomization assignment dataset
- clustered randomization assignment dataset
- three-arm randomization assignment dataset

We’ll start with the most common randomization inference task: testing an observed average treatment effect estimate against the sharp null hypothesis of no effect for any unit.

In `ri2`

, you always “declare” the random assignment
procedure so the computer knows how treatments were assigned. In the
first design we’ll consider, exactly half of the 200 students were
assigned to treatment using complete random assignment.

`library(ri2)`

`## Warning: package 'randomizr' was built under R version 4.1.2`

`## Warning: package 'estimatr' was built under R version 4.1.2`

```
complete_dat <- read.csv("complete_dat.csv")
complete_dec <- declare_ra(N = 200)
```

Now all that remains is a call to `conduct_ri`

. The
`sharp_hypothesis`

argument is set to `0`

by
default corresponding to the sharp null hypothesis of no effect for any
unit. We can see the output using the `summary`

and
`plot`

commands.

```
sims <- 10000
ri_out <-
conduct_ri(
Y ~ Z,
declaration = complete_dec,
sharp_hypothesis = 0,
data = complete_dat,
sims = sims
)
summary(ri_out)
```

```
## term estimate two_tailed_p_value
## 1 Z 41.98 0.1144
```

`plot(ri_out)`

```
## Warning: It is deprecated to specify `guide = FALSE` to remove a guide. Please
## use `guide = "none"` instead.
```

You can obtain one-sided p-values with a call to
`summary`

:

`summary(ri_out, p = "upper")`

```
## term estimate upper_p_value
## 1 Z 41.98 0.0564
```

`summary(ri_out, p = "lower")`

```
## term estimate lower_p_value
## 1 Z 41.98 0.9436
```

The answer that `ri2`

produces depends deeply on the
randomization procedure. The next example imagines that the treatment
was blocked at the school level.

```
blocked_dat <- read.csv("blocked_dat.csv")
blocked_dec <- declare_ra(blocks = blocked_dat$schools)
ri_out <-
conduct_ri(
Y ~ Z,
declaration = blocked_dec,
data = blocked_dat,
sims = sims
)
summary(ri_out)
```

```
## term estimate two_tailed_p_value
## 1 Z 91.98 2e-04
```

`plot(ri_out)`

```
## Warning: It is deprecated to specify `guide = FALSE` to remove a guide. Please
## use `guide = "none"` instead.
```

A very similar syntax accommodates a cluster randomized trial.

```
clustered_dat <- read.csv("clustered_dat.csv")
clustered_dec <- declare_ra(clusters = clustered_dat$schools)
ri_out <-
conduct_ri(
Y ~ Z,
declaration = clustered_dec,
data = clustered_dat,
sims = sims
)
summary(ri_out)
```

```
## term estimate two_tailed_p_value
## 1 Z 79.32 0.0111
```

`plot(ri_out)`

```
## Warning: It is deprecated to specify `guide = FALSE` to remove a guide. Please
## use `guide = "none"` instead.
```

Covariate adjustment can often produce large gains in precision. To
analyze an experiment with covariate adjustment, simply include the
covariates in the `formula`

argument of
`conduct_ri`

:

```
complete_dec <- declare_ra(N = 200)
ri_out <-
conduct_ri(
Y ~ Z + PSAT,
declaration = complete_dec,
data = complete_dat,
sims = sims
)
summary(ri_out)
```

```
## term estimate two_tailed_p_value
## 1 Z 59.27132 0
```

`plot(ri_out)`

```
## Warning: It is deprecated to specify `guide = FALSE` to remove a guide. Please
## use `guide = "none"` instead.
```

You can use randomization inference to conduct a balance test (or
randomization check). In this case, we write a function of
`data`

that return some balance statistic (the F-statistic
from a regression of the treatment assignment on two covariates).

```
balance_fun <- function(data) {
summary(lm(Z ~ professionalism + PSAT, data = data))$f[1]
}
ri_out <-
conduct_ri(
test_function = balance_fun,
declaration = complete_dec,
data = complete_dat,
sims = sims
)
```

```
## Warning in data.frame(est_sim = test_stat_sim, est_obs = test_stat_obs, : row
## names were found from a short variable and have been discarded
```

`summary(ri_out)`

```
## term estimate two_tailed_p_value
## 1 Custom Test Statistic 0.2924994 0.7489
```

`plot(ri_out)`

```
## Warning: It is deprecated to specify `guide = FALSE` to remove a guide. Please
## use `guide = "none"` instead.
```