Note
Thefixes
package currently supports data with annual time intervals only.
For datasets with finer time intervals, such as monthly or quarterly data, I recommend creating a new column with sequential time numbers (e.g., 1, 2, 3, …) representing the time order.
This column can then be used for analysis.
The fixes
package is designed for conducting analysis
and creating plots for event studies, a method used to verify the
parallel trends assumption in two-way fixed effects (TWFE)
difference-in-differences (DID) analysis.
The package includes two main functions:
run_es()
: Accepts a data frame, generates lead and lag
variables, and performs event study analysis. The function returns the
results as a data frame.plot_es()
: Creates plots using ggplot2
based on the data frame generated by run_es()
. Users can
choose between a plot with geom_ribbon()
or
geom_errorbar()
to visualize the results.You can install the package like so:
# install.packages("pak")
::pak("fixes") pak
or
install.packages("fixes")
If you want to install development version, please install from GitHub repository:
::pak("yo5uke/fixes") pak
First, load the library.
library(fixes)
The data frame to be analyzed must include the following variables:
is_treated
).year
).For example, a data frame like the following:
firm_id | state_id | year | is_treated | y |
---|---|---|---|---|
1 | 21 | 1980 | 1 | 0.8342158 |
1 | 21 | 1981 | 1 | -0.5354355 |
1 | 21 | 1982 | 1 | 1.1372828 |
1 | 21 | 1983 | 1 | 0.7339165 |
1 | 21 | 1984 | 1 | 1.4232840 |
1 | 21 | 1985 | 1 | 1.2783362 |
run_es()
run_es()
takes 11 arguments, including required
variables and optional specifications like covariates and
clustering.
Argument | Description |
---|---|
data |
Data frame to be used. |
outcome |
Outcome variable. Can be specified as a raw variable or a
transformation (e.g., log(y) ). Provide it unquoted. |
treatment |
Dummy variable indicating the treated units. Provide it unquoted.
Accepts both 0/1 and TRUE/FALSE . |
time |
Time variable. Provide it unquoted. |
timing |
Time value indicating when the treatment occurs. |
lead_range |
Number of pre-treatment periods to include (e.g., 3 =
lead3 , lead2 , lead1 ). |
lag_range |
Number of post-treatment periods to include (e.g., 2 =
lag0 , lag1 , lag2 ). |
covariates |
Additional covariates to include in the regression. Must be
a one-sided formula (e.g., ~ x1 + x2 ). |
fe |
Fixed effects to control for unobserved heterogeneity. Must
be a one-sided formula (e.g., ~ id + year ). |
cluster |
Specifies clustering for standard errors. Can be a character
vector (e.g., c("id", "year") ) or a
formula (e.g., ~ id + year ,
~ id^year ). |
baseline |
Relative time value to be used as the reference category. The corresponding dummy is excluded from the regression. Must be within the specified lead/lag range. |
interval |
Time interval between observations (e.g., 1 for yearly
data, 5 for 5-year intervals). |
<- run_es(
event_study data = df,
outcome = y,
treatment = is_treated,
time = year,
timing = 1998,
lead_range = 5,
lag_range = 5,
fe = ~ firm_id + year,
cluster = ~ state_id,
baseline = -1,
interval = 1
)
Note: The fe
argument must be
specified as a one-sided formula (e.g.,
~ firm_id + year
).
The cluster
argument can be specified either as a one-sided
formula (e.g., ~ state_id
) or as a character vector (e.g.,
c("firm_id", "year")
).
The run_es()
function returns a tidy data frame with
estimated event-study coefficients, confidence intervals, and metadata
such as relative timing and baseline identification1.
<- run_es(
event_study data = df,
outcome = y,
treatment = is_treated,
time = year,
timing = 1998,
lead_range = 5,
lag_range = 5,
covariates = ~ cov1 + cov2 + cov3,
fe = ~ firm_id + year,
cluster = ~ state_id,
baseline = -1,
interval = 1
)
You can use this result to create custom plots, or take advantage of
the built-in plot_es()
function to visualize the estimates
and confidence intervals with minimal code.
plot_es()
The plot_es()
function creates a plot based on
ggplot2
.
plot_es()
has 12 arguments.
Arguments | Description |
---|---|
data | Data frame created by run_es() |
type | The type of confidence interval visualization: “ribbon” (default) or “errorbar” |
vline_val | The x-intercept for the vertical reference line (default: 0) |
vline_color | Color for the vertical reference line (default: “#000”) |
hline_val | The y-intercept for the horizontal reference line (default: 0) |
hline_color | Color for the horizontal reference line (default: “#000”) |
linewidth | The width of the lines for the plot (default: 1) |
pointsize | The size of the points for the estimates (default: 2) |
alpha | The transparency level for ribbons (default: 0.2) |
barwidth | The width of the error bars (default: 0.2) |
color | The color for the lines and points (default: “#B25D91FF”) |
fill | The fill color for ribbons (default: “#B25D91FF”). |
If you don’t care about the details, you can just pass the data frame
created with run_es()
and the plot will be complete.
plot_es(event_study)
plot_es(event_study, type = "errorbar")
plot_es(event_study, type = "errorbar", vline_val = -.5)
Since it is created on a ggplot2
basis, it is possible
to modify minor details.
plot_es(event_study, type = "errorbar") +
::scale_x_continuous(breaks = seq(-5, 5, by = 1)) +
ggplot2::ggtitle("Result of Event Study") ggplot2
plot_es()
(e.g.,
conf_level = 0.90
)facet_by = "group"
)If you find an issue, please report it on the GitHub Issues page.
Behind the scenes, estimation is
performed using fixest::feols()
.↩︎