The edstan package for R provides convenience functions and predefined Stan models related to item response theory (IRT). Its purpose is to make fitting common IRT models using Stan easy. edstan relies on the rstan package, which should be installed first. See here for instructions on installing rstan.
The following table lists the models packaged with edstan. Each of these may optionally included a latent regression of ability. The table includes also links to case studies for the models, though it should be noted that the Stan code and prior distributions have changed somewhat this they were written.
Model | Stan file |
---|---|
Rasch | rasch_latent_reg.stan |
Partial credit | pcm_latent_reg.stan |
Rating scale | rsm_latent_reg.stan |
Two-parameter logistic | 2pl_latent_reg.stan |
Generalized partial credit | gpcm_latent_reg.stan |
Generalized rating scale | grsm_latent_reg.stan |
The next table lists the functions packaged with edstan.
Function | Purpose |
---|---|
irt_data() |
Prepares data for fitting |
irt_stan() |
Wrapper for Running MCMC |
print_irt_stan() |
Show table of output |
stan_columns_plot() |
Create plot of convergence statistics |
labelled_integer() |
Create vector of consecutive integers |
rescale_binary() |
Appropriately scale binary person covariates |
rescale_continuous() |
Appropriately scale continuous person covariates |
edstan_model_code() |
Print the Stan file for a given model |
The R code below first loads edstan, which implicitly loads rstan. Then the two commands that follow set options related to rstan, which are generally recommended. The first causes compiled Stan models to be saved to disc, which allows models to run more quickly after the first time. The second causes Stan to run multiple MCMC chains in parallel.
Next we preview the spelling data set.
The data set is a response matrix indicating whether the 658
respondents spelled four words correctly. It also includes a dummy
variable for whether the respondent is male. Data is fed into Stan in
list form, and the next block of code demonstrates how a suitable data
list may be constructed using the irt_data()
function. The
response matrix (that is, all columns but the first) is provided as the
response_matrix
option.
This data list may be fed to the irt_stan()
function to
fit a Rasch or 2PL model, but for this example we will add a person
covariate to the model using the optional arguments covariates
and formula. The covariates option takes a data frame
of person-related covariates and the formula option takes a
formula to be applied to that data frame. The left side of the formula
should be left empty, though implicitly it is latent ability. The choice
of what to include in the latent regression (if anything) is made when
calling irt_data()
to assemble a data list.
In the next block a data list is made that includes male
as a covariate. We use the rescale_binary()
function to
rescale this covariate for compatibility with the prior distributions
specified in edstan models.
# Make a data list with person covariates
latent_reg_list <- irt_data(
response_matrix = spelling[, -1],
covariates = spelling,
formula = ~ rescale_binary(male)
)
str(latent_reg_list)
The only difference between the two lists are K
and
W
. The first list has W
with \(K=1\) columns, which is a constant for the
model intercept. The second list has W
with \(K=2\) columns, which correspond to the
constant for the model intercept and the male indicator variable. In
either list, W
has one row per person.
Next we fit the model using the irt_stan()
function,
which is a wrapper for rstan::stan()
.
latent_reg_list
is provided as the data, and the name of
one of the .stan files to use is provided as the model
argument. We also choose the number of chains and number of iterations
per chain. By default the first half of each chain is discarded as warm
up.
# Fit the Rasch model
fit_rasch <- irt_stan(latent_reg_list, model = "rasch_latent_reg.stan",
iter = 2000, chains = 4)
The stan_columns_plot()
function shows convergence
statistics for the fitted model. As an aside, it may also be used to
show other statistics such as posterior means or numbers of effective
samples.
print_irt_stan()
provides a summary of parameter
posteriors. It is a wrapper for the print()
method for
fitted Stan models, selecting and organizing the most interesting
parameters. Labels for the items are automatically taken from the column
names of the response matrix. Regarding the parameters in the output:
beta refers to item difficulties, lambda refers to
latent regression coefficients, and sigma is the standard
deviation of the ability distribution.
A convenient way to view the Stan code for the model is the
edstan_model_code()
function.
If we prefer to fit the 2PL model with latent regression, we could
call irt_stan()
and select that .stan file.
The second example will use the verbal aggression data to fit a polytomous item response theory model.
The data consists of 316 persons responding to 24 items with
responses scored as 0, 1, or 2. The variable person
is a
person ID, item
is an item ID, and poly
is the
scored response. In this way, the data may be said to be in “long form”
as there is one row per person-item combination. The data includes two
person covariates: male
is a dummy variable for whether the
respondent is male, and anger
is the person’s trait anger
raw score, a separate measure.
Next we make the data list using irt_data()
. However,
we’ll use a different set of arguments because the data are in long
form: y
takes the responses in vector form, ii
takes a vector of item IDs, and jj
takes a vector of person
IDs. The use of covariates and formula are the same as
before. For the latent regression part, we use the
rescale_binary()
and rescale_continuous()
functions to rescale the covariates for compatibility with the prior
distributions specified in edstan models.
# Make the data list
agg_list <- irt_data(
y = aggression$poly,
ii = aggression$description,
jj = aggression$person,
covariates = aggression,
formula = ~ rescale_binary(male) * rescale_continuous(anger)
)
str(agg_list)
The generalized partial credit model is fit to the these data.
# Fit the generalized partial credit model
fit_gpcm <- irt_stan(agg_list, model = "gpcm_latent_reg.stan",
iter = 2000, chains = 4)
As before, a plot of convergence statistics is made.
And lastly we obtain a table of posterior summaries. Here, alpha refers to discrimination parameters, beta refers to step difficulty parameters, and lambda refers to latent regression coefficients.
Users will be able to fit the edstan models without full knowledge of the technical details, though these are provided in this section. All that is really needed for interpreting results is to know the meanings assigned to the Greek letters.
Variables and parameters are similar across edstan models. The variables used are:
The parameters used are:
The .stan files and the notation for the models below closely adhere to these conventions.
rasch_latent_reg.stan
\[ \mathrm{logit} [ \Pr(y_{ij} = 1 | \theta_j, \beta_i) ] = \theta_j - \beta_i \]
pcm_latent_reg.stan
\[ \Pr(Y_{ij} = y,~y > 0 | \theta_j, \beta_i) = \frac{\exp \sum_{s=1}^y (\theta_j - \beta_{is})} {1 + \sum_{k=1}^{m_i} \exp \sum_{s=1}^k (\theta_j - \beta_{is})} \]
\[ \Pr(Y_{ij} = y,~y = 0 | \theta_j, \beta_i) = \frac{1} {1 + \sum_{k=1}^{m_i} \exp \sum_{s=1}^k (\theta_j - \beta_{is})} \]
rsm_latent_reg.stan
\[ \Pr(Y_{ij} = y,~y > 0 | \theta_j, \beta_i, \kappa_s) = \frac{\exp \sum_{s=1}^y (\theta_j - \beta_i - \kappa_s)} {1 + \sum_{k=1}^{m} \exp \sum_{s=1}^k (\theta_j - \beta_i - \kappa_s)} \]
\[ \Pr(Y_{ij} = y,~y = 0 | \theta_j, \beta_i, \kappa_s) = \frac{1} {1 + \sum_{k=1}^{m} \exp \sum_{s=1}^k (\theta_j - \beta_i - \kappa_s)} \]
2pl_latent_reg.stan
\[ \mathrm{logit} [ \Pr(y_{ij} = 1 | \alpha_i, \beta_i, \theta_j) ] = \alpha_i \theta_j - \beta_i \]
gpcm_latent_reg.stan
\[ \Pr(Y_{ij} = y,~y > 0 | \theta_j, \alpha_i, \beta_i) = \frac{\exp \sum_{s=1}^y (\alpha_i \theta_j - \beta_{is})} {1 + \sum_{k=1}^{m_i} \exp \sum_{s=1}^k (\alpha_i \theta_j - \beta_{is})} \]
\[ \Pr(Y_{ij} = y,~y = 0 | \theta_j, \alpha_i, \beta_i) = \frac{1} {1 + \sum_{k=1}^{m_i} \exp \sum_{s=1}^k (\alpha_i \theta_j + w_{j}' \lambda - \beta_{is})} \]
grsm_latent_reg.stan
\[ \Pr(Y_{ij} = y,~y > 0 | \theta_j, \lambda, \alpha_i, \beta_i, \kappa_s) = \frac{\exp \sum_{s=1}^y (\alpha_i \theta_j - \beta_i - \kappa_s)} {1 + \sum_{k=1}^{m} \exp \sum_{s=1}^k (\alpha_i \theta_j - \beta_i - \kappa_s)} \]
\[ \Pr(Y_{ij} = y,~y = 0 | \theta_j, \lambda, \alpha_i, \beta_i, \kappa_s) = \frac{1} {1 + \sum_{k=1}^{m} \exp \sum_{s=1}^k (\alpha_i \theta_j - \beta_i - \kappa_s)} \]
For Rasch family models, the prior distributions for the person-related parameters are
For models with discrimination parameters, the priors are
In either case, the prior for \(\lambda\) is applied to centered and scaled versions of the person covariates. Specifically: (1) continuous covariates are mean-centered and then divided by two times their standard deviations, (2) binary covariates are mean-centered and divided their maximum minus minimum values, and (3) no change is made to the constant, set to one, for the model intercept. \(t_7\) is the Student’s \(t\) distribution with seven degrees of freedom.
The priors for the item parameters are
It is expected that edstan users will eventually want to write their
own Stan models. In this case, the edstan functions may still be useful.
If the user-written models use the same conventions for naming variables
and parameters as the edstan models, then the irt_data()
and print_irt_stan()
functions will work just fine for the
user-written models. The function stan_columns_plot()
works
with any Stan model, and labelled_integer()
may help in
data preparation for Stan models in general.