10.2 Monte Carlo Approach Using R

Johnson and Sinharay (2018) fit a loglinear CDM to a set of data from the Examination for the Certificate of Proficiency in English (ECPE) Grammar Test, which includes the responses of 2,922 examinees to 28 items measuring knowledge of (1) morphosyntactic rules, (2) cohesive rules, and (3) lexical rules. The data set and Q-matrix can be obtained by the following code:

Code

library(GDINA)
Y <- ecpe$dat
Q <- ecpe$Q

To perform the Monte Carlo method, we fit the logit link G-DINA model to the data (Step 1)

Code

lcdm <- GDINA::GDINA(Y, Q, model = "GDINA", verbose = 0)

In Step 2, estimated item parameters and proportion parameters can be obtained below

Code

# item parameter estimates
item.parameters <- coef(lcdm)
# proportion of latent classes
phat <- c(GDINA::extract(lcdm, "posterior.prob"))

In Step 3, we simulate data:

Code

set.seed(12345)
sim <- simGDINA(N = 1e+05, Q = Q, catprob.parm = item.parameters, att.prior = phat, att.dist = "categorical")

Step 4 estimates attribute profiles of the simulated data, while fixing item parameters

Code

Y <- GDINA::extract(sim, what = "dat")
est <- GDINA::GDINA(dat = Y, Q = Q, catprob.parm = item.parameters, att.prior = phat, control = list(maxitr = 0))

Step 5 compares the estimated attribute profiles with the simulated ones.

Code

# individual attribute level accuracy based on MAP estimation
colMeans(GDINA::extract(sim, "attribute") == GDINA::personparm(est, "MAP")[, 1:3])

##     A1     A2     A3 
## 0.9127 0.8680 0.9338

Explore why the attribute pattern level accuracy of the MAP estimation can be assessed using the code below:

Code

# attribute pattern level accuracy
aggregate(sim$att.group == apply(GDINA::indlogPost(est), 1, which.max), list(sim$att.group),
    mean)

##   Group.1       x
## 1       1 0.91431
## 2       2 0.05366
## 3       3 0.00000
## 4       4 0.46246
## 5       5 0.44890
## 6       6 0.19371
## 7       7 0.63537
## 8       8 0.91974