7.1 Relative Model-Data Fit at Test Level

Let \(L(\mathbf{Y})\) be the likelihood of observing item response vectors of \(N\) students.

Theoretically, a model with larger \(L(\mathbf{Y})\) or smaller \(-2\log L(\mathbf{Y})\) is perferred because it makes the data more likely to occur.

However, based on this rule, a model with more parameters is usually ‘’preferred’’, yielding overfitting issue.

What we can do is to add a penalty to penalize a model with too many parameters.