7.1 Relative Model-Data Fit at Test Level
Let \(L(\mathbf{Y})\) be the likelihood of observing item response vectors of \(N\) students.
Theoretically, a model with larger \(L(\mathbf{Y})\) or smaller \(-2\log L(\mathbf{Y})\) is perferred because it makes the data more likely to occur.
However, based on this rule, a model with more parameters is usually ‘’preferred’’, yielding overfitting issue.
What we can do is to add a penalty to penalize a model with too many parameters.