The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
The Topic Testlet Model (TTM) integrates topic modeling (Latent Dirichlet Allocation) with psychometric models (Partial Credit Model) to calibrate testlet-based assessments. This approach uses the textual content of student responses to account for local item dependence (LID) caused by shared stimuli. This vignette demonstrates how to: Pre-process item scores and textual responses. Determine the optimal number of latent topics. Extract person-specific topic proportions (delta). Calibrate the TTM to estimate student ability (theta), topic penalties (lambda), and testlet effects (gamma).
# Simulation parameters
N_students <- 100
J_items <- 4
K_topics_true <- 2
# A. Simulate Numeric Scores (0-2)
score_matrix <- matrix(
sample(0:2, N_students * J_items, replace = TRUE),
nrow = N_students,
ncol = J_items
)
# B. Simulate Textual Essays
# Define vocabularies for two topics
vocab_topic1 <- c("logic", "reasoning", "evidence", "fact", "analysis", "data")
vocab_topic2 <- c("feeling", "story", "character", "plot", "narrative", "mood")
# Helper function to generate a random essay
generate_essay <- function() {
words <- sample(c(vocab_topic1, vocab_topic2), size = 20, replace = TRUE)
paste(words, collapse = " ")
}
# Create matrix of essays
essay_matrix <- matrix(
replicate(N_students * J_items, generate_essay()),
nrow = N_students,
ncol = J_items
)
# Preview the data
head(score_matrix, 3)
#> [,1] [,2] [,3] [,4]
#> [1,] 1 1 0 1
#> [2,] 1 0 2 0
#> [3,] 0 2 1 0
substr(essay_matrix[1,1], 1, 50) # First 50 chars of first essay
#> [1] "reasoning story reasoning analysis data story mood"text_vector <- aggregate_responses(essay_matrix)
# Check the first student's aggregated text
substr(text_vector[1], 1, 60)
#> [1] "reasoning story reasoning analysis data story mood character"# In a real analysis, you might check a wider range (e.g., 2:10)
perp_results <- ttm_perplexity(text_vector, k_range = 2:3)
#> Calculating perplexity...
#> Fitting LDA with k = 2
#> Fitting LDA with k = 3
print(perp_results)
#> k perplexity
#> 1 2 12.00455
#> 2 3 12.01140
# Select the K with the lowest perplexity
best_k <- perp_results$k[which.min(perp_results$perplexity)]
cat("Optimal number of topics:", best_k)
#> Optimal number of topics: 2delta_matrix <- ttm_lda(text_vector, k = best_k)
#> Fitting LDA with k = 2
# The result is an N x K matrix
head(delta_matrix)
#> [,1] [,2]
#> [1,] 0.4968303 0.5031697
#> [2,] 0.4973594 0.5026406
#> [3,] 0.4971421 0.5028579
#> [4,] 0.4999181 0.5000819
#> [5,] 0.5029592 0.4970408
#> [6,] 0.4991809 0.5008191# We use max_iter = 50 for speed in this vignette.
# For operational use, allow more iterations for convergence.
ttm_results <- ttm_est(
scores = score_matrix,
delta = delta_matrix,
max_iter = 50
)
#> Iter: 1 | LogLik: -371.8979 | Diff: Inf
#> Iter: 2 | LogLik: -375.6258 | Diff: 3.7280
#> Iter: 3 | LogLik: -376.0733 | Diff: 0.4475
#> Iter: 4 | LogLik: -376.1024 | Diff: 0.0291
#> Iter: 5 | LogLik: -376.1017 | Diff: 0.0007
#> Iter: 6 | LogLik: -376.0989 | Diff: 0.0028
#> Iter: 7 | LogLik: -376.0989 | Diff: 0.0000
# Model Fit Statistics
print(paste("AIC:", round(ttm_results$AIC, 2)))
#> [1] "AIC: 768.2"
print(paste("BIC:", round(ttm_results$BIC, 2)))
#> [1] "BIC: 789.04"plot(ttm_results$theta, ttm_results$gamma,
xlab = "Student Ability (Theta)",
ylab = "Testlet Effect (Gamma)",
main = "Relationship between Ability and Testlet Effect",
pch = 19, col = rgb(0, 0, 1, 0.6))
grid()
abline(lm(ttm_results$gamma ~ ttm_results$theta), col = "red", lwd = 2)In this simulated example, we can examine the distribution of the estimated abilities:
hist(ttm_results$theta,
main = "Distribution of Estimated Abilities",
xlab = "Theta",
col = "lightblue",
border = "white")References Xiong, J., Kuang, H., Tang, C., Liu, Q., Wang, B., Engelhard, G., Cohen, A. S., Xiong, X., & Sheng, R. (2025). A Topic Testlet Model for Calibrating Testlet Constructed Responses. Journal of Educational Measurement.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.