Notable other

When Rating Scales Fall Short: LLM-Assisted Discovery of ADHD Signals in Turkish Teacher Narratives

Baris Karacan, Irem Aktar Songur, Ahmet Ozaslan, Elvan Iseri

Published: Jun 1, 2026 — 17:22 UTC

Problem
This study addresses the inadequacy of standardized rating scales, such as the Conners’ Teacher Rating Scale-Revised Short Form (CTRS-R:S), in fully capturing ADHD-related behaviors in children. The authors highlight a gap in the literature regarding the potential of open-ended teacher narratives to provide complementary insights that structured assessments may overlook. This work is particularly relevant as it is a preprint and has not undergone peer review, indicating the need for further validation of its findings.

Method
The authors employ a large language model (LLM) to analyze de-identified Turkish teacher evaluation forms, which include both CTRS-R:S scores and open-ended narratives. The methodology involves a theme discovery pipeline that leverages NLP techniques to extract and interpret behavioral patterns from the narrative text. The study compares predictive signals derived from structured scores against those from narrative-based models, focusing on cases where structured assessments fail to differentiate between ADHD and non-ADHD students. The LLM’s architecture and specific training details are not disclosed, but the approach emphasizes the integration of qualitative data to enhance diagnostic accuracy.

Results
The findings reveal that narrative-based models successfully identify distinct behavioral patterns in cases where CTRS-R:S scores are ambiguous. Specifically, the study reports that the narrative model captures ADHD signals in instances with minimal overlap with the structured assessments, suggesting that the two modalities encode complementary information. While exact performance metrics are not provided, the qualitative analysis indicates a significant improvement in diagnostic clarity when incorporating teacher narratives alongside traditional rating scales.

Limitations
The authors acknowledge several limitations, including the reliance on a single cultural context (Turkish teacher narratives), which may affect the generalizability of the results. Additionally, the study does not quantify the performance of the LLM in terms of standard metrics such as accuracy or F1 score, which would provide a clearer benchmark against existing models. The lack of peer review also raises questions about the robustness of the findings.

Why it matters
This research underscores the importance of integrating qualitative data from teacher narratives into ADHD assessments, potentially leading to more accurate diagnoses and tailored interventions. The use of LLMs in this context opens avenues for further exploration of NLP applications in clinical settings, particularly in enhancing traditional screening tools. The implications of this work are significant for future research in ADHD diagnostics and the broader field of educational psychology, as published in arXiv.

By Callan Zhang · Jun 1, 2026 · Editorial standards →

Summarised from the primary source with AI assistance under human editorial oversight. Turing Wire is not a primary source — read the original for the authoritative account.

Source: arXiv cs.CL