Hey Chat, Can You Teach Me? Structuring Socratic Dialogue for Human Learning in the Wild
Sidney Tio, Arunesh Sinha, Pradeep Varakantham
- Published
- Jun 10, 2026 — 07:20 UTC
Problem
The paper addresses the inadequacy of current large language models (LLMs) in structured tutoring scenarios, where interactions are typically unstructured and lack a curriculum. Existing models struggle to effectively tutor students over extended sessions due to their inability to sequence content, conduct Socratic dialogue, and infer the student’s knowledge state simultaneously. This gap is particularly pronounced in informal learning contexts, where prior knowledge is not explicitly recorded. The authors highlight that merely scaling LLMs does not resolve these issues, indicating a need for a more systematic approach to tutoring.
Method
The authors propose a novel architecture that separates the responsibilities of curriculum sequencing and Socratic dialogue. The system constructs a prerequisite knowledge graph, where nodes represent subtopics and edges denote dependencies between them. This graph serves as the foundation for determining the next topic to teach. A lightweight Proximal Policy Optimization (PPO) policy is employed to make decisions on which node to teach next and how many dialogue turns to allocate for each topic. The LLM is tasked with conducting the Socratic dialogue at the selected node and providing feedback on student progress. This dual-architecture approach allows for a more structured and effective tutoring process.
Results
The proposed PPO-paired tutor was evaluated against heuristic baselines, state-of-the-art general-purpose models, and a model specifically designed for Socratic dialogue across various STEM and non-STEM topics. The results indicate that the PPO-paired tutor significantly outperforms these baselines in two key metrics: the rate at which students achieve full curriculum mastery and the number of dialogue turns required to reach that mastery. While specific numerical results are not disclosed in the abstract, the authors assert that the explicit curriculum structure leads to improvements that scaling the underlying model alone cannot achieve.
Limitations
The authors acknowledge that their approach may not generalize across all subject areas or learning contexts, as the effectiveness of the knowledge graph and PPO policy may vary depending on the complexity of the topics. Additionally, the reliance on a structured curriculum may limit flexibility in addressing spontaneous student inquiries. The paper does not address potential scalability issues when applied to larger or more diverse student populations, nor does it explore the long-term retention of knowledge acquired through this method.
Why it matters
This work has significant implications for the design of intelligent tutoring systems, particularly in informal learning environments where structured curricula are often absent. By demonstrating that explicit curriculum structuring can enhance learning outcomes beyond what is achievable through model scaling, the authors provide a framework that could inform future research in educational AI. This structured approach may lead to more effective and personalized learning experiences, as highlighted in the findings available on arXiv.
By Turing Wire editorial staff · Jun 10, 2026 · Editorial standards →
Source: arXiv cs.CL