Major foundation models

Dep-LLM: Training-Free Depression Diagnosis via Evidence-Guided Structured Multi-factor with Reliable LLM Reasoning

Yiqing Lyu, Xianbing Zhao, Buzhou Tang, Ronghuan Jiang

Published
Jun 9, 2026 — 12:44 UTC

Problem
Automatic Depression Detection (ADD) from clinical interviews is a critical task in computational mental health, yet it faces significant challenges. The primary issues include the difficulty of modeling complex, sparsely distributed depression indicators within lengthy, multi-topic dialogues, which often leads to superficial and unreliable reasoning. Additionally, the scarcity of labeled data due to clinical privacy concerns, combined with the high costs associated with training and fine-tuning models, limits the deployment of supervised ADD systems. This paper addresses these gaps by proposing a training-free framework, Dep-LLM, which leverages existing large language models (LLMs) without requiring additional training.

Method
Dep-LLM operates in three distinct stages. First, it employs a Chain-of-Thought (CoT) Depression Multi-factor Analysis module that decomposes lengthy dialogues into five clinically relevant themes, generating evidence-grounded rationales to effectively manage long-context dependencies. Second, the Confidence Analysis and Modulation module quantifies the epistemic reliability of each rationale using token-level entropy, applying intra-label and inter-theme modulation to enhance trustworthy signals while suppressing uncertain ones, all without additional training. Finally, the Collaborative Multi-factor Prediction module integrates these multi-factor signals, weighted by their confidence levels, to produce a final diagnosis. The framework utilizes frozen off-the-shelf foundation LLMs, making it both efficient and scalable.

Results
Extensive experiments were conducted on the DAIC-WOZ and E-DAIC datasets, demonstrating the effectiveness and generalizability of Dep-LLM. The framework surpassed zero-shot baselines across 21 foundation LLMs on 9 metrics, including accuracy, macro F1, and weighted-average F1. Notably, it outperformed state-of-the-art supervised domain-specific LLMs and the latest closed-source commercial LLMs, showcasing its competitive edge while requiring no additional training.

Limitations
The authors acknowledge that while Dep-LLM effectively addresses the challenges of ADD, it may still be limited by the inherent biases present in the foundation LLMs it utilizes. Additionally, the reliance on frozen models may restrict the adaptability of the framework to rapidly evolving clinical practices or emerging mental health indicators. The paper does not explore the potential impact of varying the number of themes or the specific clinical contexts in which the model is applied.

Why it matters
The implications of this work are significant for the field of computational mental health, as it provides a novel approach to depression diagnosis that circumvents the need for extensive labeled datasets and training processes. By leveraging existing LLMs in a structured manner, Dep-LLM can facilitate more accessible and efficient mental health assessments, potentially leading to improved patient outcomes. This research contributes to the growing body of literature on training-free methodologies in machine learning, as discussed in arXiv.

Turing Wire

By Turing Wire editorial staff · Jun 9, 2026 · Editorial standards →

Source: arXiv cs.CL