Notable reasoning

When Should Models Change Their Minds? Contextual Belief Management in Large Language Models

Haoming Xu, Weihong Xu, Zongrui Li, Mengru Wang, Yunzhi Yao, Chiyu Wu

Published
May 28, 2026 — 16:52 UTC

Problem
This paper addresses the gap in the capability of large language models (LLMs) to manage contextual beliefs over long-horizon interactions. Specifically, it introduces the concept of Contextual Belief Management (CBM), which involves determining when to update, preserve, or ignore information based on task relevance. The authors highlight that existing models struggle with this task, leading to failures in maintaining accurate belief states. The work is presented as a preprint and has not yet undergone peer review.

Method
The authors propose a benchmark called BeliefTrack, designed to evaluate CBM in a closed-world setting. This benchmark encompasses tasks such as Rule Discovery and Circuit Diagnosis, utilizing a finite belief space and symbolic verifiers for precise turn-level assessment. The study identifies three primary failure modes: Failed Stay, Failed Update, and Failed Isolation. The authors evaluate multiple LLMs, including vanilla models and those enhanced with explicit belief-tracking prompts. They introduce a reinforcement learning (RL) approach that incorporates belief-state rewards, which significantly improves performance. The training process and architecture specifics are not disclosed, but the results indicate that the RL method effectively reduces failure rates by an average of 70.9%. Additionally, representation-level steering techniques are employed, yielding a further 46.1% reduction in failure rates across two tasks.

Results
The paper reports substantial improvements in CBM performance when using the proposed RL approach compared to baseline models. Specifically, the RL method reduces failure rates by 70.9% on average across the identified failure modes. In contrast, vanilla models exhibit significant CBM failures, and explicit belief-tracking prompts provide only marginal improvements. The representation-level steering further enhances performance, achieving a 46.1% reduction in failure rates across two tasks. These results underscore the effectiveness of the proposed methods in addressing the identified shortcomings in LLMs.

Limitations
The authors acknowledge that their approach is limited to the specific tasks defined in the BeliefTrack benchmark and may not generalize to all types of long-horizon interactions. They also note that while the RL method shows promise, it requires careful tuning of reward structures, which may not be straightforward in more complex scenarios. Additionally, the paper does not explore the computational costs associated with the RL training process or the scalability of the proposed methods to larger models or more diverse tasks.

Why it matters
This work has significant implications for the development of more robust LLMs capable of managing contextual information over extended interactions. By formalizing the concept of CBM and providing a benchmark for evaluation, the authors lay the groundwork for future research aimed at improving the reliability of LLMs in dynamic environments. The findings suggest that integrating belief-state management into LLM training could enhance their performance in real-world applications, such as dialogue systems and decision-making tasks, where maintaining accurate contextual understanding is crucial.

Authors: Haoming Xu, Weihong Xu, Zongrui Li, Mengru Wang, Yunzhi Yao, Chiyu Wu, Jin Shang, Yu Gong et al.
Source: arXiv:2605.30219
URL: https://arxiv.org/abs/2605.30219v1

Turing Wire

By Turing Wire editorial staff · May 28, 2026 · Editorial standards →

Source: arXiv cs.LG