FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents
Jia Deng, Yimeng Chen, Xiaoqing Xiang, Ziyang Zeng, Shuo Tang, Wayne Xin Zhao
- Published
- Jun 10, 2026 — 13:49 UTC
Problem
The paper addresses the challenge of training deep search agents with verifiable questions that require substantial evidence for resolution. Existing methods for synthesizing search tasks often increase difficulty through complex graph structures, but this does not necessarily translate to effective search challenges due to the presence of shortcuts. The authors formalize this issue with a shortcut-aware difficulty framework, identifying four specific shortcut risks: evidence co-coverage, single-clue selectivity, exposed constants, and prior-knowledge binding. This work is presented as a preprint, indicating it has not yet undergone peer review.
Method
The authors propose FORT, a Framework of Shortcut-Resistant Training-Data Synthesis, which systematically constructs training data that mitigates the identified shortcut risks. FORT operates across four key dimensions: entity selection, evidence graph construction, question formulation, and adversarial refinement. The framework employs trajectory signatures—metrics such as solving cost, answer hit time, and prior-shortcut rate—to evaluate the effectiveness of the synthesized tasks. The resulting dataset is used to train FORT-Searcher, which utilizes supervised fine-tuning (SFT) exclusively.
Results
FORT-Searcher demonstrates superior performance on challenging deep search benchmarks compared to existing open-source search agents of comparable size. The experiments reveal that FORT leads to longer pre-answer search durations and a reduction in shortcut patterns when compared to traditional datasets. Specific performance metrics are not disclosed in the abstract, but the overall claim is that FORT-Searcher achieves the best performance among its peers, indicating a significant improvement in search task difficulty and agent training efficacy.
Limitations
The authors acknowledge that while FORT effectively reduces shortcut risks, the framework may still be susceptible to unanticipated shortcut patterns not covered by their analysis. Additionally, the reliance on supervised fine-tuning may limit the generalizability of FORT-Searcher to unsupervised or semi-supervised settings. The paper does not discuss the computational resources required for training or the scalability of the proposed methods.
Why it matters
The implications of this work are significant for the development of robust deep search agents capable of tackling complex search tasks without falling prey to shortcuts. By formalizing the shortcut risks and providing a systematic approach to mitigate them, this research lays the groundwork for future advancements in search agent training methodologies. The findings contribute to the broader discourse on improving AI search capabilities, as highlighted in related works on search task synthesis and agent training strategies, and are available for further exploration on arXiv.
By Turing Wire editorial staff · Jun 10, 2026 · Editorial standards →
Source: arXiv cs.CL