Notable agents robotics

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Pu Ning, Quan Chen, Kun Tao, Xinyu Tang, Tianshu Wang, Qianggang Cao

Published
Jun 8, 2026 — 16:52 UTC

Problem
The paper addresses the challenge of enabling large language models (LLMs) to effectively manage complex, long-horizon tasks that require extensive contextual understanding. Current LLMs are limited by finite context windows, which restricts their ability to process and integrate information over extended interactions. While recent approaches have proposed task decomposition and delegation to subagents, the literature lacks a systematic method for training models to develop delegation intelligence—specifically, the ability to decompose tasks, decide on delegation, and synthesize results. This gap is particularly pronounced in the open-source community, where training data for such capabilities is scarce.

Method
The authors propose a novel framework called SearchSwarm, which incorporates a harness designed to guide LLMs in high-quality task decomposition and delegation. The harness constrains subagents to return results that are directly usable by the main agent, thereby maintaining workflow integrity. The authors utilize supervised fine-tuning on the trajectories generated by the harness, which encode optimal delegation decisions. The resulting model, SearchSwarm-30B-A3B, is a 30 billion parameter LLM that has been specifically trained to internalize delegation intelligence. The training process and the architecture details are not fully disclosed, but the authors emphasize the importance of the harness in generating effective training data.

Results
SearchSwarm-30B-A3B achieves a score of 68.1 on the BrowseComp benchmark and 73.3 on BrowseComp-ZH, outperforming all other models of comparable scale. These results indicate a significant improvement in the model’s ability to handle long-horizon tasks through effective delegation and task management. The authors provide a comparative analysis against existing models, highlighting the superior performance of SearchSwarm in the context of deep research tasks.

Limitations
The authors acknowledge that their work is a preliminary exploration and that the harness and model are specifically tailored for deep research tasks, which may limit generalizability to other domains. Additionally, the reliance on supervised fine-tuning may introduce biases based on the quality of the generated training data. The paper does not address potential scalability issues or the computational costs associated with training such large models, which could be a concern for broader applications.

Why it matters
The development of SearchSwarm represents a significant advancement in the capability of LLMs to manage complex tasks through delegation intelligence, which is crucial for real-world applications requiring sustained interaction and context management. By releasing the harness, model weights, and training data, the authors aim to facilitate further research in this area, potentially leading to more robust and capable agentic LLMs. This work lays the groundwork for future explorations into task delegation in LLMs, as discussed in arXiv cs.AI.

Turing Wire

By Turing Wire editorial staff · Jun 8, 2026 · Editorial standards →

Source: arXiv cs.AI