Major reasoning

Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning

Taebong Kim, Youngsik Hong, Minsik Kim, Sunyoung Choi, Jaewon Jang, Junghoon Shin

Published: May 14, 2026 — 05:09 UTC

Problem
This paper addresses the gap in the capability of large language models (LLMs) to enhance reasoning performance without the need for additional training. The authors propose a novel framework, Darwin Family, which allows for the evolutionary merging of existing model checkpoints through gradient-free weight-space recombination. This work is presented as a preprint and has not yet undergone peer review.

Method
The core technical contribution of the Darwin Family framework consists of three innovative components:

Adaptive Merge Genome: A 14-dimensional genome that facilitates fine-grained recombination at both the component and block levels of the model architecture. This allows for targeted merging of specific capabilities encoded in the model checkpoints.
MRI-Trust Fusion: This mechanism adaptively balances the importance of diagnostic layer signals with an evolutionary search process, utilizing a learnable trust parameter to guide the merging process effectively.
Architecture Mapper: This component enables cross-architecture breeding, allowing for the combination of heterogeneous model families, such as Transformer and Mamba architectures.

The Darwin Family framework supports recursive multi-generation evolution, enabling the creation of new models by merging existing ones without any gradient-based training.

Results
The flagship model, Darwin-27B-Opus, achieves an impressive score of 86.9% on the GPQA Diamond benchmark, ranking #6 among 1,252 evaluated models. Notably, it outperforms its fully trained foundation model, demonstrating the effectiveness of the training-free evolutionary merging approach. Across various scales, from 4 billion to 35 billion parameters, Darwin models consistently show performance improvements over their parent models, indicating the robustness of the evolutionary merging strategy.

Limitations
The authors acknowledge that the framework’s reliance on existing checkpoints may limit the diversity of capabilities that can be merged, as it is constrained by the latent knowledge encoded in those models. Additionally, the performance gains may vary depending on the specific architectures being merged, and the framework’s effectiveness in real-world applications remains to be fully validated. The paper does not address potential computational costs associated with the evolutionary merging process, nor does it explore the implications of model interpretability in the context of the merged architectures.

Why it matters
The Darwin Family framework presents a significant advancement in the field of LLMs by providing a practical and reproducible method for enhancing reasoning capabilities without incurring the costs associated with traditional post-training pipelines. This approach could lead to more efficient model development cycles and enable researchers to leverage existing models more effectively. The implications of this work extend to various applications in natural language processing, where improved reasoning performance is critical.

Authors: Taebong Kim, Youngsik Hong, Minsik Kim, Sunyoung Choi, Jaewon Jang, Junghoon Shin, Minseo Kim
Source: arXiv:2605.14386
URL: https://arxiv.org/abs/2605.14386v1

By Callan Zhang · May 14, 2026 · Editorial standards →

Summarised from the primary source with AI assistance under human editorial oversight. Turing Wire is not a primary source — read the original for the authoritative account.

Source: arXiv cs.NE