Notable training methods

Activation-Based Active Learning for In-Context Learning: Challenges and Insights

Yaseen M. Osman, Geoff V. Merrett, Stuart E. Middleton

Published
Jun 3, 2026 — 17:39 UTC

Problem
This work addresses the gap in the literature regarding the application of transformer activation signals for optimizing in-context sample selection in deep active learning. Previous studies have explored active learning for large language models (LLMs), but none have leveraged the recent advancements in understanding transformer activations. The authors present a comprehensive analysis of this approach, which remains unreviewed, highlighting the need for empirical validation of activation-based methods in the context of in-context learning.

Method
The authors propose a framework that utilizes MLP (Multi-Layer Perceptron) activation signals to inform the selection of in-context examples. They conduct experiments using two base models: Llama-3.2-3B and Qwen2.5-3B, applying various attention masking strategies to assess their impact on active learning performance across diverse classification and generative datasets. The study evaluates the correlation between model activations—specifically massive activations and the first four statistical moments—and the quality of selected examples. The training compute details are not disclosed, but the methodology emphasizes the analysis of activation patterns as a potential signal for improving sample selection.

Results
The findings reveal a significant limitation of the proposed method: the correlation between MLP outputs and example quality or task performance is weak. The absolute Spearman correlation coefficient does not exceed 0.33 across all tasks and models tested, indicating that activation-based sampling is ineffective for in-context learning. This negative result suggests that the anticipated benefits of using activation signals for active learning do not materialize, challenging the initial hypothesis.

Limitations
The authors acknowledge the primary limitation of their approach, which is the lack of correlation between activation signals and task performance. They hypothesize that this may stem from the phenomenon of superposition, where models encode more features than their dimensionality allows. This suggests that the complexity of the learned representations may obscure the utility of activation signals for sample selection. Additionally, the study does not explore alternative architectures or more sophisticated sampling techniques that could potentially yield better results.

Why it matters
The implications of this research are significant for the field of active learning and in-context learning. The findings indicate that relying on activation-based methods for sample selection may lead to suboptimal performance, prompting researchers to reconsider the use of such signals in their models. The authors suggest that future work could explore Sparse Autoencoders (SAEs) as a promising direction to address the limitations identified in their study. This paper contributes to the ongoing discourse on effective strategies for active learning in LLMs, as published in arXiv cs.LG.

Turing Wire

By Turing Wire editorial staff · Jun 3, 2026 · Editorial standards →

Source: arXiv cs.LG