Signal-to-Noise Ratio and Sample Size Govern Representational Alignment in Neural Networks
Ali Hussaini Umar, Alessandro Laio
- Published
- May 26, 2026 — 12:58 UTC
Problem
This preprint addresses the gap in understanding how signal-to-noise ratio (SNR) and sample size affect representational alignment in neural networks. While previous studies have noted that neural networks develop aligned representations across different architectures and training conditions, the specific roles of SNR and sample size in this alignment process remain underexplored. The authors aim to clarify these relationships in both controlled and real-world settings.
Method
The authors conduct experiments using an ensemble of neural networks trained on both regression and classification tasks. They manipulate the training datasets by introducing independent noise processes to vary the SNR. The study includes both a simple linear network with a single hidden layer, where alignment can be analytically estimated, and more complex nonlinear networks. The alignment of representations is quantitatively assessed across different SNR levels and sample sizes. The authors employ a systematic approach to analyze how these factors influence alignment, revealing a monotonic relationship with SNR and a non-monotonic relationship with sample size.
Results
The findings indicate that representational alignment increases monotonically with SNR across both linear and nonlinear networks, as well as across synthetic and real-world datasets. Conversely, alignment exhibits a non-monotonic relationship with training sample size, showing a minimum near the interpolation threshold. Notably, the authors report that stronger alignment does not correlate with improved generalization error, suggesting that alignment and generalization performance are decoupled. These results are consistent across various tasks and architectures, providing robust evidence for the proposed relationships.
Limitations
The authors acknowledge that their study is limited by the controlled nature of the experiments, which may not fully capture the complexities of real-world scenarios. Additionally, the focus on specific tasks (regression and classification) may restrict the generalizability of the findings to other types of tasks or architectures. The analytical estimation of alignment in the linear case may not extend to more complex networks, potentially limiting the applicability of the insights gained.
Why it matters
This work has significant implications for the design and training of neural networks, particularly in understanding how data quality (SNR) and quantity (sample size) influence representational alignment. The decoupling of alignment from generalization performance challenges existing assumptions about the relationship between these factors, suggesting that improving alignment may not necessarily lead to better model performance. This insight could inform future research on network architectures and training protocols, guiding engineers and researchers in optimizing neural network training strategies.
Authors: Ali Hussaini Umar, Alessandro Laio
Source: arXiv:2605.26973
URL: https://arxiv.org/abs/2605.26973v1
By Turing Wire editorial staff · May 26, 2026 · Editorial standards →
Source: arXiv cs.NE