QLAM: A Quantum Long-Attention Memory Approach to Long-Sequence Token Modeling
Hoang-Quan Nguyen, Sankalp Pandey, Khoa Luu
- Published
- May 13, 2026 — 17:56 UTC
Problem
Modeling long-range dependencies in sequential data is a persistent challenge in machine learning, particularly due to the limitations of existing architectures like Transformers, which exhibit quadratic complexity with respect to sequence length. This paper addresses the gap in capability by introducing Quantum Long-Attention Memory (QLAM), a novel approach that leverages quantum mechanics to enhance state-space models (SSMs) for long-sequence token modeling. This work is presented as a preprint and has not yet undergone peer review.
Method
QLAM is a hybrid quantum-classical memory mechanism that extends traditional state-space models by representing the hidden state as a quantum state. This quantum state encodes a superposition of historical information, allowing for a richer memory representation. The evolution of the state is governed by parameterized quantum circuits that are conditioned on the input, facilitating a globally coherent update mechanism. Unlike conventional attention mechanisms that compute pairwise interactions, QLAM captures global dependencies implicitly through the evolution of the quantum state. The architecture maintains the recurrent and linear-time structure of SSMs while enhancing memory capabilities through quantum superposition. The authors do not disclose specific details regarding the training compute or the exact architecture of the quantum circuits used.
Results
QLAM was evaluated on sequential variants of standard image classification benchmarks: sMNIST, sFashion-MNIST, and sCIFAR-10, where images are flattened into token sequences. The results indicate that QLAM consistently outperforms both recurrent baselines and transformer-based models across all tasks. While specific numerical results are not provided in the abstract, the authors claim significant improvements, suggesting a substantial effect size in the context of long-sequence modeling.
Limitations
The authors acknowledge that the implementation of QLAM relies on quantum computing resources, which may not be widely accessible, potentially limiting practical applications. Additionally, the paper does not address the scalability of QLAM in terms of the number of tokens or the depth of quantum circuits, which could impact performance in extremely long sequences. The reliance on quantum mechanics also raises questions about the interpretability of the model compared to classical approaches.
Why it matters
The introduction of QLAM represents a significant step towards integrating quantum computing principles into machine learning, particularly for tasks requiring the modeling of long-range dependencies. By enhancing state-space models with quantum superposition, this work opens avenues for more efficient and powerful sequence modeling techniques. The implications extend to various domains, including natural language processing and time-series analysis, where capturing long-term dependencies is crucial. Future research could explore the practical implementation of QLAM on quantum hardware and its potential to outperform classical models in real-world applications.
Authors: Hoang-Quan Nguyen, Sankalp Pandey, Khoa Luu
Source: arXiv:2605.13833
URL: https://arxiv.org/abs/2605.13833v1
By Turing Wire editorial staff · May 13, 2026 · Editorial standards →
Source: arXiv cs.LG