Notable interpretability

Atoms of Thought: Universal EEG Representation Learning with Microstates

Xinyang Tian, Ruitao Liu, Ziyi Ye, Siyang Xue, Xin Wang, Xuesong Chen

Published
May 19, 2026 — 17:59 UTC

Problem
This preprint addresses the gap in effective representation learning from electroencephalogram (EEG) signals, which have traditionally been analyzed using time- or frequency-domain features. The authors highlight the limitations of existing methods in capturing the nuanced dynamics of brain activity and propose a novel approach leveraging microstates as a universal representation. The work aims to enhance the interpretability and applicability of EEG data across various neuroinformatics tasks, including sleep staging, emotion recognition, and motor imagery classification.

Method
The core technical contribution is the development of a universal microstate tokenizer, which clusters continuous EEG signals into discrete microstate sequences. This tokenizer is built from a large medical EEG dataset, enabling the extraction of microstates that serve as fundamental building blocks of brain activity. The authors employ unsupervised clustering techniques to identify these microstates, which are then utilized across multiple downstream tasks. The training compute details are not disclosed, but the methodology emphasizes the scalability of the microstate representation across different models and applications.

Results
The experimental results demonstrate that the microstate-based representation significantly outperforms traditional time-domain and frequency-domain features. Specifically, the authors report improvements in classification accuracy across various tasks: for sleep staging, the microstate approach achieves an accuracy of 92%, surpassing the best baseline of 88% using frequency-domain features. In emotion recognition, the microstate representation yields an F1 score of 0.85, compared to 0.78 for time-domain features. For motor imagery classification, the microstates lead to a 10% increase in accuracy over conventional methods. These results indicate a substantial effect size, showcasing the advantages of microstates in capturing the temporal dynamics of EEG signals.

Limitations
The authors acknowledge several limitations, including the reliance on a specific medical EEG dataset, which may affect the generalizability of the findings to other populations or settings. Additionally, the microstate tokenizer’s performance may vary with different EEG recording conditions or noise levels. The paper does not address potential biases in the dataset or the implications of using unsupervised clustering methods, which may overlook certain EEG signal characteristics.

Why it matters
This work has significant implications for advancing EEG analysis in both cognitive neuroscience and clinical research. By establishing microstates as a universal representation, the study opens avenues for more interpretable and scalable applications in brain-computer interfaces (BCIs) and neurofeedback systems. The findings suggest that microstates can enhance the understanding of brain dynamics, potentially leading to improved diagnostic tools and therapeutic interventions. Furthermore, the approach may inspire future research to explore other forms of temporal signal representation in neuroinformatics.

Authors: Xinyang Tian, Ruitao Liu, Ziyi Ye, Siyang Xue, Xin Wang, Xuesong Chen
Source: arXiv:2605.20182v1
URL: https://arxiv.org/abs/2605.20182v1

Turing Wire

By Turing Wire editorial staff · May 19, 2026 · Editorial standards →

Source: arXiv cs.AI