Notable foundation models

Boosting Brain-to-Image Decoding with TRIBE v2 Data Augmentation

Yohann Benchetrit, Marlène Careil, Simon Dahan, Hubert Banville, Stéphane d'Ascoli, Jean-Rémi King

Published: Jun 4, 2026 — 16:18 UTC

Problem
The paper addresses the challenge of limited labeled neural data in brain decoding, particularly in low-data regimes. This issue is critical as it hampers the development of effective decoding models. The authors investigate the potential of augmenting small fMRI datasets with synthetic data generated by TRIBE v2, a pretrained model that encodes fMRI responses to various stimuli. This work is presented as a preprint, indicating that it has not yet undergone peer review.

Method
The authors utilize TRIBE v2, which is pretrained on over 1000 hours of fMRI data corresponding to video, audio, and language stimuli. They conduct experiments on two datasets: the 7T fMRI Natural Scenes Dataset and the 3T fMRI BOLD5000. The core contribution involves systematically varying the amount of synthetic data used for training image decoders and evaluating the impact on performance. The architecture of the image decoders is not explicitly detailed, but the study emphasizes the importance of tuning the proportion of augmented data based on the specific dataset used. The training compute requirements are not disclosed, but the methodology suggests a significant computational investment due to the size of the TRIBE v2 model.

Results
The results indicate a substantial improvement in Top-10 image-retrieval accuracy, achieving up to a 68% increase compared to decoders trained solely on real data. Notably, the study reveals that image decoders trained exclusively on synthetic fMRI data can perform above chance levels in certain scenarios, suggesting the feasibility of zero-shot brain-to-image decoding. The performance metrics are compelling, but specific baseline performance numbers for the decoders trained only on real data are not provided, which would help contextualize the improvements.

Limitations
The authors acknowledge that the effectiveness of synthetic data augmentation varies depending on the source of the real data, necessitating careful adjustment of the augmented data proportion. They do not discuss potential overfitting issues that may arise from reliance on synthetic data or the generalizability of their findings across different brain regions or stimuli types. Additionally, the lack of peer review raises questions about the robustness of the results.

Why it matters
This research has significant implications for advancing brain decoding methodologies, particularly in scenarios where labeled data is scarce. By demonstrating that large-scale models like TRIBE v2 can enhance data efficiency, the findings suggest new avenues for improving brain-computer interfaces and neuroimaging analysis. The potential for zero-shot decoding opens up further exploration into the capabilities of synthetic data in neuroscience applications, as published in arXiv.

By Callan Zhang · Jun 4, 2026 · Editorial standards →

Summarised from the primary source with AI assistance under human editorial oversight. Turing Wire is not a primary source — read the original for the authoritative account.

Source: arXiv cs.LG