A Multi-dimensional Framework for Evaluating Generalization in EEG Foundation Models
Aditya Kommineni, Emily Zhou, Kleanthis Avramidis, Tiantian Feng, Shrikanth Narayanan
Why It Matters
What makes this one worth your time
Understanding EEG model performance in low-resource settings is crucial for practical applications in neurotechnology and clinical fields.
A framework to evaluate EEG models under realistic constraints reveals strengths and weaknesses in foundation models.
Summary
The paper proposes a multi-dimensional framework to evaluate EEG foundation models under realistic low-resource conditions, contrasting their performance with supervised models across various tasks and datasets.
Key contributions
- Proposed a multi-dimensional evaluation framework for EEG models.
- Empirical analysis of EEG models across six datasets under low-resource conditions.
Notable insights
- EEG foundation models excel in long-context tasks but struggle with short-window tasks.
- Supervised models can match foundation models in short-window tasks despite having fewer parameters.
Possible limitations
- Not stated in the abstract
Abstract
arXiv:2605.28563v1 Announce Type: cross Abstract: Evaluating foundation models under appropriate adaptation settings is essential for understanding the quality and transferability of the learned representations. Recent EEG foundation models have demonstrated promising transfer capabilities across tasks and datasets, motivating their growing use in neurotechnology and clinical applications. However, these models are typically evaluated under full fine-tuning on well-curated downstream datasets, a setting that does not reflect biomedical domain constraints such as limited labeled data, reduced sensor coverage, or parameter-efficient adaptation. In this work, we propose a multi-dimensional evaluation framework for assessing EEG models under realistic low-resource conditions. Empirical analysis of both supervised EEG models and recent EEG foundation models, including LaBraM, CSBrain, and CBraMod, across 6 different datasets is performed under the proposed multi-dimensional evaluation framework. We find that EEG foundation models consistently provide performance gains on long-context tasks such as sleep stage prediction and mental health state classification. In contrast, for short-window Brain Computer Interface style tasks, supervised models achieve comparable despite having substantially fewer parameters. Additional analyses demonstrate that current foundation models provide limited robustness to short-window tasks and channel constrained settings. Together, these findings motivate the use of multi-dimensional evaluation protocols that characterize model behavior under realistic use constraints.