FETS Benchmark: Foundation Models Outperform Dataset-specific Machine Learning in Energy Time Series Forecasting

Marco Obermeier, Marco Pruckner, Florian Haselbeck, Andreas Zeiselmair

Published Apr 27, 2026

Editorial review7.2

Relevance0.512

Freshness0.000

Why It Matters

What makes this one worth your time

This research highlights the potential of foundation models to provide scalable and generalizable solutions for energy forecasting, which is crucial for efficient energy system planning and operation.

Foundation models outperform traditional machine learning in energy time series forecasting.

Summary

The paper introduces the FETS benchmark to evaluate the performance of foundation models in energy time series forecasting, demonstrating their superiority over traditional dataset-specific machine learning models across various datasets and settings.

Key contributions

Introduction of the FETS benchmark for energy time series forecasting.
Demonstration of foundation models' superior performance over classical machine learning models.
Analysis of performance factors such as spectral entropy and aggregation levels.

Notable insights

Foundation models show strong performance even without full historic target data.
Performance improves with higher aggregation levels and is correlated with spectral entropy.

Possible limitations

Not stated in the abstract

Abstract

arXiv:2604.22328v1 Announce Type: cross Abstract: Driven by the transition towards a climate-neutral energy system, accurate energy time series forecasting is critical for planning and operation. Yet, it remains largely a dataset-specific task, requiring comprehensive training data, limiting scalability, and resulting in high model development and maintenance effort. Recently, foundation models that aim to learn generalizable patterns via extensive pretraining have shown superior performance in multiple prediction tasks. Despite their success and strong potential to address challenges in energy forecasting, their application in this domain remains largely unexplored. We address this gap by presenting the Foundation Models in Energy Time Series Forecasting (FETS) benchmark. We (1) provide a structured overview of energy forecasting use cases along three main dimensions: stakeholders, attributes, and data categories; (2) collect and analyze 54 datasets across 9 data categories, guided by typical stakeholder interests; (3) benchmark foundation models against classical machine learning approaches across different forecasting settings. Foundation models consistently outperform dataset-specific optimized machine learning approaches across all settings and data categories, despite the latter having seen the full historic target data during training. In particular, covariate-informed foundation models achieve the strongest performance. Further analysis reveals a strong correlation between predictive performance and spectral entropy, performance saturation beyond a certain context length, and improved performance at higher aggregation levels such as national load, district heating, and power grid data. Overall, our findings highlight the strong potential of foundation models as scalable and generalizable forecasting solutions for the energy domain, particularly in data-constrained and privacy-sensitive settings.