Back to today's list

Agentic Physical AI toward a Domain-Specific Foundation Model for Nuclear Reactor Control

Yoon Pyo Lee, Samrendra Roy, Jay Yoo, Kazuma Kobayashi, Sajedul Talukder, Seid Koric, Souvik Chakraborty, Syed Bahauddin Alam

Published May 22, 2026
Editorial review6.8
Relevance0.469
Freshness0.000

Why It Matters

What makes this one worth your time

This work is relevant for AI engineers and researchers interested in developing reliable AI systems for safety-critical applications like nuclear reactor control.

The paper introduces a physics-driven AI model for nuclear reactor control that autonomously optimizes its policy.

Summary

The paper proposes a new approach to developing domain-specific foundation models for nuclear reactor control by using compact language models that focus on physics-based validation rather than perceptual inference. The model is trained on synthetic scenarios, showing improved reliability and emergent policy distillation without reinforcement learning.

Key contributions

  • Introduction of a compact language model for nuclear reactor control.
  • Demonstration of improved closed-loop reliability through physics-based validation.
  • Emergent policy distillation without reinforcement learning.

Notable insights

  • The model autonomously rejects a significant portion of the training distribution, focusing on a single strategy without explicit reward engineering.
  • Variance collapse in large-scale models leads to stabilized execution-level behavior.

Possible limitations

  • Not stated in the abstract

Abstract

arXiv:2512.23292v3 Announce Type: replace Abstract: The prevailing paradigm in AI for physical systems (scaling general-purpose foundation models toward universal multimodal reasoning) confronts a fundamental barrier at the control interface. Recent benchmarks show that even frontier vision--language models achieve only 50--53% accuracy on basic quantitative physics tasks, behaving as approximate guessers that preserve semantic plausibility by violating physical constraints. This input unfaithfulness is not a scaling deficiency but a structural limitation: perception-centric architectures optimize parameter-space imitation, whereas safety-critical control demands outcome-space guarantees over executed actions. Here, we present a fundamentally different pathway "toward" domain-specific foundation models by introducing compact language models operating as Agentic Physical AI, in which policy optimization is driven by physics-based validation rather than perceptual inference. We train a 360-million-parameter model on synthetic nuclear reactor control scenarios, scaling the dataset from 10^3 to 10^5 examples. Scaling induces strong improvements in closed-loop reliability under nominal simulated conditions, with a steep but smooth gain at strict tolerances: small-scale systems exhibit high-variance imitation with severe tail excursions, while large-scale models undergo variance collapse (approximately 500times reduction), stabilizing execution-level behavior within the sampled distribution. Despite balanced exposure to four actuation families, the model autonomously rejects approximately 70\% of the training distribution, concentrating 95% of runtime execution on a single-bank strategy. This emergent policy distillation arises without reinforcement learning or reward engineering, driven solely by outcome-level success under physical execution.