Evaluating Explainability in Safety-Critical ATR Systems: Limitations of Post-Hoc Methods and Paths Toward Robust XAI
Vanessa Buhrmester, David Muench, Dimitri Bulatov, Michael Arens
Why It Matters
What makes this one worth your time
Understanding the limitations of explainability methods in safety-critical systems is crucial for developing reliable AI applications, particularly in defense and security.
This research critiques current XAI methods in ATR systems and proposes a new evaluation framework.
Summary
The paper evaluates explainability methods in safety-critical Automatic Target Recognition (ATR) systems, identifying limitations of post-hoc approaches and proposing a structured assessment framework based on key dimensions of explainability.
Key contributions
- Introduces a taxonomy for evaluating explainability methods in ATR systems.
- Identifies systematic limitations of current post-hoc explanation methods.
- Proposes directions for developing more robust and causally grounded explainability methods.
Notable insights
- The paper formalizes explainability as an assurance-oriented assessment problem, which is a novel perspective in the context of ATR.
- It identifies critical failure modes of existing XAI techniques, emphasizing the need for more robust methods.
Possible limitations
- Not stated in the abstract.
Abstract
arXiv:2605.05748v1 Announce Type: new Abstract: Explainable Artificial Intelligence (XAI) is increasingly rec ognized as essential for deploying machine learning systems in safety critical environments. In Automatic Target Recognition (ATR), where models operate on image, video, radar, and multisensor data, high pre dictive performance alone is insufficient. Model decisions must also be interpretable, reliable, and suitable for validation. This paper presents a structured evaluation of explainability methods in the context of safety-critical ATR systems: We identify major XAI paradigms, including saliency-based, attention-based, and surrogate ap proaches, as well as recent detection-aware extensions. Based on this, we formalize explainability as an assurance-oriented assessment problem, introduce a taxonomy, and assess these methods with respect to four key dimensions: interpretability, robustness, vulnerability to manipula tion, and suitability for validation and verification. The analysis identifies systematic limitations of current post-hoc explanation methods. In par ticular, we derive critical failure modes such as spurious explanations, instability under perturbations, and overtrust induced by visually con vincing outputs. These findings indicate that widely used XAI techniques may be insufficient for safety-critical deployment. Finally, we discuss implications for ATR systems and outline directions toward more robust, causally grounded, and physically informed explain ability methods. Our results emphasize the need to move beyond visually plausible explanations toward approaches that support reliable decision making and system-level assurance.