Back to today's list

AI Integrity: A New Paradigm for Verifiable AI Governance

Seulki Lee

Published Apr 14, 2026Featured #4In the daily list Apr 15, 2026
Daily score71.3
Editorial review8.5
Relevance0.479
Freshness0.722

Why It Matters

What makes this one worth your time

This paper matters because it shifts the focus from outcome-based evaluations to a procedural approach that emphasizes transparency and auditability in AI systems, potentially leading to more trustworthy AI applications in critical sectors.

AI Integrity offers a novel approach to AI governance by ensuring the verifiability of reasoning processes.

Summary

The paper introduces 'AI Integrity' as a new governance paradigm focusing on the verifiability of AI systems' reasoning processes, proposing a layered Authority Stack model and the PRISM framework for operationalizing this concept.

Key contributions

  • Introduction of AI Integrity as a new governance paradigm
  • Development of the Authority Stack model
  • Specification of the PRISM framework for measuring reasoning integrity

Notable insights

  • AI Integrity emphasizes the importance of protecting the reasoning process from corruption and bias, rather than just evaluating outcomes.

Possible limitations

  • The concept may face challenges in practical implementation due to the complexity of auditing AI systems' reasoning processes.

Abstract

arXiv:2604.11065v1 Announce Type: new Abstract: AI systems increasingly shape high-stakes decisions in healthcare, law, defense, and education, yet existing governance paradigms -- AI Ethics, AI Safety, and AI Alignment -- share a common limitation: they evaluate outcomes rather than verifying the reasoning process itself. This paper introduces AI Integrity, a concept defined as a state in which the Authority Stack of an AI system -- its layered hierarchy of values, epistemological standards, source preferences, and data selection criteria -- is protected from corruption, contamination, manipulation, and bias, and maintained in a verifiable manner. We distinguish AI Integrity from the three existing paradigms, define the Authority Stack as a 4-layer cascade model (Normative, Epistemic, Source, and Data Authority) grounded in established academic frameworks -- Schwartz Basic Human Values for normative authority, Walton argumentation schemes with GRADE/CEBM hierarchies for epistemic authority, and Source Credibility Theory for source authority -- characterize the distinction between legitimate cascading and Authority Pollution, and identify Integrity Hallucination as the central measurable threat to value consistency. We further specify the PRISM (Profile-based Reasoning Integrity Stack Measurement) framework as the operational methodology, defining six core metrics and a phased research roadmap. Unlike normative frameworks that prescribe which values are correct, AI Integrity is a procedural concept: it requires that the path from evidence to conclusion be transparent and auditable, regardless of which values a system holds.