AI Integrity: A New Paradigm for Verifiable AI Governance

Seulki Lee

Score8.500

LLMn/a

Embedding0.479

Recencyn/a

Feedback

Why It Matters

This paper matters because it shifts the focus from outcome-based evaluations to a procedural approach that emphasizes transparency and auditability in AI systems, potentially leading to more trustworthy AI applications in critical sectors.

Contributions

Introduction of AI Integrity as a new governance paradigm
Development of the Authority Stack model
Specification of the PRISM framework for measuring reasoning integrity

Insights

AI Integrity emphasizes the importance of protecting the reasoning process from corruption and bias, rather than just evaluating outcomes.

Limitations

The concept may face challenges in practical implementation due to the complexity of auditing AI systems' reasoning processes.

Abstract

arXiv:2604.11065v1 Announce Type: new Abstract: AI systems increasingly shape high-stakes decisions in healthcare, law, defense, and education, yet existing governance paradigms -- AI Ethics, AI Safety, and AI Alignment -- share a common limitation: they evaluate outcomes rather than verifying the reasoning process itself. This paper introduces AI Integrity, a concept defined as a state in which the Authority Stack of an AI system -- its layered hierarchy of values, epistemological standards, source preferences, and data selection criteria -- is protected from corruption, contamination, manipulation, and bias, and maintained in a verifiable manner. We distinguish AI Integrity from the three existing paradigms, define the Authority Stack as a 4-layer cascade model (Normative, Epistemic, Source, and Data Authority) grounded in established academic frameworks -- Schwartz Basic Human Values for normative authority, Walton argumentation schemes with GRADE/CEBM hierarchies for epistemic authority, and Source Credibility Theory for source authority -- characterize the distinction between legitimate cascading and Authority Pollution, and identify Integrity Hallucination as the central measurable threat to value consistency. We further specify the PRISM (Profile-based Reasoning Integrity Stack Measurement) framework as the operational methodology, defining six core metrics and a phased research roadmap. Unlike normative frameworks that prescribe which values are correct, AI Integrity is a procedural concept: it requires that the path from evidence to conclusion be transparent and auditable, regardless of which values a system holds.

arXiv PDF