Which Changes Matter? Towards Trustworthy Legal AI via Relevance-Sensitive Evaluation and Solver-Grounded Reasoning

Chen Linze, Cai Yufan, Hou Zhe, Dong Jin Song

Published May 27, 2026Featured #6In the daily list May 28, 2026

Open on arXiv Read PDF

Daily score64.0

Editorial review7.0

Relevance0.458

Freshness0.722

Why It Matters

What makes this one worth your time

Understanding and improving the sensitivity of legal AI to relevant legal changes is crucial for developing trustworthy systems that can be relied upon in judicial contexts.

LexGuard enhances legal AI reliability by focusing on legally relevant changes.

Summary

The paper introduces a legal-relevance-sensitive evaluation framework for legal AI, highlighting the need for legal models to be sensitive only to legally relevant changes. It presents LexGuard, a multi-agent framework using formal reasoning and SMT solvers to improve legal reasoning reliability by reducing sensitivity to irrelevant changes and enhancing consistency.

Key contributions

Introduction of a legal-relevance-sensitive evaluation framework.
Development of LexGuard, an adversarial multi-agent framework grounded in formal reasoning.
Demonstration of improved legal reasoning reliability through experiments.

Notable insights

The use of adversarial multi-agent frameworks combined with formal reasoning and SMT solvers to verify legal satisfaction and logical consistency is a novel approach.
The focus on distinguishing legally relevant from irrelevant changes addresses a critical gap in current legal AI systems.

Possible limitations

Not stated in the abstract

Abstract

arXiv:2605.26530v1 Announce Type: new Abstract: Legal reasoning requires distinguishing changes that matter from those that do not. Legal AI should remain stable under legally irrelevant perturbations, but should change when perturbations alter legally material points. We formulate this requirement as a legal-relevance-sensitive evaluation problem: LLMs should only be sensitive to the legally relevant change. We introduce a unified evaluation suite covering should-change and should-not-change evaluation across judicial fairness, robustness, and statute-confusion scenarios. Our evaluation shows that existing legal LLMs are systematically sensitive to legally irrelevant variations and often fail to distinguish related legal elements and statutory rules. To mitigate these failures, we present LexGuard, an adversarial multi-agent framework grounded in formal reasoning. LexGuard formalizes statutes into executable constraints, uses adversarial agents to extract competing fact-statute arguments, and invokes SMT solvers to verify legal satisfaction and logical consistency. Experiments show that LexGuard improves legal reasoning reliability by reducing vulnerability to manipulative framing, improving disambiguation among similar statutes, limiting the influence of legally irrelevant attributes, and increasing consistency under benign reformulations. We show that legal trustworthiness requires not only accuracy, but calibrated sensitivity to legally material changes.