LCAM: A Framework for Diagnosing Interactional Alignment Failures in Con-versational AI
Manuele Reani, Hongyu Tian
Why It Matters
What makes this one worth your time
Understanding and diagnosing interactional failures in conversational AI is crucial for improving user trust and safety, especially in sensitive applications like counseling.
LCAM offers a structured approach to identify and address interactional misalignments in conversational AI.
Summary
The paper introduces the Layered Cognitive Alignment Model (LCAM), a framework for diagnosing interactional alignment failures in conversational AI by defining alignment across five layers and identifying two diagnostic polarities of misalignment. It applies LCAM to a case study involving a language model used for counseling, highlighting potential harms in conversational AI interactions.
Key contributions
- Introduction of the Layered Cognitive Alignment Model (LCAM) for diagnosing interactional alignment failures.
- Application of LCAM to a real-world example in conversational AI counseling.
Notable insights
- LCAM distinguishes between five layers of alignment: perceptual, semantic, affective, cognitive, and ethical.
- The framework introduces diagnostic polarities of misalignment: underfit and overreach.
Possible limitations
- Not stated in the abstract
Abstract
arXiv:2606.08131v1 Announce Type: cross Abstract: Conversational AI is increasingly used for advice, interpretation, reassurance, and decision support in contexts where users may be vulnerable, uncertain, or dependent on the system's apparent competence. Existing alignment work often focuses on model objectives, preference optimization, or output correctness. Yet, many harms arise through interaction: how systems frame authority, express uncertainty, simulate empathy, support reasoning, and make boundaries legible. This paper introduces the Layered Cognitive Alignment Model (LCAM), a conceptual and normative framework for diagnosing interac-tional alignment failures in conversational AI. LCAM defines alignment as a calibrated fit among system behavior, user goals, task demands, and normative context. It distinguishes five layers of fit: perceptual, semantic, affective, cognitive, and ethical, and two diagnostic polarities of misalignment: underfit and overreach. We apply LCAM to a published LLM counseling example, showing how an apparently supportive response can reinforce harmful beliefs, simulate inappropriate care, and obscure role boundaries. By translating conversational failures into audit and governance questions concerning over-reliance, false intimacy, autonomy erosion, boundary confusion, and inappropriate trust, LCAM offers a theoretical and normative lens for evaluating conversational AI beyond accuracy, helpfulness, or trust.