AI, Take the Wheel: What Drives Delegation and Trust in Human-Computer Cooperative Question Answering?
Maharshi Gor, Yoo Yeon Sung, Yu Hou, Eve Fleisig, Irene Ying, Tianyi Zhou, Jordan Boyd-Graber
Why It Matters
What makes this one worth your time
Understanding how humans decide to trust AI is crucial for designing better collaborative systems that enhance performance and user experience.
This study reveals critical insights into human trust dynamics in AI collaboration.
Summary
The paper investigates human decision-making in trusting AI systems during collaborative question-answering tasks, focusing on delegation and adoption choices, and presents empirical findings from a competitive game setting.
Key contributions
- Empirical analysis of delegation and adoption choices in human-AI collaboration.
- Identification of specific biases affecting human reliance on AI suggestions.
- Recommendations for improving trust through calibrated confidence and evidence-grounded explanations.
Notable insights
- Humans exhibit both under-reliance and over-reliance on AI suggestions, influenced by confirmation bias.
- The study's competitive game setting provides a realistic context for evaluating human-AI collaboration.
Possible limitations
- The sample size is relatively small with only 24 matches and may not generalize to broader contexts.
- Potential biases in expert human participants not addressed.
Abstract
arXiv:2605.28255v1 Announce Type: new Abstract: AI systems are fallible, and humans can make mistakes in deciding whether to trust AI over their own judgment. Thus, improving human-AI collaboration requires understanding when, why, and how humans decide to rely on AI. We study two distinct reliance decisions: the delegation choice -- deciding when to let AI act autonomously without knowing its output, and the adoption choice -- evaluating AI suggestions and deciding how to use them. Both of these decoupled reliance patterns shape collaboration, but prior work rarely studies them together in realistic settings with the same users. We address this gap by studying collaborative human--AI teams competing in a question-answering game in which humans can choose when and how to work with AI agents to win. Our 24 matches pair 23 expert humans with 16 AI agents, capturing 387 delegation and 1440 adoption decisions. While human--AI collaboration performs better than either AI or humans alone, humans make suboptimal collaboration decisions, both under-relying on correct AI suggestions (3.9% of opportunities missed) and over-relying when AI misleads them (1.7%). Both parties contribute wrong answers: reported model confidence is near chance when humans and AI disagree, while confirmation bias drives higher under-reliance (64.5%) when an AI suggestion agrees with humans' initial incorrect answer. To close this gap, we recommend calibrated confidence, evidence-grounded explanations, and mechanisms that help users refine trust.