MIND: Multi-rationale INtegrated Discriminative Reasoning Framework for Multi-modal Large Models
Chuang Yu, Jinmiao Zhao, Mingxuan Zhao, Yunpeng Liu, Xiujun Shu, Yuanhao Feng, Bo Wang, Xiangyu Yue
Why It Matters
What makes this one worth your time
This work addresses critical limitations in current multimodal models, potentially leading to more reliable AI systems capable of complex reasoning tasks.
MIND transforms multimodal reasoning in large language models through integrated multi-rationale approaches.
Summary
The paper introduces the MIND framework, which enhances multimodal large language models by integrating multi-rationale reasoning capabilities and a two-stage correction learning strategy to improve logical robustness and semantic modeling.
Key contributions
- Introduction of the MIND framework for multi-rationale reasoning in MLLMs.
- Development of the Rationale Augmentation and Discrimination (RAD) paradigm.
- Implementation of the Progressive Two-stage Correction Learning (P2CL) strategy.
Notable insights
- The RAD paradigm offers a unified data foundation for rationale integration, which is not commonly explored in existing frameworks.
- The P2CL strategy's two-phase approach for learning and correction introduces a novel mechanism for enhancing logical reasoning.
Possible limitations
- Not stated in the abstract.
Abstract
arXiv:2512.05530v2 Announce Type: replace Abstract: Recently, multimodal large language models (MLLMs) have been widely applied to reasoning tasks. However, they suffer from limited multi-rationale semantic modeling, insufficient logical robustness, and susceptibility to misleading cues. Therefore, we propose a Multi-rationale INtegrated Discriminative (MIND) reasoning framework, which is designed to endow MLLMs with human-like cognitive abilities of "Understand -> Rethink -> Correct", and achieves a paradigm evolution from passive imitation-based reasoning to active discriminative reasoning. Specifically, we introduce a Rationale Augmentation and Discrimination (RAD) paradigm, which provides a unified and extensible data foundation. Meanwhile, we design a Progressive Two-stage Correction Learning (P2CL) strategy. The first phase enhances multi-rationale positive learning, while the second phase enables active logic discrimination and correction. In addition, to mitigate representation entanglement in the multi-rationale semantic space, we propose a Multi-rationale Contrastive Alignment (MCA) optimization strategy. Extensive experiments show that our MIND achieves SOTA performance on multiple public datasets. Our data and code are available at https://github.com/YuChuang1205/MIND