T2MM: An LLM Supported Architecture For Inquiry-Based Modeling

John Kos, Rudra Singh, Ashok Goel

Published Jun 11, 2026Featured #2In the daily list Jun 12, 2026

Open on arXiv Read PDF

Daily score73.3

Editorial review7.5

Relevance0.461

Freshness0.722

Why It Matters

What makes this one worth your time

This research addresses a gap in educational tools by providing a more interactive approach to model construction, which could significantly improve learning outcomes in science education.

T2MM enhances inquiry-based modeling with interactive, LLM-supported architectures.

Summary

The paper presents T2MM, an architecture that integrates large language models with multimodal capabilities to enhance interactive model construction in inquiry-based learning environments, specifically within the VERA software.

Key contributions

Introduction of the T2MM architecture for interactive model construction.
Demonstration of improved performance over existing LLM-supported model generation architectures.
Integration of multimodal capabilities into inquiry-based learning tools.

Notable insights

The architecture allows for dynamic model adjustments based on learner input, which is a step beyond static visualizations commonly used in educational tools.
The use of a procedurally generated dataset for evaluation suggests a novel approach to testing LLM performance in specific educational contexts.

Possible limitations

Not stated in the abstract.

Abstract

arXiv:2606.11210v1 Announce Type: cross Abstract: Model Construction is a foundational practice in science learning that relies on visualization and interactivity. Large Language Models, increasingly augmented with multimodal capabilities, have been integrated in education contexts to support learning. However, these tools lack visual interactivity that is required by some learning contexts. We introduce Text to Multimodal Model (T2MM), a robust, dynamic LLM supported architecture that assists in model construction within the open inquiry ecology-based modeling software Virtual Experimental Research Assistant (VERA). T2MM accounts for the current context of the learner's model and creates interactive models, rather than static images, enabling the model to remain responsive to manual adjustment. To measure technical feasibility, we evaluate T2MM through a custom procedurally generated dataset of natural language learner modeling requests and target models within the VERA system. T2MM outperforms a baseline model generation architecture implemented through LLM-supported full code generation, common in the literature, across all measured success metrics. Our contribution not only outlines LLM integration into a inquiry-based learning modeling tool, but also describes a possible architecture through which more interactive multimodal LLM tools can be created.