Automatic Ontology Construction Using LLMs as an External Layer of Memory, Verification, and Planning for Hybrid Intelligent Systems

Pavel Salovskii (Partenit.io, San Francisco, CA, USA), Iuliia Gorshkova (Partenit.io, San Francisco, CA, USA)

Published Apr 23, 2026Featured #8In the daily list Apr 24, 2026

Open on arXiv Read PDF

Daily score69.5

Editorial review7.5

Relevance0.462

Freshness0.722

Why It Matters

What makes this one worth your time

This work addresses critical limitations in current LLM systems, offering a pathway to more reliable and explainable AI applications in various domains.

A novel approach that combines LLMs with structured ontological memory for improved reasoning and decision-making.

Summary

The paper proposes a hybrid architecture that integrates large language models with an external ontological memory layer to enhance reasoning capabilities and automate ontology construction from diverse data sources.

Key contributions

Development of an automated pipeline for ontology construction from heterogeneous data sources.
Integration of graph-based reasoning with LLMs for improved multi-step reasoning performance.
Establishment of a generation-verification-correction pipeline that enhances output reliability.

Notable insights

The integration of RDF/OWL representations allows for structured knowledge management, enhancing the reasoning capabilities of LLMs.
The use of SHACL and OWL constraints for validation introduces a formal mechanism for ensuring the correctness of generated outputs.

Possible limitations

Not stated in the abstract.

Abstract

arXiv:2604.20795v1 Announce Type: new Abstract: This paper presents a hybrid architecture for intelligent systems in which large language models (LLMs) are extended with an external ontological memory layer. Instead of relying solely on parametric knowledge and vector-based retrieval (RAG), the proposed approach constructs and maintains a structured knowledge graph using RDF/OWL representations, enabling persistent, verifiable, and semantically grounded reasoning. The core contribution is an automated pipeline for ontology construction from heterogeneous data sources, including documents, APIs, and dialogue logs. The system performs entity recognition, relation extraction, normalization, and triple generation, followed by validation using SHACL and OWL constraints, and continuous graph updates. During inference, LLMs operate over a combined context that integrates vector-based retrieval with graph-based reasoning and external tool interaction. Experimental observations on planning tasks, including the Tower of Hanoi benchmark, indicate that ontology augmentation improves performance in multi-step reasoning scenarios compared to baseline LLM systems. In addition, the ontology layer enables formal validation of generated outputs, transforming the system into a generation-verification-correction pipeline. The proposed architecture addresses key limitations of current LLM-based systems, including lack of long-term memory, weak structural understanding, and limited reasoning capabilities. It provides a foundation for building agent-based systems, robotics applications, and enterprise AI solutions that require persistent knowledge, explainability, and reliable decision-making.