LLM Agent Reasoning Safety Interpretability

Pedestrian-Aware LLM-Driven Behavioral Planning for Autonomous Vehicles

Aidana Baimbetova, Haruki Yonekura, Hamada Rizk, Hirozumi Yamaguchi

Published May 19, 2026

Open on arXiv Read PDF

Editorial review7.2

Relevance0.477

Freshness0.000

Why It Matters

What makes this one worth your time

Understanding pedestrian behavior is crucial for the safety of autonomous vehicles in urban settings, and this approach offers a potentially more adaptable and interpretable solution than traditional methods.

LLMs are used to improve autonomous vehicle decision-making by interpreting pedestrian behavior.

Summary

The paper proposes a novel decision-making framework for autonomous vehicles using large language models to interpret pedestrian behavior and enhance safety in urban environments. The framework translates scene observations into natural language prompts for the LLM to predict pedestrian intent and generate driving decisions. Evaluations show improved performance over deep reinforcement learning baselines in various pedestrian interaction scenarios.

Key contributions

Introduction of an LLM-based framework for pedestrian-aware decision-making in AVs.
Demonstration of improved collision-free rates compared to deep RL baselines.
Evaluation of cross-behavior transferability in pedestrian interaction scenarios.

Notable insights

Using LLMs for scene interpretation allows for natural language reasoning in decision-making.
The framework demonstrates the potential for zero-shot and few-shot learning in dynamic, real-world scenarios.

Possible limitations

Not stated in the abstract

Abstract

arXiv:2605.16858v1 Announce Type: cross Abstract: Autonomous Vehicles (AVs) must make reliable decisions in dense urban environments where pedestrian behavior is variable, sometimes abnormal, and often unseen during training. Reinforcement learning (RL)-based AV control systems perform well in structured traffic but struggle to generalize to unpredictable pedestrian interactions and out-of-distribution scenarios. Their reliance on handcrafted rewards and opaque decisions further limits their suitability for safety-critical, pedestrian-rich environments. To address these limitations, we introduce a Large Language Model (LLM)-based decision-making framework for pedestrian-aware behavioral planning. The system converts structured scene observations into natural-language reasoning prompts, enabling the LLM to infer pedestrian intent, anticipate risk, and generate cautious tactical driving decisions. These decisions are executed by a motion planner that ensures smooth, kinematically feasible control. We evaluate the framework in SUMO across multiple pedestrian-interaction scenarios, including unexpected jaywalking, turn-back crossing, hesitation, and bidirectional crossing. In zero-shot evaluation, the LLM-based agent achieves a 68% collision-free success rate, substantially outperforming deep RL baselines (17.7%). With few-shot episodic memory in a single-pedestrian scenario, performance increases to 96.0%, exceeding a custom DQN controller (82.0%). Cross-behavior evaluation further shows that memory derived from turn-back interactions transfers to unseen hesitation and bidirectional crossing scenarios, achieving 82.0% and 90.0% success, respectively. The system consistently initiates earlier responses, maintains wider safety buffers, and produces interpretable, human-aligned decisions.