Autonomous Intelligent Agents for Natural-Language-Driven Web Execution with Integrated Security Assurance
Vinil Pasupuleti, Siva Rama Krishna Varma Bayyavarapu, Shrey Tyagi
Why It Matters
What makes this one worth your time
This work addresses critical challenges in web testing and security, providing a practical solution that can save developers time and improve software reliability.
An innovative framework that automates web testing and integrates natural-language-driven security validation.
Summary
The paper introduces an AI-driven autonomous testing framework that enhances web testing by addressing common failure modes through various strategies, achieving significant improvements in script generation success and reducing navigation failures and timing-related issues, while also incorporating natural-language-driven security testing.
Key contributions
- Development of an AI-driven autonomous testing framework with five integrated strategies for improved web testing.
- Significant empirical results demonstrating enhanced script generation success and reduced failure rates in navigation and timing.
- Introduction of natural-language-driven security testing aligned with OWASP standards.
Notable insights
- The framework's use of natural language to describe attack scenarios for security testing is a clever integration of user-friendly input with technical execution.
- The decoupling of orchestration from long-running browser execution in a containerized architecture allows for more efficient resource management.
Possible limitations
- Not stated in the abstract.
Abstract
arXiv:2605.15281v1 Announce Type: cross Abstract: Modern web test suites rot. A UI refactor breaks locators, a timing change causes race conditions, and within weeks developers abandon the suite entirely. This paper presents an AI-driven autonomous testing framework that addresses these failure modes through five integrated strategies - navigation reliability, context-aware selector generation, post-generation validation, smart wait injection, and failure learning - implemented over a containerised worker architecture that decouples orchestration from long-running browser execution. Evaluated across four production applications and 176 scenarios, the framework improves script generation success from 55% to 93%, achieves an 8x reduction in navigation failures, eliminates 80% of timing-related race conditions, and reduces test creation time by 75% compared to manual Selenium authoring. The framework extends naturally to security validation: testers describe attack scenarios in plain English - "try accessing another user's invoice" - which the agent converts to OWASP Top 10-aligned browser probes, detecting 85% of authentication bypass vulnerabilities and 95% of input validation flaws with false positive rates below 12%. Natural-language-driven security testing of this kind represents, to our knowledge, a novel contribution to the field.