Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis
Hongbo Wen, Ying Li, Hanzhi Liu, Chaofan Shou, Yanju Chen, Yuan Tian, Yu Feng
Why It Matters
What makes this one worth your time
Understanding and mitigating security risks in agent skills is crucial for ensuring the safe deployment of AI-driven systems, making this research relevant for AI engineers and researchers focused on security.
Semia offers a novel approach to auditing agent skills by synthesizing their representations into a Datalog fact base for security analysis.
Summary
The paper introduces Semia, a static auditor for agent skills that uses Constraint-Guided Representation Synthesis to convert skills into a Datalog fact base, enabling security audits through reachability queries. It evaluates Semia on a large dataset, demonstrating its ability to identify critical semantic risks in agent skills.
Key contributions
- Development of Semia, a static auditor for agent skills.
- Introduction of Constraint-Guided Representation Synthesis for synthesizing skill representations.
- Evaluation of Semia on a large dataset, demonstrating its effectiveness in identifying semantic risks.
Notable insights
- The use of Constraint-Guided Representation Synthesis to refine LLM candidates until convergence is a clever methodology for ensuring semantic faithfulness.
- Reducing security properties to Datalog reachability queries provides a structured approach to auditing complex agent skills.
Possible limitations
- Not stated in the abstract
Abstract
arXiv:2605.00314v1 Announce Type: cross Abstract: An agent skill is a configuration package that equips an LLM-driven agent with a concrete capability, such as reading email, executing shell commands, or signing blockchain transactions. Each skill is a hybrid artifact-a structured half declares executable interfaces, while a prose half dictates when and how those interfaces fire-and the prose is reinterpreted probabilistically on every invocation. Conventional static analyzers parse the structured half but ignore the prose; LLM-based tools read the prose but cannot reproducibly prove that a tainted input reaches a high-impact sink. We present Semia, a static auditor for agent skills. Semia lifts each skill into the Skill Description Language (SDL), a Datalog fact base that captures LLM-triggered actions, prose-defined conditions, and human-in-the-loop checkpoints. Synthesizing a fact base that is both structurally sound and semantically faithful to the original prose is the central challenge; we address it with Constraint-Guided Representation Synthesis (CGRS), a propose-verify-evaluate loop that refines LLM candidates until convergence. Security properties (e.g., indirect injection, secret leakage, confused deputies, unguarded sinks, etc.) over an agent skill can then be reduced to Datalog reachability queries. We evaluate Semia on 13,728 real-world skills from public marketplaces. Semia renders all of them auditable and finds that more than half carry at least one critical semantic risk. On a stratified sample of 541 expert-labeled skills, Semia achieves 97.7% recall and an F1 of 90.6%, substantially outperforming signature-based scanners and LLM baselines.