Controllable Black-Box Attacks on VLM-Powered Web Agents
The paper "**AD VWEB**" investigates security vulnerabilities in Vision-Language Models (VLMs) used for web automation tasks. The research introduces a novel *black-box attack framework* that can manipulate VLM-powered web agents without accessing their internal architecture. The authors demonstrate how **adversarial perturbations** to web page elements can deceive these agents into performing unintended actions while maintaining visual consistency for human users. The study explores various attack scenarios across *e-commerce*, *content management*, and *form-filling tasks*, showing how subtle modifications to HTML elements and visual layouts can significantly impact agent behavior. The research reveals critical **security implications** for automated web systems, particularly in sensitive applications like online banking and healthcare portals. The paper also proposes potential defensive strategies and highlights the importance of *robust testing* and *validation* for VLM-based web automation systems. This work contributes valuable insights to the ongoing discussion of AI safety and security in web automation.