Research & Methodology

Built on research. Designed for the real web.

Switch's approach to agent identification, behavioral analysis, and traffic management is informed by the latest academic research on the agentic web.

The research is clear: the web isn't ready for agents

A growing body of academic work confirms what site managers are experiencing firsthand. Autonomous AI agents are now visiting websites at scale — scraping content, executing multi-step tasks, and operating without the site owner's knowledge or consent. Current web architecture provides no native mechanism for detection, classification, or control.

33%+

of web traffic is non-human

Industry estimates, 2024\u20132025

~1s

agent task latency with VOIX vs. 10\u201321 min without

Schultze et al., 2025

96%

LLM accuracy in matching agent tasks to required scopes

El Helou et al., 2025

Research Themes

Three pillars of the agentic web

The research we track falls into three interconnected areas that together define the challenge of managing agent traffic.

Agent Identification

How do we detect agents that don’t identify themselves? Research on web agent architectures reveals the behavioral patterns and signals that distinguish autonomous agents from human visitors.

Agent Authorization

Once identified, how do we control what agents can access? Research on agent authorization highlights the risks of overly broad permissions and the need for task-aware, dynamic access control.

Agent Behavior

Agents are evolving beyond simple crawlers into multi-modal, multi-step autonomous systems. Research on agent behavior patterns informs how Switch’s detection models adapt to new agent types.

Cited Research

Papers that inform our approach

Each paper is analyzed for its direct implications on agent detection, classification, and traffic management.

Agent IdentificationHigh relevance

Building the Web for Agents: A Declarative Framework for Agent–Web Interaction

Sven Schultze, Meike Kietzmann, Nils Lucas Schoenfeld, Ruth Stock-Homburg

Technical University of Darmstadt · 2025

Key Insight

Agents must reverse-engineer human UIs to operate on the web. Without explicit contracts, they scrape DOMs, parse screenshots, and bypass developer-intended workflows — creating brittle, insecure interactions that site owners can’t control.

Notable Findings

Today’s web is designed primarily for human consumption. Agents must infer available actions by scraping HTML, heuristically parsing Document Object Models (DOMs) or even analyzing rendered screenshots.

Describes the fundamental misalignment between agents and human-oriented websites

When an external agent scrapes a site, it bypasses the carefully crafted workflows and interaction patterns designed by the developer. The agent provider, not the site owner, unilaterally decides how to interpret and interact with the page’s functionality.

Highlights the loss of control site owners face from undeclared agent visits

Sensitive, personal, or proprietary information embedded in the web page, such as private messages, financial data, or user details, could be shared without the user’s explicit consent.

Identifies the privacy risk of uncontrolled agent access to web content

What This Means for Switch

This paper validates Switch’s core premise: agents are visiting websites at scale and site owners have no visibility or control. Switch’s identification engine detects these agents — whether they declare themselves or not — and gives site managers the control layer that the current web lacks. The paper’s latency benchmarks confirm that agents like Perplexity Comet and BrowserGym are actively browsing production websites.

Read the full paper on arXiv
Agent AuthorizationHigh relevance

Delegated Authorization for Agents Constrained to Semantic Task-to-Scope Matching

Majed El Helou, Chiara Troiani, Benjamin Ryder, Jean Diaconu, Hervé Muyal, Marcelo Yannuzzi

Cisco Systems · 2025

Key Insight

Current authorization models grant agents overly broad permissions. When agents request access to tools and protected resources, they may operate far beyond the intended scope of their assigned task — creating attack vectors for both malicious and misconfigured agents.

Notable Findings

Authorizing Large Language Model driven agents to dynamically invoke tools and access protected resources introduces significant risks, since current methods for delegating authorization grant overly broad permissions and give access to tools allowing agents to operate beyond the intended task scope.

Establishes the over-scoping problem in current agent authorization

Agents might invoke tools that are technically within the allowed permissions, but operate outside the intended scope of the tasks they were asked to perform, thereby creating potential attack vectors for malicious actors.

Describes how agents can exploit permission gaps, even without malicious intent

What This Means for Switch

Switch’s journey system directly addresses the authorization gap this paper identifies. Rather than hoping agents respect static permissions, Switch dynamically identifies incoming agents, classifies their intent, and routes them through configurable journeys — challenging, throttling, or redirecting based on real-time behavior rather than pre-configured rules.

Read the full paper on arXiv
Agent BehaviorSupporting

Video-Browser: Towards Agentic Open-web Video Browsing

Zhengyang Liang, Yan Shu, Xiangrui Liu, Minghao Qin, Nicu Sebe, Zheng Liu, Lizi Liao

Singapore Management University, University of Trento, BAAI, Hong Kong Polytechnic University · 2026

Key Insight

Agents are evolving from simple text scrapers into sophisticated multi-modal systems that autonomously browse the open web, searching across multiple sources and performing multi-step reasoning. This represents a new class of web traffic that traditional bot detection cannot identify.

Notable Findings

The evolution of autonomous agents is redefining information seeking, transitioning from passive retrieval to proactive, open-ended web research.

Documents the shift from passive retrieval to autonomous, multi-step web browsing

This transition towards agentic web browsing has become a dominant trend in AI research.

Confirms that autonomous web browsing is now a primary AI research direction

What This Means for Switch

This research demonstrates that agents are becoming increasingly sophisticated — moving beyond simple crawlers into autonomous systems that navigate, search, and reason across multiple websites. Switch’s behavioral analysis engine is built to detect these next-generation agents, even when they don’t identify themselves via User-Agent strings.

Read the full paper on arXiv
Platform StandardHigh relevance

WebMCP: Structured Agent–Website Interaction via the Model Context Protocol

André Cipriani Bandarra, Chrome Team

Google Chrome · 2026

Key Insight

Chrome is formalizing structured agent–website interaction through WebMCP, providing declarative and imperative APIs that let sites expose "tools" for agents to use — replacing brittle DOM scraping with sanctioned, structured channels. This creates a clear divide between compliant agents (using WebMCP) and rogue scrapers (bypassing it).

Notable Findings

By defining these tools, you tell agents how and where to interact with your site, whether it’s booking a flight, filing a support ticket, or navigating complex data. This direct communication channel eliminates ambiguity and allows for faster, more robust agent workflows.

Describes the core value proposition of structured agent interaction via WebMCP

Today’s web is designed primarily for human consumption. WebMCP aims to provide a standard way for exposing structured tools, ensuring AI agents can perform actions on your side with increased speed, reliability, and precision.

Confirms the same fundamental problem identified by the VOIX paper — now being addressed at the browser platform level

What This Means for Switch

WebMCP validates Switch’s approach and creates a powerful new detection signal. Agents using WebMCP’s structured tools are compliant — they use the sanctioned channel rather than scraping. Agents bypassing WebMCP to achieve the same actions are likely rogue. Switch now detects WebMCP protocol headers to distinguish compliant from non-compliant agent traffic, and the agent-policy meta tag declares WebMCP readiness to well-behaved agents. This is the platform-level standardization of the exact problem Switch was built to solve.

Read the full announcement
Platform StandardHigh relevance

Markdown for Agents: Serving Structured Content to AI

Cloudflare Engineering

Cloudflare · 2025

Key Insight

AI agents are a "third audience" for web content — alongside humans and search engines. Serving clean, structured Markdown instead of bloated HTML is dramatically more token-efficient and enables cooperative agent management. The llms.txt standard provides agent discovery, while alternate Markdown links let agents opt into structured content.

Notable Findings

When an AI agent visits your site, it doesn’t need the navigation bar, the JavaScript animations, or the cookie banner. It needs the content — clean, structured, and semantic.

Explains why Markdown is superior to HTML for agent consumption

The llms.txt file acts as a machine-readable guide to your site. Agents that discover it are signaling cooperative intent — they’re following the standards rather than brute-force scraping.

Describes how llms.txt serves as both content discovery and an agent identification signal

What This Means for Switch

Switch now integrates Markdown for Agents as a journey action. Instead of only blocking or challenging detected agents, site owners can serve clean Markdown content or entirely custom replacement pages. This is cooperative agent management — give well-behaved agents what they actually need (structured content) while denying scrapers the raw HTML. The SDK injects an alternate Markdown link tag, and requests to llms.txt are logged as a cooperative-agent identification signal. Combined with WebMCP detection, this creates a full spectrum of agent interaction: from hostile (block) to neutral (challenge) to cooperative (serve Markdown).

Read the full announcement
Agent IdentificationHigh relevance

Form Fill Patterns as AI Agent Detection Signals

Switch Research

Switch (internal research) · 2026

Key Insight

Observing ChatGPT’s browsing agent revealed a critical insight: AI agents can interact with inline form fields but consistently fail to engage with popup overlays. Form fill behavior provides powerful identification signals — typing cadence, fill order, focus-to-keystroke latency, and correction patterns all differ dramatically between humans and agents.

Notable Findings

When presented with a popup asking for verification, ChatGPT’s browsing agent simply ignored it. But when asked to identify itself via an inline form field, it responded honestly.

Direct observation of ChatGPT browsing agent behavior on a Switch-protected site

Bots fill forms instantly, sequentially, and without corrections. Humans pause, skip fields, backtrack, and make typos. The timing variance alone is a 95%+ accurate signal.

Analysis of form interaction patterns in Switch SDK telemetry

What This Means for Switch

Switch now monitors form fill patterns as identification signals: typing cadence variance, focus-to-keystroke latency, fill order vs DOM order, paste frequency, hidden field fills, and correction rates. Two new journey actions leverage this insight: "Inline Challenge" replaces popups with form-based verification that agents can actually interact with, and "Agent Self-ID" adds a non-intrusive field where compliant agents voluntarily identify themselves. This transforms form interactions from a blind spot into both a detection channel and an engagement mechanism.

Read the full announcement

From Research to Product

How Switch applies this research

Agent intent classification

Implemented

Directly implementing ASTRA’s semantic task-to-scope matching, Switch classifies not just whether a visitor is a bot, but what it’s trying to do — content scraping, search indexing, task automation, research browsing, monitoring, or API probing. This intent appears on every session in the dashboard.

Pyramidal perception beacon scheduling

Implemented

Adapted from the Video-Browser paper’s three-stage pyramidal approach. When classification is ambiguous, the SDK increases beacon frequency (10–15s) to gather more behavioral data faster. When confident, it reduces to 45–60s to save bandwidth. The server dynamically controls the SDK’s sampling rate.

Declarative agent policy (meta tag)

Implemented

Inspired by VOIX’s declarative framework for agent–web interaction, the SDK automatically injects a machine-readable <meta name="agent-policy"> tag that well-behaved agents can discover — declaring that the site uses Switch for agent traffic management and how agents should identify themselves.

Behavioral fingerprinting

Inspired by research showing agents exhibit distinct interaction patterns (low mouse entropy, linear movement, zero scroll jitter), Switch’s identification engine analyzes visitor behavior in real time to classify traffic as human or agent.

Lure-based identification

Drawing from research on agent browsing patterns, Switch uses invisible lure pages that only agents discover — enabling definitive identification and automatic pattern learning without affecting human visitors.

Dynamic journey routing

Informed by research on the risks of over-scoping agent permissions, Switch’s journey builder lets site managers define granular, task-aware responses to different agent types — from challenging to throttling to serving custom content.

WebMCP protocol detection

Implemented

Switch detects Chrome’s new WebMCP structured agent protocol, distinguishing compliant agents using sanctioned tool APIs from rogue scrapers manipulating the DOM. WebMCP agents are classified with high confidence as commercial agents performing task automation, and the SDK’s agent-policy meta tag now declares WebMCP readiness.

Markdown for Agents (cooperative content serving)

Implemented

Instead of only blocking agents, Switch can serve them clean Markdown content or custom replacement pages. Journey actions "Serve Markdown" and "Replace Content" use document.write() to deliver token-efficient, structured content. The SDK injects an alternate Markdown link tag for agent discovery, and requests to llms.txt are logged as a cooperative-agent identification signal — creating a full spectrum from hostile to cooperative agent management.

Form fill intelligence & inline challenges

Implemented

AI agents can fill forms but cannot interact with popup overlays. Switch now monitors form interaction patterns (typing cadence, fill order, focus-to-type latency, corrections, hidden field fills) as powerful detection signals. Two new journey actions leverage this: "Inline Challenge" presents form-based verification agents can engage with, and "Agent Self-ID" lets compliant agents voluntarily identify themselves — turning form interactions from a blind spot into a detection and engagement channel.

Ready to see it in action?

Add Switch to your site in five minutes. Get instant visibility into agent traffic and take control of the agentic web.