How to Block Puppeteer

Complete guide to blocking Puppeteer (Google) from crawling your website using robots.txt, server configuration, and Switch workflows.

Operated by GoogleBrowser Agents

Should You Block Puppeteer?

Puppeteer controls a real browser and interacts with your site like a human. It cannot be blocked via robots.txt because it doesn't use identifiable user-agent strings.

Use behavioral detection through Switch to identify and manage browser agent traffic.

Blocking Methods

1robots.txt

High for cooperative crawlers

Add a Disallow rule for Puppeteer's user-agent string in your robots.txt file. This is the standard, cooperative method that well-behaved crawlers respect.

2Server-side UA filtering

High

Configure your web server (nginx, Apache, Cloudflare) to reject requests matching Puppeteer's user-agent patterns. This blocks at the network level before your application processes the request.

3Behavioral detection

Medium — requires specialized tooling

Puppeteer uses a real browser and doesn't identify itself via user-agent strings. Detection requires analyzing automation flags, interaction patterns, and JavaScript environment signals.

4Switch Content Gate

High for automated browsers

Switch's Content Gate uses document.write() to prevent headless browsers and automation frameworks from accessing your page content. Effective against Puppeteer, Playwright, and Selenium-based agents.

5Switch Journey Workflows

Highest — granular, real-time control

Create a custom journey in Switch that detects Puppeteer and routes it to a block action, challenge, redirect, or modified content — without touching your server configuration.

robots.txt — Block Puppeteer

Add the following to your robots.txt file (at the root of your domain) to block Puppeteer:

User-agent: HeadlessChrome
Disallow: /

User-agent: headlesschrome
Disallow: /

robots.txt — Allow with Restrictions

Alternatively, allow Puppeteer on most pages while blocking specific directories:

User-agent: HeadlessChrome
Disallow: /private/
Allow: /

User-agent: headlesschrome
Disallow: /private/
Allow: /

Puppeteer User-Agent Strings

Use these patterns to identify Puppeteer in your server logs or firewall rules:

HeadlessChrome
headlesschrome

Frequently Asked Questions

Does blocking Puppeteer affect my Google search rankings?

No. Blocking Puppeteer does not affect your Google search rankings. Only blocking Googlebot impacts Google Search visibility.

Does Puppeteer respect robots.txt?

Yes, Puppeteer respects robots.txt directives. Adding a Disallow rule for its user-agent will prevent it from crawling blocked paths.

Can I allow Puppeteer on some pages but not others?

Yes. Use robots.txt to disallow specific directories, or use Switch journey workflows for granular page-level control with conditional logic.

Go beyond robots.txt

Switch detects Puppeteer in real-time and lets you build custom journey workflows — block, challenge, redirect, or serve modified content. No server changes required.

Get Started Free