How to Block Puppeteer
Complete guide to blocking Puppeteer (Google) from crawling your website using robots.txt, server configuration, and Switch workflows.
Should You Block Puppeteer?
Puppeteer controls a real browser and interacts with your site like a human. It cannot be blocked via robots.txt because it doesn't use identifiable user-agent strings.
Use behavioral detection through Switch to identify and manage browser agent traffic.
Blocking Methods
1robots.txt
High for cooperative crawlersAdd a Disallow rule for Puppeteer's user-agent string in your robots.txt file. This is the standard, cooperative method that well-behaved crawlers respect.
2Server-side UA filtering
HighConfigure your web server (nginx, Apache, Cloudflare) to reject requests matching Puppeteer's user-agent patterns. This blocks at the network level before your application processes the request.
3Behavioral detection
Medium — requires specialized toolingPuppeteer uses a real browser and doesn't identify itself via user-agent strings. Detection requires analyzing automation flags, interaction patterns, and JavaScript environment signals.
4Switch Content Gate
High for automated browsersSwitch's Content Gate uses document.write() to prevent headless browsers and automation frameworks from accessing your page content. Effective against Puppeteer, Playwright, and Selenium-based agents.
5Switch Journey Workflows
Highest — granular, real-time controlCreate a custom journey in Switch that detects Puppeteer and routes it to a block action, challenge, redirect, or modified content — without touching your server configuration.
robots.txt — Block Puppeteer
Add the following to your robots.txt file (at the root of your domain) to block Puppeteer:
User-agent: HeadlessChrome Disallow: / User-agent: headlesschrome Disallow: /
robots.txt — Allow with Restrictions
Alternatively, allow Puppeteer on most pages while blocking specific directories:
User-agent: HeadlessChrome Disallow: /private/ Allow: / User-agent: headlesschrome Disallow: /private/ Allow: /
Puppeteer User-Agent Strings
Use these patterns to identify Puppeteer in your server logs or firewall rules:
Frequently Asked Questions
Does blocking Puppeteer affect my Google search rankings?
No. Blocking Puppeteer does not affect your Google search rankings. Only blocking Googlebot impacts Google Search visibility.
Does Puppeteer respect robots.txt?
Yes, Puppeteer respects robots.txt directives. Adding a Disallow rule for its user-agent will prevent it from crawling blocked paths.
Can I allow Puppeteer on some pages but not others?
Yes. Use robots.txt to disallow specific directories, or use Switch journey workflows for granular page-level control with conditional logic.
Go beyond robots.txt
Switch detects Puppeteer in real-time and lets you build custom journey workflows — block, challenge, redirect, or serve modified content. No server changes required.
Get Started Free