How to Block ICC-Crawler
Complete guide to blocking ICC-Crawler (NICT) from crawling your website using robots.txt, server configuration, and Switch workflows.
Should You Block ICC-Crawler?
ICC-Crawler collects data for AI model training. Blocking it prevents your content from being used in NICT's AI products without affecting your search visibility.
This is a common and recommended action for sites that want to control how their content is used in AI training.
Blocking Methods
1robots.txt
High for cooperative crawlersAdd a Disallow rule for ICC-Crawler's user-agent string in your robots.txt file. This is the standard, cooperative method that well-behaved crawlers respect.
2Server-side UA filtering
HighConfigure your web server (nginx, Apache, Cloudflare) to reject requests matching ICC-Crawler's user-agent patterns. This blocks at the network level before your application processes the request.
3Switch Journey Workflows
Highest — granular, real-time controlCreate a custom journey in Switch that detects ICC-Crawler and routes it to a block action, challenge, redirect, or modified content — without touching your server configuration.
robots.txt — Block ICC-Crawler
Add the following to your robots.txt file (at the root of your domain) to block ICC-Crawler:
User-agent: ICC-Crawler Disallow: /
robots.txt — Allow with Restrictions
Alternatively, allow ICC-Crawler on most pages while blocking specific directories:
User-agent: ICC-Crawler Disallow: /private/ Allow: /
ICC-Crawler User-Agent Strings
Use these patterns to identify ICC-Crawler in your server logs or firewall rules:
Frequently Asked Questions
Does blocking ICC-Crawler affect my Google search rankings?
No. Blocking ICC-Crawler does not affect your Google search rankings. Only blocking Googlebot impacts Google Search visibility.
Does ICC-Crawler respect robots.txt?
Yes, ICC-Crawler respects robots.txt directives. Adding a Disallow rule for its user-agent will prevent it from crawling blocked paths.
Can I allow ICC-Crawler on some pages but not others?
Yes. Use robots.txt to disallow specific directories, or use Switch journey workflows for granular page-level control with conditional logic.
Go beyond robots.txt
Switch detects ICC-Crawler in real-time and lets you build custom journey workflows — block, challenge, redirect, or serve modified content. No server changes required.
Get Started Free