How to Block ICC-Crawler

Complete guide to blocking ICC-Crawler (NICT) from crawling your website using robots.txt, server configuration, and Switch workflows.

Operated by NICTCommercial Crawlers

Should You Block ICC-Crawler?

ICC-Crawler collects data for AI model training. Blocking it prevents your content from being used in NICT's AI products without affecting your search visibility.

This is a common and recommended action for sites that want to control how their content is used in AI training.

Blocking Methods

1robots.txt

High for cooperative crawlers

Add a Disallow rule for ICC-Crawler's user-agent string in your robots.txt file. This is the standard, cooperative method that well-behaved crawlers respect.

2Server-side UA filtering

High

Configure your web server (nginx, Apache, Cloudflare) to reject requests matching ICC-Crawler's user-agent patterns. This blocks at the network level before your application processes the request.

3Switch Journey Workflows

Highest — granular, real-time control

Create a custom journey in Switch that detects ICC-Crawler and routes it to a block action, challenge, redirect, or modified content — without touching your server configuration.

robots.txt — Block ICC-Crawler

Add the following to your robots.txt file (at the root of your domain) to block ICC-Crawler:

User-agent: ICC-Crawler
Disallow: /

robots.txt — Allow with Restrictions

Alternatively, allow ICC-Crawler on most pages while blocking specific directories:

User-agent: ICC-Crawler
Disallow: /private/
Allow: /

ICC-Crawler User-Agent Strings

Use these patterns to identify ICC-Crawler in your server logs or firewall rules:

ICC-Crawler

Frequently Asked Questions

Does blocking ICC-Crawler affect my Google search rankings?

No. Blocking ICC-Crawler does not affect your Google search rankings. Only blocking Googlebot impacts Google Search visibility.

Does ICC-Crawler respect robots.txt?

Yes, ICC-Crawler respects robots.txt directives. Adding a Disallow rule for its user-agent will prevent it from crawling blocked paths.

Can I allow ICC-Crawler on some pages but not others?

Yes. Use robots.txt to disallow specific directories, or use Switch journey workflows for granular page-level control with conditional logic.

Go beyond robots.txt

Switch detects ICC-Crawler in real-time and lets you build custom journey workflows — block, challenge, redirect, or serve modified content. No server changes required.

Get Started Free