Allow CustomGPT to crawl a restricted site

If your site blocks external bots, CustomGPT's crawler cannot reach your sitemap. Resolving this requires a whitelist on both sides- your security settings and CustomGPT's crawler configuration.


Before you start

  • You have already tried adding your website as a knowledge source and the sitemap is not being indexed.
  • Your site uses a firewall, WAF, bot-blocking tool, or robots.txt rules that restrict external crawlers.

Steps

1. Contact CustomGPT support

Reach out to support and let them know you need to set up crawler access for a restricted site. Include:

  • Your domain (e.g., example.org)
  • A brief note that your site blocks external crawlers

2. Receive the crawler details

Support will share CustomGPT's crawler user-agent string and IP range with you.

3. Whitelist the crawler on your side

Add an exception in your firewall, WAF, or bot-blocking tool for CustomGPT's crawler using the user-agent and IP range provided. If your site uses robots.txt to restrict crawlers, add an exception there as well.

The exact steps depend on your security tooling. If you are unsure how to add the exception, share the crawler details with your IT or infrastructure team.

4. Confirm with support

Once your side is configured, let support know. They will whitelist your domain on CustomGPT's end.

5. Add your sitemap

Once both sides are configured, go to your agent's Create page and add your website URL or sitemap.


Related articles