How CustomGPT.ai builds AI agent from my website?

This guide explains how to build an AI agent by crawling and indexing content directly from your website using CustomGPT.

Create a new agent

  1. On your dashboard, click New Agent.
  1. Click Website.
  1. Enter a URL or a sitemap in the provided field.
  1. Click Create Agent to begin indexing.

Sitemap search

CustomGPT.ai will first attempt to find a sitemap for the domain you provided. If a sitemap is found, CustomGPT.ai will crawl the pages listed on the sitemap, indexing the content for your AI agent.


If no sitemap is found

If a sitemap is not found, CustomGPT.ai will:

  • Attempt to find all links on your website's main page and crawl them.
  • Find links on the new pages discovered and crawl those as well.
  • Continue this process until it reaches your per-agent page limit or no new links are found.

🚧

Note:

If no links are found, CustomGPT.ai will crawl and index just the single page you provided.


Same-domain crawling only

If no sitemap is found, CustomGPT.ai crawls only links on the same domain. For example, if your domain is https://customgpt.ai, it won’t crawl links like https://google.com or https://docs.customgpt.ai.



Why is CustomGPT.ai not finding all pages of my website?

CustomGPT may not find all pages of your website for several reasons, especially if:

  • The website does not have a sitemap, making it harder to locate internal pages.
  • Pages load slowly, causing timeouts during indexing.
  • The site relies heavily on JavaScript, and key content is loaded dynamically.

These limitations can result in only partial indexing, sometimes just the homepage is captured.


How to fix it

You can improve indexing coverage by:


Learn more how to:

Disable JavaScript during website indexing

Extend allowed time per page

Enable slow mode when adding a new source to an existing agent