Use Speed mode for your AI agent

This step-by-step guide explains how to enable Speed mode and customize your agent’s response behavior.

What is Speed mode?

Speed mode uses a lightweight and optimized version of GPT-4o, called GPT-4o mini, along with internal performance enhancements. Although its reasoning capabilities may be slightly reduced compared to larger models, it leverages Retrieval-Augmented Generation (RAG) to ensure responses are still based on accurate and relevant information from your data.

This mode is ideal for use cases where speed matters—such as live chats, high-volume queries, or time-sensitive support.

How to enable fast responses mode:

  1. Click Personalize.
  1. In Personalize, click AI Intelligence tab.
screenshot: Intelligence tab in Personalize section
  1. Select Speed to enable the mode.
Screenshot: Speed mode
  1. Under select your AI model, choose the model for your agent:
  • GPT-4.1 mini
  • GPT-4o mini
  • Claude 4.5 Haiku
  • Gemini 2.5 Flash
  1. Click Save Settings at the bottom of the page to apply your changes.
Screenshot: Intelligence tab save settings button
📘

Note for Enterprise users: Gemini 2.5 Flash is available for Fastest Responses mode. While faster than full-size models, it is slower than other lightweight options (GPT-4o mini, GPT-4.1 mini or Claude 4.5 Haiku) and has limited use cases.