Handle long-running operations and optimize tool performance for real-time conversations.
queryCorpus
where we’d like to look up information while the agent is speaking and simply ignore the response if the agent is interrupted. Tools like this can be marked precomputable
.
precomputable
will be speculatively invoked as soon as the model produces the tool call. When the model produces both an agent utterance and the tool call, the tool’s latency will be masked by the agent speaking, but if the agent is interrupted there will be no record of the invocation.
precomputable
, a tool should have three properties:
http
tools, GET requests are usually safe while methods like POST are not.precomputable
using the corresponding field.
precomputable
, it must be:
✅ Read-only: No state changes (GET requests are usually safe, POST requests are not).
✅ No Side Effects: No logging critical events, sending notifications, etc.
✅ Idempotent: Same input always produces same output, regardless of when or how many times it’s called.
5s
for 5 seconds or 0.1s
for 100 milliseconds.
Example:
dataConnection
implementation since data connections are also able to send input text messages (and the response is always deferred in that case). Keep in mind that the model will see whatever response you send back initially, so you’ll want to make it clear to the model what’s going on by initially responding with some text like “Tool started. The full response will be available soon.”
Custom Timeout Considerations