Configuring Retries
Set theretries parameter on any @function() to automatically retry on failure. This is especially useful for LLM calls that return structured output — if the LLM returns malformed data, Pydantic validation fails and Tensorlake retries the entire call:
- Rate limit errors, timeouts, or exceptions trigger automatic retries
- Validation failures (e.g., Pydantic
ValidationError) also trigger retries - Tensorlake retries up to 3 times with exponential backoff
- Any nested function calls that already succeeded are served from checkpoints, not re-executed
Disable client-level retries (e.g., OpenAI’s
max_retries=0) when using Tensorlake retries. Layering both creates unpredictable behavior and inflated retry counts.Rate Limiting External APIs
When calling external APIs with rate limits, you can control the total number of concurrent calls using the formula: Total concurrent calls =max_containers × concurrency
This allows you to respect API rate limits by capping the maximum number of parallel requests your function can make:
- Respect API quotas — If an API allows 100 requests/second, set
max_containers=50andconcurrency=2 - Control costs — Limit concurrent LLM calls to manage token spend
- Prevent overload — Cap requests to internal services that can’t handle high concurrency
max_containers and request queuing.