> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tensorlake.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Retries & Rate Limits

> Handle LLM rate limits, transient failures, and structured output validation with durable retries

LLM providers return rate-limit errors, APIs time out, and web scrapes hit transient failures. Tensorlake handles retries at the platform level — each retry is durable, meaning any nested function calls that already succeeded are served from checkpoints instead of re-executing. See [Durable Execution](/applications/durability) for how the checkpoint mechanism works and [Crash Recovery](/applications/crash-recovery) for the agent-loop walkthrough.

## Configuring Retries

Set the `retries` parameter on any `@function()` to automatically retry on failure. This is especially useful for LLM calls that return structured output — if the LLM returns malformed data, Pydantic validation fails and Tensorlake retries the entire call:

```python theme={null}
from pydantic import BaseModel
from tensorlake.applications import function

class ResearchFindings(BaseModel):
    summary: str
    sources: list[str]
    confidence: float

@function(retries=3)
def extract_findings(text: str) -> ResearchFindings:
    from openai import OpenAI
    # Disable client-level retries to avoid unpredictable behavior
    response = OpenAI(max_retries=0).chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Extract research findings as JSON."},
            {"role": "user", "content": text},
        ],
        response_format={"type": "json_object"},
    )
    # If validation fails, Tensorlake retries the entire function
    return ResearchFindings.model_validate_json(response.choices[0].message.content)
```

**How retries work:**

* Rate limit errors, timeouts, or exceptions trigger automatic retries
* Validation failures (e.g., Pydantic `ValidationError`) also trigger retries
* Tensorlake retries up to 3 times with exponential backoff
* Any nested function calls that already succeeded are served from checkpoints, not re-executed (see [Durable Execution](/applications/durability))

If retries are exhausted, the request fails and can be re-run later via the [Replay API](/applications/durability#request-replay-api). For broader exception-handling patterns (try/except, futures, fallbacks), see [Error Handling](/applications/error-handling). For controlling per-function deadlines, see [Timeouts](/applications/timeouts).

<Note>
  Disable client-level retries (e.g., OpenAI's `max_retries=0`) when using Tensorlake retries. Layering both creates unpredictable behavior and inflated retry counts.
</Note>

## Rate Limiting External APIs

When calling external APIs with rate limits, you can control the total number of concurrent calls using the formula:

**Total concurrent calls = `max_containers` × `concurrency`**

This allows you to respect API rate limits by capping the maximum number of parallel requests your function can make:

```python theme={null}
from tensorlake.applications import function

@function(
    retries=3,
    max_containers=5,  # Maximum 5 containers
    concurrency=2      # Each container handles 2 concurrent requests
)
def call_rate_limited_api(query: str) -> dict:
    # Total concurrent calls: 5 × 2 = 10 requests max
    import requests
    response = requests.get(f"https://api.example.com/search?q={query}")
    return response.json()
```

**Use cases:**

* **Respect API quotas** — If an API allows 100 requests/second, set `max_containers=50` and `concurrency=2`
* **Control costs** — Limit concurrent LLM calls to manage token spend
* **Prevent overload** — Cap requests to internal services that can't handle high concurrency

See [Scale-Out & Queuing](/applications/scale-out-queuing) for more on `max_containers` and request queuing.

## Related Guides

<CardGroup cols={2}>
  <Card title="Durable Execution" icon="clock-rotate-left" href="/applications/durability">
    How checkpointing makes every retry cheap.
  </Card>

  <Card title="Crash Recovery" icon="shield-check" href="/applications/crash-recovery">
    Surviving mid-loop crashes when retries run out.
  </Card>

  <Card title="Timeouts" icon="clock" href="/applications/timeouts">
    Per-function deadlines and progress-update heartbeats.
  </Card>

  <Card title="Error Handling" icon="triangle-exclamation" href="/applications/error-handling">
    Try/except, futures, and graceful degradation patterns.
  </Card>
</CardGroup>
