Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tensorlake.ai/llms.txt

Use this file to discover all available pages before exploring further.

LLM providers return rate-limit errors, APIs time out, and web scrapes hit transient failures. Tensorlake handles retries at the platform level — each retry is durable, meaning any nested function calls that already succeeded are served from checkpoints instead of re-executing. See Durable Execution for how the checkpoint mechanism works and Crash Recovery for the agent-loop walkthrough.

Configuring Retries

Set the retries parameter on any @function() to automatically retry on failure. This is especially useful for LLM calls that return structured output — if the LLM returns malformed data, Pydantic validation fails and Tensorlake retries the entire call:
from pydantic import BaseModel
from tensorlake.applications import function

class ResearchFindings(BaseModel):
    summary: str
    sources: list[str]
    confidence: float

@function(retries=3)
def extract_findings(text: str) -> ResearchFindings:
    from openai import OpenAI
    # Disable client-level retries to avoid unpredictable behavior
    response = OpenAI(max_retries=0).chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Extract research findings as JSON."},
            {"role": "user", "content": text},
        ],
        response_format={"type": "json_object"},
    )
    # If validation fails, Tensorlake retries the entire function
    return ResearchFindings.model_validate_json(response.choices[0].message.content)
How retries work:
  • Rate limit errors, timeouts, or exceptions trigger automatic retries
  • Validation failures (e.g., Pydantic ValidationError) also trigger retries
  • Tensorlake retries up to 3 times with exponential backoff
  • Any nested function calls that already succeeded are served from checkpoints, not re-executed (see Durable Execution)
If retries are exhausted, the request fails and can be re-run later via the Replay API. For broader exception-handling patterns (try/except, futures, fallbacks), see Error Handling. For controlling per-function deadlines, see Timeouts.
Disable client-level retries (e.g., OpenAI’s max_retries=0) when using Tensorlake retries. Layering both creates unpredictable behavior and inflated retry counts.

Rate Limiting External APIs

When calling external APIs with rate limits, you can control the total number of concurrent calls using the formula: Total concurrent calls = max_containers × concurrency This allows you to respect API rate limits by capping the maximum number of parallel requests your function can make:
from tensorlake.applications import function

@function(
    retries=3,
    max_containers=5,  # Maximum 5 containers
    concurrency=2      # Each container handles 2 concurrent requests
)
def call_rate_limited_api(query: str) -> dict:
    # Total concurrent calls: 5 × 2 = 10 requests max
    import requests
    response = requests.get(f"https://api.example.com/search?q={query}")
    return response.json()
Use cases:
  • Respect API quotas — If an API allows 100 requests/second, set max_containers=50 and concurrency=2
  • Control costs — Limit concurrent LLM calls to manage token spend
  • Prevent overload — Cap requests to internal services that can’t handle high concurrency
See Scale-Out & Queuing for more on max_containers and request queuing.

Durable Execution

How checkpointing makes every retry cheap.

Crash Recovery

Surviving mid-loop crashes when retries run out.

Timeouts

Per-function deadlines and progress-update heartbeats.

Error Handling

Try/except, futures, and graceful degradation patterns.