Agentic applications interact with unreliable dependencies (LLMs, tools, external APIs). This guide explains how errors propagate in Tensorlake Applications and common patterns for building resilient workflows on the Agentic Runtime.Documentation Index
Fetch the complete documentation index at: https://docs.tensorlake.ai/llms.txt
Use this file to discover all available pages before exploring further.
How failures propagate
- A function can fail by raising an exception or timing out (see Timeouts).
- If an exception is not handled, it bubbles up to the caller and can fail the overall request. Failed requests can be re-run with the Replay API, and previously successful nested calls are served from checkpoints instead of re-executing.
- Retries can be configured per-function or at the application level. See Retries & Rate Limits.
- Mid-loop crashes in long-running agents are covered in Crash Recovery.
Pattern: catch errors and continue
Usetry/except inside your application to decide whether to fail the request or degrade gracefully.
Pattern: retries for flaky dependencies
Retries are a good fit for transient failures (timeouts, 429s, temporary upstream errors). Configure them on the function (or set defaults on the application).Futures: handling parallel failures
When using Futures, errors are raised when you call.result():
Debugging tips
- Start by reproducing locally: Tensorlake Applications run as normal Python functions locally. See Testing locally.
- Add structured logs: log inputs/outputs (excluding secrets) so you can diagnose failures.
- Make side effects idempotent: if a function can retry, avoid double-charging or double-writing.
Related Guides
Retries & Rate Limits
Configure auto-retries for transient failures and structured-output validation.
Crash Recovery
Resume long agent loops from the failed step instead of restarting.
Durable Execution
Replay API, adaptive vs. strict modes, and how checkpoints survive failures.
Timeouts
Per-function deadlines and progress-update heartbeats.