How failures propagate
- A function can fail by raising an exception or timing out (see Timeouts).
- If an exception is not handled, it bubbles up to the caller and can fail the overall request. Failed requests can be re-run with the Replay API, and previously successful nested calls are served from checkpoints instead of re-executing.
- Retries can be configured per-function or at the application level. See Retries & Rate Limits.
- Mid-loop crashes in long-running agents are covered in Crash Recovery.
Pattern: catch errors and continue
Usetry/except inside your application to decide whether to fail the request or degrade gracefully.
Pattern: retries for flaky dependencies
Retries are a good fit for transient failures (timeouts, 429s, temporary upstream errors). Configure them on the function (or set defaults on the application).Futures: handling parallel failures
When using Futures, errors are raised when you call.result():
Debugging tips
- Start by reproducing locally: Tensorlake Applications run as normal Python functions locally. See Testing locally.
- Add structured logs: log inputs/outputs (excluding secrets) so you can diagnose failures.
- Make side effects idempotent: if a function can retry, avoid double-charging or double-writing.
Related Guides
Retries & Rate Limits
Configure auto-retries for transient failures and structured-output validation.
Crash Recovery
Resume long agent loops from the failed step instead of restarting.
Durable Execution
Replay API, adaptive vs. strict modes, and how checkpoints survive failures.
Timeouts
Per-function deadlines and progress-update heartbeats.