Skip to main content
Agentic applications and AI workflows are often long-running (seconds to hours) and interact with unreliable dependencies (LLMs, external APIs, tools). A failure in a dependency call requires implementing retry logic and restarting the agent or workflow from scratch when out of retries. This can be costly and adds significant complexity and latency. When running on Tensorlake, your application automatically saves outputs of every Tensorlake function call in the current application request. This means that outputs of successful Tensorlake function calls will be available without re-execution when you replay an application request after it failed. The same applies to automatic retries. When a Tensorlake function gets retried and runs the same previously succeeded Tensorlake function calls again, it will use their saved outputs instead of re-executing them. Storing outputs of successful function calls in an application request and re-using the outputs in the same request without re-executing the same function calls again is called durable execution. It works out-of-the-box for all Tensorlake applications.
Durable execution is in technical preview mode. Please contact us on Slack if you’d like to ask a question or try it out.

Request Replay API

You can use Request Replay API to restart a failed Tensorlake application request from where it failed without re-executing the previously successful Tensorlake function calls in it.
curl \
"https://api.tensorlake.ai/applications/$APPLICATION_NAME/requests/$REQUEST_ID/replay" \
-H "Authorization: Bearer $TENSORLAKE_API_KEY" \
--json '{}'
When you replay a request, Tensorlake doesn’t create a new request. Instead, it re-runs the same request with the same request ID. The request runs again and the request output is updated when the replay completes.

Application code upgrade

When request gets replayed it runs the same application code version as in the previous run. You can upgrade it to the latest application code version by passing --json '{ "upgrade_to_latest_version": true }' in HTTP replay API call or passing request.replay(upgrade_to_latest_version=True) in Python. This is handy if you fixed a bug in your application code and want to re-run the request with the fix applied. If you replay with a code upgrade, please ensure that the latest application code can handle the original request inputs. This typically requires backward compatibility implemented at your application function parameters level.

Replay modes

Tensorlake detects when a replayed request follows a different execution path comparing to the original request run or any its past replays. For example, a replayed request may execute a new function call if it uses a random number generator to do it:
import random
from tensorlake.applications import application, function

@function()
def foo():
    print("foo")

@function()
def bar():
    print("bar")

@application()
@function()
def my_workflow_app():
    # succeeded in the original request run,
    # skipped in the replayed run
    foo()
    if random.random() < 0.5:
        # never called in the original request run,
        # called in the replayed run, Tensorlake detects this
        bar()
    # ... more Tensorlake function calls
Other common causes of a replayed requests following a different execution path:
  • Conditional execution of code depending on current time, database state, values returned by external APIs, etc.
  • Changing order of Tensorlake function calls depending on duration of external API calls, LLM calls, etc (aka race conditions).
  • Change of Tensorlake function calls in upgraded application code.
For some applications, a replayed request following a different execution path is expected and acceptable and for others it is not. Tensorlake provides two different replay modes to suite the needs of both types of applications. Adaptive replay allows this scenario and Strict replay doesn’t allow it and fails the replay if it happens.

Adaptive replay

By default, Tensorlake uses adaptive replay. In this mode, all new Tensorlake function calls are allowed to execute, even if the replayed request doesn’t run some function calls that were executed in the original request run or in previous replayed runs. This mode is useful when the user just wants to re-run the request from where it failed without being concerned about potential behavioral changes or non-determinism in their application code. To explicitly enable adaptive replay, pass --json '{ "mode": "adaptive" }' in HTTP replay API call or pass request.replay(mode=ReplayMode.ADAPTIVE) in Python. This is not necessary since adaptive replay is the default mode.

Strict replay

In this mode, if a new Tensorlake function call is detected during the request replay and one or more Tensorlake function call from the original request run or from previous replayed runs are not executed in the current replayed run, then the request replay fails with a ReplayError. This mode is useful when the user wants to ensure that the request behavior remains the same during replays. i.e. that all the resources claimed during the original request run are reused during the replayed run without claiming more resources again (i.e. to not redo cross-service transactions). To enable strict replay, pass --json '{ "mode": "strict" }' in HTTP replay API call or pass request.replay(mode=ReplayMode.STRICT) in Python.

How function calls are matched

Tensorlake makes a fingerprint of every Tensorlake function call made in an application request. It then compares fingerprints of new function calls made during a request replay with fingerprints of previously executed function calls in the same request to determine whether the function call has been made previously. A function call fingerprint includes:
  • Function call type (i.e. “function_call”, “map”, “reduce”).
  • Function name.
  • Parent function call fingerprint.
  • Function call sequence number in the parent function call.
  • Other information to ensure that changes in function call tree structures are detected.
Function parameters are not included in the function call fingerprint. Takeaways from this:
  • Changing function parameters in application code doesn’t affect replay behavior. A new function call with different parameters still matches the previous function call. This enables seemless application code upgrades without affecting replays.
  • Passing different values (e.g., random numbers, current time) as function parameters doesn’t affect replay behavior. A function call with a different random number passed into it still matches its previous function call where the random number was different.
  • If sequence of function calls changed in the latest application code then the replayed function calls will not match the previous function calls. In this case the replay behavior depends on the selected replay mode (adaptive or strict).
  • If function calls are started in an arbitrary order (i.e. with a random delay) then the order of function calls would differ between the original request run and the replayed run even without application code changes. In this case the replay behavior depends on the selected replay mode (adaptive or strict). Application code should avoid arbitrary function call ordering to ensure consistent behavior during request replays and reuse of previously completed work.

Automatic retries

When a Tensorlake function call gets retried automatically, it uses the same durable execution mechanism to re-use outputs of previously successful Tensorlake function calls from the same request. In this case, adaptive replay mode is always used.

Disabling durable execution

Durable execution is enabled by default for all Tensorlake functions. You can disable it for a function by setting the durable attribute to False in the @function decorator.
from tensorlake.applications import application, function
from magic_llm import ask_llm

@function(durable=False)
def get_current_weather() -> str:
    return ask_llm("What's the weather now in San Francisco?")
Disabling durable execution for a function means that when its parent function calls are re-executed during request replays or automatic retries, then the non-durable function calls will always be re-executed and their outputs will never be reused from previous executions. Disabling durability is useful for functions that must always run fresh (e.g., functions that return current time or current weather or stock price). It’s not recommended to call other Tensorlake functions from a non-durable function because all such function calls will also be non-durable and will always be re-executed. If the same functions are called from durable functions then their outputs will still be saved and reused as normal if the called functions are durable. With strict replay mode, no validation is done on non-durable function calls. A non-durable function call and all its child Tensorlake function calls just get re-executed on each replay.

Best practices for durable Tensorlake applications

  • Wrap every external call (LLM, API, database, etc.) in a Tensorlake function to make these calls durable and avoid repeating work. If a framework is doing these calls then use framework customization points (e.g., callbacks, hooks, decorators, etc.) to wrap the calls in Tensorlake functions.
  • Design your application code to be deterministic to ensure that replays follow the same execution path and thus reuse previously finished work.
  • If your Tensorlake functions have external side effects (e.g., sending emails, modifying databases), ensure that these side effects are idempotent or can be safely retried without causing issues.
  • Disable durability for functions that must always run on request replay or retry.
  • If strict mode and code upgrade to latest are used in a replay then the latest application code needs to be fully backward compatible with the original request code to avoid failing the replay.

Human in the loop and external events

The replay API can be used for resuming requests that timed out while waiting for external inputs (e.g., human review, external event).
from tensorlake.applications import application, function, RequestError
from approval_system import wait_for_approval_request, get_approval_message, create_approval_request, ApprovalDeniedError

@function()
def create_approval(user_id: str, action: str) -> str:
    """Wraps create_approval_request call to make it durable."""
    return create_approval_request(user_id, action)


@function(timeout=300)  # 5 minute timeout
def wait_approval(approval_id: str) -> str:
    """Waits for an approval request to be completed and returns its approval message.

    Raises RequestError if the approval was denied. This fails the current request immediately.
    """
    try:
        wait_for_approval_request(approval_id)
    except ApprovalDeniedError as e:
        raise RequestError(f"Approval '{approval_id}' was denied: {e}") from e
    return get_approval_message(approval_id)


@application()
@function()
def user_authorization_workflow(user_id: str, action: str) -> str:
    approval_id: str = create_approval(user_id, action)
    approval_message: str = wait_approval(approval_id)
    return f"Approval received: {approval_message}"
wait_approval function times out after waiting for 5 minutes. The wait can be resumed once the approval is granted by replaying the request using the Request Replay API. The replayed request will skip the already completed create_approval function call and re-execute the wait_approval function call which will now be able to complete successfully.

Comparison with Temporal

Both Tensorlake and Temporal provide durable execution, they achieve it through different architectures. Temporal relies on event history replay, whereas Tensorlake saves and retrieves function outputs and matches function calls using their fingerprints. This removes many constraints that Temporal imposes on application code.
FeatureTensorlake ApplicationsTemporal
User Code ConstraintsAdaptive. By default, a replayed request can change its function calls.Strict Determinism. Workflow logic must be perfectly deterministic or replay crashes.
Handling Code UpdatesAdaptive. By default, Tensorlake adapts to new code. New function calls execute normally, and removed function calls are ignored. No special versioning logic is required.Complex. Requires explicit “Versioning” logic (workflow.patched()) or creating new task queues to prevent “Non-Determinism Errors” when replays encounter new code.
History LimitsUnlimited. There are no event history size limits. You can have infinite loops or long-running applications without resetting execution state.Limited. Event history has hard size limits (typically 50K events). Large loops or long-running workflows must use “Continue-As-New” to truncate history.
Replay BehaviorAdaptive. By default, if the code execution path deviates, Tensorlake simply executes the new path while reusing cached outputs where possible.Strict. If the code execution path deviates from the saved history, the workflow fails (Block/Retry loop).
Code FailuresFails Fast. If a function fails and runs out of retries, the request fails immediately, allowing you to debug and Replay it later when fixed.Blocks and Retries. If a workflow task fails (e.g., a bug in logic), it blocks and retries indefinitely until fixed.
Code StructureFlexible. You can structure your application code freely, using any programming constructs without worrying about replay constraints.Constrained. You must split the code into workflows and activities and use them carefully to avoid non-determinism and ensure replayability.
Strict Replay ModeAvailable. You can enable strict replay mode to enforce exact function calls matching during replays to avoid non-determinism.Available. Temporal always enforces strict determinism in workflow code
Non-durable FunctionsSupported. You can disable durability for specific functions that must always run fresh on replays or retries.Not Supported. All external data must be recorded in history. Retrieving fresh data during replay is generally forbidden to prevent non-determinism.

Next