Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tensorlake.ai/llms.txt

Use this file to discover all available pages before exploring further.

Tensorlake Workflows let you compose functions into a Graph and execute them in parallel or serially. Below are the most common questions about how Workflows work.

What is durable execution?

Durable execution is a runtime model where the outputs of each step in a long-running program are checkpointed, so a crash, timeout, or retry can resume from the last completed step instead of restarting from scratch. It’s the foundation behind systems like Temporal, Inngest, and Restate, and is commonly used for AI agents, long-running data pipelines, multi-step orchestration, and workflows that span minutes, hours, or days. Tensorlake Workflows implement durable execution natively in Python: function outputs are checkpointed to object storage, and on failure the scheduler replays the call graph and skips already-completed steps. See Durable Execution for the full model.

What are Tensorlake Workflows?

Tensorlake Workflows are a way to automate and orchestrate complex tasks. You define a series of functions that execute in parallel or sequentially, and Tensorlake handles distribution, persistence, and recovery.

What is a Graph in a Tensorlake Workflow?

A Graph connects multiple functions together into a workflow. It contains:
  • Node — a function that operates on data.
  • Start Node — the first function executed when the graph is invoked.
  • Edges — represent data flow between functions.
  • Conditional Edge — evaluates input data from the previous function and decides which edges to take. Like an if-else statement.
Graphs are workflows whose functions can be executed in parallel, while Pipelines are linear workflows that execute functions serially.

How do I define a function in a Tensorlake Workflow?

Functions are regular Python functions decorated with @tensorlake_function(). A function executes in a distributed manner and its output is stored, so if downstream functions fail they can resume from that output. The decorator accepts parameters to configure retry behavior, placement constraints, and more.

How do I run a sequential pipeline in Tensorlake?

Chain nodes with add_edge so each function transforms the output of the previous one until reaching the end node.
@tensorlake_function()
def node1(input: int) -> int:
    return input + 1

@tensorlake_function()
def node2(input2: int) -> int:
    return input2 + 2

@tensorlake_function()
def node3(input3: int) -> int:
    return input3 + 3

graph = Graph(name="pipeline", start_node=node1)
graph.add_edge(node1, node2)
graph.add_edge(node2, node3)
Use case: Transforming a video into text by first extracting the audio, and then doing Automatic Speech Recognition (ASR) on the extracted audio.

How do I run workflow steps in parallel in Tensorlake?

Add multiple edges from one start node to different downstream functions. Each branch produces an output for the same input in parallel.
@tensorlake_function()
def start_node(input: int) -> int:
    return input + 1

@tensorlake_function()
def add_two(input: int) -> int:
    return input + 2

@tensorlake_function()
def is_even(input: int) -> int:
    return input % 2 == 0

graph = Graph(name="pipeline", start_node=start_node)
graph.add_edge(start_node, add_two)
graph.add_edge(start_node, is_even)
Use case: Extracting embeddings and structured data from the same unstructured data.

How do I parallelize a function across many items (map) in Tensorlake?

When an upstream function returns a sequence and the downstream function accepts a single element of that sequence, Tensorlake automatically parallelizes the downstream function — one invocation per element — across machines and worker processes.
@tensorlake_function()
def fetch_urls() -> list[str]:
    return [
        'https://example.com/page1',
        'https://example.com/page2',
        'https://example.com/page3',
    ]

# scrape_page is called in parallel for every element of fetch_url across
# many machines in a cluster or across many worker processes in a machine
@tensorlake_function()
def scrape_page(url: str) -> str:
    content = requests.get(url).text
    return content
Use case: Generating an embedding for every chunk of a document.

How do I aggregate results across many items (reduce) in Tensorlake?

Reduce functions aggregate outputs from one or more functions that return sequences. They have two key properties:
  • Lazy evaluation — reduce functions are invoked incrementally as elements become available, so they stream over large datasets efficiently.
  • Stateful aggregation — the aggregated value persists between invocations. Each call receives the current accumulated state along with the new element to process.
@tensorlake_function()
def fetch_numbers() -> list[int]:
    return [1, 2, 3, 4, 5]

class Total(BaseModel):
    value: int = 0

@tensorlake_function(accumulate=Total)
def accumulate_total(total: Total, number: int) -> Total:
    total.value += number
    return total
Use case: Aggregating a summary from hundreds of web pages.

How do I conditionally route data between functions in Tensorlake?

Use @tensorlake_router on a function that returns the list of downstream functions to invoke based on custom logic. The router decides at runtime which branch(es) to take.
@tensorlake_function()
def handle_error(text: str):
    # Logic to handle error messages
    pass

@tensorlake_function()
def handle_normal(text: str):
    # Logic to process normal text
    pass

# The function routes data into the handle_error and handle_normal based on the
# logic of the function.
@tensorlake_router()
def analyze_text(text: str) -> List[Union[handle_error, handle_normal]]:
    if 'error' in text.lower():
        return [handle_error]
    else:
        return [handle_normal]
Use case: Processing outputs differently based on classification results.

How do Tensorlake Workflows compare to durable-execution systems like Temporal or Inngest?

Tensorlake Workflows are a durable-execution runtime in the same category as Temporal, Inngest, and Restate: function outputs are checkpointed, and on failure the scheduler replays the call graph from the last completed checkpoint instead of re-running everything from scratch. The differences are surface and integration:
  • Authored as plain Python. Functions are decorated with @tensorlake_function — no separate worker SDK or activity/workflow split.
  • One runtime for code and isolation. Workflows run on the same platform as Tensorlake Sandboxes, so the durable functions and the isolated environments they call into are managed by one scheduler.
  • Output storage built in. Function outputs are persisted to object storage by default, so you can pass large files between steps without external workarounds.
See Architecture and Durable Execution for how checkpointing and replay work.