> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tensorlake.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Building Workflows

> Build multi-step data workflows with parallel execution and optimized resource usage

Data workflows involve multiple steps — fetching, transforming, validating, enriching, and loading data. Tensorlake lets you define these pipelines as composed functions that automatically run in parallel where possible, with built-in durability and resource optimization.

Your workflows are exposed as HTTP endpoints, that can be called on-demand. They scale up when they are called, and scale down when they are idle.

## Your First Workflow

Workflows in Tensorlake use **futures** to define function calls without executing them immediately. This allows Tensorlake to optimize execution by running independent steps in parallel. When you return a future from a function (called a **tail call**), the function completes immediately without blocking, and Tensorlake orchestrates the remaining work.

Here's a simple workflow that processes and formats data from multiple sources:

```python theme={null}
from tensorlake.applications import application, function

@application()
@function()
def enrich_record(record_id: str) -> dict:
    # Create futures - these don't run yet, just define the function calls
    profile = fetch_profile.future(record_id)
    history = fetch_history.future(record_id)

    # Return a tail call - enrich_record() completes immediately without blocking
    # Tensorlake then automatically:
    # 1. Runs fetch_profile() and fetch_history() in parallel (no dependencies between them)
    # 2. Once both complete, runs merge_data() with their results
    # 3. Uses merge_data()'s return value as enrich_record()'s final result
    return merge_data.future(profile, history)


@function()
def fetch_profile(record_id: str) -> dict:
    # Fetch from profile service
    return {"id": record_id, "name": "Example Corp", "tier": "enterprise"}


@function()
def fetch_history(record_id: str) -> list:
    # Fetch transaction history
    return [{"date": "2024-01-15", "amount": 5000}]


@function()
def merge_data(profile: dict, history: list) -> dict:
    return {"profile": profile, "transactions": history}
```

**What happens when you call this workflow:**

```bash theme={null}
curl https://api.tensorlake.ai/applications/enrich_record \
  -H "Authorization: Bearer $TENSORLAKE_API_KEY" \
  --json '"rec_123"'
```

1. `enrich_record` starts and immediately returns (doesn't block)
2. `fetch_profile("rec_123")` and `fetch_history("rec_123")` run **in parallel**
3. When both complete, `merge_data` runs with both results
4. Final response contains the merged data

<Check>
  **Key benefits:**

  * **Parallel execution** where possible (lower latency)
  * **No blocking** — the orchestrator container is freed immediately
  * **Automatic dependency tracking** — no manual coordination needed
  * **Built-in durability** — failures resume from checkpoints
</Check>

For a deep dive on futures and tail calls, see [Futures](/applications/futures#tail-calls).
See [async functions](/applications/async-functions) on how to build non-blocking workflows using Python async/await.

<Info>
  Each function in your workflow can be configured with retry policies. If a step fails, Tensorlake automatically retries it based on your [retry configuration](/applications/retries).
</Info>

## Best Practices

### Design for Parallelism

Identify steps that can run independently:

```python theme={null}
# Sequential — slow
@function()
def slow_pipeline(data: str) -> str:
    result1 = step1(data)
    result2 = step2(data)  # Could have run in parallel
    return combine(result1, result2)

# Parallel — fast
@function()
def fast_pipeline(data: str) -> str:
    result1 = step1.future(data)
    result2 = step2.future(data)  # Runs in parallel with step1
    return combine.future(result1, result2)
```

### Use Tail Calls for Efficiency

Return futures instead of blocking. When you return a future as a tail call, the current function's container is freed immediately — you're not paying for idle containers waiting for downstream results.

```python theme={null}
# Blocks container unnecessarily
@function()
def inefficient(data: str) -> str:
    result = expensive_operation(data)  # Container blocked here
    return result

# Frees container immediately
@function()
def efficient(data: str) -> str:
    return expensive_operation.future(data)  # Container freed right away
```

### Process Lists with Map-Reduce

For workflows that process collections of items, use map-reduce operations to parallelize the work:

```python theme={null}
from pydantic import BaseModel

class ProcessingResult(BaseModel):
    total_processed: int = 0
    total_value: float = 0.0

@application()
@function()
def process_batch(record_ids: list[str]) -> ProcessingResult:
    # Map: process each record in parallel
    results = process_record.future.map(record_ids)
    # Reduce: aggregate results as they complete
    return aggregate_results.future.reduce(results, ProcessingResult())

@function()
def process_record(record_id: str) -> dict:
    # Each record processed in its own container
    return {"id": record_id, "value": 100.0}

@function()
def aggregate_results(summary: ProcessingResult, record: dict) -> ProcessingResult:
    summary.total_processed += 1
    summary.total_value += record["value"]
    return summary
```

Map-reduce operations automatically run in parallel and scale to handle large datasets efficiently. See [Map-Reduce](/applications/map-reduce) for more details.

## Learn More

<CardGroup cols={2}>
  <Card title="Futures" icon="shuffle" href="/applications/futures">
    Deep dive on futures, tail calls, and parallel execution.
  </Card>

  <Card title="Async Functions" icon="shuffle" href="/applications/async-functions">
    Async functions are another way to define workflows with parallel execution.
  </Card>

  <Card title="Durable Execution" icon="clock-rotate-left" href="/applications/durability">
    How workflows recover from failures.
  </Card>
</CardGroup>
