Workflows are not Generally Available yet. Please contact us to get access if you are interested to build and run hassle free data-intensive workflows on GPUs.

Tensorlake Workflows enable orchestrating data ingestion and transformation. The workflows are distributed and can handle ingesting and processing large volumes of data in parallel. They are durable so it’s guaranteed that all ingested data will be processed.

Workflows are defined by writing Python functions which processes data. The functions are connected to each other to form a workflow which captures the end-to-end data ingestion and transformation logic.

We use Graphs to represent the workflow, to enable parallel execution of disjoint parts of the workflows.

Functions

Functions are the building blocks of workflows. They are regular Python functions, decorated with @tensorlake_function() decorator.

Functions can be executed in a distributed manner, and the output is stored so that if downstream functions fail, they can be resumed from the output of their upstream functions.

my_graph.py
from tensorlake import tensorlake_function

@tensorlake_function()
def multiply(a: int, b: int) -> int:
    return a * b

Graphs

Graphs define the data flow in a workflow across functions.

Graph contains:

  • Start Function: The first function that is executed when the graph is invoked.
  • Edges: Connections between functions.
my_graph.py
from tensorlake import Graph, tensorlake_function

@tensorlake_function()
def square(a: int) -> int:
    return a * a

graph = Graph(name="my_graph", start_node=multiply)
graph.add_edge(multiply, square)

Testing and Deploying Graphs

Graphs and Functions can be tested locally like any other Python code.

Install the tensorlake package:

pip install tensorlake

Run the graphs locally:

my_graph.py
graph.run(a=2, b=3)

Deploying Graphs

export TENSORLAKE_API_KEY=<API_KEY>
tensorlake deploy my_graph.py

Calling Graphs

Once a Graph is deployed, they are exposed as HTTP endpoints waiting for data to be sent to them. You can invoke the graphs using HTTP API or Python SDK.

Using HTTP API
curl -X POST https://api.tensorlake.ai/v1/workflows/my_graph \
-H "Authorization: Bearer <API_KEY>" \
-H "Content-Type: application/json" \
-d '{"a": 2, "b": 3}'
Using Python SDK
from tensorlake import RemoteGraph

g = RemoteGraph(name="my_graph")
g.run(a=2, b=3)

Resource Usage Metering

Workflows are billed based on the number of seconds they are running any of the functions. We don’t charge you when functions are not running, or data is being ingested but not processed by any function.