Overview
Tensorlake Workflows enable orchestrating data ingestion and transformation workflows. The workflows are distributed and can handle ingesting and processing large volumes of data paralelly. You can run inference on ingested data using AI models at any point of the workflow. They are durable so it’s guaranteed that all ingested data will be processed.
Overview of Workflows
Workflows are defined by writing Python functions which processes data. The functions are connected to each other to form a workflow which captures the end-to-end data ingestion and transformation logic.
We use Graphs to represent the workflow, to enable parallel execution of disjoint parts of the workflow, and dynamic routing of data across functions similar to calling functions with if-else statements on the input data.
Tensorlake Functions
Tensorlake functions are the building blocks of workflows. They are Python functions decorated with the @tensorlake_function
decorator.
You can write any Python code inside the function, and depend on any Python package. The functions are run inside a container
of an image built from the specification of the Image
object.
Graphs
You can string together multiple functions to form a workflow.
In the above example, my_function
is the start node of the workflow. The input to the workflow is passed to my_function
.
Workflows are exposed as HTTP endpoints, the body of the request will be passed to the start node of the workflow, in this case my_function
.
Outputs
Tensorlake workflows allow retrieving the outputs of any function in the workflow.
Automatic ScaleOuts
If a function returns a list, and its edges processes a single element of the list, Tensorlake automatically parallelizes the edges by bringing up many instances of the function.
In the above example, count_words
will be called in parallel for each chunk of the essay. If you are using tools like
Airflow, you would need to use something like Spark, Ray or Dask to parallelize the processing. Tensorlake does this automatically.
Get Started with Tensorlake Serverless
Deploy your first Tensorlake Serverless workflow
Key Concepts
Learn about the key concepts for building workflows
Was this page helpful?