@tensorlake_function
decorator.
@tensorlake_function
decorator allows you to specify the following attributes:
image
- The image to use for the function container. A basic Debian based image by default. See Images.input_encoder
- The serializer to use for the input of the function. json
by default. See Input and output serialization.output_encoder
- The serializer to use for the output of the function. json
by default. See Input and output serialization.secrets
- The secrets available to the function in its environment variables. No secrets by default. See Secrets.next
- functions called with outputs of this function. This allows to chain functions together into a workflow graph. See Dynamic routing.name
- The name of the function in the workflow. By default, it is the name of the Python function.description
- A description of the function in the workflow. Visible when viewing workflow details.retries
- retry policy for the function. No retries by default if function failed. See Retries.timeout
- The timeout for the function in seconds. The default is 5 minutes. See Timeouts.use_ctx
- If True
then request context is passed to the function. False
by default. See Request Context.accumulate
- If not None
, turns the function into a reducer. None
by default. See Map-Reduce.cacheable
- If True
, reusing previous function outputs is allowed. False
by default. See Caching.cpu
- The number of CPUs available to the function. The default is 1.0
CPU. See CPU.memory
- The memory GB available to the function. The default is 1.0
GB. See Memory.ephemeral_disk
- The ephemeral /tmp
disk space available to the function in GB. The default is 2.0
GB. See Ephemeral Disk.gpu
- The GPU model and count available to the function. The default is None
(no GPU). Please contact support@tensorlake.ai
to enable GPU support.TensorlakeCompute
and use its __init__(self)
constructor to run any initialization code once on function container startup.
Under the hood, all functions defined using @tensorlake_function()
decorator get converted into a TensorlakeCompute
instance.
cloudpickle
if you want to pass complex Python objects between functions, such as Pandas dataframes, Pytorch Tensors, PIL
images, etc.
cloudpickle
requires the objects to be serialized and deserialized on the same Python version. This requires function containers to use the same Python version.
The input_encoder
and output_encoder
attributes can be used to change the serialization format. Currently supported formats are:
json
- JSON serializationcloudpickle
- Cloudpickle serializationtimeout
attribute.
The default timeout is 300
(5 minutes). Minimum is 1
, maximum is 172800
(48 hours). Progress updates can be sent by the function to extend the
timeout. See Request Context.
retries
attribute.
use_ctx
function attribute is True
then the function gets a request context as its first parameter with name ctx
.
The context has information about the current request and provides access to Tensorlake APIs for the current request.
By default, the request context is not passed to the function.
ctx.update_progress
after 2 minutes of execution,
then the timeout is reset to 4 minutes from that point, allowing the function to run for another 4 minutes.
cacheable
function attribute is True
, then Tensorlake assumes that the function returns the same outputs for the same inputs.
This allows Tensorlake to cache the outputs of the function and reuse them when the function is called with the same inputs again.
When cached outputs are used, the function is not executed. This speeds up requests and makes them cheaper to run.
The size of the cache and the caching duration is controlled by the Tensorlake Platform.
cpu
attribute. Minimum is 1.0
, maximum is 8.0
.
The default is 1.0
. This is usually sufficient for functions that only call external APIs and do simple data processing.
Adding more CPUs is recommended for functions that do complex data processing or work with large datasets.
If functions use large multy-gigabyte inputs or produce large multi-gigabyte outputs, then at least 3 CPUs are recommended.
This results in the fastest download and upload speeds for the data.
memory
attribute. Minimum is 1.0
, maximum is 32.0
.
The default is 1.0
. This is usually sufficient for functions that only call external APIs and do simple data processing.
Adding more memory is recommended for functions that do complex data processing or work with large datasets.
It’s recommended to set memory
to at least 2x the size of the largest inputs and outputs of the function.
This is because when the inputs/outputs are deserialized/serialized both serialized and deserialized representations are
kept in memory.
/tmp
path. It gets erased when its
function container gets terminated. It’s optimal for storing temporary files that are not needed after the function
execution is completed. Ephemeral disks are backed by fast SSD drives. Using other filesystem paths like /home/ubuntu
for storing temporary files will result in slower performance. Temporary files created using Python modules like tempfile
are stored in ephemeral disk space inside /tmp
.
GB of ephemeral disk space available to the function is set using ephemeral_disk
attribute. Minimum is 2.0
, maximum is 50.0
.
The default is 2.0
GB. This is usually sufficient for functions that only call external APIs and do simple data processing.
If the function needs to temporarily store large files or datasets on disk, then the ephemeral_disk
attribute should be increased
accordingly.
my_function
is the start node of the workflow. The input to the workflow is passed to my_function
.
Workflows are exposed as HTTP endpoints, the body of the request will be passed to the start node of the workflow, in this case my_function
.
retries
attribute of the workflow.
Each function can override the default retry policy by setting its own retries
attribute.
See Retries.