# Tensorlake skills Source: https://docs.tensorlake.ai/agent-skills Tensorlake skills teach your coding agents to build production workflows with TensorLake's Sandbox and Orchestration SDKs. Instead of treating TensorLake as just another API, **Tensorlake Skills** teach agents how to use TensorLake as infrastructure — coordinate workflows with the Orchestration SDK, run tasks in isolated environments with the Sandbox SDK, and compose reliable agent systems for production use. ## What You Can Build Use it when you want your coding agent to build: * Multi-agent applications with an orchestrator and specialist sub-agents * Sandboxed coding or execution workflows * Agent teams with separate workspaces * Long-running or stateful agent systems * Production-ready orchestration patterns ## What the Skill Does It guides agents to: * Use the **Orchestration SDK** for workflow logic and multi-agent coordination * Use the **Sandbox SDK** for isolated code execution and real agent workspaces * Combine both SDKs to build production-style agent systems * Choose TensorLake patterns that are better than a single-agent or stateless approach Works with any LLM provider (OpenAI, Anthropic) and any agent framework (LangChain, CrewAI, LlamaIndex). TensorLake is the infrastructure layer — bring your own models and frameworks. ## Supported Agents | Agent | File | How to Install | | ------------------------------------------------------------- | ----------- | ---------------------------------------- | | [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | `SKILL.md` | [Claude Code installation](#claude-code) | | [Google ADK](https://google.github.io/adk-docs/skills/) | `SKILL.md` | [Google ADK installation](#google-adk) | | [OpenAI Codex](https://openai.com/index/codex/) | `AGENTS.md` | [Codex installation](#openai-codex) | ## Installation ### Any Agent ```bash theme={null} npx skills add tensorlakeai/tensorlake-skills ``` ### Claude Code Clone the repo and copy the skill into your project's `.claude/skills/` directory: ```bash theme={null} git clone https://github.com/tensorlakeai/tensorlake-skills /tmp/tensorlake-skills mkdir -p .claude/skills/tensorlake cp -r /tmp/tensorlake-skills/SKILL.md /tmp/tensorlake-skills/references .claude/skills/tensorlake/ rm -rf /tmp/tensorlake-skills ``` Or for global access across all projects: ```bash theme={null} git clone https://github.com/tensorlakeai/tensorlake-skills /tmp/tensorlake-skills mkdir -p ~/.claude/skills/tensorlake cp -r /tmp/tensorlake-skills/SKILL.md /tmp/tensorlake-skills/references ~/.claude/skills/tensorlake/ rm -rf /tmp/tensorlake-skills ``` ### Google ADK Install the skill by adding the `SKILL.md` file to your ADK agent's skill directory. See the [Google ADK skills documentation](https://google.github.io/adk-docs/skills/) for details. ### OpenAI Codex Install the skill by adding the `AGENTS.md` file to your Codex agent configuration. See the [OpenAI Codex documentation](https://openai.com/index/codex/) for details. Works with Claude Code, Cursor, Cline, GitHub Copilot, Windsurf, and more via [skills.sh](https://skills.sh). ## Setup TensorLake requires a `TENSORLAKE_API_KEY` configured in the local environment. 1. Get an API key at [cloud.tensorlake.ai](https://cloud.tensorlake.ai) 2. Run `tensorlake login` or configure the variable through your shell profile, `.env` file, or secret manager Do not paste API keys into chat, commit them to source control, or print them in terminal output. ## The Skill Triggers Automatically The skill activates when you ask the agent to: * Build agentic workflows or multi-agent pipelines * Run LLM-generated code in a secure sandbox * Orchestrate complex multi-step AI applications * Integrate TensorLake with any LLM, framework, database, or API * Ask questions about TensorLake APIs or documentation ## Source The skill is open source and available on GitHub: [tensorlakeai/tensorlake-skills](https://github.com/tensorlakeai/tensorlake-skills) # Root Source: https://docs.tensorlake.ai/api-reference/root get / # Create Dataset Source: https://docs.tensorlake.ai/api-reference/v2/datasets/create post /documents/v2/datasets Create an ingestion workflow for structured extraction or document parsing. A dataset is a collection of settings that help with organizing documents from the same domain and enable focused document intelligence. The dataset’s name must be unique. *Your data is *NOT* sent to a third party service(OpenAI, Anthropic, etc), and uses our own models to parse the document.* To read more about the configuration options, see the [Parse Documents](../parse/parse) endpoint. # Data Source: https://docs.tensorlake.ai/api-reference/v2/datasets/data get /documents/v2/datasets/{dataset_id}/data List all the parse jobs associated with a specific dataset. This endpoint allows you to retrieve the status and metadata of each parse job that has been submitted under the specified dataset. # Delete Source: https://docs.tensorlake.ai/api-reference/v2/datasets/delete delete /documents/v2/datasets/{dataset_id} ## Delete Dataset Delete a dataset from your current project based on the API Key you are using. Deleting a dataset removes every output generated by the dataset, but not the files associated with it. # Get Source: https://docs.tensorlake.ai/api-reference/v2/datasets/get get /documents/v2/datasets/{dataset_id} Get the details of a specific dataset associated with your project. This endpoint allows you to retrieve information about the dataset, including its ID, name, description, and any associated metadata. The dataset's settings can be modified using the [Update Dataset](./update) endpoint. The properties of the dataset include: * `name`: The name given to the dataset. * `dataset_id`: The unique identifier for the dataset. * `description`: A brief description of the dataset if provided. * `status`: The current status of the dataset (e.g., `idle`, `processing`). If the dataset has at least one parse job in the `processing` or `pending` state, the dataset status will be `processing`. * `created_at`: The timestamp when the dataset was created, formatted as a RFC 3339 string. * `updated_at`: The timestamp when the dataset was last updated, formatted as a RFC 3339 string. # List Datasets Source: https://docs.tensorlake.ai/api-reference/v2/datasets/list get /documents/v2/datasets List all the datasets in your organization. # Parse Source: https://docs.tensorlake.ai/api-reference/v2/datasets/parse post /documents/v2/datasets/{dataset_id}/parse Use the Dataset's configuration to parse a document and get parsed results in the Dataset. This endpoint allows you to submit a file for parsing using the settings defined in a specific dataset. ## Using a file When submitting a parse job to the dataset, you can provide the content of the file in one of three ways: 1. `file_id`: The ID of a file that has been previously uploaded to the [Upload File](../../v2/files/upload) endpoint. This is the most common method. 2. `file_url`: A publicly accessible URL that points to the file you want to parse. The API will download the file from this URL. Redirects are also supported, but the URL and the `Location` header must point to a file that is publicly accessible. 3. `raw_text`: Raw text content, if you want to perform structured extraction from non-file sources; such as emails, HTML, CSV, XML, etc. # Update a dataset's settings Source: https://docs.tensorlake.ai/api-reference/v2/datasets/update put /documents/v2/datasets/{dataset_id} Change the settings or metadata for a dataset. Dataset's settings changes are not retroactive. The changes will only apply to new document parsing or structured extraction executions. The unique name constraint is still enforced. If you change the name of the dataset, the new name must be unique within your organization. # Edit Document Source: https://docs.tensorlake.ai/api-reference/v2/edit post /documents/v2/edit Submit an uploaded file, an internet-reachable URL, or any kind of raw text for document editing. If you have configured a webhook, we will notify you when the job is complete, be it a success or a failure. The API will edit the document based on the provided prompt and options. Once submitted, the API will return a job ID. # Delete file Source: https://docs.tensorlake.ai/api-reference/v2/files/delete delete /documents/v2/files/{file_id} This operation allows you to delete a file from the Tensorlake Cloud. The file will be removed from the project specified by the API key used in the request. This operation is not reversible. Once a file is deleted, it cannot be recovered. Deleting a file **does not** delete the parse jobs associated with it. If you want to delete the parse jobs, you need to do that separately. # Get file metadata Source: https://docs.tensorlake.ai/api-reference/v2/files/get-metadata get /documents/v2/files/{file_id}/metadata Get the metadata of a specific file in the Tensorlake Cloud. This endpoint allows you to retrieve detailed information about a file, including its ID, name, size, type, and any associated labels. # List Files Source: https://docs.tensorlake.ai/api-reference/v2/files/list get /documents/v2/files This operation allows you to see every file that has been uploaded to the Project specified by the API key used in the request. The response will include metadata about each file, such as the file ID, name, size, and type. We use cursor-based pagination to return the files in pages. A page has the following fields: * `items`: An array of file metadata, each containing the fields described below. * `has_more`: A boolean indicating whether there are more files available beyond the current page. * `next_cursor`: A base64-encoded cursor for the next page of results. If `has_more` is `false`, this field will be `null`. * `prev_cursor`: A base64-encoded cursor for the previous page of results. If this is the first page, this field will be `null`. # Upload File Source: https://docs.tensorlake.ai/api-reference/v2/files/upload put /documents/v2/files This operation allows you to upload a file to the Tensorlake Cloud. The file will be associated with the project specified by the API key used in the request. The file can be of any of the following types: * PDF * Word (DOCX) * Spreadsheets (XLS, XLSX, XSLM, CSV) * Presentations (PPTX, Apple Keynote) * Images (PNG, JPG, JPEG) * Raw text (plain text, HTML) The file type is automatically detected based on `Content-Type` header. In case the `Content-Type` header is not provided, the file extension will be used to infer the type. If the file type cannot be determined, it will default to `application/octet-stream`. We only keep one copy of the file, so uploading the same file multiple times will return the same `file_id`. ### Labels Labels can be added to the file to help categorize the parse jobs associated with it. Labels are key-value pairs that can be used to filter and organize files. These should be provided in the a `labels` text field in the multipart form data. Labels are optional, but they can be very useful for organizing and managing parse jobs. ### Limits There is an upload limit of 1 GB per file. If you need to upload larger files, please reach out to us at [support@tensorlake.ai](mailto:support@tensorlake.ai). # Introduction Source: https://docs.tensorlake.ai/api-reference/v2/introduction Tensorlake API Reference ## Sandbox APIs The Tensorlake Sandbox API lets you create isolated runtimes, inspect their state, update public ingress settings, snapshot running sandboxes, restore new sandboxes from snapshots, and suspend or resume them. Launch a sandbox for the current project. Capture the current filesystem and memory state of a running sandbox. Start a new sandbox from a previously created snapshot. ## Sandbox Runtime APIs The sandbox proxy also exposes runtime endpoints for each running sandbox. These requests are routed through the sandbox proxy to the daemon inside the sandbox. Start processes, inspect status, send signals, write stdin, and stream output from a sandbox. Create interactive terminal sessions and attach over WebSocket. Read, write, delete, and list files through the sandbox proxy. # Classify Document Source: https://docs.tensorlake.ai/api-reference/v2/parse/classify post /documents/v2/classify Submit a uploaded file, an internet-reachable URL, or any kind of raw text for document parsing. If you have configured a webhook, we will notify you when the job is complete, be it a success or a failure. Once submitted, the API will return a parse response with a `parse_id` field. You can query the status and results of the parse operation with the [Get Parse Result](./get) endpoint. ## Using page classes For this operation, you must pass in an array of categories along with their descriptions to guide the classifier in the `page_classifications` field. The API will return the page class for each page of the document. Each page class name must be unique within the document, and should be descriptive enough to convey the content of the page. # Delete Parse Jobs Source: https://docs.tensorlake.ai/api-reference/v2/parse/delete delete /documents/v2/parse/{parse_id} Delete a previously submitted parse job. This will remove the parse job and its associated settings from the system. Deleting a parse job does not delete the original file used for parsing, nor does it affect any other parse jobs that may have been created from the same file. # Extract Document Source: https://docs.tensorlake.ai/api-reference/v2/parse/extract post /documents/v2/extract Submit a uploaded file, an internet-reachable URL, or any kind of raw text for document parsing. If you have configured a webhook, we will notify you when the job is complete, be it a success or a failure. Once submitted, the API will return a parse response with a `parse_id` field. You can query the status and results of the parse operation with the [Get Parse Result](./get) endpoint. ## Using a schema For this operation, you must provide one or more schemas to guide the extraction process. The schema must be in the form of a JSON Schema object. The JSON Schema object can be provided in the `structured_extraction_options` array, which can contain multiple objects. Known limitations include: * The schema can only be at most 5 levels deep * Root level fields must be objects Page Classification labels can be combined with structured extraction, to make the API perform structured extraction on a subset of pages. ## Accepted input types This endpoint accepts three input types via the request body: | Field | Description | | ---------- | ---------------------------------------------------------------------------------- | | `file_id` | ID of a file previously uploaded to Tensorlake | | `file_url` | Publicly accessible URL of a file | | `raw_text` | Raw text content (e.g. Markdown), submitted inline with `mime_type: text/markdown` | If you're iterating on an extraction schema, you can run OCR once, upload the Markdown output as a file, and reuse the resulting `file_id` for every subsequent extraction call — paying only for extraction tokens, not document processing. See [Reusing OCR Output for Structured Extraction](/document-ingestion/parsing/structured-extraction#advanced-reusing-ocr-output-for-structured-extraction). # Get Parse Result Source: https://docs.tensorlake.ai/api-reference/v2/parse/get get /documents/v2/parse/{parse_id} Retrieve the results of a previously submitted parse job. The response will include: * Parsed content * Markdown (chunked if a chunking strategy is specified) * Pages * Structured extraction results (if schemas are provided during the parse request) * Page classification results (if page classifications are provided during the parse request) ## Response Structure When the job finishes successfully, the response will contain a JSON object with the following fields: ### pages The `pages` field contains a JSON representation of the chunks of the page/document. Each page is represented as an object with the following properties: * `page_number`: The page number of the document. * `page_fragments`: An array of document elements, each with: * `content`: The content of the fragment. * `fragment_type`: The type of the fragment (e.g., text, image, table). * `bbox`: The bounding box of the fragment, represented as an object with `x1`, `y1`, `x2`, and `y2` coordinates. ### chunks The `chunks` field contains an array of text chunks extracted from the document. Each chunk is an object with a property called `content`, which is the text content of the chunk. If a chunking strategy was specified during the parse request, the text will be chunked accordingly. ### structured\_data The `structured_data` field contains a JSON object with every `schema_name` you provided in the parse request as a key. Each object in this array represents a structured data item extracted from the document, adhering to the specified schema. For example, if you provided the following schema for an invoice: ```json theme={null} { "title": "Invoice", "type": "object", "properties": { "invoice_number": { "type": "string" }, "date": { "type": "string", "format": "date" }, "total_amount": { "type": "number" }, "items": { "type": "array", "items": { "type": "object", "properties": { "description": { "type": "string" }, "quantity": { "type": "number" }, "price": { "type": "number" } } } } } } ``` The `structured_data` field will contain objects that match that schema, such as: ```json theme={null} { "invoice_number": "12345", "date": "2023-10-01", "total_amount": 100.0, "items": [ { "description": "Item 1", "quantity": 2, "price": 50.0 } ] } ``` If our models were unable to find any text that complied to the schema, the `structured_data` field will be `null`. This can happen if the document does not contain any text that matches the schema you provided. ### Errors If a parse job is marked as `failure`, the `errors` field will contain an object with details about the error. ## Lifecycle of a parse operation The `status` field will indicate the current state of the parse job. Possible values are: * `pending`: The job is waiting to be processed. * `processing`: The job is currently being processed. * `successful`: The job has been successfully completed and the results are available. * `failure`: The job has failed, and the `errors` field will contain details about Only when the job is in the `successful` state, you can access the `structured_data`, `chunks` and `pages` fields. # List Parse Jobs Source: https://docs.tensorlake.ai/api-reference/v2/parse/list get /documents/v2/parse Retrieve a list of all parse jobs that have been submitted. This endpoint allows you to see the status and metadata of each parse job. The endpoint is paginated. A page has the following fields: * `items`: An array of parse jobs, each containing the fields described below. * `has_more`: A boolean indicating whether there are more parse jobs available beyond the current page. * `next_cursor`: A base64-encoded cursor for the next page of results. If `has_more` is `false`, this field will be `null`. * `prev_cursor`: A base64-encoded cursor for the previous page of results. If this is the first page, this field will be `null`. The response will include a page of parse jobs, each containing the following fields: * `parse_id`: The unique identifier for the parse job. * `status`: The current status of the parse job (e.g., `pending`, `processing`, `successful`, `failure`). * `created_at`: The RFC 3339 timestamp when the parse job was created. * `finished_at`: The RFC 3339 timestamp when the parse job was completed or failed. * `options`: The configuration options used for the parse job, including the file ID, file URL, raw text, mime type, and structured extraction options, etc. ### Filters You can filter the list of parse jobs by providing query parameters: * `cursor`: A base64-encoded cursor for pagination. If not provided, the first page will be returned. * `direction`: The direction of pagination. Can be `next` or `prev`. Defaults to `next`. * `limit`: The maximum number of parse jobs to return per page. Defaults to 100, with a maximum of 1000. * `filename`: Filter by the original filename of the file used for parsing. This is useful to find parse jobs related to a specific file. * `status`: Filter by the status of the parse job. Can be `pending`, `processing`, `successful`, or `failure`. * `id`: Filter by the unique identifier of the parse job. This is useful to retrieve a specific parse job, but is preferable to use the [Get Parse Result](./get) endpoint for that purpose. * `created_after`: Filter by the creation date of the parse job. Only parse jobs created after this date will be returned. The date should be in RFC 3339 format. * `created_before`: Filter by the creation date of the parse job. Only parse jobs created before this date will be returned. The date should be in RFC 3339 format. * `finished_after`: Filter by the completion date of the parse job. Only parse jobs completed after this date will be returned. The date should be in RFC 3339 format. * `finished_before`: Filter by the completion date of the parse job. Only parse jobs completed before this date will be returned. The date should be in RFC 3339 format. # Parse Source: https://docs.tensorlake.ai/api-reference/v2/parse/parse post /documents/v2/parse Submit a uploaded file, an internet-reachable URL, or any kind of raw text for document parsing. If you have configured a webhook, we will notify you when the job is complete, be it a success or a failure. This API is an advanced version of the [Extract Document](./extract), [Classify Document](./classify), and [Read Document](./read) endpoints. We recommend using this API for more advanced use cases e.g. getting structured data and page classifications in a single request. The API will convert the document into markdown, and provide document layout information. You can also classify pages into categories and perform structured extraction using JSON Schema. Once submitted, the API will return a parse response with a `parse_id` field. You can query the status and results of the parse operation with the [Get Parse Result](./get) endpoint. ## Using a file When submitting a parse job, you can provide the content of the file in one of three ways: 1. `file_id`: The ID of a file that has been previously uploaded to the [Upload Files](../../v2/files/upload). This is the most common method. 2. `file_url`: A publicly accessible URL that points to the file you want to parse. The API will download the file from this URL. Redirects are also supported, but the URL and the `Location` header must point to a file that is publicly accessible. 3. `raw_text`: Raw text content, if you want to perform structured extraction from non-file sources; such as emails, HTML, CSV, XML, etc. The API will attempt to detect the mime-type automatically based on the file extension. You can provide a `mime_type` field to override the inferred mime-type. This is useful if you know the content type of the file and want to ensure the model interprets it correctly. ## Page classification You can classify pages of a document into categories, or tags. Pass in an array of categories along with their descriptions to guide the classifier in the `page_classifications` field. The API will return the page class for each page of the document. ## Structured extraction For structured extraction, you can provide one or more schemas to guide the extraction process. The schema must be in the form of a JSON Schema object. The JSON Schema object can be provided in the `structured_extraction_options` array, which can contain multiple objects. Known limitations include: * The schema can only be at most 5 levels deep * All fields must be required * Root level fields must be objects Page Classification labels can be combined with structured extraction, to make the API perform structured extraction on a subset of pages. # Read Document Source: https://docs.tensorlake.ai/api-reference/v2/parse/read post /documents/v2/read Submit a uploaded file, an internet-reachable URL, or any kind of raw text for document parsing. If you have configured a webhook, we will notify you when the job is complete, be it a success or a failure. The API will convert the document into markdown, and provide document layout information. Once submitted, the API will return a parse response with a `parse_id` field. You can query the status and results of the parse operation with the [Get Parse Result](./get) endpoint. # Close Process Stdin Source: https://docs.tensorlake.ai/api-reference/v2/processes/close-stdin post /api/v1/processes/{pid}/stdin/close Close a process stdin pipe and deliver EOF to the process. Close a process's stdin pipe and deliver EOF. The process must be started with `stdin_mode: "pipe"` for this endpoint to work. ## Endpoint ```http theme={null} POST /api/v1/processes/{pid}/stdin/close ``` ## Example Request ```bash theme={null} curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes/42/stdin/close \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `204 No Content` when stdin is closed. If stdin was not opened in `pipe` mode, Tensorlake returns `400 Bad Request`. Missing processes return `404 Not Found`. # Follow Process Output Source: https://docs.tensorlake.ai/api-reference/v2/processes/follow-output get /api/v1/processes/{pid}/output/follow Replay captured output and follow live combined output over Server-Sent Events. Replay captured output and follow new output over Server-Sent Events. ## Endpoint ```http theme={null} GET /api/v1/processes/{pid}/output/follow ``` Use `curl -N` or an SSE client so the connection stays open. ## Example Request ```bash theme={null} curl -N https://.sandbox.tensorlake.ai/api/v1/processes/42/output/follow \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## SSE Events Tensorlake first replays any captured output, then streams live events: ```text theme={null} event: output data: {"line":"Processing item 1/10","timestamp":1710000000000,"stream":"stdout"} event: output data: {"line":"Processing item 2/10","timestamp":1710000001000,"stream":"stdout"} event: eof data: {} ``` `/output/follow` follows combined output and includes `stream` set to `stdout` or `stderr`. If you need a single stream, use [Follow Process Stdout](/api-reference/v2/processes/follow-stdout) or [Follow Process Stderr](/api-reference/v2/processes/follow-stderr). When the process exits and the output stream closes, Tensorlake sends `event: eof`. # Follow Process Stderr Source: https://docs.tensorlake.ai/api-reference/v2/processes/follow-stderr get /api/v1/processes/{pid}/stderr/follow Replay captured stderr and follow live stderr over Server-Sent Events. Replay captured stderr and follow new stderr over Server-Sent Events. ## Endpoint ```http theme={null} GET /api/v1/processes/{pid}/stderr/follow ``` Use `curl -N` or an SSE client so the connection stays open. ## Example Request ```bash theme={null} curl -N https://.sandbox.tensorlake.ai/api/v1/processes/42/stderr/follow \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## SSE Events Tensorlake first replays any captured stderr, then streams live events: ```text theme={null} event: output data: {"line":"Traceback (most recent call last):","timestamp":0} event: output data: {"line":"ValueError: invalid input","timestamp":1710000001000} event: eof data: {} ``` Stderr-only events do not include a `stream` field. If you need merged stdout and stderr with stream tags, use [Follow Process Output](/api-reference/v2/processes/follow-output). # Follow Process Stdout Source: https://docs.tensorlake.ai/api-reference/v2/processes/follow-stdout get /api/v1/processes/{pid}/stdout/follow Replay captured stdout and follow live stdout over Server-Sent Events. Replay captured stdout and follow new stdout over Server-Sent Events. ## Endpoint ```http theme={null} GET /api/v1/processes/{pid}/stdout/follow ``` Use `curl -N` or an SSE client so the connection stays open. ## Example Request ```bash theme={null} curl -N https://.sandbox.tensorlake.ai/api/v1/processes/42/stdout/follow \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## SSE Events Tensorlake first replays any captured stdout, then streams live events: ```text theme={null} event: output data: {"line":"Processing item 1/10","timestamp":0} event: output data: {"line":"Processing item 2/10","timestamp":1710000001000} event: eof data: {} ``` Stdout-only events do not include a `stream` field. If you need merged stdout and stderr with stream tags, use [Follow Process Output](/api-reference/v2/processes/follow-output). # Get Process Source: https://docs.tensorlake.ai/api-reference/v2/processes/get get /api/v1/processes/{pid} Retrieve process metadata and current status for a sandbox process. Get the current status and metadata for one process. ## Endpoint ```http theme={null} GET /api/v1/processes/{pid} ``` ## Example Request ```bash theme={null} curl https://.sandbox.tensorlake.ai/api/v1/processes/42 \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `200 OK` with the process metadata: ```json theme={null} { "pid": 42, "status": "running", "exit_code": null, "signal": null, "stdin_writable": false, "command": "python", "args": ["-m", "http.server", "8080"], "started_at": 1710000000000, "ended_at": null } ``` If the process does not exist, Tensorlake returns `404 Not Found`. # Sandbox Processes API Overview Source: https://docs.tensorlake.ai/api-reference/v2/processes/introduction Manage sandbox processes through the sandbox proxy. The sandbox process API is exposed through each sandbox's management hostname, not `https://api.tensorlake.ai`. ```text theme={null} https://.sandbox.tensorlake.ai ``` These endpoints are proxied through the sandbox proxy to the daemon running inside the sandbox. Use them to start background processes, inspect status, send signals, write stdin, and retrieve or follow captured output. Include `Authorization: Bearer $TENSORLAKE_API_KEY` on requests to the sandbox proxy. Launch a new process inside a sandbox. Enumerate the processes tracked by the sandbox daemon. Inspect the current status and metadata for one process. Deliver a POSIX signal such as `SIGTERM` or `SIGKILL`. Write to stdin or close stdin for a running process. Read captured stdout, stderr, or combined output. Replay existing output and follow new output over SSE. Force-terminate a running process with `SIGKILL`. # Kill Process Source: https://docs.tensorlake.ai/api-reference/v2/processes/kill delete /api/v1/processes/{pid} Force-terminate a process with `SIGKILL`. Kill a running process with `SIGKILL`. ## Endpoint ```http theme={null} DELETE /api/v1/processes/{pid} ``` ## Example Request ```bash theme={null} curl -X DELETE https://.sandbox.tensorlake.ai/api/v1/processes/42 \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `204 No Content` when the process is terminated. If you want a graceful shutdown first, use [Send Signal](/api-reference/v2/processes/signal) with `15` before falling back to `SIGKILL`. # List Processes Source: https://docs.tensorlake.ai/api-reference/v2/processes/list get /api/v1/processes List the processes tracked inside a sandbox through the sandbox proxy. List the processes tracked inside a sandbox. ## Endpoint ```http theme={null} GET /api/v1/processes ``` ## Example Request ```bash theme={null} curl https://.sandbox.tensorlake.ai/api/v1/processes \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `200 OK`: ```json theme={null} { "processes": [ { "pid": 42, "status": "running", "exit_code": null, "signal": null, "stdin_writable": false, "command": "python", "args": ["-m", "http.server", "8080"], "started_at": 1710000000000, "ended_at": null } ] } ``` `status` is one of `running`, `exited`, or `signaled`. # Get Process Output Source: https://docs.tensorlake.ai/api-reference/v2/processes/output get /api/v1/processes/{pid}/output Read the captured combined output for a process. Read output that has already been captured for a process. ## Endpoint ```http theme={null} GET /api/v1/processes/{pid}/output ``` ## Example Request ```bash theme={null} curl https://.sandbox.tensorlake.ai/api/v1/processes/42/output \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `200 OK`: ```json theme={null} { "pid": 42, "lines": [ "Serving HTTP on 0.0.0.0 port 8080", "127.0.0.1 - - [06/Apr/2026 22:20:00] \"GET / HTTP/1.1\" 200 -" ], "line_count": 2 } ``` `/output` returns the combined captured lines from stdout and stderr and does not include per-line stream tags. If you need a single stream, use [Get Process Stdout](/api-reference/v2/processes/stdout) or [Get Process Stderr](/api-reference/v2/processes/stderr). If you need stream-tagged live events, use [Follow Process Output](/api-reference/v2/processes/follow-output). # Send Signal Source: https://docs.tensorlake.ai/api-reference/v2/processes/signal post /api/v1/processes/{pid}/signal Send a POSIX signal such as `SIGTERM` or `SIGKILL` to a running process. Send a POSIX signal to a running process. ## Endpoint ```http theme={null} POST /api/v1/processes/{pid}/signal ``` ## Example Request ```bash theme={null} curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes/42/signal \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{"signal": 15}' ``` ## Request Body ```json theme={null} { "signal": 15 } ``` Common values include `15` for `SIGTERM` and `9` for `SIGKILL`. ## Response Tensorlake returns `200 OK`: ```json theme={null} { "success": true } ``` If the process does not exist, Tensorlake returns `404 Not Found`. Invalid signals or signals sent to non-running processes return `400 Bad Request`. # Start Process Source: https://docs.tensorlake.ai/api-reference/v2/processes/start post /api/v1/processes Start a new process through the sandbox proxy on `https://.sandbox.tensorlake.ai`. Start a new process inside a sandbox. ## Endpoint ```http theme={null} POST /api/v1/processes ``` Use this endpoint on the sandbox proxy host: ```text theme={null} https://.sandbox.tensorlake.ai/api/v1/processes ``` ## Example Request ```bash theme={null} curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "command": "python", "args": ["-m", "http.server", "8080"], "env": {"PORT": "8080"}, "working_dir": "/workspace", "stdin_mode": "pipe", "stdout_mode": "capture", "stderr_mode": "capture" }' ``` ## Request Body ```json theme={null} { "command": "python", "args": ["-m", "http.server", "8080"], "env": {"PORT": "8080"}, "working_dir": "/workspace", "stdin_mode": "pipe", "stdout_mode": "capture", "stderr_mode": "capture" } ``` * `command` is required. * `args` defaults to `[]`. * `env` defaults to `{}`. * `working_dir` is optional. * `stdin_mode` accepts `closed` or `pipe`. The default is `closed`. * `stdout_mode` and `stderr_mode` accept `capture` or `discard`. The default is `capture`. ## Response Tensorlake returns `201 Created` with the started process metadata: ```json theme={null} { "pid": 42, "status": "running", "exit_code": null, "signal": null, "stdin_writable": true, "command": "python", "args": ["-m", "http.server", "8080"], "started_at": 1710000000000, "ended_at": null } ``` # Get Process Stderr Source: https://docs.tensorlake.ai/api-reference/v2/processes/stderr get /api/v1/processes/{pid}/stderr Read the captured stderr lines for a process. Read stderr that has already been captured for a process. ## Endpoint ```http theme={null} GET /api/v1/processes/{pid}/stderr ``` ## Example Request ```bash theme={null} curl https://.sandbox.tensorlake.ai/api/v1/processes/42/stderr \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `200 OK`: ```json theme={null} { "pid": 42, "lines": [ "Traceback (most recent call last):", "ValueError: invalid input" ], "line_count": 2 } ``` If you need stdout only, use [Get Process Stdout](/api-reference/v2/processes/stdout). If you need both streams merged, use [Get Process Output](/api-reference/v2/processes/output). # Process Stdin Source: https://docs.tensorlake.ai/api-reference/v2/processes/stdin post /api/v1/processes/{pid}/stdin Write raw bytes to a process whose stdin was opened in `pipe` mode. Write raw bytes to a process's stdin. The process must be started with `stdin_mode: "pipe"` for this endpoint to work. ## Write to Stdin ```http theme={null} POST /api/v1/processes/{pid}/stdin ``` ```bash theme={null} curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes/42/stdin \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/octet-stream" \ --data-binary "print('hello')\n" ``` Tensorlake returns `204 No Content` when the bytes are accepted. If stdin was not opened in `pipe` mode, Tensorlake returns `400 Bad Request`. Missing processes return `404 Not Found`. To deliver EOF without killing the process, use [Close Process Stdin](/api-reference/v2/processes/close-stdin). # Get Process Stdout Source: https://docs.tensorlake.ai/api-reference/v2/processes/stdout get /api/v1/processes/{pid}/stdout Read the captured stdout lines for a process. Read stdout that has already been captured for a process. ## Endpoint ```http theme={null} GET /api/v1/processes/{pid}/stdout ``` ## Example Request ```bash theme={null} curl https://.sandbox.tensorlake.ai/api/v1/processes/42/stdout \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `200 OK`: ```json theme={null} { "pid": 42, "lines": [ "Serving HTTP on 0.0.0.0 port 8080", "127.0.0.1 - - [06/Apr/2026 22:20:00] \"GET / HTTP/1.1\" 200 -" ], "line_count": 2 } ``` If you need stderr only, use [Get Process Stderr](/api-reference/v2/processes/stderr). If you need both streams merged, use [Get Process Output](/api-reference/v2/processes/output). # Create PTY Session Source: https://docs.tensorlake.ai/api-reference/v2/pty/create post /api/v1/pty Create a PTY-backed interactive terminal session through the sandbox proxy. Create a PTY session for interactive terminal access. ## Endpoint ```http theme={null} POST /api/v1/pty ``` Use this endpoint on the sandbox proxy host: ```text theme={null} https://.sandbox.tensorlake.ai/api/v1/pty ``` ## Example Request ```bash theme={null} curl -X POST https://.sandbox.tensorlake.ai/api/v1/pty \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "command": "/bin/bash", "args": ["-l"], "env": {"TERM": "xterm-256color"}, "working_dir": "/workspace", "rows": 24, "cols": 80 }' ``` ## Request Body ```json theme={null} { "command": "/bin/bash", "args": ["-l"], "env": {"TERM": "xterm-256color"}, "working_dir": "/workspace", "rows": 24, "cols": 80 } ``` * `command` is required. * `args` is optional. * `env` is optional. * `working_dir` is optional. * `rows` and `cols` are optional and default to `24` and `80`. * Tensorlake clamps `rows` to `1..500` and `cols` to `1..1000`. ## Response Tensorlake returns `201 Created`: ```json theme={null} { "session_id": "LYtJOrxE9Kz3bphPUDzuX", "token": "" } ``` Use `session_id` with the other PTY endpoints. Use `token` when connecting to the PTY WebSocket. If the sandbox already has 64 PTY sessions, Tensorlake returns `429 Too Many Requests` with code `TOO_MANY_SESSIONS`. # Get PTY Session Source: https://docs.tensorlake.ai/api-reference/v2/pty/get get /api/v1/pty/{session_id} Retrieve metadata for a single PTY session. Get metadata for one PTY session. ## Endpoint ```http theme={null} GET /api/v1/pty/{session_id} ``` ## Example Request ```bash theme={null} curl https://.sandbox.tensorlake.ai/api/v1/pty/ \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `200 OK`: ```json theme={null} { "session_id": "LYtJOrxE9Kz3bphPUDzuX", "pid": 42, "command": "/bin/bash", "args": ["-l"], "rows": 24, "cols": 80, "created_at": 1710000000000, "ended_at": null, "exit_code": null, "is_alive": true } ``` The PTY token is not returned from this endpoint. If the session does not exist, Tensorlake returns `404 Not Found`. # Sandbox PTY API Overview Source: https://docs.tensorlake.ai/api-reference/v2/pty/introduction Create and manage interactive PTY sessions through the sandbox proxy. The sandbox PTY API is exposed through each sandbox's management hostname, not `https://api.tensorlake.ai`. ```text theme={null} https://.sandbox.tensorlake.ai ``` Use PTY sessions when you need an interactive terminal, shell, or full-screen TUI inside a sandbox. Include `Authorization: Bearer $TENSORLAKE_API_KEY` on requests to the sandbox proxy. The WebSocket attach endpoint also requires the per-session PTY token returned from session creation. Start a new PTY-backed interactive session. Enumerate the PTY sessions tracked by the sandbox daemon. Inspect session metadata, terminal size, and liveness. Change the terminal rows and columns for an active session. Connect to a session over WebSocket and exchange terminal bytes. Terminate a PTY session. # Kill PTY Session Source: https://docs.tensorlake.ai/api-reference/v2/pty/kill delete /api/v1/pty/{session_id} Terminate a PTY session. Terminate a PTY session. ## Endpoint ```http theme={null} DELETE /api/v1/pty/{session_id} ``` ## Example Request ```bash theme={null} curl -X DELETE https://.sandbox.tensorlake.ai/api/v1/pty/ \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `204 No Content`. The daemon first sends `SIGHUP` to the PTY session and, if it is still alive after a short grace period, follows up with `SIGKILL`. If the session does not exist, Tensorlake returns `404 Not Found`. # List PTY Sessions Source: https://docs.tensorlake.ai/api-reference/v2/pty/list get /api/v1/pty List the PTY sessions tracked inside a sandbox. List the PTY sessions tracked inside a sandbox. ## Endpoint ```http theme={null} GET /api/v1/pty ``` ## Example Request ```bash theme={null} curl https://.sandbox.tensorlake.ai/api/v1/pty \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `200 OK`: ```json theme={null} { "sessions": [ { "session_id": "LYtJOrxE9Kz3bphPUDzuX", "pid": 42, "command": "/bin/bash", "args": ["-l"], "rows": 24, "cols": 80, "created_at": 1710000000000, "ended_at": null, "exit_code": null, "is_alive": true } ] } ``` The PTY token is not included in list responses. # Resize PTY Session Source: https://docs.tensorlake.ai/api-reference/v2/pty/resize post /api/v1/pty/{session_id}/resize Resize the terminal dimensions for a PTY session. Resize a PTY session. ## Endpoint ```http theme={null} POST /api/v1/pty/{session_id}/resize ``` ## Example Request ```bash theme={null} curl -X POST https://.sandbox.tensorlake.ai/api/v1/pty//resize \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{"rows": 40, "cols": 120}' ``` ## Request Body ```json theme={null} { "rows": 40, "cols": 120 } ``` Tensorlake clamps `rows` to `1..500` and `cols` to `1..1000`. ## Response Tensorlake returns `204 No Content` when the resize is applied. If the session does not exist, Tensorlake returns `404 Not Found`. # Attach PTY WebSocket Source: https://docs.tensorlake.ai/api-reference/v2/pty/websocket get /api/v1/pty/{session_id}/ws Upgrade to a WebSocket connection for an interactive PTY session. Attach to a PTY session over WebSocket. ## Endpoint ```http theme={null} GET /api/v1/pty/{session_id}/ws ``` Use the WebSocket endpoint on the sandbox proxy host: ```text theme={null} wss://.sandbox.tensorlake.ai/api/v1/pty//ws ``` ## Authentication You must provide the PTY token returned from [Create PTY Session](/api-reference/v2/pty/create). Tensorlake accepts the token in either place: * Preferred: `X-PTY-Token: ` * Backward-compatible fallback: `?token=` The header form is preferred because query parameters are more likely to appear in access logs. ## WebSocket Protocol After connecting, send a binary `READY` frame immediately so Tensorlake can flush any buffered output. For an end-to-end example that creates the session, sends `READY`, runs a command, reads output, and closes cleanly, see [PTY Sessions](/sandboxes/pty-sessions). ### Client-to-server opcodes | Opcode | Meaning | Payload | | ------ | ------- | ----------------------------------------------------------- | | `0x00` | Data | Raw terminal input bytes | | `0x01` | Resize | `cols` as big-endian `u16`, then `rows` as big-endian `u16` | | `0x02` | Ready | No payload | ### Server-to-client opcodes | Opcode | Meaning | Payload | | ------ | ------- | ----------------------------- | | `0x00` | Data | Raw terminal output bytes | | `0x03` | Exit | Exit code as big-endian `i32` | ## Example Connection Header-based token: ```bash theme={null} wscat -H "X-PTY-Token: " -c "wss://.sandbox.tensorlake.ai/api/v1/pty//ws" ``` Query-string token: ```bash theme={null} wscat -c "wss://.sandbox.tensorlake.ai/api/v1/pty//ws?token=" ``` ## Connection Semantics * If the token is invalid, Tensorlake returns `403 Forbidden` with code `INVALID_TOKEN`. * If the session does not exist, Tensorlake returns `404 Not Found` with code `SESSION_NOT_FOUND`. * When the process exits, Tensorlake sends the `0x03` exit frame and then closes the WebSocket with reason `exit:`. * If the PTY session is terminated while the socket is open, Tensorlake closes the WebSocket with code `1001` and reason `session terminated`. * If you do not send `READY`, Tensorlake buffers output up to 1 MB before disconnecting the client. # Delete File Source: https://docs.tensorlake.ai/api-reference/v2/sandbox-files/delete delete /api/v1/files Delete a file through the sandbox proxy. Delete a file from a sandbox path. ## Endpoint ```http theme={null} DELETE /api/v1/files?path= ``` ## Example Request ```bash theme={null} curl -X DELETE "https://.sandbox.tensorlake.ai/api/v1/files?path=/workspace/temp.txt" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `204 No Content` when the file is deleted. If the file does not exist, Tensorlake returns `404 Not Found`. Paths containing `..` are rejected with `403 Forbidden`. # Sandbox Files API Overview Source: https://docs.tensorlake.ai/api-reference/v2/sandbox-files/introduction Read, write, delete, and list files through the sandbox proxy. The sandbox file API is exposed through each sandbox's management hostname, not `https://api.tensorlake.ai`. ```text theme={null} https://.sandbox.tensorlake.ai ``` Use these endpoints to read files, upload content, delete files, and list directory contents inside a sandbox. Include `Authorization: Bearer $TENSORLAKE_API_KEY` on requests to the sandbox proxy. Download file contents from a sandbox path. Upload bytes to a sandbox path. Remove a file from a sandbox. List files and directories at a sandbox path. # List Directory Source: https://docs.tensorlake.ai/api-reference/v2/sandbox-files/list get /api/v1/files/list List directory contents through the sandbox proxy. List the contents of a sandbox directory. ## Endpoint ```http theme={null} GET /api/v1/files/list?path= ``` ## Example Request ```bash theme={null} curl "https://.sandbox.tensorlake.ai/api/v1/files/list?path=/workspace" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `200 OK`: ```json theme={null} { "path": "/workspace", "entries": [ { "name": "src", "is_dir": true, "size": null, "modified_at": 1710000000000 }, { "name": "data.csv", "is_dir": false, "size": 24, "modified_at": 1710000001000 } ] } ``` Entries are sorted with directories first and then alphabetically. If the path is not a directory, Tensorlake returns `400 Bad Request`. Missing paths return `404 Not Found`. # Read File Source: https://docs.tensorlake.ai/api-reference/v2/sandbox-files/read get /api/v1/files Read a file through the sandbox proxy. Read a file from a sandbox path. ## Endpoint ```http theme={null} GET /api/v1/files?path= ``` ## Example Request ```bash theme={null} curl "https://.sandbox.tensorlake.ai/api/v1/files?path=/workspace/data.csv" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ## Response Tensorlake returns `200 OK` with `Content-Type: application/octet-stream` and the raw file bytes. If the path points to a directory, Tensorlake returns `400 Bad Request`. If the file does not exist, Tensorlake returns `404 Not Found`. Paths containing `..` are rejected with `403 Forbidden`. # Write File Source: https://docs.tensorlake.ai/api-reference/v2/sandbox-files/write put /api/v1/files Write raw bytes to a sandbox file path through the sandbox proxy. Write bytes to a sandbox path. ## Endpoint ```http theme={null} PUT /api/v1/files?path= ``` ## Example Request ```bash theme={null} curl -X PUT "https://.sandbox.tensorlake.ai/api/v1/files?path=/workspace/config.json" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/octet-stream" \ --data-binary '{"debug": true, "port": 8080}' ``` ## Response Tensorlake returns `204 No Content` when the write succeeds. Parent directories are created automatically if needed. Paths containing `..` are rejected with `403 Forbidden`. # Create Sandbox Source: https://docs.tensorlake.ai/api-reference/v2/sandboxes/create post /sandboxes Create an ephemeral or named sandbox. Launch an ephemeral or named sandbox. * Omit `name` to create an ephemeral sandbox. * Set `name` to create a named sandbox that supports suspend and resume. * Set `snapshot_id` to restore from a snapshot, or `image` to boot from a registered Sandbox Image. * For fresh creates, if `resources.disk_mb` is omitted, the sandbox uses the default 10 GB root disk (`10240` MiB). * With `image`, `resources.disk_mb` can be used to grow the root disk at create time (growth-only). * With `snapshot_id` from a filesystem snapshot, `resources.disk_mb` can be used to grow the root disk at create time (growth-only). # Delete Sandbox Source: https://docs.tensorlake.ai/api-reference/v2/sandboxes/delete delete /sandboxes/{sandbox_id} Terminate a sandbox. This operation is idempotent and returns success if the sandbox was already terminated. Terminate a sandbox. This call is idempotent. If the sandbox is already terminated, Tensorlake still returns success. # Get Sandbox Source: https://docs.tensorlake.ai/api-reference/v2/sandboxes/get get /sandboxes/{sandbox_id} Retrieve metadata for a sandbox in the current project, including its management `sandbox_url` when available. Retrieve metadata for a single sandbox. The response includes `sandbox_url` when the sandbox management API is reachable through public ingress. # List Sandboxes Source: https://docs.tensorlake.ai/api-reference/v2/sandboxes/list get /sandboxes List sandboxes. List sandboxes for the current project. Use `status=running` to query only sandboxes that are currently live in memory. Omitting the filter returns full sandbox history, including terminated sandboxes. # Restore Sandbox Source: https://docs.tensorlake.ai/api-reference/v2/sandboxes/restore post /sandboxes Restore a sandbox from a previously created snapshot by calling the create endpoint with a `snapshot_id`. Restore a new sandbox from a previously created snapshot. ## Endpoint ```http theme={null} POST /sandboxes ``` To restore from a snapshot, call the standard create sandbox endpoint and include `snapshot_id` in the request body. * If the snapshot type is filesystem (default), the new sandbox restores the captured filesystem. You can override launch settings (including resources). * If the snapshot type is memory, the new sandbox restores filesystem, memory, and running processes exactly as they were. Image, resources (CPUs, memory), entrypoint, and secrets come from the snapshot and cannot be changed at restore time. ## Example Request ```bash theme={null} curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "snapshot_id": "" }' ``` For filesystem snapshots, `resources.disk_mb` can be used at restore time to grow the root disk (growth-only). For the full request and response schema of `POST /sandboxes`, see [Create Sandbox](/api-reference/v2/sandboxes/create). For the end-to-end snapshot workflow, see [Snapshots](/sandboxes/snapshots). # Resume Sandbox Source: https://docs.tensorlake.ai/api-reference/v2/sandboxes/resume post /sandboxes/{sandbox_id}/resume Resume a suspended named sandbox from its suspend snapshot. Returns `202 Accepted` when resume begins and `200 OK` when the sandbox is already running. Resume a suspended named sandbox from its suspend snapshot. * This path accepts either the sandbox ID or the sandbox name. * Tensorlake returns `202 Accepted` when resume starts. * Tensorlake returns `200 OK` when the sandbox is already running. * If the sandbox is not suspended, or its suspend snapshot is missing or not ready, Tensorlake returns `400 Bad Request`. Most sandbox-proxy requests to a suspended named sandbox also resume it automatically, so this endpoint is mainly useful when you want to wake the sandbox proactively before sending traffic. # Snapshot Sandbox Source: https://docs.tensorlake.ai/api-reference/v2/sandboxes/snapshot post /sandboxes/{sandbox_id}/snapshot Create a snapshot of a running sandbox so you can restore the same filesystem and memory state later. Create a snapshot of a running sandbox so you can restore the same filesystem and memory state later. You can optionally pass `snapshot_type` in the request body: * `filesystem` (default): captures filesystem state only and restores with a cold boot. * `memory`: captures filesystem, memory, and running process state and restores with a warm start. ## Endpoint ```http theme={null} POST /sandboxes/{sandbox_id}/snapshot ``` ## Example Request ```bash theme={null} curl -X POST https://api.tensorlake.ai/sandboxes//snapshot \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{"snapshot_type":"memory"}' ``` Use the created snapshot with [Restore Sandbox](/api-reference/v2/sandboxes/restore) when you want to boot a new sandbox from that saved state. For the broader snapshot lifecycle, including listing and deleting snapshots, see [Snapshots](/sandboxes/snapshots). # Suspend Sandbox Source: https://docs.tensorlake.ai/api-reference/v2/sandboxes/suspend post /sandboxes/{sandbox_id}/suspend Suspend a named running sandbox by snapshotting it and terminating the live container. Returns `202 Accepted` when suspension begins or is already in progress, and `200 OK` when the sandbox is already suspended. Suspend a named running sandbox by snapshotting it and terminating the live container. * This path accepts either the sandbox ID or the sandbox name. * Only named sandboxes can be suspended. Ephemeral sandboxes return `400 Bad Request`. * Tensorlake returns `202 Accepted` when suspension starts or is already in progress. * Tensorlake returns `200 OK` when the sandbox is already suspended. # Update Sandbox Source: https://docs.tensorlake.ai/api-reference/v2/sandboxes/update patch /sandboxes/{sandbox_id} Update proxy-visible sandbox settings such as public exposed ports and whether ingress can skip authentication checks. Update public ingress settings for a sandbox. This endpoint controls the sandbox proxy allowlist, including `exposed_ports` and `allow_unauthenticated_access`. # Architecture Source: https://docs.tensorlake.ai/applications/architecture How Tensorlake's Application Runtime runs your code under the hood This page describes the architecture of Tensorlake's Application Runtime. Tensorlake is a complex system with many moving parts. To help users build a mental model of how it works, this page documents the system architecture. **Advanced topic.** You do not need to understand these details to effectively use Tensorlake. The details are documented here for those who wish to learn about them without having to go spelunking through the source code. ## High-Level Overview When a request hits your application, the runtime creates a new sandbox in milliseconds and your agent starts in an isolated environment with its own filesystem. Every function decorated with `@function()` can run in its own remote sandbox with dedicated resources — from your code it looks like a normal function call, but under the hood the runtime is scheduling containers, managing state, and handling failures. At a high level, the system looks like this: ```mermaid theme={null} graph TD Client["Client (SDK / HTTP)"] --> Server subgraph Server["Server (Control Plane)"] direction LR API["HTTP API"] AppSched["Application
Scheduler"] ContSched["Container
Scheduler"] StateDB["State Store"] end Server -- "gRPC stream" --> DPA Server -- "gRPC stream" --> DPB subgraph DPA["Dataplane A"] LR1["Language Runtime"] LR2["Language Runtime"] end subgraph DPB["Dataplane B"] LR3["Language Runtime"] LR4["Language Runtime"] end ``` The **server** is the control plane. It receives requests from clients, persists all state, and runs two schedulers. The **application scheduler** manages the lifecycle of function calls — it builds the execution graph for each request, creates allocations, checkpoints outputs, and handles replay on failure. The **container scheduler** manages the infrastructure layer — it tracks resources across all dataplanes, places containers on worker nodes, manages warm pools, and scales containers up and down based on demand. A **dataplane** manages containers on a pool of worker nodes. You can think of it as a regional cluster of compute capacity. Multiple dataplanes can run in parallel, and the server distributes work across them. Each dataplane maintains a persistent bidirectional gRPC stream with the server — it reports its current state (running containers, resource usage, allocation results) and receives new work assignments in return. A **language runtime** is the sandbox that runs your code. Every `@function()` call runs in its own isolated container with its own filesystem, dependencies, and resource limits. When a function calls another function, the child runs in a separate sandbox — a lightweight orchestrator can dispatch work to GPU-equipped containers without needing GPU resources itself. From your code, this is invisible. ## Why a Custom Scheduler A common question is why Tensorlake built its own container scheduler instead of using Kubernetes. The short answer is that Kubernetes was designed for long-running services, not for workloads that create a new container for every request and need it running in milliseconds. Tensorlake's execution model is fundamentally different from what Kubernetes expects. When a request arrives, the runtime creates a fresh sandbox — an isolated container with its own filesystem — in single-digit milliseconds. At peak load, the scheduler creates hundreds of these per second. In Kubernetes, creating a pod involves writing to etcd, passing through admission controllers, waiting for the kubelet to sync, and pulling images. This takes seconds at best, often longer. Creating a pod per request at this rate would overwhelm the Kubernetes control plane. Beyond raw speed, the container scheduler is tightly integrated with the application scheduler in ways that a general-purpose orchestrator can't be. It understands function-level container pools — warm pools, minimum counts, and buffer sizes per function — and uses this to make smarter placement decisions. Its eviction algorithm knows which containers have active allocations and never evicts them, prioritizing containers above pool buffers first. It tracks container affinity per function so it can route work to dataplanes that already have warm containers, avoiding cold starts entirely. None of these concepts exist in Kubernetes scheduling. The desired state model is also purpose-built for this workload. The server pushes desired state to dataplanes over a persistent gRPC stream, and dataplanes reconcile in real-time. Kubernetes uses a watch/list model over etcd that works well for long-running services but adds latency when you need the scheduling loop to react in milliseconds — for example, when a function completes and the next step in a workflow needs to start immediately. Finally, there is an operational argument. Deploying agents on Kubernetes means writing YAMLs, configuring Horizontal Pod Autoscalers, managing image pull policies, setting up KEDA or Knative for scale-to-zero, and running a separate durable execution server for crash recovery. Tensorlake collapses all of that into a single runtime. You deploy your Python code, and the scheduler handles the rest. ## The Server The server is the single control plane for the entire system. It exposes the HTTP API that receives requests from clients and the SDK, persists all state to a durable store, and runs the two schedulers that coordinate all work. Every request, every function call, every allocation, and every container decision flows through the server. When a request arrives, the server creates a **request context** — a record that tracks the full state of the request, including the function call graph, all function runs, and the final outcome. The request context is persisted immediately. From this point, the application scheduler and container scheduler work together to execute the request. ### Container Scheduler The container scheduler is responsible for the infrastructure layer: deciding which containers run on which machines, and managing their lifecycle. ```mermaid theme={null} graph LR subgraph CS["Container Scheduler"] direction TB RT["Resource Tracker
CPU, memory, GPU per executor"] CP["Container Pools
min, max, buffer per function"] PL["Placement Engine
constraints, affinity, eviction"] end CS -- "create / terminate" --> DPA["Dataplane A"] CS -- "create / terminate" --> DPB["Dataplane B"] ``` The container scheduler maintains a real-time view of every executor (worker node) in the system — its total and free resources (CPU, memory, GPU), and every container running on it. It also tracks **container pools**, which group containers by function. Each pool has configurable minimums, maximums, and buffer sizes that control scaling behavior. When the application scheduler needs a container for a function, the container scheduler first checks whether a **warm container** already exists in the function's pool — a pre-initialized container with no active work. If one is available, it claims it immediately, avoiding cold-start latency entirely. If no warm container is available, the scheduler runs its **placement engine**. It finds candidate executors that satisfy the function's resource requirements and constraints, then selects one. If no executor has enough free resources, the scheduler runs a **vacuum pass** — it looks for lower-priority containers that can be evicted to free up space. Eviction follows a priority order: containers above the pool's buffer count are evicted first, then those above the minimum, and only as a last resort those at or below the minimum. Containers with active allocations are never evicted. The container scheduler communicates with dataplanes through a **desired state model**. Rather than issuing imperative commands, it declares the desired state of each container (running or terminated) and the dataplane reconciles its actual state to match. This makes the system resilient to transient failures — if a message is lost, the next reconciliation cycle corrects the drift. Scaling is driven by demand. When requests arrive, new containers are created. When traffic drops, idle containers are terminated. Functions with no traffic have no running containers and incur no cost. You can configure **warm pools** to keep a buffer of pre-initialized containers ready for latency-sensitive functions, or set **concurrency caps** to limit the total number of concurrent instances. ### Application Scheduler The application scheduler manages the execution of your code: the function call graph, allocations, checkpointing, and replay. ```mermaid theme={null} graph TD Req["Incoming Request"] --> RC["Request Context
function call graph, runs, outcome"] RC --> FC["Function Calls
nodes in the execution DAG"] FC --> FR["Function Runs
execution instances with checkpointed outputs"] FR --> AL["Allocations
unit of work assigned to a container"] AL --> CS["Container Scheduler
finds or creates a container"] ``` When a request arrives, the application scheduler creates an initial **function call** — a node in the execution graph that represents a function to invoke with specific inputs. For each function call, it creates a **function run** — an execution instance that tracks status (pending, running, completed) and stores the checkpointed output when the function finishes. To actually execute a function run, the application scheduler creates an **allocation** — a unit of work that binds a function run to a specific container. It first checks whether an existing container has capacity (based on the function's `max_concurrency` setting). If not, it asks the container scheduler to create a new one. The allocation is persisted and pushed to the dataplane through the desired state stream. When a function calls another function, the language runtime reports the child function call back to the server. The application scheduler adds a new node to the execution graph, creates a function run for it, and the cycle repeats. This is how Tensorlake builds the full DAG of function calls for each request — the graph grows dynamically as your code executes. When a function run completes, the application scheduler checkpoints its output. The output data is stored in object storage (not in the database), so your agents can pass large files between functions without workarounds. The scheduler then propagates the output to any downstream function calls that depend on it, creating new function runs as inputs become available. **Replay** is how the system recovers from failures. When a request is replayed, the application scheduler walks the function call graph from the beginning. Function runs that already have checkpointed outputs return their results instantly without re-executing. The replay fast-forwards through completed work until it reaches the function that failed, then starts running it again from scratch. From the application's perspective, it picks up right where it left off. **Retries** handle individual function failures. When a function run fails — whether from an exception, a container crash, or a timeout — the application scheduler checks the function's retry policy. If retries are available, it creates a new allocation and runs the function again with the same inputs. ## Dataplanes A dataplane manages the containers on a pool of worker nodes. It is the bridge between the server's scheduling decisions and actual code execution. ```mermaid theme={null} graph LR subgraph DP["Dataplane"] direction TB SR["State Reconciler"] HR["Heartbeat Reporter"] FEC1["Language Runtime Controller"] FEC2["Language Runtime Controller"] end Server -- "desired state
(gRPC stream)" --> DP DP -- "heartbeats
(every 5s)" --> Server ``` Each dataplane maintains two communication channels with the server. A **bidirectional gRPC stream** carries the desired state — the server pushes container specifications and allocations to the dataplane, and the dataplane acknowledges receipt. A **heartbeat** fires every five seconds, reporting the dataplane's current state: which containers are running, their resource usage, and the results of completed allocations. When the dataplane receives a new desired state, its **state reconciler** compares it against the actual state of the local system. If a container should exist but doesn't, it creates one. If a container should be terminated, it shuts it down. If an allocation needs to be executed, it routes it to the appropriate language runtime controller. Allocation results flow back to the server through the heartbeat channel. The dataplane buffers results and fragments large payloads across multiple heartbeats (with a 10MB limit per message) to avoid overwhelming the connection. Results are only removed from the buffer after the server acknowledges receipt, ensuring nothing is lost in transit. If the gRPC stream disconnects, the dataplane reconnects automatically. If heartbeats fail, it uses exponential backoff. The desired state model means that temporary disconnections don't cause inconsistency — the next successful sync brings everything back in line. ## Language Runtimes A language runtime is the sandbox that runs your function code. It is a container with its own filesystem, dependencies, and resource limits, managed by a controller on the dataplane. ```mermaid theme={null} graph LR AL["Allocation"] --> P["Preparing
download inputs,
presign URLs
"] P --> R["Running
execute function,
stream state
"] R --> F["Finalizing
upload outputs,
clean up
"] ``` When a language runtime receives an allocation, it processes it through a three-phase pipeline. In the **preparing** phase, the runtime downloads input data and presigns blob URLs for outputs. This phase does not occupy a concurrency slot, so the container can prepare multiple allocations in parallel while running others. In the **running** phase, the function code executes. The language runtime streams state updates back to the dataplane controller in real-time: progress updates, output blob requests, child function calls, and the final result. If the function calls another `@function()`-decorated function, the language runtime reports the child call to the server, which creates a new allocation for it. For **blocking calls**, the language runtime registers a watcher and pauses until the child function's result arrives from the server. In the **finalizing** phase, the runtime completes any multipart uploads, cleans up blob handles, and releases the concurrency slot. Each language runtime has a configurable `max_concurrency` that limits how many allocations it can execute simultaneously in the running phase. The application scheduler respects this limit when placing allocations — if all slots are full, it either queues the work or asks the container scheduler for a new container. ## Getting in Depth This has been a high-level overview of the Application Runtime architecture. The [durable execution model](/applications/durability), [crash recovery behavior](/applications/crash-recovery), [scaling configuration](/applications/scaling-agents), and [queuing behavior](/applications/scale-out-queuing) are all documented in more detail. How checkpointing and replay work to make your functions resilient to failures. How the server detects failures and re-schedules only the work that needs to re-run. Configure scaling behavior, warm pools, and concurrency limits for your functions. Functions, applications, decorators, and resource configuration. # Async Functions Source: https://docs.tensorlake.ai/applications/async-functions Use Python async/await with Tensorlake async functions. Run them concurrently to optimize resource usage and reduce latency. An `async` Tensorlake function behaves like a regular Python `async` function. Calling it returns a coroutine that doesn't run until it's awaited or started with `asyncio.create_task()` or other `asyncio` module functions. ```python theme={null} from tensorlake.applications import application, function @function() async def capitalize(text: str) -> str: return text.upper() @application() @function() async def greet(name: str) -> str: # Calling an async Tensorlake function `capitalize` returns a coroutine. # `await` is available inside async Tensorlake functions `greet`. # `await` starts the `capitalize` coroutine and waits for it to complete, returning the result. capitalized: str = await capitalize(name) return f"Hello, {capitalized}!" ``` coroutines returned by async Tensorlake functions behave almost the same way as [Futures](/applications/futures) used with sync Tensorlake functions. ### asyncio.create\_task Use `asyncio.create_task()` to run a coroutine in the background without blocking on it. This returns an `asyncio.Task` that can be awaited later to get the result. ```python theme={null} import asyncio from tensorlake.applications import application, function @function() async def double(x: int) -> int: return x * 2 @application() @function() async def my_app(x: int) -> int: coroutine = double(x) # Starts the coroutine in the background and returns an asyncio.Task. task: asyncio.Task = asyncio.create_task(coroutine) # Do something else and then await the task to get the result. return await task ``` ### Running coroutines in parallel with asyncio.gather Use `asyncio.gather()` to run multiple coroutines in parallel and collect their results. This is the standard Python way to run async functions concurrently. ```python theme={null} import asyncio from tensorlake.applications import application, function @function() async def capitalize(text: str) -> str: return text.upper() @function() async def make_joke(name: str) -> str: return f"Why did {name} cross the road? To get to the other side!" @application() @function() async def greet(name: str) -> str: # Start both function calls in parallel. capitalized, joke = await asyncio.gather( capitalize(name), make_joke(name), ) return f"Hello, {capitalized}! {joke}" ``` ### Non-blocking map and reduce operations Calling `function.map(...)` or `function.reduce(...)` on an async function returns a coroutine. ```python theme={null} from tensorlake.applications import application, function @function() async def double(x: int) -> int: return x * 2 @function() async def add(a: int, b: int) -> int: return a + b @application() @function() async def process_numbers(numbers: list[int]) -> int: # Calling .map() on an async function returns a coroutine. # `await` runs the map operation and blocks until all items are processed. doubled: list[int] = await double.map(numbers) # Calling .reduce() on an async function also returns a coroutine. total: int = await add.reduce(doubled) return total ``` The coroutines returned by `function.map()` or `function.reduce()` behave exactly the same as coroutines returned by async `function(...)` calls. ### Passing coroutines and asyncio.Tasks as inputs Coroutines returned from async Tensorlake functions and `asyncio.Task` objects created with `asyncio.create_task()` from such coroutines can be passed as arguments to other function calls. Tensorlake automatically runs the coroutines or `asyncio.Task` objects, waits for them to complete, and uses their results as the argument values. This works exactly like [passing Futures as inputs](/applications/futures#passing-futures-as-inputs). ```python theme={null} from tensorlake.applications import application, function @function() async def double(x: int) -> int: return x * 2 @function() async def add(a: int, b: int) -> int: return a + b @application() @function() async def my_app(x: int) -> int: a = double(x) b = double(x + 1) # Pass coroutines as function call arguments. Tensorlake runs both in parallel, # waits for them to complete, and uses their results as the arguments for `add`. return await add(a, b) ``` All input coroutines that don't depend on each other run in parallel, allowing Tensorlake to optimize resource usage and reduce overall application latency. A function call or a map-reduce operation are only blocked while their input coroutines are running. Once all input coroutines complete, Tensorlake automatically runs the function call or the map-reduce operation. ### Wrapping coroutines and asyncio.Tasks into Python objects is not allowed When passing Tensorlake coroutines or `asyncio.Task` objects create from them as arguments to function calls, or returning them as tail calls, they cannot be wrapped into other Python objects. For example, returning a list with a coroutine inside is not allowed. Tensorlake will not recognize the coroutine wrapped into the list. This is the same restriction as with [Futures](/applications/futures#wrapping-futures-into-python-objects-is-not-allowed). Map and reduce operations accept a Future/coroutine/`asyncio.Task` or a list as input items. If a list is passed then the Futures/coroutines/asyncio tasks in the list are recognized by Tensorlake and run automatically. ### Tail calls Returning a Tensorlake function coroutine or its `asyncio.Task` makes a [tail call](/applications/futures#tail-calls). The returning function completes immediately and its function container is freed to process the next request. Tensorlake runs the returned coroutine or task and uses its result as the function's return value. This works exactly like returning a Future as a tail call. ```python theme={null} from tensorlake.applications import application, function @function() async def double(x: int) -> int: return x * 2 @application() @function() async def my_app(x: int) -> int: # Returns a coroutine as a tail call. The function completes immediately # and Tensorlake runs the coroutine in the background. return double(x) ``` Futures can also be returned as tail calls from async functions. ```python theme={null} from tensorlake.applications import application, function @function() def double(x: int) -> int: return x * 2 @application() @function() async def my_app(x: int) -> int: return double.future(x) ``` ### Calling sync functions from async functions Sync Tensorlake functions can be called directly from async functions. The call blocks the asyncio event loop until the sync function completes. No other asyncio tasks can run while the asyncio event loop is blocked. Because of this, calling sync Tensorlake functions directly is an anti-pattern and should be avoided. Use `function.future()` to call sync functions without blocking the event loop. Call `future.run()` to start the Future in the background. Use `await future` to wait for the Future to complete and get its result. If this doesn't fit the use case, use `future.coroutine()` to convert the Future into a coroutine that can be used the same way as any coroutine returned by an async Tensorlake function. ```python theme={null} from tensorlake.applications import application, function, Future @function() def sync_double(x: int) -> int: return x * 2 @application() @function() async def my_app(x: int) -> int: # Simple await of the sync function call. return await sync_double.future(x) @application() @function() async def my_app_tail_call(x: int) -> int: # Tail call. return sync_double.future(x) @application() @function() async def my_app_background_task(x: int) -> int: double_task: asyncio.Task = asyncio.create_task(sync_double.future(x).coroutine()) return await double_task ``` ### Calling async functions from sync functions Sync functions cannot `await` coroutines. To call an async Tensorlake function from a sync function, use `function.future()` to create a Future and call `.result()` to block until it completes. ```python theme={null} from tensorlake.applications import application, function @function() async def async_double(x: int) -> int: return x * 2 @function() async def async_add(a: int, b: int) -> int: return a + b @application() @function() def my_app(x: int) -> int: doubled: int = async_double.future(x).result() return async_add.future(x, doubled).result() ``` ## See Also Use Futures for parallel execution and tail calls. Parallel processing over lists of data. # Building Workflows Source: https://docs.tensorlake.ai/applications/building-workflows Build multi-step data workflows with parallel execution and optimized resource usage Data workflows involve multiple steps — fetching, transforming, validating, enriching, and loading data. Tensorlake lets you define these pipelines as composed functions that automatically run in parallel where possible, with built-in durability and resource optimization. Your workflows are exposed as HTTP endpoints, that can be called on-demand. They scale up when they are called, and scale down when they are idle. ## Your First Workflow Workflows in Tensorlake use **futures** to define function calls without executing them immediately. This allows Tensorlake to optimize execution by running independent steps in parallel. When you return a future from a function (called a **tail call**), the function completes immediately without blocking, and Tensorlake orchestrates the remaining work. Here's a simple workflow that processes and formats data from multiple sources: ```python theme={null} from tensorlake.applications import application, function @application() @function() def enrich_record(record_id: str) -> dict: # Create futures - these don't run yet, just define the function calls profile = fetch_profile.future(record_id) history = fetch_history.future(record_id) # Return a tail call - enrich_record() completes immediately without blocking # Tensorlake then automatically: # 1. Runs fetch_profile() and fetch_history() in parallel (no dependencies between them) # 2. Once both complete, runs merge_data() with their results # 3. Uses merge_data()'s return value as enrich_record()'s final result return merge_data.future(profile, history) @function() def fetch_profile(record_id: str) -> dict: # Fetch from profile service return {"id": record_id, "name": "Example Corp", "tier": "enterprise"} @function() def fetch_history(record_id: str) -> list: # Fetch transaction history return [{"date": "2024-01-15", "amount": 5000}] @function() def merge_data(profile: dict, history: list) -> dict: return {"profile": profile, "transactions": history} ``` **What happens when you call this workflow:** ```bash theme={null} curl https://api.tensorlake.ai/applications/enrich_record \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ --json '"rec_123"' ``` 1. `enrich_record` starts and immediately returns (doesn't block) 2. `fetch_profile("rec_123")` and `fetch_history("rec_123")` run **in parallel** 3. When both complete, `merge_data` runs with both results 4. Final response contains the merged data **Key benefits:** * **Parallel execution** where possible (lower latency) * **No blocking** — the orchestrator container is freed immediately * **Automatic dependency tracking** — no manual coordination needed * **Built-in durability** — failures resume from checkpoints For a deep dive on futures and tail calls, see [Futures](/applications/futures#tail-calls). See [async functions](/applications/async-functions) on how to build non-blocking workflows using Python async/await. Each function in your workflow can be configured with retry policies. If a step fails, Tensorlake automatically retries it based on your [retry configuration](/applications/retries). ## Best Practices ### Design for Parallelism Identify steps that can run independently: ```python theme={null} # Sequential — slow @function() def slow_pipeline(data: str) -> str: result1 = step1(data) result2 = step2(data) # Could have run in parallel return combine(result1, result2) # Parallel — fast @function() def fast_pipeline(data: str) -> str: result1 = step1.future(data) result2 = step2.future(data) # Runs in parallel with step1 return combine.future(result1, result2) ``` ### Use Tail Calls for Efficiency Return futures instead of blocking. When you return a future as a tail call, the current function's container is freed immediately — you're not paying for idle containers waiting for downstream results. ```python theme={null} # Blocks container unnecessarily @function() def inefficient(data: str) -> str: result = expensive_operation(data) # Container blocked here return result # Frees container immediately @function() def efficient(data: str) -> str: return expensive_operation.future(data) # Container freed right away ``` ### Process Lists with Map-Reduce For workflows that process collections of items, use map-reduce operations to parallelize the work: ```python theme={null} from pydantic import BaseModel class ProcessingResult(BaseModel): total_processed: int = 0 total_value: float = 0.0 @application() @function() def process_batch(record_ids: list[str]) -> ProcessingResult: # Map: process each record in parallel results = process_record.future.map(record_ids) # Reduce: aggregate results as they complete return aggregate_results.future.reduce(results, ProcessingResult()) @function() def process_record(record_id: str) -> dict: # Each record processed in its own container return {"id": record_id, "value": 100.0} @function() def aggregate_results(summary: ProcessingResult, record: dict) -> ProcessingResult: summary.total_processed += 1 summary.total_value += record["value"] return summary ``` Map-reduce operations automatically run in parallel and scale to handle large datasets efficiently. See [Map-Reduce](/applications/map-reduce) for more details. ## Learn More Deep dive on futures, tail calls, and parallel execution. Async functions are another way to define workflows with parallel execution. How workflows recover from failures. # SDK Reference Source: https://docs.tensorlake.ai/applications/concepts Functions, applications, decorators, request context, and lifecycle reference ## Applications Applications are the top-level decorators that define entry points for your applications. You can define as many applications as you want in your project. Each one of them will be assigned a unique HTTP entry point based on the name of the Python function. ```python theme={null} from tensorlake.applications import application, function # This application's name will be `hello_world`. @application() @function() def hello_world(): print("Hello, world!") # This application's name will be `hola_mundo`. @application() @function() def hola_mundo(): print("Hola, mundo!") ``` ### Configuring Applications The `@application` decorator allows you to specify the following attributes: 1. `tags` - dict of tags to categorize the application. 2. `retries` - Retry policy for every function in the application unless a function specifies its own retry policy. No retries by default if function failed. See [Retries](/applications/concepts#retries). 3. `region` - The region where every function in the application will be deployed unless a function specifies its own region. Either `us-east-1` or `eu-west-1`. The default is any of the regions. The following code snippet shows an example of all the function attributes set to custom values. ```python theme={null} from tensorlake.applications import application, Retries @application( tags={"language": "python"}, retries=Retries(max_retries=3), region="us-east-1", ) @function() def hello_world(): print("Hello, world!") ``` ### Application inputs and output Application functions take zero or more arguments which are the current request inputs. The inputs get deserialized from their JSON representation into Python objects specified in the arguments' type hints. A reverse process happens for the request output. The object returned from the application function gets JSON serialized according to the return type hint of the function. The resulting JSON is returned as the HTTP response body of the application request. For example if your application function takes a single `str` argument and returns a `str`, then the request input and output should be JSON strings: ```python theme={null} from tensorlake.applications import function, application @application() @function() def greet(data: str) -> str: return data + " from greet!" ``` ```json request input theme={null} "Hello, world!" ``` ```json request output theme={null} "Hello, world! from greet!" ``` If you want to use multiple application request inputs with complex data structures, you can add more arguments and use type hints with your Pydantic model classes. Each type hint needs to be [supported in Pydantic JSON mode](https://docs.pydantic.dev/latest/concepts/serialization/#json-mode). All basic type hints like `str`, `int`, `float`, `bool`, `list`, `dict`, `set`, `tuple`, `None`, `Any`, `|`, Pydantic model classes and more are supported. If a type hint is a union of multiple Python types, like `str | int`, then the request JSON input can match any of the types in the union. ```python theme={null} from pydantic import BaseModel from tensorlake.applications import application, function class PersonSearchQuery(BaseModel): name: str age: int class PersonSearchResult(BaseModel): matches: list[dict] @application() @function() def process_data(query: PersonSearchQuery | list[PersonSearchQuery], limit: int | None = None) -> PersonSearchResult: if isinstance(query, list): return PersonSearchResult( matches=[ {"name": q.name, "age": q.age, "id": i+1} for i, q in enumerate(query) if limit is None or i < limit ] ) else: return PersonSearchResult(matches=[{"name": query.name, "age": query.age, "id": 1}]) ``` ```json request input theme={null} {"name": "John", "age": 30} ``` ```json response output theme={null} {"matches":[{"name":"John","age":30,"id":1}]} ``` The limit argument is optional so we can omit it from the request input. If an argument type hint is `Any`, then the corresponding request input can be any valid JSON value (string, number, object, array, boolean, null). The JSON value gets deserialized into the corresponding Python object (`str`, `int`/`float`, `dict`, `list`, `bool`, `None`). The same applies to `Any` return type hint (i.e. a Python dict gets serialized as JSON object, a Python list as JSON array, etc.). If type hints are not provided then they are treated as `Any`. ### Calling Applications You can call applications remotely using any HTTP client or Tensorlake Python SDK. Use empty POST request body if the application takes no arguments. A JSON serialized request body is passed if the application takes one argument. Use multipart/form-data request body if the application takes multiple arguments or one or more files (see [uploading and downloading files](/applications/concepts#uploading-and-downloading-files)). i.e. to call the `hello_world` application defined above (takes no arguments): ```bash bash theme={null} curl \ https://api.tensorlake.ai/applications/hello_world \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ --json '' ``` ```python python theme={null} from tensorlake.applications import run_remote_application run_remote_application("hello_world") ``` i.e. to call the `greet` application defined above (takes a single string argument): ```bash bash theme={null} curl \ https://api.tensorlake.ai/applications/greet \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ --json '"Hello, John"' ``` ```python python theme={null} from tensorlake.applications import run_remote_application run_remote_application("greet", "Hello, John") # Or: # run_remote_application("greet", data="Hello, John") ``` i.e. to call the `process_data` application defined above (with multiple arguments): ```bash bash theme={null} query_value='[{"name": "Alice", "age": 25}, {"name": "Bob", "age": 24}]' limit_value='10' curl \ https://api.tensorlake.ai/applications/process_data \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Accept: application/json" \ -F "query=$query_value;type=application/json" \ -F "limit=$limit_value;type=application/json" ``` ```python python theme={null} from tensorlake.applications import run_remote_application run_remote_application( "process_data", query=[ PersonSearchQuery(name="Alice", age=25), PersonSearchQuery(name="Bob", age=24) ], limit=10 ) ``` If you pass an argument and application function doesn't have it then it's simply ignored. If an argument has a default value then you can omit it from the request input. Both of these features make it easy to update application code without breaking existing clients. ### Uploading and downloading files Application functions can receive files as current request inputs and return a file as a current request output. This makes it easy to build applications which process input files or produce an output file. File sizes of up to 5 TB are supported. The file input type is represented by the `File` class in Tensorlake SDK with the following interface: ```python theme={null} class File: content: bytes # Raw bytes of the file content_type: str # MIME content type of the file ``` When an argument has a `File` type hint, Tensorlake SDK doesn't attempt to deserialize the input from JSON and instead passes a `File` object with original request input binary content and content type. When the return type hint is `File`, Tensorlake SDK doesn't JSON serialize the returned `File` object. It instead sets the request output content type to the `File.content_type` and the HTTP response body to the raw bytes in `File.content`. ```python theme={null} from tensorlake.applications import function, application, File @application() @function() def process_file(input: File) -> File: print( "Got file content type:", input.content_type, "size:", len(input.content), "bytes" ) return File( # HTTP response body is the raw bytes of the input file. content=input.content, # HTTP response content type is the same as input file content type. content_type=input.content_type ) ``` `File.content` field holds the raw bytes of the file uploaded to the application endpoint. At the moment, the SDK doesn't support lazy loading large files, so the entire file is loaded into memory when the function is called. To pass a local file at `/foo/bar/file_name.txt` path to an application, you need to use a multipart/form-data HTTP request or just use Python SDK. ```bash bash theme={null} input_value='@/foo/bar/file_name.txt' curl \ https://api.tensorlake.ai/applications/process_file \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Accept: application/json" \ -F "input=$input_value" ``` ```python python theme={null} from tensorlake.applications import run_remote_application, File # Note: File object from Tensorlake SDK is not the same as File object from Python standard library. with open("/foo/bar/file_name.txt", "rb") as local_file: local_file_content: bytes = local_file.read() run_remote_application( "process_file", File( content=local_file_content, content_type="text/plain" ) ) ``` ## Functions Functions are the building blocks of applications. They are Python functions decorated with the `@function` decorator. Tensorlake functions can call other Tensorlake functions. The function call blocks until the called function returns its output to the calling function. For example, a simple application function which calls another function to process its input: ```python theme={null} from tensorlake.applications import application, function # Define an application function which is an HTTP entry point for the application. @application() @function() def greet(name: str) -> str: if name.startswith("A"): return "Hello, A-name!" else: # Call another function to perform a specific processing of non-A names. return process_non_a_name(name) + " from greet!" @function() def process_non_a_name(name: str) -> str: return "A" + name[1:] ``` Every Tensorlake function call: * is executed in its own function container, * supports durable execution, * can run in parallel with other function calls, * can be retried independently if it fails, * has its own resource limits (CPU, memory, disk, GPU, timeout), * has its logs available in Tensorlake logging tools, * has its execution timeline available in Tensorlake tracing tools, * can report progress updates to extend its timeout, * can share state with other function calls of the same application request. Every Python function decorated with `@function` becomes a Tensorlake function and thus gets all these capabilities automatically. ### Configuring Tensorlake functions The `@function` decorator allows you to set the following attributes: 1. `description` - A description of the function's purpose and behavior. 2. `cpu` - The number of CPUs available to the function. The default is `1.0` CPU. See [CPU](/applications/concepts#cpu). 3. `memory` - The memory GB available to the function. The default is `1.0` GB. See [Memory](/applications/concepts#memory). 4. `ephemeral_disk` - The ephemeral `/tmp` disk space available to the function in GB. The default is `2.0` GB. See [Ephemeral Disk](/applications/concepts#ephemeral-disk). 5. `gpu` - The GPU model available to the function. The default is `None` (no GPU). Please contact `support@tensorlake.ai` to enable GPU support. 6. `timeout` - The timeout for the function in seconds. The default is 5 minutes. See [Timeouts](/applications/concepts#timeouts). 7. `image` - The image to use for the function container. A basic Debian based image by default. See [Images](/applications/images). 8. `secrets` - The secrets available to the function in its environment variables. No secrets by default. See [Secrets](/applications/secrets). 9. `retries` - Retry policy for the function. No retries by default if function failed. See [Retries](/applications/concepts#retries). 10. `region` - The region where the function will be deployed. Either `us-east-1` or `eu-west-1`. The default is any of the regions. The following code snippet shows an example of all the function attributes set to custom values. ```python theme={null} from tensorlake.applications import function, Image, Retries @function( # Use Ubuntu as a base image instead of the default Debian image=Image(base_image="ubuntu:latest"), # Make my_secret available to the function as an environment variable secrets=["my_secret"], # Description of the function in the workflow description="Measures the string using its length", # Retry the function twice if it fails retries=Retries(max_retries=2), # Function fails if it was running for more than 30 seconds and didn't report any progress timeout=30, # 2 CPUs are available to the function cpu=2, # 4 GB of memory is available to the function memory=4, # 2 GB of ephemeral /tmp disk space is available to the function ephemeral_disk=2, # Run the function in a container with GPU support gpu="H100", # Run the function in us-east-1 region only region="us-east-1", ) def string_length(s: str) -> int: return len(s) ``` ### Function inputs and output Tensorlake functions are not exposed as HTTP entry points unlike application functions. Because of this Tensorlake functions have minimal limitations on their signatures. Arguments and return value don't require any type hints but have to be picklable. Most Python objects are picklable, except special cases like Processes, Threads, database connections, etc. If a Tensorlake function argument or return value is a Tensorlake SDK `File` object, then it bypasses pickling and is passed as-is to and from Tensorlake functions. ### Application functions and Tensorlake Functions Every application function decorated with `@application()` decorator is also a Tensorlake function. This is why every application function needs to be decorated with `@function()` decorator in addition to `@application()`. As application functions are HTTP entry points into Tensorlake applications, they have some differences compared to regular Tensorlake functions. Application functions: * Require JSON serializable type hints for all arguments and the return value. * Don't support `/` and `*` in function arguments. * Don't support `*args` and `**kwargs`. * Ignore function call arguments that are not defined in the function signature. This simplifies code migrations, i.e. if HTTP client sends extra arguments that the application function doesn't take anymore after its code update. Application functions can be called from regular Tensorlake functions. The call is executed in the current application request without creating a new one. So it behaves like a regular Tensorlake function call inside an application. ### Classes Sometimes a function needs one-time initialization, like loading a large model into memory. This is achieved by defining a class using the `@cls` decorator. Classes use their `__init__(self)` constructor to run any initialization code once on function container startup. The constructor can not have any arguments other than `self`. Any number of class methods can be decorated with `@function`. ```python theme={null} from large_model import load_large_model from tensorlake.applications import application, cls, function, run_remote_application @cls() class MyCompute: def __init__(self): # Run initialization code once on function container startup self.model = load_large_model() @application() @function(cpu=4, memory=16) def run(self, data: str) -> int: return self.model.run(data) if __name__ == "__main__": run_remote_application("MyCompute.run", data="some input data") ``` ### Timeouts When a function runs longer than its timeout, it is terminated and marked as failed. The timeout in seconds is set using the `timeout` attribute. The default timeout is `300` (5 minutes). Minimum is `1`, maximum is `172800` (48 hours). Progress updates can be sent by the function to extend the timeout. See [Request Context](/applications/concepts#request-context). ```python theme={null} from tensorlake.applications import function # Set a 30 minute timeout for long-running agent tasks @function(timeout=1800) def deep_research(prompt: str) -> str: ... ``` ### Retries When a function fails by raising an exception or timing out, it gets retried according to its retry policy. The default retry policy is to not retry the function call. You can specify a custom retry policy using the `retries` attribute. If you allow retries, it's typically a best practice to ensure that the function is idempotent unless this is not required for your use case. ```python theme={null} from tensorlake.applications import function, Retries # Retry the function once if it failed @function(retries=Retries(max_retries=1)) def my_function() -> int: raise Exception("Something went wrong") ``` You can set default retry policy for all the functions in the application decorator. See the [Configuring Applications](/applications/concepts#configuring-applications) guide. ### Request Context Functions can use a request context to share state between function calls of the same request. The context has information about the current request and provides access to APIs for the current request. You can access the request context directly from the `RequestContext` class. ```python theme={null} from tensorlake.applications import RequestContext, function @function() def my_function(data: str) -> int: ctx: RequestContext = RequestContext.get() print(f"Request ID: {ctx.request_id}") ... ``` #### Request ID Each request has a unique identifier accessible via `ctx.request_id`. This is useful for logging and debugging. ```python theme={null} from tensorlake.applications import RequestContext, function @function() def my_function(data: str) -> str: ctx = RequestContext.get() print(f"Processing request: {ctx.request_id}") return ctx.request_id ``` #### Request State The state API allows you to set and get key-value pairs scoped per request. Each new request starts with an empty state. Values can be any picklable object. | Method | Description | | ----------------------------------------------------------------- | ------------------------------------------------ | | `state.set(key: str, value: Any) -> None` | Set a key-value pair | | `state.get(key: str, default: Any \| None = None) -> Any \| None` | Get a value by key, returns default if not found | ```python theme={null} from tensorlake.applications import RequestContext, function @function() def first_function(data: str) -> int: ctx = RequestContext.get() ctx.state.set("user_data", data) ctx.state.set("processed", True) return second_function() @function() def second_function() -> int: ctx = RequestContext.get() data = ctx.state.get("user_data") return len(data) ``` #### Streaming Progress Updates The progress API allows you to stream execution progress from your functions. This is useful for monitoring long-running tasks and providing real-time feedback to users. ```python theme={null} from tensorlake.applications import RequestContext, function @function() def process_items(items: list) -> dict: ctx = RequestContext.get() results = [] for i, item in enumerate(items): ctx.progress.update(i + 1, len(items), f"Processing item {i + 1}") results.append(process(item)) return {"results": results} ``` **Key features:** * **Automatic timeout reset** - Each progress update resets the function timeout, allowing long-running agents to run indefinitely * **Real-time streaming** - Stream updates to frontends via Server-Sent Events (SSE) * **HTTP API access** - Query progress updates programmatically for custom dashboards and monitoring See the [Streaming Progress guide](/applications/guides/streaming-progress) for detailed API reference, frontend integration examples, and best practices. #### Request Metrics The metrics API allows you to record custom metrics for monitoring and debugging. | Method | Description | | ------------------------------------------------------- | ------------------------------------------------------------------ | | `metrics.timer(name: str, value: int \| float) -> None` | Record a duration metric (in seconds) | | `metrics.counter(name: str, value: int = 1) -> None` | Increment a counter by the given value. Every counter starts at 0. | ```python theme={null} import time from tensorlake.applications import RequestContext, function @function() def my_function(data: str) -> int: ctx = RequestContext.get() start_time = time.monotonic() result = len(data) # Record metrics ctx.metrics.timer("processing_time", time.monotonic() - start_time) ctx.metrics.counter("items_processed", result) return result ``` ### CPU The number of CPUs available to the function is set using the `cpu` attribute. Minimum is `1.0`, maximum is `8.0`. The default is `1.0`. This is usually sufficient for functions that only call external APIs and do simple data processing. Adding more CPUs is recommended for functions that do complex data processing or work with large datasets. If functions use large multi-gigabyte inputs or produce large multi-gigabyte outputs, then at least 3 CPUs are recommended. This results in the fastest download and upload speeds for the data. ```python theme={null} from tensorlake.applications import function # Allocate 4 CPUs for data processing @function(cpu=4) def process_data(data: list) -> dict: ... ``` ### Memory GB memory available to the function is set using the `memory` attribute. Minimum is `1.0`, maximum is `32.0`. The default is `1.0`. This is usually sufficient for functions that only call external APIs and do simple data processing. Adding more memory is recommended for functions that do complex data processing or work with large datasets. It's recommended to set `memory` to at least 2x the size of the largest inputs and outputs of the function. This is because when the inputs/outputs are deserialized/serialized both serialized and deserialized representations are kept in memory. ```python theme={null} from tensorlake.applications import function # Allocate 8 GB memory for loading large models @function(memory=8) def run_model(data: str) -> str: ... ``` ### Ephemeral disk Ephemeral disk space is a temporary storage space available to functions at `/tmp` path. It gets erased when its function container gets terminated. It's optimal for storing temporary files that are not needed after the function execution is completed. Ephemeral disks are backed by fast SSD drives. Using other filesystem paths like `/home/ubuntu` for storing temporary files will result in slower performance. Temporary files created using Python modules like `tempfile` are stored in ephemeral disk space inside `/tmp`. GB of ephemeral disk space available to the function is set using `ephemeral_disk` attribute. Minimum is `2.0`, maximum is `50.0`. The default is `2.0` GB. This is usually sufficient for functions that only call external APIs and do simple data processing. If the function needs to temporarily store large files or datasets on disk, then the `ephemeral_disk` attribute should be increased accordingly. ```python theme={null} from tensorlake.applications import function # Allocate 20 GB disk for downloading and processing large files @function(ephemeral_disk=20) def process_video(url: str) -> str: # Download video to /tmp # Process and return results ... ``` # Crash Recovery Source: https://docs.tensorlake.ai/applications/crash-recovery How agents survive failures and resume without losing work **Durable Execution is in Technical Preview** This feature is currently in technical preview and under active development. Please contact us on Slack if you'd like to ask a question or try it out. Agents call LLMs, scrape websites, query databases, and invoke external APIs. Any of these can fail — rate limits, timeouts, transient network errors, OOM kills. Without durability, a failure means restarting the entire agent from scratch, repeating every LLM call and API request. Tensorlake checkpoints every `@function()` call automatically. When a request fails, you [replay](/applications/durability#request-replay-api) it and only the failed step re-executes. Everything before it is served from the checkpoint. This page covers the recovery patterns. For automatic retries on transient failures (rate limits, validation errors), see [Retries & Rate Limits](/applications/retries). For long-running functions that need to extend their deadline as they make progress, see [Timeouts](/applications/timeouts). For try/except patterns and graceful degradation, see [Error Handling](/applications/error-handling). ## Why LLM Calls Must Be Durable LLM calls are unlike normal API calls. They are **non-deterministic** — the same prompt can produce a different response on every invocation. This makes re-execution dangerous, not just wasteful. Consider a travel agent that plans a trip. On the first run, the LLM decides on flights to Whistler. The agent books the flights, then crashes while searching for hotels. Without durable execution, the agent restarts from scratch. This time the LLM decides on Japan instead. Now the user has unwanted Whistler flights and a completely different trip plan. Making LLM calls durable solves three problems at once: * **Consistency** — Prior LLM decisions are preserved on replay. The agent resumes searching for Whistler hotels, not re-planning the entire trip. * **Cost** — LLM inference is expensive. Re-executing 14 successful tool-calling iterations because the 15th failed wastes tokens and money. * **Rate limits** — Agentic applications multiply downstream calls by an order of magnitude. Re-executing all of them increases the chance of hitting rate limits again. On Tensorlake, every `@function()` call is automatically checkpointed. When a request is replayed, previously successful LLM calls return their recorded outputs — the model is not called again. ## Durable Tool Calls The most common agent pattern is a loop: the LLM decides which tool to call, the tool runs, the result feeds back into the LLM. Each iteration is an expensive operation — an LLM inference plus a tool execution. Wrap each tool in its own `@function()` to make every tool call a checkpoint: ```python theme={null} from tensorlake.applications import application, function, Image llm_image = Image().run("pip install openai") @function() def search_web(query: str) -> list[dict]: import requests response = requests.get("https://api.search.com/v1/search", params={"q": query}) return response.json()["results"] @function() def read_document(url: str) -> str: import requests return requests.get(url).text @function(image=llm_image) def call_llm(messages: list[dict]) -> dict: from openai import OpenAI response = OpenAI().chat.completions.create( model="gpt-4o", messages=messages, tools=[ {"type": "function", "function": {"name": "search_web", "parameters": {"type": "object", "properties": {"query": {"type": "string"}}}}}, {"type": "function", "function": {"name": "read_document", "parameters": {"type": "object", "properties": {"url": {"type": "string"}}}}}, ] ) return response.choices[0].message @application() @function(image=llm_image, timeout=1800) def research_agent(topic: str) -> str: tools = {"search_web": search_web, "read_document": read_document} messages = [{"role": "user", "content": f"Research this topic: {topic}"}] for _ in range(20): # max iterations response = call_llm(messages) # checkpointed messages.append(response) if not response.get("tool_calls"): return response["content"] for tool_call in response["tool_calls"]: fn = tools[tool_call["function"]["name"]] result = fn(**tool_call["function"]["arguments"]) # checkpointed messages.append({"role": "tool", "content": str(result), "tool_call_id": tool_call["id"]}) return messages[-1].get("content", "Max iterations reached") ``` If the agent crashes on iteration 15, a replay skips the first 14 iterations entirely. The LLM calls, web searches, and document reads from those iterations are all served from checkpoints. The agent resumes from iteration 15 with the full conversation history intact. ## Surviving Partial Failures in Fan-Out When you process a batch of items in parallel using [map](/applications/map-reduce), each item is an independent function call with its own checkpoint. If 3 out of 1,000 items fail, replay only re-processes those 3. ```python theme={null} from tensorlake.applications import application, function @function(timeout=120) def process_document(doc_url: str) -> dict: """Parse a single document. Each call is independently checkpointed.""" content = fetch_and_parse(doc_url) extracted = extract_fields(content) return extracted @function() def aggregate_results(results: list[dict], acc: dict) -> dict: """Combine results as they arrive.""" acc["documents"].append(results) return acc @application() @function() def batch_processor(doc_urls: list[str]) -> dict: results = process_document.map(doc_urls) summary = results.reduce(aggregate_results, {"documents": []}) return summary ``` This is the pattern behind durable data ingestion pipelines. Whether you're processing SEC filings, insurance forms, or research papers — partial failures don't lose the work already completed. ## Idempotent Side Effects When a function sends an email, charges a credit card, or writes to an external database, you don't want that action repeated on replay. Wrap the side effect in its own `@function()` — since the function's output is checkpointed, replay skips it entirely. ```python theme={null} @function() def send_notification(user_id: str, message: str) -> str: """Send once, skip on replay.""" response = email_api.send(user_id, message) return response.message_id @application() @function() def onboarding_agent(user_id: str) -> str: profile = build_profile(user_id) send_notification(user_id, f"Welcome, {profile['name']}!") # sent once return setup_account(profile) ``` ## Functions That Must Always Run Fresh Some function calls should never be replayed from a checkpoint — they need live data every time. Mark them with `durable=False`: ```python theme={null} @function(durable=False) def get_current_price(ticker: str) -> float: """Always fetches the latest price, even on replay.""" return stock_api.get_price(ticker) @function() def get_historical_data(ticker: str) -> list[dict]: """Historical data doesn't change — safe to checkpoint.""" return stock_api.get_history(ticker, days=30) ``` See [Disabling durable execution](/applications/durability#disabling-durable-execution) for the full implications. ## When to Use Durable Execution | Scenario | Benefit | | ----------------------------------------------- | --------------------------------------------- | | Agent loops with 10+ tool calls | Crash on call #N resumes from #N, not #1 | | Batch processing 100s-1000s of documents | Partial failures only re-process failed items | | Pipelines with expensive LLM calls | No repeated inference costs on retry | | Multi-step workflows with external side effects | Emails, payments, API calls aren't duplicated | For the technical details of how checkpointing, fingerprinting, and replay modes work, see [Durable Execution](/applications/durability). ## Related Guides Replay API, adaptive vs. strict modes, and fingerprinting internals. Auto-retry on rate limits, validation errors, and transient failures. Heartbeat-based timeouts so long agent loops don't fail prematurely. Try/except, futures, and degrading gracefully when a step fails. # Cron Scheduler Source: https://docs.tensorlake.ai/applications/cron-scheduler Schedule recurring invocations of your Orchestration endpoints. Cron schedules trigger your deployed orchestration endpoints on a recurring basis. You can manage schedules programmatically via the API or through the Applications UI. ## Creating a Schedule Send a `POST` request to create a cron schedule for a deployed application. The schedule starts immediately after creation. ``` POST /applications/{application}/cron-schedules ``` ```python Python theme={null} import requests, base64, json application = "my-app" payload = {"cron_expression": "0 * * * *"} # Every hour # Optional: pass input data to each invocation input_data = json.dumps({"report_type": "daily"}).encode() payload["input_base64"] = base64.b64encode(input_data).decode() response = requests.post( f"https://api.tensorlake.ai/applications/{application}/cron-schedules", json=payload, headers={"Authorization": "Bearer TENSORLAKE_API_KEY"}, ) response.raise_for_status() schedule_id = response.json()["schedule_id"] print(f"Created schedule: {schedule_id}") ``` ```typescript TypeScript theme={null} async function createCronSchedule( application: string, cronExpression: string, inputBytes?: Uint8Array, ) { const body: Record = { cron_expression: cronExpression }; if (inputBytes) { body.input_base64 = btoa(String.fromCharCode(...inputBytes)); } const res = await fetch( `/applications/${application}/cron-schedules`, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify(body), }, ); if (!res.ok) { const err = await res.json(); throw new Error(err.error ?? `HTTP ${res.status}`); } const { schedule_id } = await res.json(); return schedule_id as string; } ``` ### Request fields | Field | Type | Required | Description | | ----------------- | ------ | -------- | -------------------------------------------------------------------------------- | | `cron_expression` | string | Yes | A valid 5-field cron expression | | `input_base64` | string | No | Base64-encoded bytes passed as input on every invocation. Maximum 1 MiB decoded. | The response returns a `schedule_id`. Save this — it is required to delete the schedule later. The minimum allowed interval is 60 seconds. `* * * * *` (every minute) is the fastest supported expression. Sub-minute expressions are rejected with a `400` error. ## Listing Schedules Retrieve all cron schedules for an application: ``` GET /v1/namespaces/{namespace}/applications/{application}/cron-schedules ``` ```python Python theme={null} response = requests.get( f"https://api.tensorlake.ai/applications/{application}/cron-schedules", headers={"Authorization": "Bearer TENSORLAKE_API_KEY"}, ) response.raise_for_status() for schedule in response.json()["schedules"]: print(schedule["id"], schedule["cron_expression"], schedule["next_fire_time_ms"]) ``` ```typescript TypeScript theme={null} interface CronSchedule { id: string; application_name: string; cron_expression: string; next_fire_time_ms: number; last_fired_at_ms: number | null; created_at: number; enabled: boolean; } async function listCronSchedules(application: string) { const res = await fetch( `/applications/${application}/cron-schedules`, ); if (!res.ok) throw new Error(`HTTP ${res.status}`); const { schedules } = await res.json(); return schedules as CronSchedule[]; } ``` ### Response fields | Field | Type | Description | | ------------------- | -------------- | ------------------------------------------------------------------------------------- | | `id` | string | Unique ID for this schedule | | `application_name` | string | The application this schedule belongs to | | `cron_expression` | string | The schedule expression as stored | | `next_fire_time_ms` | number | Unix timestamp (ms) of the next scheduled invocation | | `last_fired_at_ms` | number \| null | Unix timestamp (ms) of the last invocation. `null` if the schedule has never fired. | | `created_at` | number | Monotonic counter for ordering — not a wall-clock timestamp, do not display as a date | | `enabled` | boolean | Always `true` — reserved for future use | `next_fire_time_ms` and `last_fired_at_ms` are standard Unix millisecond timestamps. In JavaScript: `new Date(next_fire_time_ms)`. ## Deleting a Schedule ``` DELETE /applications/{application}/cron-schedules/{schedule_id} ``` ```python Python theme={null} response = requests.delete( f"https://api.tensorlake.ai/applications/{application}/cron-schedules/{schedule_id}", headers={"Authorization": "Bearer TENSORLAKE_API_KEY"}, ) response.raise_for_status() ``` ```typescript TypeScript theme={null} async function deleteCronSchedule( application: string, scheduleId: string, ) { const res = await fetch( `/applications/${application}/cron-schedules/${scheduleId}`, { method: "DELETE" }, ); if (!res.ok) throw new Error(`HTTP ${res.status}`); } ``` Deletion is permanent. To modify a schedule, delete it and recreate it — you can reuse the `cron_expression` from the list response to pre-populate the new request. ## Limits | Limit | Value | | --------------------------------- | --------------- | | Minimum interval | 60 seconds | | Maximum schedules per application | 100 | | Maximum input payload | 1 MiB (decoded) | ## Related Monitor scheduled invocations alongside the rest of your application activity. Configure automatic retries for functions triggered by the scheduler. Pass secrets securely to functions that run on a schedule. # Durable Execution Source: https://docs.tensorlake.ai/applications/durability Tensorlake automatically persists function outputs so retries and replays skip already-succeeded work — avoiding costly restarts of long-running agent workflows. Agentic applications and AI workflows are often **long-running** (seconds to hours) and interact with **unreliable dependencies** (LLMs, external APIs, tools). A failure in a dependency call requires implementing retry logic and restarting the agent or workflow from scratch when out of retries. This can be costly and adds significant complexity and latency. When running on Tensorlake, your application automatically saves outputs of every Tensorlake function call in the current application request. This means that outputs of successful Tensorlake function calls will be available without re-execution when you [replay](#request-replay-api) an application request after it failed. The same applies to [automatic retries](/applications/retries). When a Tensorlake function gets retried and runs the same previously succeeded Tensorlake function calls again, it will use their saved outputs instead of re-executing them. For a worked example of how durable execution lets agents survive crashes mid-loop, see [Crash Recovery](/applications/crash-recovery). For tuning retry counts, rate limits, and validation-driven retries on a single function, see [Retries & Rate Limits](/applications/retries). For functions that should keep running across long agent loops without tripping the timeout, see [Timeouts](/applications/timeouts). Storing outputs of successful function calls in an application request and re-using the outputs in the same request without re-executing the same function calls again is called **durable execution**. It works out-of-the-box for all Tensorlake applications. **Durable execution** is in technical preview mode. Please [contact us on Slack](https://join.slack.com/t/tensorlakecloud/shared_invite/zt-32fq4nmib-gO0OM5RIar3zLOBm~ZGqKg) if you'd like to ask a question or try it out. ## Request Replay API You can use Request Replay API to restart a failed Tensorlake application request from where it failed without re-executing the previously successful Tensorlake function calls in it. ```bash bash theme={null} curl \ "https://api.tensorlake.ai/applications/$APPLICATION_NAME/requests/$REQUEST_ID/replay" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ --json '{}' ``` ```python python theme={null} from tensorlake.applications import Request, get_remote_request application_name: str = "my_durable_application" request_id: str = "abc123def456ghi789" request: Request = get_remote_request(application_name, request_id) request.replay() # Blocks until the request replay completes and prints request output. print(request.output()) ``` When you replay a request, Tensorlake doesn't create a new request. Instead, it re-runs the same request with the same request ID. The request runs again and the request output is updated when the replay completes. ### Application code upgrade When request gets replayed it runs the same application code version as in the previous run. You can upgrade it to the latest application code version by passing `--json '{ "upgrade_to_latest_version": true }'` in HTTP replay API call or passing `request.replay(upgrade_to_latest_version=True)` in Python. This is handy if you fixed a bug in your application code and want to re-run the request with the fix applied. If you replay with a code upgrade, please ensure that the latest application code can handle the original request inputs. This typically requires backward compatibility implemented at your application function parameters level. ### Replay modes Tensorlake detects when a replayed request follows a different execution path comparing to the original request run or any its past replays. For example, a replayed request may execute a new function call if it uses a random number generator to do it: ```python theme={null} import random from tensorlake.applications import application, function @function() def foo(): print("foo") @function() def bar(): print("bar") @application() @function() def my_workflow_app(): # succeeded in the original request run, # skipped in the replayed run foo() if random.random() < 0.5: # never called in the original request run, # called in the replayed run, Tensorlake detects this bar() # ... more Tensorlake function calls ``` Other common causes of a replayed requests following a different execution path: * Conditional execution of code depending on current time, database state, values returned by external APIs, etc. * Changing order of Tensorlake function calls depending on duration of external API calls, LLM calls, etc (aka race conditions). * Change of Tensorlake function calls in [upgraded application code](#application-code-upgrade). For some applications, a replayed request following a different execution path is expected and acceptable and for others it is not. Tensorlake provides two different replay modes to suite the needs of both types of applications. [Adaptive replay](#adaptive-replay) allows this scenario and [Strict replay](#strict-replay) doesn't allow it and fails the replay if it happens. #### Adaptive replay By default, Tensorlake uses **adaptive replay**. In this mode, all new Tensorlake function calls are allowed to execute, even if the replayed request doesn't run some function calls that were executed in the original request run or in previous replayed runs. This mode is useful when the user just wants to re-run the request from where it failed without being concerned about potential behavioral changes or non-determinism in their application code. To explicitly enable adaptive replay, pass `--json '{ "mode": "adaptive" }'` in HTTP replay API call or pass `request.replay(mode=ReplayMode.ADAPTIVE)` in Python. This is not necessary since adaptive replay is the default mode. #### Strict replay In this mode, if a new Tensorlake function call is detected during the request replay and one or more Tensorlake function call from the original request run or from previous replayed runs are not executed in the current replayed run, then the request replay fails with a `ReplayError`. This mode is useful when the user wants to ensure that the request behavior remains the same during replays. i.e. that all the resources claimed during the original request run are reused during the replayed run without claiming more resources again (i.e. to not redo cross-service transactions). To enable strict replay, pass `--json '{ "mode": "strict" }'` in HTTP replay API call or pass `request.replay(mode=ReplayMode.STRICT)` in Python. ### How function calls are matched Tensorlake makes a fingerprint of every Tensorlake function call made in an application request. It then compares fingerprints of new function calls made during a request replay with fingerprints of previously executed function calls in the same request to determine whether the function call has been made previously. A function call fingerprint includes: * Function call type (i.e. "function\_call", "map", "reduce"). * Function name. * Parent function call fingerprint. * Function call sequence number in the parent function call. * Other information to ensure that changes in function call tree structures are detected. Function parameters are not included in the function call fingerprint. Takeaways from this: * Changing function parameters in application code doesn't affect replay behavior. A new function call with different parameters still matches the previous function call. This enables seemless application code upgrades without affecting replays. * Passing different values (e.g., random numbers, current time) as function parameters doesn't affect replay behavior. A function call with a different random number passed into it still matches its previous function call where the random number was different. * If sequence of function calls changed in the latest application code then the replayed function calls will not match the previous function calls. In this case the replay behavior depends on the selected replay mode (adaptive or strict). * If function calls are started in an arbitrary order (i.e. with a random delay) then the order of function calls would differ between the original request run and the replayed run even without application code changes. In this case the replay behavior depends on the selected replay mode (adaptive or strict). Application code should avoid arbitrary function call ordering to ensure consistent behavior during request replays and reuse of previously completed work. ## Automatic retries When a Tensorlake function call gets [retried automatically](/applications/retries), it uses the same durable execution mechanism to re-use outputs of previously successful Tensorlake function calls from the same request. In this case, adaptive replay mode is always used. See [Retries & Rate Limits](/applications/retries) for the `retries=` parameter, validation-driven retries, and `max_containers` × `concurrency` for capping concurrent calls. For broader exception-handling patterns (try/except, futures, graceful degradation), see [Error Handling](/applications/error-handling). ## Disabling durable execution Durable execution is enabled by default for all Tensorlake functions. You can disable it for a function by setting the `durable` attribute to `False` in the `@function` decorator. ```python theme={null} from tensorlake.applications import application, function from magic_llm import ask_llm @function(durable=False) def get_current_weather() -> str: return ask_llm("What's the weather now in San Francisco?") ``` Disabling durable execution for a function means that when its parent function calls are re-executed during request replays or automatic retries, then the non-durable function calls will always be re-executed and their outputs will never be reused from previous executions. Disabling durability is useful for functions that must always run fresh (e.g., functions that return current time or current weather or stock price). It's not recommended to call other Tensorlake functions from a non-durable function because all such function calls will also be non-durable and will always be re-executed. If the same functions are called from durable functions then their outputs will still be saved and reused as normal if the called functions are durable. With strict replay mode, no validation is done on non-durable function calls. A non-durable function call and all its child Tensorlake function calls just get re-executed on each replay. ## Best practices for durable Tensorlake applications * Wrap every external call (LLM, API, database, etc.) in a Tensorlake function to make these calls durable and avoid repeating work. If a framework is doing these calls then use framework customization points (e.g., callbacks, hooks, decorators, etc.) to wrap the calls in Tensorlake functions. * Design your application code to be deterministic to ensure that replays follow the same execution path and thus reuse previously finished work. * If your Tensorlake functions have external side effects (e.g., sending emails, modifying databases), ensure that these side effects are idempotent or can be safely retried without causing issues. * [Disable durability](#disabling-durable-execution) for functions that must always run on request replay or retry. * If strict mode and code upgrade to latest are used in a replay then the latest application code needs to be fully backward compatible with the original request code to avoid failing the replay. ## Human in the loop and external events The replay API can be used for resuming requests that timed out while waiting for external inputs (e.g., human review, external event). ```python theme={null} from tensorlake.applications import application, function, RequestError from approval_system import wait_for_approval_request, get_approval_message, create_approval_request, ApprovalDeniedError @function() def create_approval(user_id: str, action: str) -> str: """Wraps create_approval_request call to make it durable.""" return create_approval_request(user_id, action) @function(timeout=300) # 5 minute timeout def wait_approval(approval_id: str) -> str: """Waits for an approval request to be completed and returns its approval message. Raises RequestError if the approval was denied. This fails the current request immediately. """ try: wait_for_approval_request(approval_id) except ApprovalDeniedError as e: raise RequestError(f"Approval '{approval_id}' was denied: {e}") from e return get_approval_message(approval_id) @application() @function() def user_authorization_workflow(user_id: str, action: str) -> str: approval_id: str = create_approval(user_id, action) approval_message: str = wait_approval(approval_id) return f"Approval received: {approval_message}" ``` `wait_approval` function times out after waiting for 5 minutes. The wait can be resumed once the approval is granted by replaying the request using the Request Replay API. The replayed request will skip the already completed `create_approval` function call and re-execute the `wait_approval` function call which will now be able to complete successfully. ## Comparison with Temporal Both Tensorlake and Temporal provide durable execution, they achieve it through different architectures. Temporal relies on event history replay, whereas Tensorlake saves and retrieves function outputs and matches function calls using their fingerprints. This removes many constraints that Temporal imposes on application code. | Feature | Tensorlake Applications | Temporal | | :------------------------ | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | **User Code Constraints** | **Adaptive.** By default, a replayed request can change its function calls. | **Strict Determinism.** Workflow logic must be perfectly deterministic or replay crashes. | | **Handling Code Updates** | **Adaptive.** By default, Tensorlake adapts to new code. New function calls execute normally, and removed function calls are ignored. No special versioning logic is required. | **Complex.** Requires explicit "Versioning" logic (`workflow.patched()`) or creating new task queues to prevent "Non-Determinism Errors" when replays encounter new code. | | **History Limits** | **Unlimited.** There are no event history size limits. You can have infinite loops or long-running applications without resetting execution state. | **Limited.** Event history has hard size limits (typically 50K events). Large loops or long-running workflows must use "Continue-As-New" to truncate history. | | **Replay Behavior** | **Adaptive.** By default, if the code execution path deviates, Tensorlake simply executes the new path while reusing cached outputs where possible. | **Strict.** If the code execution path deviates from the saved history, the workflow fails (Block/Retry loop). | | **Code Failures** | **Fails Fast.** If a function fails and runs out of retries, the request fails immediately, allowing you to debug and [Replay](#request-replay-api) it later when fixed. | **Blocks and Retries.** If a workflow task fails (e.g., a bug in logic), it blocks and retries indefinitely until fixed. | | **Code Structure** | **Flexible.** You can structure your application code freely, using any programming constructs without worrying about replay constraints. | **Constrained.** You must split the code into workflows and activities and use them carefully to avoid non-determinism and ensure replayability. | | **Strict Replay Mode** | **Available.** You can enable strict replay mode to enforce exact function calls matching during replays to avoid non-determinism. | **Available.** Temporal always enforces strict determinism in workflow code | | **Non-durable Functions** | **Supported.** You can disable durability for specific functions that must always run fresh on replays or retries. | **Not Supported.** All external data must be recorded in history. Retrieving fresh data during replay is generally forbidden to prevent non-determinism. | ## Next Walkthrough: how durable execution survives mid-agent crashes and partial fan-out failures. Configure retry counts, validation-driven retries, and concurrency caps. Bounded function timeouts with progress-update heartbeats. Try/except patterns, future failures, and graceful degradation. # Error handling Source: https://docs.tensorlake.ai/applications/error-handling How errors propagate in Tensorlake Applications — function exceptions, timeouts, retries, and patterns for building resilient agentic workflows. Agentic applications interact with unreliable dependencies (LLMs, tools, external APIs). This guide explains how errors propagate in Tensorlake Applications and common patterns for building resilient workflows on the Agentic Runtime. ## How failures propagate * **A function can fail** by raising an exception or timing out (see [Timeouts](/applications/timeouts)). * **If an exception is not handled**, it bubbles up to the caller and can fail the overall request. Failed requests can be re-run with the [Replay API](/applications/durability#request-replay-api), and previously successful nested calls are served from [checkpoints](/applications/durability) instead of re-executing. * **Retries** can be configured per-function or at the application level. See [Retries & Rate Limits](/applications/retries). * **Mid-loop crashes** in long-running agents are covered in [Crash Recovery](/applications/crash-recovery). ## Pattern: catch errors and continue Use `try/except` inside your application to decide whether to fail the request or degrade gracefully. ```python theme={null} from tensorlake.applications import application, function @function() def call_tool(x: str) -> str: # e.g., LLM/tool/API call that can fail raise RuntimeError("tool failed") @application() @function() def workflow(user_input: str) -> dict: try: tool_output = call_tool(user_input) return {"status": "ok", "tool_output": tool_output} except Exception as e: # Decide how your agent/workflow should behave on failure return {"status": "degraded", "error": str(e)} ``` ## Pattern: retries for flaky dependencies Retries are a good fit for transient failures (timeouts, 429s, temporary upstream errors). Configure them on the function (or set defaults on the application). ```python theme={null} from tensorlake.applications import function, Retries @function(retries=Retries(max_retries=3)) def flaky_step() -> str: ... ``` ## Futures: handling parallel failures When using [Futures](/applications/futures), errors are raised when you call `.result()`: ```python theme={null} from tensorlake.applications import application, function, Future @function() def maybe_fails() -> str: raise RuntimeError("boom") @application() @function() def parallel_work() -> str: fut: Future = maybe_fails.future().run() try: return fut.result() except Exception as e: return f"handled: {e}" ``` ## Debugging tips * **Start by reproducing locally**: Tensorlake Applications run as normal Python functions locally. See [Testing locally](/applications/quickstart#testing-locally). * **Add structured logs**: log inputs/outputs (excluding secrets) so you can diagnose failures. * **Make side effects idempotent**: if a function can retry, avoid double-charging or double-writing. ## Related Guides Configure auto-retries for transient failures and structured-output validation. Resume long agent loops from the failed step instead of restarting. Replay API, adaptive vs. strict modes, and how checkpoints survive failures. Per-function deadlines and progress-update heartbeats. # Futures Source: https://docs.tensorlake.ai/applications/futures Use Futures to run multiple function calls in parallel to optimize resource usage and reduce latency. A Future object defines, runs and tracks execution of a function call or another operation like map or reduce It is created using the `function.future` factory. i.e. calling `my_function.future(1, 2, 3)` returns a Future object for the `my_function(1, 2, 3)` function call. The Future doesn't start running until it's started with its `.run()` or `.result()` methods, used as a function call argument, or returned from a function. `.result()` method blocks until the Future completes and returns the value returned by the function call or raises an exception on failure. ```python theme={null} from tensorlake.applications import application, function, Future @application() @function() def my_application(name: str) -> str: # Creates a Future object for the `capitalize(name)` function call and runs it immediately. # `Future.run()` blocks the calling function to start the function call, not to finish it. capitalized_name_future: Future = capitalize.future(name).run() # `Future.result()` blocks until the `capitalize` function call completes. # It returns the value returned by the function call or raises an exception on failure. capitalized_name: str = capitalized_name_future.result() return f"Hello, {capitalized_name}!" @function() def capitalize(text: str) -> str: return text.upper() ``` The main purpose of Futures is to allow running multiple function calls in parallel and getting their results later. This allows building applications that can process multiple independent tasks concurrently, reducing overall latency. Class method `Future.wait(futures: Iterable[Future])` can be used to wait for multiple Futures to complete. See more details at [waiting for multiple Futures to complete](#waiting-for-multiple-futures-to-complete). ### Example: Running multiple function calls in parallel ```python theme={null} from tensorlake.applications import application, function, Future, RETURN_WHEN @function() def capitalize(text: str) -> str: return text.upper() @application() @function() def greet(name: str) -> str: # Start two function calls in parallel. capitalized_name: Future = capitalize.future(name).run() joke: Future = make_joke.future(name).run() # Wait for both function calls to complete. Future.wait([capitalized_name, joke], return_when=RETURN_WHEN.ALL_COMPLETED) # Call `say_hello_and_say_joke` with the values returned by both function calls. # Block until `say_hello_and_say_joke` completes and return its return value. return say_hello_and_say_joke(capitalized_name.result(), joke=joke.result()) @function() def say_hello_and_say_joke(name: str, joke: str) -> str: return f"Hello, {name}! Here's a joke for you: {joke}" @function() def make_joke(name: str) -> str: return f"Why did {name} cross the road? To get to the other side!" ``` ### Example: Non-blocking map and reduce operations Use `function.future.map(...)` and `function.future.reduce(...)` to create Futures for map and reduce operations. The arguments of these methods are the same for `function.map(...)` and `function.reduce(...)` described at [Map-Reduce](/applications/map-reduce) page. ```python theme={null} from tensorlake.applications import application, function, Future @application() @function() def process_numbers(numbers: list[int]) -> int: # Start a map operation to double the numbers in parallel with another function call. doubled_numbers: Future = double_number.future.map(numbers).run() # Start another function call in parallel. log_processing.future(len(numbers)).run() # Wait for the map operation to complete and get the doubled numbers. doubled_numbers_result: list[int] = doubled_numbers.result() # Make sure that log_processing call is completed. log_processing.result() # Start a reduce operation to sum the doubled numbers and return its result. return sum.future.reduce(doubled_numbers_result).result() @function() def double_number(number: int) -> int: return number * 2 @function() def sum(a: int, b: int) -> int: return a + b @function() def log_processing(count: int) -> None: print(f"Processing {count} numbers") ``` ### Waiting for multiple Futures to complete `Future.wait` class method can be used to wait for multiple Futures to complete. This class method is inspired by the standard `concurrent.futures.wait` in Python. It's full signature is: ```python theme={null} from tensorlake.applications import Future, RETURN_WHEN Future.wait( futures: Iterable[Future], timeout: float|None = None, return_when=RETURN_WHEN.ALL_COMPLETED ) -> tuple[list[Future], list[Future]] ``` * `futures`: An iterable of Future objects to wait for. * `timeout`: An optional timeout in seconds. If specified, the method will return after the timeout even if not all Futures have completed. * `return_when`: A flag indicating when to return. It can be one of the following values from the `RETURN_WHEN` enum: * `RETURN_WHEN.ALL_COMPLETED`: Wait until all Futures have completed. * `RETURN_WHEN.FIRST_COMPLETED`: Wait until at least one Future has completed. * `RETURN_WHEN.FIRST_EXCEPTION`: Wait until at least one Future has raised an exception or all have completed. The method returns a tuple of two lists: `(done, not_done)`, where `done` is a list of Futures that have completed, and `not_done` is a list of Futures that have not completed yet. If a future is not running yet, it's started automatically when passed to `Future.wait`. ### Future object Future object has the following methods and properties: * `exception -> TensorlakeError|None`: If the function call or another operation associated with this Future failed then this property will return the exception associated with the failure. Otherwise, it will return `None`. If the operation is not yet complete, this property will also return `None`. * `result(timeout: float|None = None) -> Any`: Blocks until the operation completes and returns the result of the operation (i.e. value returned by function call). If the operation fails, the `FunctionError` will be raised. See more about [error handling](/applications/error-handling). An optional timeout in seconds can be specified. If timeout is reached before the Future completes, a `TimeoutError` will be raised. * `done() -> bool`: Returns `True` if the operation has completed (either successfully or with an exception), otherwise returns `False`. * `run() -> Future`: Starts the Future's operation. Returns the same Future object for chaining. A Future that hasn't been started with `.run()` will be started automatically when passed as another operation input or returned as a [tail call](#tail-calls). * `__await__() -> Generator[Any]`: Allows awaiting the Future in async functions. This is equivalent to calling `.result()`, but the call will not block the async event loop. * `coroutine() -> Coroutine`: Converts the Future into a coroutine that can be used the same way as any coroutine returned by an async Tensorlake function. Returns the same coroutine object if called multiple times on the same Future. Can only be called before a Future is started with `.run()`. ### Passing Futures as inputs Futures can be passed as arguments to function calls. When a Future gets passed this way, Tensorlake automatically runs it if not running, waits for the Future to complete and uses its result as the function call argument value. This allows building applications that can run multiple function calls in parallel without blocking on their results until it's necessary. ```python theme={null} from tensorlake.applications import application, function, Future @function() def double(x: int) -> int: return x * 2 @function() def add(a: int, b: int) -> int: return a + b @application() @function() async def my_app(x: int) -> int: a: Future = double.future(x) b: Future = double.future(x + 1) # Pass Futures as function call arguments. Tensorlake runs both Futures in parallel, # waits for them to complete, and uses their results as the arguments for `add`. return add(a, b) ``` All input futures that don't depend on each other run in parallel, allowing Tensorlake to optimize resource usage and reduce overall application latency. A function call or a map-reduce operation are only blocked while their input Futures are running. Once all input Futures complete, Tensorlake automatically runs the function call or the map-reduce operation. #### Wrapping Futures into Python objects is not allowed When passing Futures as arguments to function calls, or returning them as tail calls, the Futures cannot be wrapped into other Python objects. For example: ```python theme={null} from tensorlake.applications import application, function, Future @function() def capitalize(text: str) -> str: return text.upper() @application() @function() def my_application(name: str) -> list[str]: capitalized_name: Future = capitalize.future(name) names: list[str | Future] = [capitalized_name, name] # Passing Python list with a Future as an argument here is not allowed. # Tensorlake will not recognize the Future wrapped into the list # and will not run it or wait for it to complete. return concat(names) @function() def concat(strings: list[str]) -> str: return "".join(strings) ``` Map and reduce operations accept a Future or a list as input items. If a list is passed then the Futures in the list are recognized by Tensorlake and run automatically. ### Tail calls When a Tensorlake function calls another Tensorlake function or calls `future.result()`, the calling function blocks until the function call or the future completes and returns its result. Applications that make many of such calls can face multiple challenges: 1. **Wasted Resources**: While waiting for the result, the calling function container cannot perform other tasks while still consuming its compute resources. 2. **Higher Resource Usage**: More function containers are required to handle the same number of concurrent application requests if each request blocks multiple function containers. 3. **Higher Latency**: Sequential blocking function calls or `future.result()` calls can lead to increased overall latency, especially when multiple function calls are involved. To address these challenges, Tensorlake introduced **Tail Calls**. A function makes a tail call when it returns a Future. The result of the future, when available, becomes the return value of the function. Once the Future is returned, it immediately starts running and frees the calling function container to process next tasks. This allows building applications that can run multiple function calls in parallel without blocking on their results until it's necessary, significantly reducing overall latency and resource usage. With tail calls the example `greet(...)` application doesn't have to wait for completion of any of its function calls. `greet(...)` just returns almost immediately after telling Tensorlake what it needs to do for the request. `greet(...)` then frees its container to process another request while Tensorlake is orchestrating the execution the most efficient way possible. Once all function calls complete, Tensorlake will return the final result to the user. ```python theme={null} from tensorlake.applications import application, function, Future @function() def capitalize(text: str) -> str: return text.upper() @function() def make_joke(name: str) -> str: return f"Why did {name} cross the road? To get to the other side!" @function() def say_hello_and_say_joke(name: str, joke: str) -> str: return f"Hello, {name}! Here's a joke for you: {joke}" @application() @function() def greet(name: str) -> str: # Returns a future for `say_hello_and_say_joke(capitalize(name), make_joke(name))` function call. # This is a tail call. `greet` doesn't block waiting for any of the function calls to complete. # Once returned Tensorlake will run the Future and use `say_hello_and_say_joke` return value as the # return value of `greet`. The `say_hello_and_say_joke` function call will run as soon as both its # arguments are available. Both arguments are computed in parallel because they don't depend on each other. capitalized_name: Future = capitalize.future(name) joke: Future = make_joke.future(name) return say_hello_and_say_joke.future(capitalized_name, joke=joke) ``` Same as with input futures, wrapping a Future returned from a function into another Python object is not allowed. For example, returning a list with a Future inside is not allowed. Tensorlake will not recognize the Future wrapped into the list. ## See Also Learn how to use async functions in Tensorlake applications. Learn how to use map-reduce operations to run function calls in parallel and aggregate results. # Logging Source: https://docs.tensorlake.ai/applications/guides/logging Emit logs from Tensorlake applications with `print`, the built-in application logger, or structlog for structured JSON logs that are easier to analyze and visualize. You can print logs in your Tensorlake application to help you debug and monitor your application's behavior. Logs can be printed using the `print` function or by using a logging library such as `logging` or `structlog`. We recommend you using structured logs for better analysis and visualization. They're usually JSON that contain key-value pairs, making them easier to parse. The following guide will help you configure `structlog` to take full advantage of structured logs in Tensorlake. ## Adding Structured Logs to Your Application ### Using Tensorlake's built-in application logger The Tensorlake SDK provides a built-in application logger that outputs messages in a predefined JSON format. This logger is designed to be easy to use and provides a simple way to log messages with structured data. To initialize it, you need to import the `Logger` class from the `tensorlake.applications` module and use the `get_logger` method: ```python theme={null} from tensorlake.applications import Logger logger = Logger.get_logger(module="my_app") ``` Then you can use it to log messages with structured data: ```python theme={null} @application() @function(description="An example of logging in Tensorlake") def logging_example(name: str) -> str: logger.info("User logged in", user_id=123) return f"Hello, {name}. This is a logging example!" ``` You can also log exceptions as structured data by using the `exc_info=True` parameter: ```python theme={null} @application() @function(description="An example of logging in Tensorlake") def logging_example(name: str) -> str: try: # some code that may raise an exception except Exception: logger.error("An error occurred", exc_info=True) return f"Hello, {name}. This is a logging example!" ``` Finally, if you need to bind additional context to your logs, you can use the `bind()` method: ```python theme={null} @application() @function(description="An example of logging in Tensorlake") def logging_example(name: str) -> str: logger = logger.bind(user_id=123) logger.info("User logged in") logger.debug("Debug message") return f"Hello, {name}. This is a logging example!" ``` ### Using a custom StructLog configuration If you don't want to use the Tensorlake's built-in application logger, you can use [structlog](https://www.structlog.org/en/stable/) to add structured logs to your application. Structlog is a Python library that provides a simple and flexible way to create structured logs. To configure structlog to print JSON logs, including stack traces, we recommend using the following code: ```python theme={null} import structlog structlog.configure( processors=[ structlog.stdlib.add_log_level, # Add log level structlog.processors.TimeStamper(fmt="iso", key="timestamp", utc=True), # Add timestamp in RFC3339 format structlog.processors.StackInfoRenderer(), # Add stack info for exceptions structlog.processors.dict_tracebacks, # Formats exception info structlog.processors.JSONRenderer(), # Render the log entry as JSON ], cache_logger_on_first_use=True, ) ``` Before you start printing any logs in your library, you need to initialize the logger with the previous configuration. You can do this by calling the `structlog.get_logger()` function: ```python theme={null} logger = structlog.get_logger("my-tensorlake-application") ``` After initializing the logger, you can start printing logs using the `logger` object inside your application. Look at this next example putting all the code together: ```python theme={null} import structlog from tensorlake.applications import ( application, function, ) # Configure structlog to output in JSON format structlog.configure( processors=[ structlog.stdlib.add_log_level, # Add log level structlog.processors.TimeStamper(fmt="iso", key="timestamp", utc=True), # Add timestamp in RFC3339 format structlog.processors.StackInfoRenderer(), # Add stack info for exceptions structlog.processors.dict_tracebacks, # Formats exception info structlog.processors.JSONRenderer(), # Render the log entry as JSON ], cache_logger_on_first_use=True, ) # Create a logger instance logger = structlog.get_logger("logging_example") @application() @function(description="An example of logging in Tensorlake") def logging_example(name: str) -> str: logger.info("Logging example started", status="started") logger.debug("Debugging the payload", name=name) logger.warning("The application is about to crash") try: 1 / 0 except ZeroDivisionError: logger.error("Division by zero error", exc_info=True) return f"Hello, {name}. This is a logging example!" ``` ### Setting levels for your logs By default, when you print any information with `print` in your application, we assign the level `INFO` to those logs. Tensorlake supports the 5 standard levels of logging, `TRACE`, `DEBUG`, `INFO`, `WARNING`, and `ERROR`. These levels are represented with numbers from Trace(1) to Error(5). Our built-in application logger, as well as Structlog, provides helpers that will set the log level for you directly, like `logger.debug` and `logger.warning`. To set the logging level manually, you have to print JSON objects that include a `level` attribute. We take the string representation of these levels from the JSON objects and transform them into our internal representation: ```python theme={null} @application() @function(description="An example of logging in Tensorlake") def manual_log_level_example(_name: str): print('{"level": "DEBUG", "message": "Debugging the payload"}') ``` ## Log retention By default, all application logs are retained for 7 days. This retention period can be increased to 30 days or 1 year maximum. If you want to increase the retention period contact Tensorlake support at `support@tensorlake.ai`. ## Visualizing the logs in Tensorlake's Dashboard The logs that you print in your applications can be visualized in each application page of the [Tensorlake's Dashboard](https://cloud.tensorlake.ai). That page allows you to filter logs by different parameters, like request IDs, function names, and logging levels. ## Get Application logs via API Application logs are also accessible via the Tensorlake API. You can use `curl` or any other HTTP client to retrieve logs for your application. The following section explains how to do that: ```bash theme={null} curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` **Response:** ```json theme={null} { "logs": [ { "timestamp": 1717171717171717171, "uuid": "550e8400-e29b-41d4-a716-446655440000", "namespace": "my-namespace", "application": "my-application", "body": "Processing started for item 1", "level": 3, "logAttributes": "{\"level\": \"info\"}" } ], "nextToken": "1717171717171717172.550e8400-e29b-41d4-a716-446655440000" } ``` ### Filtering Logs You can filter logs using query parameters to narrow down results: **Filter by Request ID** ```bash theme={null} curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs?requestId={request_id}" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` **Filter by Function Name** ```bash theme={null} curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs?function={function_name}" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` **Filter system events out** By default, we add system and application events to the logs, so you can keep track of the lifecycle of your requests. Use `events` if you want to filter out system events: ```bash theme={null} curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs?events=3" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` **Filter by log levels** Log levels are identified by numbers from Trace(1) to Error(5). By default, we show logs for all levels. These are all the possible values for the different levels: 1. Trace 2. Debug 3. Info 4. Warning 5. Error If you want to learn how to set these log levels, check out our [Logging reference](/applications/guides/logging). ```bash theme={null} curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs?level={level}" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` **Combine Filters** Use the `gate` parameter to combine multiple filters with AND (default) or OR logic: ```bash theme={null} # Get logs matching BOTH request ID AND function name curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs?requestId={request_id}&function={function_name}&gate=and" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" # Get logs matching EITHER request ID OR function name curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs?requestId={request_id}&function={function_name}&gate=or" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ### Pagination and Ordering **Get Most Recent Logs (Default)** By default, logs are returned in descending order (newest first). Use `tail` to specify the number of logs: ```bash theme={null} curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs?tail=50" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` **Get Oldest Logs First** Use `head` to get logs in ascending order (oldest first): ```bash theme={null} curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs?head=50" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` **Paginate Through Logs** Use the `nextToken` from the response to fetch the next page: ```bash theme={null} curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs?nextToken={next_token}" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ### Query Parameters Reference | Parameter | Type | Description | | ------------------ | ------------- | ----------------------------------------------------------- | | `requestId` | String | Filter logs for specific request IDs | | `function` | String | Filter logs for specific function names | | `functionExecutor` | String | Filter logs for specific function executor containers | | `functionRunId` | String | Filter logs for specific function runs | | `allocationId` | String | Filter logs for specific allocations | | `level` | Integer | Filter logs for specific log levels | | `events` | Integer | Filter system and application events | | `gate` | `and` \| `or` | Logic for combining multiple filters (default: `and`) | | `head` | Integer | Number of logs to return in ascending order (default: 100) | | `tail` | Integer | Number of logs to return in descending order (default: 100) | | `nextToken` | String | Pagination token from previous response | The parameter that filter logs (requestId, function, functionExecutor, functionRunId, allocationId, and level) can be repeated one or multiple times. If you add more than one parameter with the same name, Tensorlake will search for both parameters using the `gate` parameter as connector. For example, filtering DEBUG and INFO logs: ```bash theme={null} curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs?level=2&level=3" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` # Progress Updates Source: https://docs.tensorlake.ai/applications/guides/streaming-progress Stream real-time progress updates from functions ## Progress API ### Getting the Request Context First, get access to the request context in your function: ```python theme={null} from tensorlake.applications import RequestContext, function @function() def my_function(data: str) -> str: # Get the current request context ctx = RequestContext.get() # Now you can use ctx.progress.update() ctx.progress.update(1, 10, "Starting processing...") return "done" ``` ### Method: `progress.update()` Stream progress updates to monitoring systems and frontends. ```python theme={null} ctx.progress.update( current: int | float, total: int | float, message: str | None = None, attributes: dict[str, str] | None = None ) ``` **Parameters:** | Parameter | Type | Required | Description | | ------------ | ------------------------ | -------- | ------------------------------------- | | `current` | `int \| float` | ✅ Yes | Current step or percentage complete | | `total` | `int \| float` | ✅ Yes | Total steps or 100 for percentage | | `message` | `str \| None` | ❌ No | Human-readable progress message | | `attributes` | `dict[str, str] \| None` | ❌ No | Additional metadata (key-value pairs) | ### Basic Usage **Simple progress tracking:** ```python theme={null} @function() def process_items(items: list) -> dict: ctx = RequestContext.get() for i, item in enumerate(items): # Update progress: current step, total steps ctx.progress.update(i + 1, len(items)) process(item) return {"processed": len(items)} ``` **With a message:** ```python theme={null} @function() def multi_step_workflow() -> str: ctx = RequestContext.get() ctx.progress.update(1, 3, "Fetching data from API...") data = fetch_data() ctx.progress.update(2, 3, "Processing data...") processed = process_data(data) ctx.progress.update(3, 3, "Storing results...") store_results(processed) return "complete" ``` **With additional metadata:** ```python theme={null} @function() def batch_processor(items: list) -> dict: ctx = RequestContext.get() errors = 0 for i, item in enumerate(items): try: process(item) except Exception: errors += 1 # Include metadata about the processing ctx.progress.update( current=i + 1, total=len(items), message=f"Processing item {i + 1}", attributes={ "error_count": str(errors), "success_rate": f"{((i + 1 - errors) / (i + 1) * 100):.1f}%" } ) return {"total": len(items), "errors": errors} ``` ### Using Percentages You can use percentages instead of step counts: ```python theme={null} @function() def long_operation() -> str: ctx = RequestContext.get() # 0-100 scale ctx.progress.update(0, 100, "Starting...") # 25% complete ctx.progress.update(25, 100, "Quarter way through...") # 50% complete ctx.progress.update(50, 100, "Halfway done...") # 100% complete ctx.progress.update(100, 100, "Finished!") return "done" ``` ## Consuming Progress Streams Progress updates are available through the Tensorlake API in real-time. ### Polling for Progress Updates ```bash theme={null} # Get progress updates for a specific request curl -X GET \ "https://api.tensorlake.ai/applications/{application}/requests/{request_id}/progress" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` **Response:** ```json theme={null} { "current": 45, "total": 100, "message": "Processing batch 3 of 10", "attributes": { "batch_id": "batch_003", "records_processed": "4500" }, "timestamp": 1704067200000 } ``` ## Learn More Full context API reference. # Container Images Source: https://docs.tensorlake.ai/applications/images Define per-function container images declaratively with Tensorlake's `Image` API — set the base image, install Python and system packages, and customize per-deploy. Tensorlake functions run in function containers. To install dependencies in the containers, we use container images that are built when you deploy an application. Functions can use any Python or system packages installed into their container images. Tensorlake provides a declarative API to define function container images with their dependencies. ## Defining Images An image is defined using an `Image` object. You can modify the base image, run commands to install dependencies at build time, and modify other image attributes, like its name. ```python theme={null} from tensorlake.applications import Image image = ( Image( name="my-pdf-parser-image", base_image="ubuntu:24.04", ) .run("apt update") .run("pip install langchain") ) ``` ```python theme={null} from tensorlake.applications import function @function(image=image) def parse_pdf(pdf_path: str) -> str: import langchain # All the packages installed in the image are available inside the function. # They need to be imported here because they might not be available # in the Python environment used to deploy the application. ... ``` #### Default Base Image We use a Debian based image `python:{LOCAL_PYTHON_VERSION}-slim-bookworm` as the default. `LOCAL_PYTHON_VERSION` represents the Python version in your current Python environment. ## See Also End-to-end example of using custom images for structured extraction from images. # Introduction Source: https://docs.tensorlake.ai/applications/introduction Add serverless orchestration to any agent Orchestrate is a serverless runtime for adding data orchestration capabilities to Agents. You can build orchestration APIs without deploying containers, workers or queues. Functions starts running when they are called and scale down to zero after finishing work. Some use cases are - 1. Creating multi-stage tools that needs to be retried until they complete. 2. Data ingestion worklfow APIs. 3. Scale out processing using distributed map and reduce. ```python theme={null} from tensorlake.applications import application, function @function() def summarize(doc: str) -> int: summary = call_llm(doc) return summary @function() def summarize_files(docs: List[str]) -> List[str]: summaries = docs.map(summarize) return summaries ``` 1. Tensorlake functions are a unit of compute which is executed in a sandbox and retried based on a user provided retry policy. 2. Functions decorated with `@applications` becomes callable from external systems and exposed as HTTP APIs. 3. Function calls are automatically queued durably when they are called when there is not enough compute to handle the requests. 4. Each function’s inputs and outputs are check-pointed durably so they can be retried. 5. Every function can have different resource asks, making it possible to allocate more resources to functions which are more compute or memory intensive. ### Quickstart Let's build a simple application that greets a user by name. ```bash theme={null} pip install tensorlake ``` You can get an [API key](/platform/authentication#api-keys) from the Tensorlake Dashboard. ```bash theme={null} export TENSORLAKE_API_KEY= ``` Applications are defined by Python functions. Let's start with a template, that greets a user by name. ```bash theme={null} tl new hello_world ``` This creates a file named `hello_world/hello_world.py` with the following content: ```python hello_world.py theme={null} from tensorlake.applications import application, function @application() @function() def greet(name: str) -> str: return f"Hello, {name}!" ``` Deploy your application referencing your application's source file. ```bash theme={null} tl deploy hello_world/hello_world.py ``` ## Invoke Orchestrate Functions Orchestrate endpoints can be invoked using HTTP requests or the Python SDK. ### HTTP Endpoint ``` https://api.tensorlake.ai/applications/ ``` ```bash bash theme={null} curl https://api.tensorlake.ai/applications/hello_world \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ --json '"John"' # {"request_id":"beae8736ece31ef9"} ``` ```python python theme={null} from tensorlake.applications import run_remote_application, Request request: Request = run_remote_application(greet, 'John') print(request.id) # "beae8736ece31ef9" ``` This will return a request ID that you can use to track the progress of your request. Requests may run seconds to hours depending on your workload. ```bash bash theme={null} curl -X GET https://api.tensorlake.ai/applications/hello_world/requests/{request_id} \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" # { # "id":"B0IwzHibTTfn5mCXHPGsu", # "outcome":"success", # "failure_reason":null, # "request_error":null, # .... other fields ... #} ``` ```python python theme={null} # You don't need to poll for request completion. Retrieving the output will wait for the request to complete. ``` The `outcome` field will be `success` or `failure` depending on whether the request completed successfully. It will be null if the request is still in progress. ```bash bash theme={null} curl -X GET https://api.tensorlake.ai/applications/hello_world/requests/{request_id}/output \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" # "Hello, John!" ``` ```python python theme={null} from tensorlake.applications import run_remote_application, Request request: Request = run_remote_application(greet, 'John') output: str = request.output() print(output) # "Hello, John!" ``` ## Testing Locally Tensorlake Applications can run locally on your laptop. You can run them like regular python scripts. ```python hello_world.py theme={null} # At the end of the file from tensorlake.applications import run_local_application, Request if __name__ == "__main__": request: Request = run_local_application(greet, 'John') output: str = request.output() print(output) # "Hello, John!" ``` Deploying agents which starts complex workflows, or multi-stage tool calls requires building complex distributed systems with queues, workers or orchestration engines. It takes away time from focusing and building the agentic logic. Orchestrate helps to solve this problem by letting you write orchestration endpoints and solves the coordination of functions, retries and autoscaling. ## Examples Multi-agent research pipeline with parallel web search and report synthesis using OpenAI Agents SDK. Execute LLM-generated code safely in isolated containers with data science libraries. Claude agentic loop that chains tool calls, each running in its own isolated container. Parse bank statements, categorize transactions, and answer spending questions with Claude. Serverless web crawler that scrapes websites N levels deep using headless Chrome. Conversational weather agent powered by Claude, deployed as an HTTP API. ## Next Steps Follow our quick start guide to build and deploy a serverless agentic code interpreter in under 5 minutes. # Map-Reduce Source: https://docs.tensorlake.ai/applications/map-reduce Use map-reduce patterns in Tensorlake Applications to parallelize large-scale ETL — apply a function across items and aggregate the results. *Map-Reduce* is supported by Tensorlake Applications to support large scale ETL of data. **Map** is the process of applying a function to each item of a list in parallel. **Reduce** is the process of aggregating the results of the map phase. The example below visualizes mapping of a list of numbers to their squares and reducing the results by summing the squares: ```mermaid theme={null} flowchart TD inputs(1, 2, 3, 4, 5) map(square.map) inputs --> map square1(1) map --> square1 square2(4) map --> square2 square3(9) map --> square3 square4(16) map --> square4 square5(25) map --> square5 reduce1("sum(1, 4)") square1 --> reduce1 square2 --> reduce1 reduce2("sum(5, 9)") reduce1 --> reduce2 square3 --> reduce2 reduce3("sum(14, 16)") reduce2 --> reduce3 square4 --> reduce3 reduce4("sum(30, 25)") reduce3 --> reduce4 square5 --> reduce4 final_result(55) reduce4 --> final_result ``` Tensorlake automatically parallelizes function calls across multiple function containers when you map a function to a list. The reducer function is applied to each pair of mapped values sequentially in their original order in the list. Tensorlake runs each reduce function call as soon as its input values are available. ## Blocking Map-Reduce In the following code example, we calculate the square of each number and once we have all the squares, we sum them. ```python theme={null} from pydantic import BaseModel from tensorlake.applications import application, function class TotalSum(BaseModel): value: int = 0 @application() @function() def sum_squares(total_numbers: int) -> TotalSum: # Blocks until all map calls complete. # The behavior and signature of function.map is very similar to Python's built-in map except it's distributed and parallel. squares: List[int] = square.map([i for i in range(total_numbers)]) # Blocks until all reduce calls complete. # The behavior and signature of function.reduce is very similar to Python's functools.reduce except it's distributed. total: TotalSum = sum_total.reduce(squares, TotalSum(value=0)) return total @function() def square(number: int) -> int: return number ** 2 @function() def sum_total(total: TotalSum, number: int) -> TotalSum: total.value += number # This value will be passed to the next sum_total call as the first argument. # Unless this is the last call, in which case it will be returned as the final # result of the reduce operation. return total ``` ## Non-blocking Map-Reduce In the following code example, we calculate the square of each number and as soon as each square is available, we sum them. This is achieved using [futures and tail calls](/applications/futures#tail-calls). This reduces the overall duration of the Map-Reduce operation. The reduce function is still called sequentially in the original order of the list. ```python theme={null} from pydantic import BaseModel from tensorlake.applications import application, function, Future class TotalSum(BaseModel): value: int = 0 @application() @function() def sum_squares(total_numbers: int) -> TotalSum: # Defines map function calls but doesn't run them. squares: Future = square.future.map([i for i in range(total_numbers)]) # Defines reduce function calls that will run as soon as each mapped value is available. # Returns the reduce operation definition as a tail call. Tensorlake will take care of running it. # The final value of the reduce operation will be assigned as the request output like if this function # returns it here. return sum_total.future.reduce(squares, TotalSum(value=0)) @function() def square(number: int) -> int: return number ** 2 @function() def sum_total(total: TotalSum, number: int) -> TotalSum: total.value += number # This value will be passed to the next sum_total call as the first argument. # Unless this is the last call, in which case it will be returned as the final # result of the reduce operation. return total ``` ## Inputs ### List Both map and reduce operations accept a list as operation inputs. Each item in the list can be a value, a Future, a Tensorlake coroutine, or an `asyncio.Task` object. Tensorlake recognizes these Futures/coroutines/`asyncio.Task` objects, runs them automatically, and uses their results as the input values for the operation. ```python theme={null} from tensorlake.applications import application, function, Future @function() def double(number: int) -> int: return number * 2 @function() def sum(a: int, b: int) -> int: return a + b @application() @function() def sum_doubled(numbers: list[int]) -> int: # Reduce operation input is a list of Futures. doubled: list[Future] = [double.future(number) for number in numbers] # sum is called on each pair of doubled values as soon as they are available. return sum.reduce(doubled, 0) ``` ### Future / Coroutine / Task Map and reduce operations accept a single [Future](/applications/futures)/[coroutine/`asyncio.Task`](/applications/async-functions.mdx) object as their input. The Future/coroutine/`asyncio.Task` object has to resolve to a list of items. Tensorlake automatically waits for it to complete and uses the returned list as the operation input. This is useful when the input list is produced by another Tensorlake function. ```python theme={null} from tensorlake.applications import application, function, Future @function() def generate_numbers(count: int) -> list[int]: return list(range(count)) @function() def square(number: int) -> int: return number ** 2 @function() def sum(a: int, b: int) -> int: return a + b @application() @function() def sum_of_squares(count: int) -> int: # generate_numbers returns a list[int] when it completes. numbers_future: Future = generate_numbers.future(count) # Pass the Future as input to map. Tensorlake waits for generate_numbers # to complete and maps square over the returned list. squares: Future = square.future.map(numbers_future) # Pass the map Future as input to reduce operation. return sum.reduce(squares) ``` ## Tail calls Map and reduce operation Futures can be returned from functions as [tail calls](/applications/futures#tail-calls). The returning function completes immediately and frees its container while Tensorlake orchestrates the map-reduce operation. This allows Tensorlake to optimize resource usage and reduce overall application latency. Learn how to use async functions in Tensorlake applications. Use Futures for parallel execution and tail calls. # Observability Source: https://docs.tensorlake.ai/applications/observability Built-in tracing, execution timelines and monitoring Every Tensorlake function call is automatically traced. You get execution timelines, logs, metrics, and error details without configuring any observability infrastructure. ## Execution Timelines When a request flows through your application, Tensorlake records every function call in an execution timeline. You can see: * **Function call sequence** — which functions ran and in what order * **Timing** — how long each function took, including cold start time * **Dependencies** — which function calls ran in parallel vs. sequentially * **Status** — success, failure, or retry for each function call This is available in the [Tensorlake Dashboard](https://cloud.tensorlake.ai/applications) for every application request. ## Structured Logging Use Python's standard `print()` or `logging` module inside your functions. Logs are captured automatically and associated with the specific function call and request. ```python theme={null} from tensorlake.applications import function import logging logger = logging.getLogger(__name__) @function() def process_data(data: str) -> str: logger.info(f"Processing {len(data)} characters") result = transform(data) logger.info(f"Transformation complete, output size: {len(result)}") return result ``` Logs are available in the dashboard and through the [Logging guide](/applications/guides/logging) for configuration details. ## Learn More Structured logging configuration. # Programming Agents Source: https://docs.tensorlake.ai/applications/overview Core concepts and common patterns for running agents on Tensorlake Tensorlake is a **compute platform for agents** — it runs your agents, it doesn't replace your agent framework. You bring the agent logic (OpenAI Agents SDK, LangGraph, Claude SDK, or plain Python), and Tensorlake provides the infrastructure: serverless containers, durable execution, sandboxes, and observability. ## Patterns ## Agent Loop in a Single Function The simplest pattern: your entire agent loop runs inside one `@function()`. Tensorlake handles deployment, scaling, and durability. ```python theme={null} from tensorlake.applications import application, function @application() @function(timeout=3600) def research_agent(topic: str) -> str: from agents import Agent, Runner, WebSearchTool agent = Agent( name="ResearchAgent", instructions="Thoroughly research the given topic using web search.", tools=[WebSearchTool()] ) result = Runner.run_sync(agent, topic) return result.final_output ``` This works well for agents that: * Run a single loop with tool calls * Don't need to fan out work to other agents * Have predictable resource requirements ## Sandboxing Functions When your agent calls tools with different resource needs (CPU, memory, GPU, dependencies), wrap each tool in its own `@function()`. Each function runs in its own container with its own resource limits and dependencies. ```python theme={null} from tensorlake.applications import application, function, Image heavy_image = Image().run("pip install torch transformers") @function(image=heavy_image, memory=8, gpu="T4") def classify_image(image_url: str) -> str: """Runs in a GPU container with 8GB memory.""" from transformers import pipeline classifier = pipeline("image-classification") return classifier(image_url)[0]["label"] @function() def search_web(query: str) -> list[str]: """Runs in a lightweight container.""" import requests # Call search API return ["result1", "result2"] @application() @function(timeout=1800) def research_agent(topic: str) -> dict: # Agent loop calls tools that run in separate containers image_label = classify_image("https://example.com/photo.jpg") web_results = search_web(topic) return {"image": image_label, "web": web_results} ``` Each `@function()`: * Runs in its own isolated container * Has its own dependencies, CPU, memory, and GPU allocation * Is independently retryable and durable * Scales independently based on demand ## Harness Pattern: Agent as Orchestrator For complex agents, separate the **harness** (orchestration logic) from the **work** (tool execution). The harness is a lightweight function that coordinates heavier worker functions. ```python theme={null} from tensorlake.applications import application, function, Image worker_image = Image().run("pip install openai langchain") @application() @function(timeout=3600) def analyst_agent(query: str) -> dict: """Lightweight harness that orchestrates worker functions.""" from openai import OpenAI client = OpenAI() # Agent decides what to do plan = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": f"Plan research for: {query}"}] ).choices[0].message.content # Dispatch to worker functions data = fetch_data(query) analysis = analyze_data(data) return {"plan": plan, "analysis": analysis} @function(image=worker_image, cpu=4, memory=8) def fetch_data(query: str) -> dict: """Heavy data fetching in a dedicated container.""" ... @function(image=worker_image, cpu=2, memory=4) def analyze_data(data: dict) -> str: """Analysis with different resource needs.""" ... ``` ## Running Agent Frameworks on Tensorlake ### OpenAI Agents SDK ```python theme={null} from tensorlake.applications import application, function @application() @function(timeout=1800) def openai_agent(prompt: str) -> str: from agents import Agent, Runner, WebSearchTool agent = Agent( name="Assistant", instructions="You are a helpful assistant.", tools=[WebSearchTool()] ) result = Runner.run_sync(agent, prompt) return result.final_output ``` ### LangGraph ```python theme={null} from tensorlake.applications import application, function, Image image = Image().run("pip install langgraph langchain-openai") @application() @function(image=image, timeout=1800) def langgraph_agent(query: str) -> str: from langgraph.prebuilt import create_react_agent from langchain_openai import ChatOpenAI model = ChatOpenAI(model="gpt-4") agent = create_react_agent(model, tools=[]) result = agent.invoke({"messages": [("human", query)]}) return result["messages"][-1].content ``` ### Claude SDK ```python theme={null} from tensorlake.applications import application, function @application() @function(timeout=3600, ephemeral_disk=4) def claude_agent(prompt: str) -> str: import asyncio from claude_agent_sdk import query, ClaudeAgentOptions async def run(): options = ClaudeAgentOptions( system_prompt="You are an expert developer.", permission_mode="acceptEdits", cwd="/tmp/workspace" ) result = "" async for message in query(prompt=prompt, options=options): result = str(message) return result return asyncio.run(run()) ``` ## Parallel Sub-Agents When your workflow involves multiple specialist agents, fan them out using [futures](/applications/futures) or [async functions](/applications/async-functions) so they run in parallel: ```python theme={null} @application() @function() def analyze_proposal(text: str) -> dict: financial = financial_agent.future(text) legal = legal_agent.future(text) technical = technical_agent.future(text) return synthesize.future(financial, legal, technical) ``` See [Parallel Sub-Agents](/applications/parallel-sub-agents) for detailed patterns. ## Core Concepts **Building blocks of applications.** Functions are Python functions that run in isolated containers with their own dependencies, compute, and storage. **HTTP-triggered entry points.** Applications are functions exposed as HTTP endpoints that receive requests and orchestrate work across multiple functions. **Resume from failures, not restart.** Checkpoints are automatically created so retries continue from the last successful step instead of starting over. **Run untrusted code safely.** Every function runs in an isolated sandbox with configurable resource limits and network restrictions. **Parallel data processing.** Fan out work across a list in parallel, then aggregate results—no queue setup required. **Built-in tracing and logging.** Every function call is automatically traced with timing, logs, and execution timelines. # Parallel Sub-Agents Source: https://docs.tensorlake.ai/applications/parallel-sub-agents Fan out work to specialist agents that run in parallel Every agent framework has converged on the same pattern: break a complex task into independent subtasks, run specialist agents on each subtask in parallel, and synthesize the results. LangGraph does this with `Send` and `@task` futures. OpenAI Agents SDK uses `asyncio.gather` and `agent.as_tool()`. Claude Agent SDK spawns subagents via the `Task` tool. Deep Agents dispatches parallel `task` tool calls. On Tensorlake, you get the same fan-out/fan-in pattern — but each sub-agent runs in its own container with dedicated resources, independent retries, and durable checkpointing. No `asyncio` plumbing, no graph DSL, no shared memory coordination. ## Basic Pattern: Fan-Out and Combine Define each sub-agent as a `@function()`, create futures for each, and pass them to a combiner function as a tail call: ```python theme={null} from tensorlake.applications import application, function, Image research_image = Image().run("pip install openai requests") @application() @function() def analyze_company(company_name: str) -> dict: # Fan out to specialist agents — all run in parallel financials = financial_agent.future(company_name) market = market_agent.future(company_name) sentiment = sentiment_agent.future(company_name) # Combine results — runs after all agents complete return compile_report.future(financials, market, sentiment, company_name) @function(image=research_image, timeout=600) def financial_agent(company: str) -> dict: """Analyze financial data for a company.""" from openai import OpenAI client = OpenAI() response = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": f"Analyze financials for {company}"}] ) return {"analysis": response.choices[0].message.content} @function(image=research_image, timeout=600) def market_agent(company: str) -> dict: """Analyze market position and competitors.""" from openai import OpenAI client = OpenAI() response = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": f"Analyze market position for {company}"}] ) return {"analysis": response.choices[0].message.content} @function(image=research_image, timeout=600) def sentiment_agent(company: str) -> dict: """Analyze public sentiment.""" from openai import OpenAI client = OpenAI() response = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": f"Analyze sentiment for {company}"}] ) return {"analysis": response.choices[0].message.content} @function() def compile_report(financials: dict, market: dict, sentiment: dict, company: str) -> dict: return { "company": company, "financials": financials, "market": market, "sentiment": sentiment } ``` **Execution flow:** ```mermaid theme={null} graph LR A["analyze_company()"] --> B["financial_agent()"] A --> C["market_agent()"] A --> D["sentiment_agent()"] B --> E["compile_report()"] C --> E D --> E E --> F["result"] ``` ## How It Works 1. The orchestrator function creates futures for each sub-agent — this defines the calls without running them 2. Futures are passed as arguments to the combiner function, which is returned as a **tail call** 3. Tensorlake detects that the future arguments have no dependencies on each other and runs all sub-agents **in parallel** 4. When all sub-agents complete, the combiner runs with their results 5. The orchestrator's container is freed immediately after returning the tail call The orchestrator's container is freed immediately after returning the tail call. You're not paying for an idle container while sub-agents work. ## Real-World Patterns These patterns are inspired by what teams are building in production with LangGraph, OpenAI Agents SDK, Claude Agent SDK, and Deep Agents — reimplemented on Tensorlake with container isolation, independent scaling, and durable execution. ### Parallel Research with Synthesis The most common multi-agent pattern across every framework: decompose a research question into subtopics, investigate each in parallel, and synthesize the findings. This is the pattern behind GPT Researcher, Exa's web research system, and Anthropic's multi-agent research system. ```python theme={null} from tensorlake.applications import application, function, Image research_image = Image().run("pip install openai requests beautifulsoup4") @function(image=research_image, timeout=900, retries=2) def research_subtopic(topic: str, subtopic: str) -> dict: """Each researcher runs in its own container, searches the web, reads sources, and produces a structured summary.""" from openai import OpenAI client = OpenAI(max_retries=0) # Step 1: Generate search queries for this subtopic queries = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": f"Generate 3 search queries to research '{subtopic}' in the context of '{topic}'."}], ).choices[0].message.content # Step 2: Search and gather sources sources = search_and_read(queries) # Step 3: Analyze and summarize analysis = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": "Summarize research findings with citations."}, {"role": "user", "content": f"Topic: {subtopic}\n\nSources:\n{sources}"}, ], ).choices[0].message.content return {"subtopic": subtopic, "analysis": analysis, "source_count": len(sources)} @function(image=research_image, timeout=300) def synthesize_research(results: list[dict], topic: str) -> dict: """Combine all parallel research into a cohesive report.""" from openai import OpenAI combined = "\n\n---\n\n".join( f"## {r['subtopic']}\n{r['analysis']}" for r in results ) report = OpenAI(max_retries=0).chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": "Synthesize research findings into a cohesive report. Resolve contradictions and highlight consensus."}, {"role": "user", "content": f"Topic: {topic}\n\nFindings:\n{combined}"}, ], ).choices[0].message.content return {"topic": topic, "report": report, "sections": len(results)} @application() @function(image=research_image, timeout=120) def deep_research(topic: str) -> dict: """Orchestrator: decompose, fan out, synthesize.""" from openai import OpenAI import json # Plan the research plan = OpenAI(max_retries=0).chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": f"Break this topic into 3-5 independent research subtopics: {topic}"}], response_format={"type": "json_object"}, ).choices[0].message.content subtopics = json.loads(plan)["subtopics"] # Fan out — each subtopic researched in parallel findings = [research_subtopic.future(topic, sub) for sub in subtopics] # Synthesize — runs after all research completes return synthesize_research.future(findings, topic) ``` Each researcher runs in its own container with its own 15-minute timeout and 2 retries. If one subtopic's research fails (rate limit, network error), only that subtopic is retried — the other researchers' work is preserved. ### Multi-Perspective Analysis Multiple specialist agents examine the same input from different analytical perspectives — a pattern used in production for investment analysis, proposal review, and compliance checks. ```python theme={null} from pydantic import BaseModel from tensorlake.applications import application, function, Image analyst_image = Image().run("pip install anthropic") class AnalystReport(BaseModel): perspective: str assessment: str risk_score: float key_findings: list[str] @function(image=analyst_image, timeout=600, retries=2) def growth_analyst(company_data: dict) -> AnalystReport: """Evaluate revenue growth, market expansion, and competitive moats.""" import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=2000, messages=[{"role": "user", "content": f"As a growth analyst, evaluate:\n{company_data}"}], ) return parse_report("growth", response.content[0].text) @function(image=analyst_image, timeout=600, retries=2) def value_analyst(company_data: dict) -> AnalystReport: """Evaluate cash flow, margins, and intrinsic value.""" import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=2000, messages=[{"role": "user", "content": f"As a value analyst, evaluate:\n{company_data}"}], ) return parse_report("value", response.content[0].text) @function(image=analyst_image, timeout=600, retries=2) def risk_analyst(company_data: dict) -> AnalystReport: """Evaluate regulatory risk, market volatility, and operational risk.""" import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=2000, messages=[{"role": "user", "content": f"As a risk analyst, evaluate:\n{company_data}"}], ) return parse_report("risk", response.content[0].text) @function(image=analyst_image, timeout=300) def investment_committee(growth: AnalystReport, value: AnalystReport, risk: AnalystReport) -> dict: """Weigh all perspectives and produce a final recommendation.""" import anthropic client = anthropic.Anthropic() combined = f"Growth: {growth.model_dump()}\nValue: {value.model_dump()}\nRisk: {risk.model_dump()}" response = client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=2000, messages=[{"role": "user", "content": f"As an investment committee, synthesize these analyst reports into a buy/hold/sell recommendation:\n{combined}"}], ) return {"recommendation": response.content[0].text, "analyst_reports": [growth.model_dump(), value.model_dump(), risk.model_dump()]} @application() @function() def analyze_investment(company_data: dict) -> dict: growth = growth_analyst.future(company_data) value = value_analyst.future(company_data) risk = risk_analyst.future(company_data) return investment_committee.future(growth, value, risk) ``` This mirrors the multi-agent portfolio collaboration pattern from the OpenAI Agents SDK cookbook — but each analyst runs in an isolated container with its own timeout and retry policy. ### Document Processing Pipeline Process a batch of documents through parallel specialist agents — a common pattern for intake automation in insurance, legal, and financial services. ```python theme={null} from tensorlake.applications import application, function, Image ocr_image = Image().run("pip install pytesseract pillow pdf2image") llm_image = Image().run("pip install openai") @function(image=ocr_image, cpu=2, memory=4, timeout=120) def extract_text(doc_url: str) -> dict: """OCR and text extraction — needs CPU for image processing.""" content = download_and_ocr(doc_url) return {"url": doc_url, "text": content} @function(image=llm_image, timeout=300, retries=2) def classify_document(doc: dict) -> dict: """Determine document type and extract key fields.""" from openai import OpenAI response = OpenAI(max_retries=0).chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": f"Classify this document and extract key fields:\n{doc['text'][:4000]}"}], response_format={"type": "json_object"}, ) return {**doc, "classification": response.choices[0].message.content} @function(image=llm_image, timeout=300, retries=2) def check_compliance(doc: dict) -> dict: """Check for missing signatures, dates, required fields.""" from openai import OpenAI response = OpenAI(max_retries=0).chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": f"Check this document for compliance issues:\n{doc['text'][:4000]}"}], response_format={"type": "json_object"}, ) return {**doc, "compliance": response.choices[0].message.content} @function(timeout=60) def merge_results(classified: dict, compliance: dict) -> dict: return { "url": classified["url"], "classification": classified["classification"], "compliance": compliance["compliance"], } @application() @function() def process_document(doc_url: str) -> dict: extracted = extract_text.future(doc_url) classified = classify_document.future(extracted) compliance = check_compliance.future(extracted) return merge_results.future(classified, compliance) ``` ```mermaid theme={null} graph LR A["process_document()"] --> B["extract_text()"] B --> C["classify_document()"] B --> D["check_compliance()"] C --> E["merge_results()"] D --> E E --> F["result"] ``` After extraction, classification and compliance checking run in parallel — they both depend on the extracted text but not on each other. If the compliance check hits a rate limit, it retries independently without re-running OCR or classification. ## Use Any Agent Framework Each sub-agent can use whatever framework you want internally. The `@function()` boundary is a container boundary — what runs inside is up to you. Define each specialist as a focused function using its framework, then fan them out with `.future()`. ```python theme={null} from tensorlake.applications import application, function, Image # Each framework gets its own container image with its own dependencies langgraph_image = Image().run("pip install langgraph langchain-openai tavily-python") openai_image = Image().run("pip install openai-agents") claude_image = Image().run("pip install claude-agent-sdk") deep_image = Image().run("pip install deepagents langchain-openai") @function(image=langgraph_image, timeout=600) def market_researcher(company: str) -> str: """Market research using a LangGraph ReAct agent with web search.""" from langgraph.prebuilt import create_react_agent from langchain_openai import ChatOpenAI from langchain_community.tools import TavilySearchResults agent = create_react_agent( ChatOpenAI(model="gpt-4o"), tools=[TavilySearchResults(max_results=5)], ) result = agent.invoke({"messages": [ ("human", f"Research the market position, competitors, and recent news for {company}.") ]}) return result["messages"][-1].content @function(image=openai_image, timeout=600) def financial_analyst(company: str) -> str: """Financial analysis using an OpenAI Agents SDK agent with tool use.""" from agents import Agent, Runner, WebSearchTool agent = Agent( name="FinancialAnalyst", instructions=( "You are a financial analyst. Analyze revenue, margins, cash flow, " "and valuation metrics. Use web search to find the latest filings." ), tools=[WebSearchTool()], ) result = Runner.run_sync(agent, f"Analyze the financials for {company}") return result.final_output @function(image=claude_image, timeout=900, ephemeral_disk=4) def risk_assessor(company: str) -> str: """Risk assessment using a Claude agent with deep reasoning.""" import asyncio from claude_agent_sdk import query, ClaudeAgentOptions async def run(): result = "" async for message in query( prompt=f"Assess regulatory, operational, and market risks for {company}.", options=ClaudeAgentOptions( system_prompt="You are a risk analyst. Identify and score key risks.", permission_mode="acceptEdits", cwd="/tmp/workspace", ), ): result = str(message) return result return asyncio.run(run()) @function(image=deep_image, timeout=900) def technical_reviewer(company: str) -> str: """Technical deep-dive using a Deep Agent with planning and web search.""" from deepagents import create_deep_agent agent = create_deep_agent( model="openai:gpt-4o", system_prompt="Evaluate the company's technology stack, patents, and engineering culture.", ) result = agent.invoke({ "messages": [{"role": "user", "content": f"Technical review of {company}"}] }) return result["messages"][-1].content @function(timeout=300) def compile_analysis(market: str, financials: str, risks: str, technical: str, company: str) -> dict: """Combine all analyst reports into a final recommendation.""" return { "company": company, "market_research": market, "financial_analysis": financials, "risk_assessment": risks, "technical_review": technical, } @application() @function() def analyze_company(company: str) -> dict: # Four frameworks, four containers, all running in parallel market = market_researcher.future(company) financials = financial_analyst.future(company) risks = risk_assessor.future(company) technical = technical_reviewer.future(company) return compile_analysis.future(market, financials, risks, technical, company) ``` ```mermaid theme={null} graph LR A["analyze_company()"] --> B["market_researcher()\nLangGraph"] A --> C["financial_analyst()\nOpenAI Agents SDK"] A --> D["risk_assessor()\nClaude Agent SDK"] A --> E["technical_reviewer()\nDeep Agents"] B --> F["compile_analysis()"] C --> F D --> F E --> F F --> G["result"] ``` Each agent runs in its own container with its own dependencies — no version conflicts, no shared memory, no `asyncio` event loop contention. If the risk assessment takes longer than the others, the completed agents' results are checkpointed and preserved. ## Different Resources Per Agent Each sub-agent can have its own container configuration: ```python theme={null} gpu_image = Image().run("pip install torch transformers") @function(cpu=1, memory=2, timeout=300) def text_agent(prompt: str) -> str: """Lightweight text analysis.""" ... @function(image=gpu_image, cpu=4, memory=16, gpu="T4", timeout=600) def vision_agent(image_url: str) -> dict: """GPU-heavy image analysis.""" ... @function(cpu=2, memory=4, timeout=900) def data_agent(query: str) -> list: """Medium resources for data fetching.""" ... @application() @function() def multimodal_analysis(prompt: str, image_url: str) -> dict: text_result = text_agent.future(prompt) vision_result = vision_agent.future(image_url) data_result = data_agent.future(prompt) return combine_results.future(text_result, vision_result, data_result) ``` ## Chaining Parallel Stages You can chain stages where each stage fans out in parallel: ```python theme={null} @application() @function() def pipeline(query: str) -> dict: # Stage 1: Gather data in parallel web = search_web.future(query) papers = search_papers.future(query) news = search_news.future(query) # Stage 2: Analyze each source (runs after stage 1) analysis = analyze_sources.future(web, papers, news) # Stage 3: Generate final output return generate_report.future(analysis, query) ``` Each stage waits for its dependencies automatically. Stages without dependencies run in parallel. ## Using Futures for More Control When you need to do work in the orchestrator while sub-agents run, use [Futures](/applications/futures) instead of tail calls: ```python theme={null} from tensorlake.applications import application, function, Future, RETURN_WHEN @application() @function(timeout=1800) def interactive_analysis(query: str) -> dict: # Start sub-agents agent_a: Future = agent_a_work.future(query).run() agent_b: Future = agent_b_work.future(query).run() # Do local work while agents run local_context = prepare_context(query) # Wait for both agents Future.wait([agent_a, agent_b], return_when=RETURN_WHEN.ALL_COMPLETED) return { "context": local_context, "agent_a": agent_a.result(), "agent_b": agent_b.result() } ``` ## Learn More Deep dive on futures, tail calls, and parallel execution. Use Python async/await for parallel workflows. Parallel processing over lists of data. # Troubleshooting Source: https://docs.tensorlake.ai/applications/production/troubleshooting Common issues building Tensorlake applications and how to debug them ## Common Issues ### Function Timeout If your function is timing out, consider: 1. **Increase the timeout** - Set a higher `timeout` value in your `@function` decorator 2. **Report progress** - Use `ctx.progress.update()` to reset the timeout. See [Streaming Progress Updates](/applications/concepts#streaming-progress-updates) 3. **Check the logs** - Use the Logs API above to see what your function was doing before it timed out ### Request Failed To investigate a failed request: 1. **Check request state** - Get the full request state including failure reason: ```bash theme={null} curl -X GET \ "https://api.tensorlake.ai/applications/{application}/requests/{request_id}" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` 2. **Review logs** - Filter logs by the request ID to see what happened: ```bash theme={null} curl -X GET \ "https://api.tensorlake.ai/applications/{application}/logs?requestId={request_id}" \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ### Out of Memory If your function is running out of memory: 1. **Check current allocation** - Review the `memory` setting in your `@function` decorator 2. **Increase memory** - Set `memory` to a higher value (up to 32 GB). See [Memory](/applications/concepts#memory) 3. **Process in batches** - Break large datasets into smaller chunks ### Debugging Tips * Add `print()` statements in your code to log intermediate values * Use `ctx.request_id` to correlate logs across function calls. See [Request ID](/applications/concepts#request-id) * Check that your function has sufficient CPU, memory, and disk resources * Review retry settings if functions are failing intermittently. See [Retries](/applications/concepts#retries) # Applications Quickstart Source: https://docs.tensorlake.ai/applications/quickstart Write, deploy, and call your first Tensorlake Application — a serverless agentic web-scraper with the Claude Agent SDK in under five minutes. This guide will walk you through the process of writing, deploying, and calling Tensorlake Applications. You will learn how to build a serverless agentic web-scrapper with Anthropic's Claude Agent SDK under 5 minutes. Let's start with a simple "Hello, World!" application, to make sure your environment is set up correctly. ```bash theme={null} pip install tensorlake ``` You can get an [API key](/platform/authentication#api-keys) from the Tensorlake Dashboard. ```bash theme={null} export TENSORLAKE_API_KEY= ``` Applications are defined by Python functions. Let's start with a template, that greets a user by name. ```bash theme={null} tl new hello_world ``` This creates a file named `hello_world/hello_world.py` with the following content: ```python hello_world.py theme={null} from tensorlake.applications import application, function @application() @function() def greet(name: str) -> str: return f"Hello, {name}!" ``` Deploy your application referencing your application's source file. ```bash theme={null} tl deploy hello_world/hello_world.py ``` That's it — you now have a distributed app running in the cloud. ## Call Applications Tensorlake gives you an HTTP endpoint, for calling your application remotely. ``` https://api.tensorlake.ai/applications/ ``` Fetch a key from the [Tensorlake Dashboard](/platform/authentication#api-keys) and export it as an environment variable: ```bash theme={null} export TENSORLAKE_API_KEY= ``` ```bash bash theme={null} curl https://api.tensorlake.ai/applications/hello_world \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ --json '"John"' # {"request_id":"beae8736ece31ef9"} ``` ```python python theme={null} from tensorlake.applications import run_remote_application, Request request: Request = run_remote_application(greet, 'John') print(request.id) # "beae8736ece31ef9" ``` This will return a request ID that you can use to track the progress of your request. Requests may run seconds to hours depending on your workload. ```bash bash theme={null} curl -X GET https://api.tensorlake.ai/applications/hello_world/requests/{request_id} \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" # { # "id":"B0IwzHibTTfn5mCXHPGsu", # "outcome":"success", # "failure_reason":null, # "request_error":null, # .... other fields ... #} ``` ```python python theme={null} # You don't need to poll for request completion. Retrieving the output will wait for the request to complete. ``` The `outcome` field will be `success` or `failure` depending on whether the request completed successfully. It will be null if the request is still in progress. ```bash bash theme={null} curl -X GET https://api.tensorlake.ai/applications/hello_world/requests/{request_id}/output \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" # "Hello, John!" ``` ```python python theme={null} from tensorlake.applications import run_remote_application, Request request: Request = run_remote_application(greet, 'John') output: str = request.output() print(output) # "Hello, John!" ``` ## Testing Locally Tensorlake Applications can run locally on your laptop. You can run them like regular python scripts. ```python hello_world.py theme={null} # At the end of the file from tensorlake.applications import run_local_application, Request if __name__ == "__main__": request: Request = run_local_application(greet, 'John') output: str = request.output() print(output) # "Hello, John!" ``` ## Building an Agentic Code Interpreter Now let's build a real agentic application. We will build a code interpreter agent with OpenAI Agent SDK. The tensorlake application function will be the main agentic loop, and we will use a Tensorlake function to execute code, and pass it as a tool to the agent. Whenever the agent needs to execute code, it will call the Tensorlake function and pass the code as a tool call. The Tensorlake function will execute the code in an isolated container and return the output to the agent. The agent needs access to the OpenAI API. Add your API key as a secret using the Tensorlake CLI: ```bash theme={null} tl secrets set OPENAI_API_KEY= ``` This securely stores your API key so it can be injected into your application at runtime. The secret is referenced in the function decorator which uses the OpenAI Agent SDK and will be available as an environment variable. ```python code_interpreter.py theme={null} import sys from io import StringIO from tensorlake.applications import application, function, Image # Image for the code execution container - has data science libraries code_exec_image = ( Image(name="python:3.11-slim") .run("pip install numpy pandas matplotlib") ) # Image for the agent container - has the OpenAI Agent SDK agent_image = ( Image(name="python:3.11-slim") .run("pip install openai-agents") ) @function(image=code_exec_image, cpu=2, memory=4) def execute_code(code: str) -> str: """Execute Python code in a secure sandbox and return the output.""" stdout_capture = StringIO() old_stdout = sys.stdout try: sys.stdout = stdout_capture exec_globals = {"__builtins__": __builtins__} exec(code, exec_globals) sys.stdout = old_stdout return stdout_capture.getvalue() except Exception as e: sys.stdout = old_stdout return f"Error: {e}\nOutput: {stdout_capture.getvalue()}" @application() @function(image=agent_image, secrets=["OPENAI_API_KEY"]) def code_interpreter_agent(user_request: str) -> str: """Run the agentic loop and return the final answer.""" from agents import Agent, Runner, function_tool @function_tool def execute_python(code: str) -> str: """Execute Python code in a secure sandbox. Use this for calculations or data analysis.""" return execute_code(code) agent = Agent( name="Code interpreter", model="gpt-4o", instructions="You are a helpful assistant that can execute Python code to solve problems.", tools=[execute_python], ) result = Runner.run_sync(agent, user_request) return result.final_output ``` Deploy your application and call it: ```bash theme={null} tl deploy code_interpreter.py ``` ```bash theme={null} curl https://api.tensorlake.ai/applications/code_interpreter_agent \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ --json '"What is the square root of 273 * 312821 plus 1782?"' ``` On Lambda or Vercel, running arbitrary code execution would require complex sandboxing, security policies, and resource management — all in the same container as your main application. With Tensorlake, the `execute_code` function runs in a completely isolated container with its own CPU, memory, and dependencies. If code execution needs heavy compute or specialized libraries, it scales independently from your agent logic. You get secure, isolated code execution without managing infrastructure. Tensorlake handles the infrastructure complexity so you can focus on building powerful AI tools. ## Next Steps Here are some of the next things to learn about: Learn key concepts and APIs to program applications. Learn how to add dependencies for your applications. Learn how to manage secrets that your applications access. Learn how to use map-reduce to process large datasets. Learn how to build multi-step workflows with parallel execution and optimized resource usage. Learn how to run multiple function calls in parallel using Futures. Learn how to use Python async/await with Tensorlake functions. # Retries & Rate Limits Source: https://docs.tensorlake.ai/applications/retries Handle LLM rate limits, transient failures, and structured output validation with durable retries LLM providers return rate-limit errors, APIs time out, and web scrapes hit transient failures. Tensorlake handles retries at the platform level — each retry is durable, meaning any nested function calls that already succeeded are served from checkpoints instead of re-executing. See [Durable Execution](/applications/durability) for how the checkpoint mechanism works and [Crash Recovery](/applications/crash-recovery) for the agent-loop walkthrough. ## Configuring Retries Set the `retries` parameter on any `@function()` to automatically retry on failure. This is especially useful for LLM calls that return structured output — if the LLM returns malformed data, Pydantic validation fails and Tensorlake retries the entire call: ```python theme={null} from pydantic import BaseModel from tensorlake.applications import function class ResearchFindings(BaseModel): summary: str sources: list[str] confidence: float @function(retries=3) def extract_findings(text: str) -> ResearchFindings: from openai import OpenAI # Disable client-level retries to avoid unpredictable behavior response = OpenAI(max_retries=0).chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": "Extract research findings as JSON."}, {"role": "user", "content": text}, ], response_format={"type": "json_object"}, ) # If validation fails, Tensorlake retries the entire function return ResearchFindings.model_validate_json(response.choices[0].message.content) ``` **How retries work:** * Rate limit errors, timeouts, or exceptions trigger automatic retries * Validation failures (e.g., Pydantic `ValidationError`) also trigger retries * Tensorlake retries up to 3 times with exponential backoff * Any nested function calls that already succeeded are served from checkpoints, not re-executed (see [Durable Execution](/applications/durability)) If retries are exhausted, the request fails and can be re-run later via the [Replay API](/applications/durability#request-replay-api). For broader exception-handling patterns (try/except, futures, fallbacks), see [Error Handling](/applications/error-handling). For controlling per-function deadlines, see [Timeouts](/applications/timeouts). Disable client-level retries (e.g., OpenAI's `max_retries=0`) when using Tensorlake retries. Layering both creates unpredictable behavior and inflated retry counts. ## Rate Limiting External APIs When calling external APIs with rate limits, you can control the total number of concurrent calls using the formula: **Total concurrent calls = `max_containers` × `concurrency`** This allows you to respect API rate limits by capping the maximum number of parallel requests your function can make: ```python theme={null} from tensorlake.applications import function @function( retries=3, max_containers=5, # Maximum 5 containers concurrency=2 # Each container handles 2 concurrent requests ) def call_rate_limited_api(query: str) -> dict: # Total concurrent calls: 5 × 2 = 10 requests max import requests response = requests.get(f"https://api.example.com/search?q={query}") return response.json() ``` **Use cases:** * **Respect API quotas** — If an API allows 100 requests/second, set `max_containers=50` and `concurrency=2` * **Control costs** — Limit concurrent LLM calls to manage token spend * **Prevent overload** — Cap requests to internal services that can't handle high concurrency See [Scale-Out & Queuing](/applications/scale-out-queuing) for more on `max_containers` and request queuing. ## Related Guides How checkpointing makes every retry cheap. Surviving mid-loop crashes when retries run out. Per-function deadlines and progress-update heartbeats. Try/except, futures, and graceful degradation patterns. # Sandboxes Source: https://docs.tensorlake.ai/applications/sandboxes Two patterns for running agents with isolated code execution Agents that generate and execute code need a workspace — a computer where they can run code, install packages, and access files. That workspace needs to be isolated so the agent can't access your credentials, files, or network. Sandboxes provide this isolation. The question isn't whether to use sandboxes — it's how to integrate them with your agent. There are two architectural patterns, based on where the agent runs: inside the sandbox or outside of it. ## Pattern 1: Agent in Sandbox The agent runs inside an isolated container. Your application communicates with it over the network. ```mermaid theme={null} graph LR A["Your Application"] -- "HTTP" --> B["Sandbox Container"] subgraph B["Sandbox Container"] C["Agent Code"] D["Filesystem"] E["Packages"] end ``` This is what Tensorlake's `@function()` does. When you deploy a function, your agent code runs inside an isolated container with its own filesystem, dependencies, and resource limits. The agent has direct access to its environment — it can read and write files, install packages, and execute code, all within the container boundary. ```python theme={null} from tensorlake.applications import application, function, Image agent_image = Image().run("pip install openai") @application() @function(image=agent_image, timeout=1800, memory=4, ephemeral_disk=10) def coding_agent(task: str) -> str: """Agent runs inside the container with full filesystem access.""" from openai import OpenAI import subprocess client = OpenAI() messages = [{"role": "user", "content": task}] for _ in range(20): response = client.chat.completions.create(model="gpt-4o", messages=messages) reply = response.choices[0].message if not reply.tool_calls: return reply.content for tool_call in reply.tool_calls: if tool_call.function.name == "run_code": # Code executes directly — agent is already in the sandbox result = subprocess.run( ["python", "-c", tool_call.function.arguments], capture_output=True, text=True, timeout=30 ) messages.append({"role": "tool", "content": result.stdout or result.stderr}) ``` **When to use this pattern:** * The agent and execution environment are tightly coupled * The agent needs persistent filesystem access across tool calls * You want production to mirror local development — same code, same environment **Trade-offs:** * API keys must live inside the container for the agent to make inference calls * Updating agent logic requires redeploying the function With Tensorlake, every `@function()` is already a sandbox. You get process isolation, resource limits, timeout enforcement, and dependency isolation without any extra setup. ## Pattern 2: Sandbox as Tool The agent runs in a Tensorlake function and gets sandboxes as tools it can use for code execution. When the agent needs to run untrusted or LLM-generated code, it creates a sandbox on demand, executes code there, and reads the results back. ```mermaid theme={null} graph LR A["Agent Function"] -- "Create" --> B["Sandbox 1"] A -- "Create" --> C["Sandbox 2"] A -- "Create" --> D["Sandbox 3"] ``` Tensorlake's [Sandbox API](/sandboxes/introduction) provides this pattern. Your agent logic runs in a `@function()`, and when it needs to execute code, it creates a sandbox with the `Sandbox` SDK and uses it as a tool. ```python theme={null} from tensorlake.applications import application, function, Image agent_image = Image().run("pip install openai tensorlake") @application() @function(image=agent_image, timeout=1800) def coding_agent(task: str) -> str: """Agent uses a sandbox as a tool for code execution.""" from openai import OpenAI from tensorlake.sandbox import Sandbox client = OpenAI() # Create an on-demand sandbox for code execution sandbox = Sandbox.create( image="tensorlake/ubuntu-minimal", cpus=1.0, memory_mb=1024, timeout_secs=60, ) try: messages = [{"role": "user", "content": task}] for _ in range(20): response = client.chat.completions.create(model="gpt-4o", messages=messages) reply = response.choices[0].message if not reply.tool_calls: return reply.content for tool_call in reply.tool_calls: if tool_call.function.name == "run_code": # Code executes in the remote sandbox, not here result = sandbox.execute(tool_call.function.arguments) messages.append({"role": "tool", "content": result.output}) finally: sandbox_client.delete(sandbox.sandbox_id) ``` **When to use this pattern:** * You need to execute untrusted or LLM-generated code * API keys should stay outside the code execution environment * You want to spin up multiple sandboxes in parallel for concurrent code execution * The agent needs to create, inspect, and tear down environments dynamically **Trade-offs:** * Network latency on each execution call * Two layers of containers (agent function + sandbox) ## Learn More Install Tensorlake and create your first sandbox. Sandbox states, resources, timeouts, and lifecycle operations. Control internet access and blocked destinations. # Scale-Out & Queuing Source: https://docs.tensorlake.ai/applications/scale-out-queuing Workflows scale automatically as endpoints are called, with configurable scaling per function Workflows scale out automatically as their endpoints are called. When you invoke a workflow, Tensorlake spins up containers for each function as needed, processes the request, and scales back down when idle. Each function in your workflow can have its own scaling configuration. You control scaling behavior with two parameters: `warm_containers` and `max_containers`. **Example workflow with scaling:** ```python theme={null} from tensorlake.applications import application, function @function(warm_containers=2, max_containers=10) def enrich_data(record_id: str) -> dict: # 2 containers always warm, scales up to 10 ... @function() def transform(data: dict) -> dict: # Transform the enriched data ... @application() @function() def process_workflow(record_id: str) -> dict: enriched = enrich_data.future(record_id) return transform.future(enriched) ``` When you call `POST /applications/process_workflow`, the workflow endpoint scales automatically, and each function scales based on its configuration. ## Scaling Parameters Configure scaling in the `@function()` decorator: ```python theme={null} from tensorlake.applications import function @function( warm_containers=2, max_containers=10 ) def process_data(data: str) -> str: ... ``` ### `warm_containers` Number of pre-warmed containers to keep ready. Warm containers have your code and dependencies loaded, eliminating cold start latency for incoming requests. ```python theme={null} @function(warm_containers=3) def classify_document(content: str) -> str: """3 containers are always warm and ready to handle requests.""" # Critical first step in workflow - needs low latency ... ``` Use warm containers when: * You need low-latency responses * Cold starts are unacceptable for your use case * You have predictable baseline traffic ### `max_containers` Maximum number of containers. Once this limit is reached, additional requests are automatically queued and processed in FIFO order as containers become available. ```python theme={null} @function(max_containers=5) def bounded_processing(data: str) -> str: """No more than 5 containers will run simultaneously.""" ... ``` ## Automatic Queuing When all containers for a function are busy and `max_containers` has been reached, Tensorlake automatically queues incoming requests. No configuration is needed — queuing is built into the platform. * Requests are processed in **FIFO order** * Queued requests begin processing as soon as a container becomes available * No separate queue infrastructure (Redis, SQS, RabbitMQ) is required ```python theme={null} @function(max_containers=3) def process_with_llm(data: str) -> str: """At most 3 concurrent LLM calls. Additional requests are queued.""" # Expensive workflow step - limit concurrency to control costs ... ``` ## Combined Behaviors Combine parameters for fine-grained control: ### Low-latency with bounded scale ```python theme={null} @function( warm_containers=2, # 2 containers ready for instant response max_containers=10 # Scale up to 10, then queue ) def extract_entities(document: str) -> dict: # First step in document workflow - needs low latency ... ``` ### High-throughput with bounded cost ```python theme={null} @function( warm_containers=4, # 4 containers pre-warmed max_containers=50 # Scale up to 50, then queue ) def enrich_from_api(record_id: str) -> dict: # High-volume workflow step with bounded scale ... ``` ## Scaling in Workflows Each function in your workflow scales independently. This allows different workflow steps to have different scaling profiles based on their resource requirements and latency needs: ```python theme={null} @function(warm_containers=5, max_containers=50) def fetch_data(record_id: str) -> dict: """High-throughput data fetching with low latency.""" ... @function(max_containers=3) def analyze_with_llm(data: dict) -> dict: """Expensive LLM analysis, bounded concurrency to control costs.""" ... @application() @function() def process_record(record_id: str) -> dict: # fetch_data can handle 50 concurrent requests data = fetch_data.future(record_id) # analyze_with_llm is limited to 3 concurrent executions return analyze_with_llm.future(data) ``` In this workflow, `fetch_data` can scale to 50 containers for high throughput, while `analyze_with_llm` is capped at 3 to control costs. When you call the `process_record` endpoint, both functions scale independently based on their configuration. ## Default Behavior Without any scaling parameters, workflow functions scale dynamically: * Containers scale from zero based on demand when the workflow endpoint is called * There is no upper bound on container count * Cold starts occur for the first request after an idle period * No automatic queuing (unlimited scaling) ## Learn More Full @function() decorator reference. Structuring agents for scale. Multi-step data workflows. # Autoscaling Source: https://docs.tensorlake.ai/applications/scaling-agents Autoscaling guide for Orchestration endpoints Tensorlake scales your `@function()` sandboxes automatically. In most cases, you do not need to configure anything. Start with defaults, then tune only if you have a specific latency or cost goal. ## Default Behavior With just `@function()`, Tensorlake does this automatically: * Creates containers when requests arrive * Scales to zero when idle * Adds more containers as traffic grows ```python theme={null} from tensorlake.applications import function @function() def agent(prompt: str) -> str: ... ``` This is the simplest and most cost-efficient setup for many async and internal workloads. ## Scaling Settings Use these only when default on-demand scaling is not enough: | Setting | What it controls | What happens | | ----------------- | --------------------- | --------------------------------------------------------------- | | `warm_containers` | Ready-to-serve buffer | Keeps extra pre-started containers ready so bursts start faster | | `max_containers` | Capacity ceiling | Caps total containers so scale and cost stay bounded | How they work together: * `warm_containers` adds ready capacity above current demand. * `max_containers` limits the final upper bound. * If demand exceeds `max_containers`, requests wait in queue. ## Practical Examples ### 1) Reduce cold starts If this is a user-facing endpoint and startup delay is noticeable: ```python theme={null} @function(warm_containers=2) def agent(prompt: str) -> str: ... ``` ### 2) Cap spend or protect downstream APIs If you need to bound scale: ```python theme={null} @function(max_containers=10) def agent(prompt: str) -> str: ... ``` When all 10 are busy, new requests wait in queue. ### 3) Balance low latency with bounded scale If you want faster startup plus bounded scaling: ```python theme={null} @function(warm_containers=2, max_containers=20) def agent(prompt: str) -> str: ... ``` Result: * 2 warm containers are ready for faster responses * Scale is still capped at 20 containers ### 4) High-throughput with a safety ceiling ```python theme={null} @function( warm_containers=4, max_containers=50, ) def agent(prompt: str) -> str: ... ``` ## How to Choose Values Start with `@function()` and add knobs only for a specific goal: * Lower first-request latency: set `warm_containers=1`, then increase gradually. * Budget or downstream protection: set `max_containers` to a safe upper limit. * Stable setup: add a small `warm_containers` buffer, then cap with `max_containers`. * Keep changes incremental: update one knob, test, then adjust. ## Learn More How queueing works when demand exceeds available capacity Pattern for handling transient API failures safely # Secrets Source: https://docs.tensorlake.ai/applications/secrets Providing secrets to Tensorlake functions Secrets allow providing sensitive values to your Tensorlake functions in a secure manner without having to put them into your code. ## Storing secrets You can store secrets on Tensorlake Cloud using the CLI: ```bash theme={null} tl secrets set AWS_ACCESS_KEY=MY_AWS_ACCESS_KEY tl secrets set OPENAI_API_KEY=MY_OPENAI_API_KEY ``` ## Using secrets Stored secrets are available as environment variables within your Tensorlake functions: ``` @application() @function(secrets=["AWS_ACCESS_KEY", "OPENAI_API_KEY"]) def my_function() -> str: aws_access_key = os.environ["AWS_ACCESS_KEY"] openai_api_key = os.environ["OPENAI_API_KEY"] ... ``` ### Secrets and application deployment When you add or update a secret used by an already deployed application, it needs to get redeployed for the new secret values to take effect. ## CLI Commands ### List Secrets List secrets that have been previously set. Values are not shown for security reasons. ```bash theme={null} $ tl secrets list | Name | Created At | | ----------- | ---------- | | SECRET_NAME | Date | ``` ### Set a Secret Set a secret will create or update a secret. ```bash theme={null} $ tl secrets set = [=] ``` ### Unset a Secret ```bash theme={null} tl secrets unset [] ``` ## Security Secrets use envelope encryption with AES-256-GCM, providing strong confidentiality and integrity. Each project has a dedicated Data Encryption Key (DEK) wrapped by a root Key Encryption Key (KEK) managed by AWS KMS, creating strict isolation boundaries. Secrets remain encrypted at rest and are only decrypted in-memory on dataplane machines running workflows that requires those secrets, with all communication secured through mutual TLS (mTLS). ## See Also Tutorial that demonstrates using secrets with the OpenAI API in a deployed workflow. # Timeouts Source: https://docs.tensorlake.ai/applications/timeouts How function timeouts work and how progress updates reset them Agents can take an unpredictable amount of time to do their work — an LLM tool-calling loop might finish in seconds or run for hours depending on the task. Tensorlake handles this by letting functions run indefinitely as long as they keep sending heartbeats in the form of progress updates. A timeout only kicks in if a function stops making progress — it doesn't finish within the allotted time *and* it doesn't send any progress updates to reset the clock. ## Setting Timeouts Set the `timeout` attribute on the `@function()` decorator to control how long a function can run before it is terminated and marked as failed. ```python theme={null} from tensorlake.applications import function @function(timeout=1800) # 30 minutes def deep_research(prompt: str) -> str: ... ``` | | Value | | ----------- | ------------------- | | **Default** | `300` (5 minutes) | | **Minimum** | `1` second | | **Maximum** | `172800` (48 hours) | When a function times out, it is terminated and marked as failed. If the function has a [retry policy](/applications/retries), it will be retried according to that policy. Already-completed nested calls are served from checkpoints on retry — see [Durable Execution](/applications/durability). ## Automatic Timeout Reset When a function reports progress via `ctx.progress.update()`, its timeout automatically resets. This allows functions to run indefinitely as long as they continue making progress. ```python theme={null} from tensorlake.applications import function, RequestContext @function(timeout=300) # 5 minute timeout def long_running_task(items: list) -> dict: ctx = RequestContext.get() # After 3 minutes of processing... ctx.progress.update(50, 100, "Halfway done") # Timeout just reset to 5 minutes from NOW # Function can run another 5 minutes before next update # Continue processing... ``` **Timeline:** 1. Function starts with 5 minute timeout 2. At 3 minutes: `progress.update()` called 3. Timeout resets to 5 minutes from this point 4. Function can now run until minute 8 (3 + 5) 5. Next `progress.update()` resets timeout again Set a short timeout (e.g., 5 minutes) and rely on progress updates to extend it. This way, if a function gets stuck and stops reporting progress, it fails fast instead of running silently for hours. ## Examples ### Agent Loops An agent that runs hundreds of iterations can use a short timeout per iteration. Each progress update resets the clock: ```python theme={null} @function(timeout=300) # 5 minute timeout per iteration def persistent_agent(task: str) -> str: ctx = RequestContext.get() for iteration in range(1000): ctx.progress.update(iteration, 1000) # resets timeout result = agent_iteration(task) if is_complete(result): return result ``` ### Batch Processing Process an unbounded stream of items. The function runs as long as items keep arriving: ```python theme={null} @function(timeout=600) # 10 minute timeout def process_stream(stream_url: str) -> dict: ctx = RequestContext.get() count = 0 for item in stream_items(stream_url): ctx.progress.update(count, count + 1, f"Processed {count} items") process(item) count += 1 return {"total": count} ``` ### Video/Audio Processing Report progress every N frames to keep the timeout from firing during long media processing: ```python theme={null} @function(timeout=300) def process_video(video_url: str) -> str: ctx = RequestContext.get() frames = extract_frames(video_url) total_frames = len(frames) for i, frame in enumerate(frames): if i % 100 == 0: ctx.progress.update(i, total_frames, f"Processing frame {i}/{total_frames}") process_frame(frame) return "complete" ``` ## Learn More The full progress API, frontend integration, and SSE streaming. What happens after a timeout fires — auto-retry with checkpoint reuse. Resume a request from where it timed out instead of restarting. How nested completed calls are reused across timeout-triggered retries. Catch timeouts in caller code and degrade gracefully. Functions, retries, resource limits, and request context. # Create Datasets Source: https://docs.tensorlake.ai/document-ingestion/datasets/create Datasets let you apply the same parsing configuration to many documents — useful for versioning schemas and OCR settings across workflows. Datasets allow you to apply the same document parsing configuration to multiple documents. This makes it easy to version schemas, and OCR configurations, and apply them to new documents in your workflows. All the code examples on this page use the [Tensorlake Python SDK](https://github.com/tensorlakeai/tensorlake). For other languages, please consult our [API Reference](/api-reference/v2/datasets/). ## Quick Start The example can be run in a [Google Colab notebook](https://colab.research.google.com/drive/1Bz6wFrJd64RY9cslpwmJ4nncCTbwV6rL?usp=sharing). ```python Python theme={null} from tensorlake.documentai.client import DocumentAI doc_ai = DocumentAI(api_key="YOUR_TENSORLAKE_API_KEY") # Create a dataset. The only required argument for a dataset is its name. # A dataset name may only contain alphanumeric characters, hyphens or underscores. # # Not specifying parsing, or extraction options will create a dataset used for parsing # documents with our recommended defaults. dataset = doc_ai.create_dataset( name="your_dataset_name" ) ``` ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/datasets \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "name": "your_dataset_name" }' # Response returns the dataset ID, which you can use to reference the dataset in future API calls. # { # "dataset_id": "dataset_xxxxx", # } ``` ```python Python theme={null} # Use a publicly accessible URL or upload a file to Tensorlake and use the file ID. file_url = "https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/real-estate-purchase-all-signed.pdf" parse_id = doc_ai.parse_dataset_file( dataset=dataset, file=file_url ) ``` ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/datasets/{dataset_id}/parse \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "file_url": "https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/real-estate-purchase-all-signed.pdf" }' ``` ```python python theme={null} # Retrieve the outputs of the parsing job. result = doc_ai.wait_for_completion(parse_id) # The result contains the parsed document and any extracted data. print(result) ``` ```bash curl theme={null} curl --request GET \ --url https://api.tensorlake.ai/documents/v2/parse/{parse_id} \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' # Result will only be available after the parsing job is complete. ``` The Python SDK `wait_for_completion` method will block until the parsing job is complete and return the result. With datasets, you can ingest as many files as you want, and the parsing configuration will be applied to all of them. You can also create a dataset with structured extraction options, which will allow you to extract structured data from related documents. # Retrieve Dataset Data Source: https://docs.tensorlake.ai/document-ingestion/datasets/data Retrieve the data stored in a Tensorlake Dataset. A Tensorlake Dataset is a collection of parsed results from documents that were parsed using the options defined by the Dataset. You can retrieve the parsed result data stored in a Dataset using the `/datasets/{dataset_id}/data` endpoint. ```python Python theme={null} from tensorlake.documentai.client import ( DocumentAI, ParseStatus, ) doc_ai = DocumentAI(api_key="YOUR_TENSORLAKE_API_KEY") # If you don't know your dataset ID, you can go to https://cloud.tensorlake.ai # and find it in the Datasets section. dataset = doc_ai.get_dataset("your_dataset_id") dataset_data = doc_ai.get_dataset_data(dataset) for parsed_result in dataset_data.items print(f"Parse ID: {parsed_result.parse_id}") if parsed_result.status == ParseStatus.SUCCESSFUL: print("Parsed Document:") print(parsed_result.document) else: print(f"Parse Status: {parsed_result.status}") ``` ```bash curl theme={null} curl --request GET \ --url https://api.tensorlake.ai/documents/v2/datasets/{dataset_id}/data \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' ``` Dataset data is returned as a paginated list of results from [parse jobs initiated via the Dataset](/document-ingestion/datasets/create). Each item in the list follows the same structure as the [parse results from regular parse jobs](/document-ingestion/parsing/read#understand-the-parsing-output). Both the API and the Python SDK use cursor-based pagination to retrieve the Dataset data. The response will include a `next_cursor` field that you can use to retrieve the next page of results. ### Filtering Dataset Data The [`/datasets/{dataset_id}/data`](/api-reference/v2/datasets/data) endpoint supports filtering the Dataset data by various parameters. You can filter by: * `status`: Filter by the status of the parse job (e.g., `Pending`, `Processing`, `Successful`, `Failure`). * `file_name`: Filter by the name of the file that was parsed. This may not be available if the file used was not a file uploaded to Tensorlake (e.g. if you used a `file_url` or `raw_text`). * `created_after`: Filter by an inclusive date after which the parse job was created. Date should be in RFC 3339 format (e.g., `2023-10-01T00:00:00Z`). * `created_before`: Filter by an inclusive date before which the parse job was created. Date should be in RFC 3339 format (e.g., `2023-10-01T00:00:00Z`). * `finished_after`: Filter by an inclusive date after which the parse job was finished. Date should be in RFC 3339 format (e.g., `2023-10-01T00:00:00Z`). * `finished_before`: Filter by an inclusive date before which the parse job was finished. Date should be in RFC 3339 format (e.g., `2023-10-01T00:00:00Z`). # Managing Files Source: https://docs.tensorlake.ai/document-ingestion/file-management/overview Upload, list, and delete files in Tensorlake — or skip the upload step by passing a pre-signed or publicly accessible URL to the parse endpoint. You can upload files to Tensorlake for parsing without relying on any external storage like S3. The file upload API returns a unique `file_id` that can be used with the [parse](/api-reference/v2/parse/parse) endpoint to parse the file. We also support pre-signed URLs or any publicly accessible URLs for files. You can skip the upload step and directly use the [parse](/api-reference/v2/parse/parse) endpoint with file URLs. ## Upload Files Files uploaded are scoped to a specific project in your account. The API key provided with the API calls is used to determine the project to which the file is uploaded. This is used to secure the files uploaded and isolate them from other projects in your account. The file upload API is not intended for files larger than 1 GB. If you have files larger than 1 GB, please reach out to us at [support@tensorlake.ai](mailto:support@tensorlake.ai). ```python theme={null} from tensorlake.documentai import DocumentAI doc_ai = DocumentAI(api_key="xxxx") file_id = doc_ai.upload(path="/path/to/file.pdf") ``` ```bash theme={null} curl -X POST https://api.tensorlake.ai/documents/v2/files \ -H "Authorization: Bearer " \ -F "file=@/path/to/file.pdf" ``` ## List Files Using the API key for a specific Tensorlake project, you can list all of the files that are a part of that project. ```python theme={null} from tensorlake.documentai import DocumentAI doc_ai = DocumentAI(api_key="xxxx") files_page = doc_ai.files() ``` ```bash theme={null} curl -X GET https://api.tensorlake.ai/documents/v2/files \ -H "Authorization: Bearer " ``` ## Delete Files If you have documents you want to remove from Tensorlake Cloud, you can quickly delete them by passing in the file\_id. ```python theme={null} from tensorlake.documentai import DocumentAI doc_ai.delete_file(file_id="file_unique_id ") ``` ```bash theme={null} curl -X DELETE https://api.tensorlake.ai/documents/v2/files/file_unique_id \ -H "Authorization: Bearer " ``` # Document Ingestion Overview Source: https://docs.tensorlake.ai/document-ingestion/overview Tensorlake Document Ingestion turns unstructured documents into structured data, built as an application on Orchestrate. Document AI is built as an application on [Orchestrate](/applications/introduction). You can build your own document ingestion application on TensorLake using any OCR model or VLM. We plan to open-source Document AI soon — once available, you'll be able to install it on your own TensorLake account for a private deployment. Tensorlake’s Document Ingestion API turns unstructured documents into **agent-ready inputs**: layout-aware Markdown chunks and schema-validated structured data. Tensorlake’s Document Ingestion API provides a comprehensive set of tools for converting documents into Markdown or structured data. It’s backed by our state-of-the-art OCR and VLM models, and is designed to be used in conjunction with our Agentic Runtime or separately for document processing workflows. The Document Ingestion API has the following capabilities: * [Read](/document-ingestion/parsing/read): Converts any document or image to Markdown and provides layout information. * [Edit](/document-ingestion/parsing/edit): Allows filling forms and modifying documents using a prompt. * [Extract](/document-ingestion/parsing/structured-extraction): Extracts structured data from documents using JSON Schemas. * [Classify](/document-ingestion/parsing/page-classification): Classifies pages into categories. * [Summarize](/document-ingestion/parsing/read#table-and-figure-summarization): Summarizes tables, figures, and charts in documents. * [Signature Detection](/document-ingestion/parsing/signature): Detects signatures in documents. * [Barcode Detection](/document-ingestion/parsing/barcode): Detects and reads barcodes in documents. * [Cross-page Header Correction](/document-ingestion/parsing/header-correction): Fixes headers that span or repeat across pages to improve document structure. * [Table Merging](/document-ingestion/parsing/table-merging): Merges tables that span multiple pages into a single unified table. * [Chart Extraction](/document-ingestion/parsing/chart-extraction): Extracts and processes charts as distinct elements separate from figures. * [Key-Value Extraction](/document-ingestion/parsing/key-value-extraction): Extracts key-value pairs from forms such as loan applications, insurance claims, and tax documents. ## How it works 1. Upload — Send a PDF, image, spreadsheet, or 15+ supported formats 2. Process — We have multiple Document AI endpoints that can run OCR, detect layout, extract tables, summarize figures/tables/charts, extract structured data, detect signatures, read bar codes, and more depending on your use case. 3. Receive — Get clean JSON with text, tables, bounding boxes and structured output. ## Integration with Your Existing Workflows Document Ingestion API is a standalone API that can be used independently of the Agentic Runtime. You can call the APIs directly from your existing workflows. We also support sending webhooks when a document parsing job is completed. See [Webhooks](/webhooks/overview). Get started with the Document Ingestion API. Read documents and get Markdown and layout information. Understand the output of the document ingestion job. ## Supported File Types Tensorlake supports the following file types: * PDF * Images (PNG, JPG, TIFF) * Presentations (PPTX, Keynote) * Raw Text (plain text, HTML, Markdown) * Spreadsheets (XLSX, XLSM, XLS, CSV) * Word Documents (DOC, DOCX) * RTF * P7M # Barcode Detection Source: https://docs.tensorlake.ai/document-ingestion/parsing/barcode Detecting and decoding barcodes from document pages, returning type, value, and bounding boxes as structured output. ## Overview Barcode detection identifies and decodes barcodes found in document pages. Each detected barcode is returned as a structured page fragment with its decoded value, barcode type, and bounding box — allowing downstream processing to handle barcodes separately from text, tables, and other content. Barcode detection is available in the `model03` OCR model and is enabled via a flag in `ParsingOptions`. ## Enabling Barcode Detection Set `barcode_detection="true"` in your `ParsingOptions` along with `ocr_model="model03"`: Barcode detection requires Tensorlake Python SDK version `0.2.91` or later. Run `pip install --upgrade tensorlake` before using this feature. ```python theme={null} from tensorlake import DocumentAI, ParsingOptions doc_ai = DocumentAI(api_key="YOUR_TENSORLAKE_CLOUD_API_KEY") file_id = doc_ai.upload(path="barcode_file.pdf") parsing_options = ParsingOptions( ocr_model="model03", barcode_detection="true", ) parse_id = doc_ai.read( file_id=file_id, parsing_options=parsing_options, ) result = doc_ai.wait_for_completion(parse_id) ``` ## How It Works When barcode detection is enabled, the pipeline: 1. Parses each page into fragments (text, tables, barcodes, etc.) 2. Runs a barcode detector and decoder over the page image 3. Emits `fragment_type: "barcode"` entries alongside other page fragments 4. Includes bounding boxes and page dimensions so barcodes can be positioned or highlighted in a viewer ## Fragment Output Each detected barcode is returned as a fragment with the following structure: ```json theme={null} { "fragment_type": "barcode", "content": { "content": "PDF417: 4QGDkVjpF7nuGhQiOgLHwc", "html": null }, "reading_order": 9, "bbox": { "x1": 2, "y1": 444, "x2": 207, "y2": 475 } } ``` | Field | Description | | ----------------- | --------------------------------------------------------------------------- | | `fragment_type` | Always `"barcode"` for barcode fragments | | `content.content` | The barcode type and decoded value, formatted as `": "` | | `reading_order` | Position of this fragment in reading order relative to other page fragments | | `bbox` | Bounding box coordinates `(x1, y1, x2, y2)` in page pixels | ## Common Use Cases Barcodes appear across many operational document types: * **Shipping labels and packing slips** — tracking numbers and carrier codes * **Lab reports and sample labels** — specimen and sample IDs * **Insurance documents** — claim IDs and policy references * **Utility bills, tickets, and receipts** — account numbers and confirmation codes With barcodes returned as structured fragments alongside text and tables, you can: * Match barcode values to internal IDs (shipment, claim, order, patient) * Validate that a barcode value matches a printed text label on the same page * Flag documents where an expected barcode is missing or unreadable * Render barcode overlays in a document viewer using the provided bounding boxes ## Related * [Parsing Overview](/document-ingestion/parsing/read) * [Parse Output](/document-ingestion/parsing/parse-output) * [Sample Notebook: Barcode Detection](https://tlake.link/notebooks/barcode-detection) # Chart Extraction Source: https://docs.tensorlake.ai/document-ingestion/parsing/chart-extraction Extract structured, plottable data from charts embedded in documents — bar, line, scatter, and pie charts output as standardized JSON. ## Overview Charts in PDFs and documents are static images. Traditional parsers either skip them entirely or return a generic figure fragment with no underlying data. Tensorlake's Agentic Chart Extraction transforms those images into structured, usable JSON — detecting the chart type, extracting data series and axis information, and producing output that can be fed directly into analytics, BI tools, or plotted programmatically. Enable it with `chart_extraction=True` in your `EnrichmentOptions`. ## Enabling Chart Extraction Set `chart_extraction=True` in your `EnrichmentOptions`: ```python Python SDK theme={null} from tensorlake.documentai import DocumentAI from tensorlake.documentai.models.options import EnrichmentOptions doc_ai = DocumentAI(api_key="YOUR_TENSORLAKE_CLOUD_API_KEY") file_id = doc_ai.upload(path="document.pdf") enrichment_options = EnrichmentOptions( chart_extraction=True, ) parse_id = doc_ai.read( file_id=file_id, enrichment_options=enrichment_options, ) result = doc_ai.wait_for_completion(parse_id) ``` ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/parse \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "file_id": "file_XXX", "enrichment_options": { "chart_extraction": true } }' ``` ## How It Works For each chart detected in the document, the system: 1. Identifies the chart type (bar, line, scatter, or pie) 2. Extracts axis definitions, series names, data points, and rendering hints (colors, markers, legend position) 3. Outputs a standardized JSON object conforming to the schema for that chart type All predictions conform to predefined schemas, so a single parser can consume every chart JSON produced without per-chart ad-hoc handling. Each JSON includes numeric arrays and rendering hints, making it directly plottable without any additional transformation. ## Supported Chart Types | Chart type | Schema highlights | | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------- | | **Bar** | `orientation` (vertical/horizontal), named `series` for grouped/stacked bars, `x_axis.categories`, optional axis bounds and per-bar display flags | | **Line** | x/y axis definitions, explicit `values` arrays (numeric or categorical), multiple `series` with `color`, `line_style`, and `marker` styling | | **Scatter** | Per-series `x_data`/`y_data` arrays, marker styling (`size`, `alpha`, `edge_color`), and axis bounds | | **Pie** | Slice-centric schema with `label`, `value`, optional `percentage`, `colors`, and display flags | ## Output Examples ### Bar chart ```json theme={null} { "type": "bar_chart", "title": "Annual Energy Consumption by Source (TWh)", "orientation": "vertical", "x_axis": { "label": "REGION", "categories": ["North America", "Europe", "Asia", "Africa"] }, "y_axis": { "label": "TWh", "min": 0, "max": 1000, "format": "number" }, "series": [ { "name": "Solar", "data": [120, 150, 200, 80], "color": "#FFD700" }, { "name": "Wind", "data": [180, 220, 300, 60], "color": "#00BFFF" }, { "name": "Hydro", "data": [250, 180, 400, 120], "color": "#32CD32" }, { "name": "Nuclear", "data": [300, 450, 280, 20], "color": "#FF4500" }, { "name": "Fossil Fuels", "data": [500, 400, 800, 350], "color": "#B0B0B0" } ], "bar_style": "grouped", "grid": true } ``` ### Scatter plot ```json theme={null} { "type": "scatter_plot", "title": "Urban vs Rural: Income vs Spending", "x_axis": { "label": "Annual Income ($k)", "min": 10, "max": 150, "scale": "linear" }, "y_axis": { "label": "Annual Spending ($k)", "min": 0, "max": 100, "scale": "linear" }, "series": [ { "name": "Urban", "x_data": [24, 32, 35, 42, 67, 80, 91, 110, 125], "y_data": [4, 11, 22, 23, 27, 46, 59, 57, 71], "color": "#5da5da", "marker": "o", "alpha": 0.75 }, { "name": "Rural", "x_data": [25, 28, 38, 47, 63, 77, 98, 115, 129], "y_data": [7, 16, 17, 29, 27, 44, 63, 78, 81], "color": "#faa43a", "marker": "s", "alpha": 0.75 } ], "legend_position": "upper right", "grid": true } ``` ### Line chart ```json theme={null} { "type": "line_chart", "title": "Uncorrelated Remote Sensor Readings", "x_axis": { "label": "Observation Minute", "values": [0, 2, 4, 6, 8, 10], "scale": "linear" }, "y_axis": { "label": "Value", "min": 15, "max": 90, "scale": "linear" }, "series": [ { "name": "Room A (Stable)", "data": [56, 54, 48, 55, 50, 54], "color": "#F472B6", "line_style": "-" }, { "name": "Room B (Cooling)", "data": [81, 80, 76, 80, 79, 71], "color": "#9CA3AF", "line_style": "-" }, { "name": "Room C (Cyclic)", "data": [31, 32, 34, 33, 38, 39], "color": "#FDE047", "line_style": "-" }, { "name": "Outdoor (Variable)","data": [40, 41, 41, 39, 41, 40], "color": "#9CD9D3", "line_style": "-" } ], "legend_position": "upper right", "grid": true } ``` ## Common Use Cases * **Financial reports** — extract revenue, cost, and margin trends from bar and line charts without manual transcription * **Scientific papers** — recover experimental data points from scatter plots for further analysis or comparison * **Business presentations** — pull KPI charts into structured data for dashboards and reporting pipelines * **RAG pipelines** — surface chart data as structured context so LLMs can answer quantitative questions about visuals * **BI and analytics** — re-plot or aggregate extracted series directly using the output JSON without rebuilding data manually ## Related * [Parsing Overview](/document-ingestion/parsing/read) * [Parse Output](/document-ingestion/parsing/parse-output) * [Table Merging](/document-ingestion/parsing/table-merging) # Docx Parsing with Tracked Changes Source: https://docs.tensorlake.ai/document-ingestion/parsing/docx-parsing Learn how to parse Docx documents, including tracked changes and comments. When parsing DOCX files that contain tracked changes or comments, Tensorlake preserves this collaboration metadata in the HTML output. This enables workflows that need to process document revisions, review comments, or extract specific change history. Tracked changes and comments are preserved using semantic HTML markup: **Tracked Changes:** * **Insertions**: `inserted text` - Text that was added to the document * **Deletions**: `deleted text` - Text that was removed or struck through **Comments:** * **Comment ranges**: `highlighted text` - Comments anchored to selected text * **Comment references**: `` - Comments at cursor positions without highlighted text #### Example Output ```html Markdown theme={null}

Initial damage estimates suggest total losses between $2.8M and $3.4M, based on preliminary contractor assessments, which falls within policy limits though a complete forensic analysis is pending.

``` #### Extracting Change Data Programmatically Use these HTML patterns to extract specific content types: ```python Python theme={null} from bs4 import BeautifulSoup soup = BeautifulSoup(html_content, 'html.parser') # Extract all comments comments = [] for span in soup.find_all('span', class_='comment'): comments.append({ 'text': span.get_text(strip=True), 'comment': span.get('data-note', '') }) # Extract all deletions deletions = [del_tag.get_text() for del_tag in soup.find_all('del')] for deletion in deletions: print(f"Deleted: {deletion}") # Extract all insertions insertions = [ins_tag.get_text() for ins_tag in soup.find_all('ins')] for insertion in insertions: print(f"Inserted: {insertion}") # Print all comments for comment in comments: print(f"Comment: {comment['text']} - {comment['comment']}") ``` Tracked changes are only preserved when parsing DOCX files that contain Microsoft Word's revision history. Regular text formatting (bold, italic) is handled separately through standard HTML markup. # Edit Documents Source: https://docs.tensorlake.ai/document-ingestion/parsing/edit Fill forms and modify documents The Edit API allows you to programmatically edit documents, such as filling forms using AI. ## API Usage Guide Calling the [edit](/api-reference/v2/edit) endpoint initiates a document editing job. ```python theme={null} from tensorlake.documentai import DocumentAI, FormFillingOptions doc_ai = DocumentAI(api_key="YOUR_API_KEY") form_filling = FormFillingOptions( fill_prompt="Fill the form for John Doe, born 01/01/1980.", ignore_source_values=True ) # Returns a job ID job_id = doc_ai.edit( file_id="file_XXX", form_filling=form_filling ) ``` ```bash theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/edit \ --header 'Authorization: Bearer ' \ --header 'Content-Type: application/json' \ --data '{ "file_id": "file_XXX", "form_filling": { "fill_prompt": "Fill the form for John Doe, born 01/01/1980.", "ignore_source_values": true } }' ``` ## Form Filling The `form_filling` object configures how the document should be filled. | Parameter | Description | Default Value | | ---------------------- | ------------------------------------------------------------------------------------------------------------ | ------------- | | `fill_prompt` | A custom prompt to use for form filling. This provides context or data to the AI model for filling the form. | `None` | | `ignore_source_values` | If `true`, the model will ignore existing values in the form fields and overwrite them. | `false` | | `no_acroform` | If `true`, the model will not use AcroForm detection (standard PDF form fields). | `false` | | `no_widget_detection` | If `true`, the model will not use do widget detection using visual analysis. | `false` | ## Output The output of the edit operation includes the modified document and metadata. * `filled_pdf_base64`: The base64 encoded string of the filled PDF document. * `form_filling_metadata`: A dictionary containing metadata about the form filling process, such as fields identified and filled. # Cross-page Header Correction Source: https://docs.tensorlake.ai/document-ingestion/parsing/header-correction Automatically detect and correct document header hierarchy across pages, even when OCR misidentifies header levels. ## Overview Cross-page header correction analyzes header patterns across an entire document and corrects their hierarchy. OCR engines frequently misidentify header depth — a subsection labeled "2.2" might be emitted as a top-level header (`##`) instead of a nested one (`###`). This feature resolves those inconsistencies and detects headers that span page breaks without fragmentation. Each corrected `section_header` fragment includes a `level` attribute that accurately reflects its depth in the document hierarchy (0 for `#`, 1 for `##`, 2 for `###`, etc.). ## Enabling Header Correction Set `cross_page_header_detection=True` in your `ParsingOptions`: ```python Python SDK theme={null} from tensorlake.documentai import DocumentAI, ParsingOptions doc_ai = DocumentAI(api_key="YOUR_TENSORLAKE_CLOUD_API_KEY") file_id = doc_ai.upload(path="document.pdf") parsing_options = ParsingOptions( cross_page_header_detection=True, ) parse_id = doc_ai.read( file_id=file_id, parsing_options=parsing_options, ) result = doc_ai.wait_for_completion(parse_id) ``` ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/parse \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "file_id": "file_XXX", "parsing_options": { "cross_page_header_detection": true } }' ``` ## How It Works When enabled, the pipeline: 1. Parses all pages and collects every `section_header` fragment across the document 2. Analyzes numbering patterns (e.g. `1.`, `1.1.`, `1.1.1.`) and visual structure to infer correct depth 3. Assigns accurate `level` values to each header, overriding what the OCR engine reported 4. Detects headers that span page breaks and merges them into a single fragment For example, a document with an incorrectly leveled subsection: ```markdown theme={null} # Effectiveness of ω-3 Polyunsaturated Fatty Acids... ## 1. Introduction ## 2. Materials and Methods ### 2.1. Subjects ## 2.2. Statistical Analysis ← Wrong level (should be ###) ## 3. Results ``` becomes: ```markdown theme={null} # Effectiveness of ω-3 Polyunsaturated Fatty Acids... ## 1. Introduction ## 2. Materials and Methods ### 2.1. Subjects ### 2.2. Statistical Analysis ← Corrected ## 3. Results ``` ## Fragment Output Corrected headers are returned as `section_header` page fragments. Each fragment includes: | Field | Description | | ----------------- | --------------------------------------------------------------------------- | | `fragment_type` | Always `"section_header"` for header fragments | | `content.level` | Integer representing header depth (0 = `#`, 1 = `##`, 2 = `###`, etc.) | | `content.content` | Clean header text without markdown formatting | | `reading_order` | Position of this fragment in reading order relative to other page fragments | | `bbox` | Bounding box coordinates `(x1, y1, x2, y2)` in page pixels | ### Example fragment ```json theme={null} { "fragment_type": "section_header", "content": { "level": 2, "content": "2.2. Statistical Analysis" }, "reading_order": 5, "bbox": { "x1": 72, "y1": 310, "x2": 540, "y2": 328 } } ``` ## Accessing Corrected Headers ```python theme={null} for page in result.outputs.document.pages: for fragment in page.page_fragments: if fragment.fragment_type == "section_header": print(f"Level {fragment.content.level}: {fragment.content.content}") ``` Example output: ``` Level 0: Effectiveness of ω-3 Polyunsaturated Fatty Acids... Level 1: 1. Introduction Level 1: 2. Materials and Methods Level 2: 2.1. Subjects Level 2: 2.2. Statistical Analysis Level 1: 3. Results ``` ## Building a Document Outline Use the `level` attribute to render a nested outline of the document: ```python theme={null} for page in result.outputs.document.pages: for fragment in page.page_fragments: if fragment.fragment_type == "section_header": indent = " " * fragment.content.level print(f"{indent}• {fragment.content.content}") ``` Example output: ``` • Effectiveness of ω-3 Polyunsaturated Fatty Acids... • 1. Introduction • 2. Materials and Methods • 2.1. Subjects • 2.2. Statistical Analysis • 3. Results ``` ## Common Use Cases * **RAG pipelines** — accurate header boundaries improve chunking quality and context preservation for retrieval * **Document outlines** — build navigable tables of contents programmatically from any document * **Knowledge graphs** — construct accurate document trees with correct parent-child header relationships * **Research paper processing** — parse structured academic documents with multi-level section hierarchies ## Related * [Parsing Overview](/document-ingestion/parsing/read) * [Parse Output](/document-ingestion/parsing/parse-output) * [Sample Notebook: Header Detection](https://tlake.link/notebooks/header-correction) # Key-Value Extraction Source: https://docs.tensorlake.ai/document-ingestion/parsing/key-value-extraction Template-free extraction of structured field data from forms — text inputs, checkboxes, radio buttons, dropdowns, and signature lines. ## Overview Forms are everywhere in enterprise documents — loan applications, insurance claims, medical surveys, compliance questionnaires. But processing them at scale is hard: layouts vary, fields shift position, and content is often mixed with tables, text, and illustrations on the same page. Tensorlake's Agentic Key-Value Extraction solves this with a two-stage pipeline: it first detects whether a page component is actually a form (skipping expensive vision models on non-form content), then extracts every field into structured JSON with its name, type, value, and an optional box ID. No templates, no coordinate mapping, no per-form configuration. Enable it with `key_value_extraction=True` in your `EnrichmentOptions`. ## Enabling Key-Value Extraction Set `key_value_extraction=True` in your `EnrichmentOptions`: ```python Python SDK theme={null} from tensorlake.documentai import DocumentAI from tensorlake.documentai.models.options import EnrichmentOptions doc_ai = DocumentAI(api_key="YOUR_TENSORLAKE_CLOUD_API_KEY") file_id = doc_ai.upload(path="form.pdf") enrichment_options = EnrichmentOptions( key_value_extraction=True, ) parse_id = doc_ai.read( file_id=file_id, enrichment_options=enrichment_options, ) result = doc_ai.wait_for_completion(parse_id) ``` ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/parse \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "file_id": "file_XXX", "enrichment_options": { "key_value_extraction": true } }' ``` ## How It Works ### Stage 1 — Form Detection When Tensorlake encounters a layout component, a lightweight vision model first determines whether it is actually a form. Non-form content (tables, text blocks, illustrations) is skipped immediately, so expensive extraction models are only invoked on pages or regions that contain real form fields. This keeps costs low and processing fast. ### Stage 2 — Agentic Field Extraction Once a form is identified, the agent extracts its fields by reasoning about: * **Multi-field patterns** — grouping related fields such as address components or checkbox groups * **Context** — inferring field purpose from surrounding text and document structure * **Visual cues** — recognizing checkboxes, radio buttons, and text boxes by appearance * **Spatial relationships** — resolving which labels correspond to which input fields ## Supported Field Types | Type | Description | | -------------- | ------------------------------------- | | `text` | Free-text input fields | | `checkbox` | Boolean tick boxes (`true` / `false`) | | `radio button` | Single-select option groups | | `dropdown` | Select menus with a chosen value | | `signature` | Signature line fields | ## Output Each extracted form produces a JSON array of field objects: | Field | Description | | ------------ | ------------------------------------------------------------------------- | | `box_id` | Optional reference ID linking the field back to a labeled box in the form | | `field_name` | Label or purpose of the field (e.g. `"Federal income tax withheld"`) | | `type` | Input type (e.g. `"text"`, `"checkbox"`) | | `value` | Current content of the field | ### Example — W-2 form ```json theme={null} [ { "box_id": "a", "field_name": "Employee's social security number", "type": "text", "value": "123-45-6789" }, { "box_id": "b", "field_name": "Employer identification number (EIN)", "type": "text", "value": "98-7654321" }, { "box_id": "1", "field_name": "Wages, tips, other compensation", "type": "text", "value": "85,000.00" }, { "box_id": "2", "field_name": "Federal income tax withheld", "type": "text", "value": "12,750.00" }, { "box_id": "c", "field_name": "Employer's name, address, and ZIP code", "type": "text", "value": "ABC Technologies Inc. 1234 Innovation Drive San Francisco, CA 94105" } ] ``` The same output is also available as readable Markdown: ``` [a] **Employee's social security number** (text): 123-45-6789 [b] **Employer identification number (EIN)** (text): 98-7654321 [1] **Wages, tips, other compensation** (text): 85,000.00 [2] **Federal income tax withheld** (text): 12,750.00 [c] **Employer's name, address, and ZIP code** (text): ABC Technologies Inc. ... ``` ## Common Use Cases * **Loan and mortgage applications** — extract applicant data, income fields, and declaration checkboxes without per-lender templates * **Insurance claims** — pull policy numbers, claimant details, and coverage selections from variable claim form layouts * **Medical surveys and intake forms** — capture patient responses, checkbox selections, and consent signatures * **Tax documents** — extract labeled box values from W-2s, 1099s, and other structured government forms * **Compliance questionnaires** — process due-diligence and KYC forms across counterparties with different layouts ## Related * [Parsing Overview](/document-ingestion/parsing/read) * [Parse Output](/document-ingestion/parsing/parse-output) * [Structured Data Extraction](/document-ingestion/parsing/structured-extraction) * [Chart Extraction](/document-ingestion/parsing/chart-extraction) # On-premise deployment Source: https://docs.tensorlake.ai/document-ingestion/parsing/on-prem Run Tensorlake's Document Ingestion API in your own AWS infrastructure — OCR models, API services, and the open-source Compute Engine. We are starting to support running Tensorlake's Document Ingestion API in your own infrastructure. The main components are the our OCR models, API services, and our open-source [Compute Engine](https://github.com/tensorlakeai/indexify) which runs the models. Please contact us at [support@tensorlake.ai](mailto:support@tensorlake.ai) to get access. At the moment, we only support running in AWS infrastructure, but if you have other requirements, please contact us and we will be happy to discuss. ## Compute and Storage Requirements 1. 2 x 40GB A100 GPU - For Layout Understanding, OCR, Summarization and Table Extraction 2. 1 x OpenAI API key - For leveraging OpenAI models for structured output extraction - optional 3. 1 x RDS or equivalent Postgres database - For managing user data and document metadata. 4. 2 x S3 buckets - For storing your documents and outputs. ## Onboarding To get an on-premise version of Document Ingestion, please contact us at [support@tensorlake.ai](mailto:support@tensorlake.ai). We will provide you with the installation package, which includes all the necessary components and instructions to set up the system in your infrastructure. Our IAM policies are based on AWS STS roles, so you will need to provide a role with the necessary permissions to access the S3 buckets and RDS database. ## Installation Once you have access to the on-premise version of Document Ingestion, you will receive a link to download the installation package. Follow the instructions in the package to install the necessary components. The components that comprise the Document Ingestion API are: 1. Document AI Server - This is the entrypoint for all document processing requests. 2. Document AI Worker - This component handles asynchronous document processing tasks. 3. Indexify - The workflow orchestration engine, built and run by [Tensorlake](https://github.com/tensorlakeai/indexify). 4. Executors - These are the individual processing units that run the document processing tasks. 5. File Normalization Executor - This executor handles file downloading files from the document storage bucket, and normalizing the format for further processing. 6. OCR Executor - This executor runs the OCR models to extract text from documents. It uses in-house models for layout understanding and text extraction. 7. Structured Extraction Executor - This executor runs the structured output extraction models. You can use our private structured extraction model on an H100 or use OpenAI's models. 8. Output Formatter Executor - This executor handles the finalization of the document processing workflow, including formatting the output and storing it in the appropriate location. ### Deploying the Document Ingestion Workflow The installation package includes a deployment script that automates the setup of the Indexify workflow and its executors. The script will guide you through the configuration process and ensure that all components are properly connected. The script will need to be run on the machine where Indexify is installed, and it will require access to the S3 buckets. #### Installation script ```bash theme={null} #!/bin/bash set -e # check if aws cli is installed if ! command -v aws &> /dev/null then echo "aws cli could not be found, please install it first" exit 1 fi # check if indexify-cli is installed, if not install it as a python package in a virtual environment if ! command -v indexify-cli &> /dev/null then echo "indexify-cli could not be found, installing it now" python3 -m venv venv source venv/bin/activate pip install indexify fi # download the number version from the arguments if [ "$#" -ne 1 ]; then echo "Usage: $0 " exit 1 fi VERSION=$1 # get the indexify URL from environment variable if [ -z "$INDEXIFY_URL" ]; then echo "Please set the INDEXIFY_URL environment variable" exit 1 fi #Download everything to a temporary directory. The s3 uri looks like this: s3://tensorlake-document-ingestion-workflows-dev/workflows/onprem/$VERSIOn/ TEMP_DIR=$(mktemp -d) echo "Downloading workflows to temporary directory: $TEMP_DIR" aws s3 sync s3://tensorlake-document-ingestion-workflows-prod/workflows/onprem/$VERSION/ $TEMP_DIR/ # check if the download was successful if [ $? -ne 0 ]; then echo "Failed to download workflows from s3" exit 1 fi echo "Deploying workflows from version: $VERSION to Indexify instance at $INDEXIFY_URL" # deploy the workflows using indexify-cli to the onprem_workflows.py file indexify-cli deploy $TEMP_DIR/onprem_workflows.py ``` This script will only work if: 1. The instance running this is using the STS authorized role to access the S3 bucket. 2. The script has access to the Indexify instance via the `INDEXIFY_URL` environment variable. 3. Is using a valid version number. ### AWS Credentials Every service in the Document Ingestion API uses the official AWS SDK to access S3 buckets. You will need to provide your AWS credentials via environment variables, using **configuration files is not supported**. However, the system supports every kind of AWS credential provider, including instance profiles and ECS task roles. ### Docker support We provide Docker images for all components of the Document Ingestion API. You can use these images to run the services in a containerized environment. To get started with Docker, Tensorlake will provide you with a `docker-compose.yml` file that defines the services and their configurations. You can then use Docker Compose to start and manage the services. ## Configuration ### Organization, projects and API Keys By default, the on-premise installation of the Document Ingestion API comes with a `default` organization and project. You can configure additional organizations and projects using environment variables or a YAML configuration file. When there are no API keys configured, the system will allow unauthenticated access to the API. However, it is recommended to configure API keys for better security and access control. Once organizations, projects, and API keys are configured, the following headers will need to be provided with every request: 1. `X-Tensorlake-Organization-Id`: The ID of the organization making the request. 2. `X-Tensorlake-Project-Id`: The ID of the project belonging to the organization making the request. 3. `Authorization`: Bearer token for the API key associated with the project. The system does not provide a built-in user interface for managing organizations, projects, and API keys. You will need to manage these configurations manually. Every time you create a new organization or project, you will need to update the configuration files or environment variables accordingly, and restart the services for the changes to take effect. #### Example configuration ```yaml theme={null} listen_addr: 0.0.0.0:8700 on_prem_enabled: true on_prem: organizations: - id: tensorlake_onprem_prg_1 projects: - id: tensorlake_onprem_project_1 api_keys: - tensorlake_onprem_project_apikey_123456789 - tensorlake_onprem_project_apikey_987654321 - id: tensorlake_onprem_project_2 api_keys: - tensorlake_onprem_project2_apikey_123456789 ``` With this configuration, only requests with the following headers would be valid: ``` X-Tensorlake-Organization-Id: tensorlake_onprem_prg_1 X-Tensorlake-Project-Id: tensorlake_onprem_project_1 Authorization: Bearer tensorlake_onprem_project_apikey_123456789 ``` ### Executors networking Each executor needs to be able to connect to the Indexify server. By default, the executors will try to connect to `http://indexify-server:8900`, which is the default hostname and port used in the provided `docker-compose.yml` file. If the executors where to run in a different host or network, you will need to configure the `INDEXIFY_SERVER_HOST` environment variable to point to the correct URL. The executors need to be able to reach the HTTP Port (8900 by default) and the gRPC port (8901 by default). # Page Classification Source: https://docs.tensorlake.ai/document-ingestion/parsing/page-classification Classify pages using semantic descriptions in natural language Page Classification enables you to automatically categorize pages within documents based on their content. This allows you to label pages to filter them for downstream use-cases like structured extraction and OCR. Try this out using this [Colab Notebook](https://tlake.link/docs/page-classifications). ## How Page Classification works Page Classifications work by analyzing each page of a document and assigning it to one or more predefined categories that you specify. When you initiate a parse job, you can provide a list of page classification configurations as part of your request. Each page classification configuration consists of: * **Name**: A unique identifier for the page class * **Description**: A detailed description that guides the AI model in identifying pages that belong to this category If you have specified page classes in your parse request, Tensorlake analyzes each page of the document and assigns it to one or more categories that you specify. ## Classification Example ```python Python theme={null} import time from tensorlake.documentai import ( DocumentAI, PageClassConfig, ) doc_ai = DocumentAI(api_key="YOUR_TENSORLAKE_API_KEY") # Define page classifications page_classifications = [ PageClassConfig( name="signature_page", description="Pages containing signatures, signature lines, or signature blocks" ), PageClassConfig( name="terms_and_conditions", description="If the has Terms and Conditions as a section header, classify as terms_and_conditions" ) ] parse_id = doc_ai.classify( file_id="https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/Fake_Terms_Conditions.pdf", page_classifications=page_classifications ) print(f"Parse job submitted with ID: {parse_id}") # Get the result result = doc_ai.wait_for_completion(parse_id=parse_id) for page_classification in result.page_classes: print(f"Classification: {page_classification.page_class}") print(f"Page: {page_classification.page_numbers}") ``` ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/parse \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "file_id": "file_xxxxx", "page_classifications": [ { "name": "signature_page", "description": "Pages containing signatures, signature lines, or signature blocks" }, { "name": "terms_and_conditions", "description": "If the has Terms and Conditions as a section header, classify as terms_and_conditions" } ] }' ``` ## Classification Results When you use page classification, the parse results include a `page_classifications` field that contains an array of classification results: ```json theme={null} { "parse_id": "parse_xxxxx", "status": "successful", "page_classes": { "terms_and_conditions": { "page_class": "terms_and_conditions", "page_numbers": [ 1 ] }, "signature_page": { "page_class": "signature_page", "page_numbers": [ 2 ] } }, // ... other parse results } ``` Each classification result includes: * **page\_class**: The classification name you provided. This will match the `name` field in your `PageClassConfig`. * **page\_numbers**: An array of page numbers (1-indexed) that match this classification. ## Combining with Structured Extraction You can combine page classification with structured data extraction to only extract data from specific page types. This allows speeding up structured extraction in long documents, and often improves accuracy of extraction. Check out [this Colab Notebook](https://colab.research.google.com/drive/1Z3fuY1N-PUJGhtOHbGc670PojIfCbsEK?usp=sharing) to see an example of combining Page Classification with Structured Extraction. ## Best Practices ### Writing Effective Descriptions The quality of your page classifications depends heavily on the descriptions you provide. Here are some tips: **Be specific and descriptive** ```python theme={null} # Good PageClassConfig( name="financial_summary", description="Pages containing financial summaries, balance sheets, income statements, or tables with monetary values and financial metrics" ) # Less effective PageClassConfig( name="financial_summary", description="Financial pages" ) ``` **Include visual and content cues** ```python theme={null} PageClassConfig( name="signature_page", description="Pages with signature lines, signature blocks, 'Sign here' text, or actual handwritten signatures. May include date fields next to signatures." ) ``` **Mention common patterns** ```python theme={null} PageClassConfig( name="form_page", description="Pages with form fields, checkboxes, fill-in-the-blank sections, or structured input areas for data entry" ) ``` ## Common Use Cases ### Insurance Claims Processing ```python theme={null} page_classifications = [ PageClassConfig( name="claim_form", description="Insurance claim forms with policy numbers, incident details, and claimant information" ), PageClassConfig( name="supporting_documents", description="Supporting documentation like police reports, medical records, or receipts" ), PageClassConfig( name="photos_evidence", description="Pages containing photographs, images, or visual evidence of damages" ) ] ``` ### Legal Document Processing ```python theme={null} page_classifications = [ PageClassConfig( name="contract_terms", description="Main contract pages with terms, conditions, and legal clauses" ), PageClassConfig( name="signature_pages", description="Pages requiring signatures from parties, with signature lines and date fields" ), PageClassConfig( name="exhibits_attachments", description="Exhibits, attachments, or addendums referenced in the main contract" ) ] ``` ### Financial Document Analysis ```python theme={null} page_classifications = [ PageClassConfig( name="executive_summary", description="Executive summary or overview pages with key financial highlights" ), PageClassConfig( name="financial_statements", description="Balance sheets, income statements, cash flow statements with numerical financial data" ), PageClassConfig( name="notes_disclosures", description="Footnotes, accounting policies, or disclosure pages explaining financial data" ) ] ``` Page classification works with all supported document types including PDFs, Word documents, images, and more. The AI model analyzes both textual content and visual layout to make classification decisions. # Parsed Document Reference Source: https://docs.tensorlake.ai/document-ingestion/parsing/parse-output Understand the output from calling the Parse API. Parsed document output can be retrieved using the [`/parse/{parse_id}`](/api-reference/v2/parse/get) endpoint, or using the `get_parsed_result` SDK function. ```python Python theme={null} result = doc_ai.get_parsed_result(parse_id) ``` ```bash curl theme={null} curl -X GET "https://api.tensorlake.ai/documents/v2/parse/parse_XXX" \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" ``` The response is a JSON object if you are using the REST API, and a `ParseResult` object if you are using the Python SDK. ```python Python theme={null} class ParseResult(BaseModel): # Parsed document specific fields chunks: Optional[List[Chunk]] = Field( default=None, description="Chunks of layout text extracted from the document. This is a vector of `Chunk` objects, each containing a piece of text extracted from the document. The chunks are typically used for further processing, such as indexing or searching. The value will vary depending on the chunking strategy used during parsing.", ) pages: Optional[List[Page]] = Field( default=None, description="The layout of the document. This is a JSON object that contains the layout information of the document. It can be used to understand the structure of the document, such as the position of text, tables, figures, etc.", ) page_classes: Optional[List[PageClass]] = Field( default=None, description="Page classes extracted from the document. This is a list of `PageClass` objects containing the class name and page numbers where each page class appears.", ) structured_data: Optional[List[StructuredData]] = Field( default=None, description="Structured data extracted from the document. The structured data is a list of `StructuredData` objects containing the structured data extracted from the document; formatted according to the schema. This is used to extract structured information from the document, such as tables, forms, or other structured content.", ) merged_tables: Optional[List[MergedTable]] = Field( default=None, description="Merged tables extracted from the document. This is a list of `MergedTable` objects containing the merged tables extracted from the document. Tables are merged if they are part of the same logical table.", ) # Parse details parse_id: str = Field(description="The unique identifier for the parse job") parsed_pages_count: int = Field( description="The number of pages that were parsed successfully.", ge=0 ) status: ParseStatus = Field(description="The status of the parse job.") created_at: str = Field( description="The date and time when the parse job was created in RFC 3339 format." ) error: Optional[str] = Field( default=None, description="Error occurred during any part of the parse execution.", ) finished_at: Optional[str] = Field( default=None, description="The date and time when the parse job was finished in RFC 3339 format.", ) labels: Optional[dict] = Field( default=None, description="Labels associated with the parse job.", ) ``` ```json JSON theme={null} { "chunks": [ { "page_number": 1, "content": "", }, ... ], "pages": [ { "page_number": 1 "page_fragments": [ { "fragment_type": "", "content": { "content": "", "summary": null }, "reading_order": 0, "bbox": { "x2": 518.0, "x1": 93.0, "y2": 88.0, "y1": 74.0 } }, ... ], "dimensions": [ 1584, 1224 ] }, ... ], "page_classes": null, "structured_data": null, "parse_id": "parse_", "parsed_pages_count": 2, "total_pages": 10, "status": "successful", "created_at": "2025-07-04T05:20:52.285044+00:00", "options": { ... }, "errors": null, "finished_at": "2025-07-04T05:21:10.036248+00:00", "labels": null, "tasks_completed_count": null, "tasks_total_count": null } ``` ## Output Response Fields The response contains the following fields which returns the parsed document: * **parse\_id**: The unique identifier for the parse job. * **parsed\_pages\_count**: An integer representing the number of pages that were parsed successfully. * **total\_pages**: An integer representing the total number of pages in the document. * **status**: The status of the parse job. * **created\_at**: The date and time when the parse job was created in RFC 3339 format. * **finished\_at**: The date and time when the parse job was finished in RFC 3339 format. * **error**: Any errors encountered while parsing the document. * **labels**: Labels associated with the parse job. * **chunks**: An array of objects that contain the markdown content for each chunk. The number of chunks depends on the chunking strategy you chose. See [more below](/document-ingestion/parsing/read#markdown-chunks). * **pages**: A comprehensive JSON representation of the document's visual structure, including page dimensions, bounding boxes for each element, and reading order. See [more below](/document-ingestion/parsing/read#document-layout-and-bounding-boxes). * **page\_classes**: A map where the keys are page class names provided in the parse request, and the values are PageClass objects containing class names and page numbers where each page class appears. See [more below](/document-ingestion/parsing/read#page-classifications). * **structured\_data**: A map where the keys are the names of the JSON schema provided in the parse request, and the values are StructuredData objects containing the extracted structured data. See [more below](/document-ingestion/parsing/read#structured-extraction). * **merged\_tables**: A list of merged tables. This is a list of `MergedTable` objects containing the merged tables extracted from the document. Tables are merged if they are part of the same logical table. * **usage**: An object containing usage statistics for the parse job, including token counts and parsed page counts for various extraction tasks. The Outputs class has been documented in the [Python SDK](https://github.com/tensorlakeai/tensorlake/blob/main/src/tensorlake/documentai/models/results.py#L59) and in the [REST API](/api-reference/v2/parse/get). ### Markdown Chunks The markdown content of the document is available in the `chunks` attribute of the JSON response. The number of chunks depends on the chunking strategy you chose. **Chunking Strategy Options** * **None** - The whole document is returned as a single chunk. This allows you to use your own chunking logic. * **Page** - Each page is returned as a separate chunk. You should receive as many chunks as the number of pages in the document. * **Section** - The document is split into chunks based on the section headers detected in the document. * **Fragment** - Every *page fragment* (e.g. table, figure, paragraph) is returned as a separate chunk. You will most likely have to merge these chunks based on your use-case. ### Document Layout and Bounding Boxes The entire document layout is available in the `pages` attribute of the JSON response. This object has a list of Pages, each encoded as a JSON object. Each `pages[x]` contains the following attributes: * **`page_number`** - The page number of the page. * **`dimensions`** - The width and height of the page in pixels. * **`page_fragments`** - The list of objects on the page. Each page fragment has the following attributes: * **`fragment_type`** - The type of the object: `section_header, title, text, table, figure, formula, form, key_value_region, document_index, list_item, table_caption, figure_caption, formula_caption, page_footer, page_header, page_number, signature, strikethrough` * **`reading_order`** - The reading order of the page fragments. This is the order in which the fragment would be read by a human. * **`bbox`** - The bounding box of the page fragment, in the format `[x1, y1, x2, y2]`. * **`content`** - The actual content that is found on that fragment of the page. ### Page Classifications Page classifications are also returned as a list of Page Class objects, which contain the following attributes: * **`page_class`**: The classification name you provided. This will match the `name` field in your `PageClassConfig`. * **`page_numbers`**: An array of page numbers (1-indexed) that match this classification. See [Page Classification](document-ingestion/parsing/page-classification#understanding-the-results) for more details. ### Structured Extraction Structured Data is returned as a list depending on partition strategy (e.g. one Structured Data object for each partition of the document). Each object contains * **`data`**: The JSON object representing the data extracted that matches the input schema. * **`page_numbers`**: A list of page numbers where the structured data was searched for. * **`schema_name`**: The name of the schema provided by the user. See [Structured Extraction](/document-ingestion/parsing/structured-extraction#structured-extraction-response) for more details. ### Usage The `usage` attribute contains usage statistics for the parse job, including token counts and parsed page counts for various extraction tasks. The fields include: * **`pages_parsed`**: The number of pages that were parsed. * **`signature_detected_pages`**: The number of pages where signatures were detected. This is only applicable if signature detection was enabled. * **`strikethrough_detected_pages`**: The number of pages where strikethroughs were detected. This is only applicable if strikethrough lines detection was enabled. * **`ocr_input_tokens_used`**: The number of input tokens used for OCR processing. * **`ocr_output_tokens_used`**: The number of output tokens generated from OCR processing. * **`extraction_input_tokens_used`**: The number of input tokens used for text extraction. This is only applicable if structured extraction options were enabled. * **`extraction_output_tokens_used`**: The number of output tokens generated from text extraction. This is only applicable if structured extraction options were enabled. * **`summarization_input_tokens_used`**: The number of input tokens used for summarization. * **`summarization_output_tokens_used`**: The number of output tokens generated from summarization. # Read Documents Source: https://docs.tensorlake.ai/document-ingestion/parsing/read Convert documents to Markdown with spatial page layouts — tables, figures, bounding boxes, and reading-order fragments returned by the Read API. The Read API converts Documents to Markdown and provides spatial layouts of pages. The response of the Read API contains: * Markdown representation of pages. The elements in pages ordered by their natural reading order * Tables encoded as Markdown or HTML * Summary of tables and figures guided by custom prompts * Bounding boxes for each page element(e.g. signature, key-value pair, figure) Read the [Overview](/document-ingestion/overview) for understanding how to integrate Document Parsing to your existing workflows. ## API Usage Guide Calling the [read](/api-reference/v2/parse/read) endpoint will create a new document parsing job, starting in the `pending` state. It will transition to the `processing` state and then to the `successful` state when it's parsed successfully. If you are using the Python SDK, all the configuration options described above are expressed through the `ParsingOptions` class. ```python theme={null} from tensorlake.documentai import ( DocumentAI, ParsingOptions, ChunkingStrategy, TableOutputMode, TableParsingFormat, ) doc_ai = DocumentAI(api_key="xxxx") file_id = "file_xxxx" parsing_options = ParsingOptions( chunking_strategy=ChunkingStrategy.FRAGMENT, table_output_mode=TableOutputMode.MARKDOWN ) parse_id = doc_ai.read(file_id=file_id, page_range="1-2", parsing_options=parsing_options) ``` The HTTP API for parsing is thoroughly documented [here](/api-reference/v2/parse/parse). Here is an example of how to initiate a parsing job: ```javascript theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/parse \ --header 'Authorization: Bearer ' \ --header 'Content-Type: application/json' \ --data '{ "file_id": "", "page_range": "1-2", "parsing_options": { "chunking_strategy": "fragment", "table_output_mode": "markdown" } }' ``` ## Options for Parsing Documents Document Parsing can be customized by providing the `parsing_options` and `enrichment_options` in your request. | Parameter | Description | | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | | `parsing_options` | Customizes the OCR and table parsing process and chunking strategies. See [Parsing Options](/document-ingestion/parsing/read#parsing-options). | | `enrichment_options` | Enables and configures table and figure summarization. See [Summarization](/document-ingestion/parsing/read#table-and-figure-summarization). | Get a full list of the configuration setting options on the [`/parse` section of the API reference](/api-reference/v2/parse/parse). ### Parsing Options | Parameter | Description | Default Value | | ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- | | `chunking_strategy` | Choose between None, Page, Section, or Fragment. | `None` | | `table_output_mode` | Choose between Markdown, HTML. | `HTML` | | `ocr_model` | Choose between `model01`, `model02`, `model03`, and `gemini3` | `model03` | | `disable_layout_detection` | Boolean flag to skip layout detection and directly extract text. Useful for documents with many tables or images. | `false` | | `skew_detection` | Detect and correct skewed or rotated pages. Please note this can increase the processing time. | `false` | | `signature_detection` | Detect signatures in the document. Please note this can increase the processing time, and incurs additional costs. | `false` | | `remove_strikethrough_lines` | Remove strikethrough lines from the document. Please note this can increase the processing time, and incurs additional costs. | `false` | | `ignore_sections` | A set of document fragments to ignore during parsing. This can be useful for excluding irrelevant sections from the output. Potential values include: `section_header`, `title`, `text`, `table`, `figure`, `chart`, `formula`, `form`, `key_value_region`, `document_index`, `list_item`, `table_caption`, `figure_caption`, `formula_caption`, `page_footer`, `page_header`, `page_number`, `signature`, `strikethrough`, `tracked_changes`, `comments`, `barcode`. | `[]` | | `cross_page_header_detection` | A boolean flag to enable header hierarchy detection across pages. This can improve the accuracy of header extraction in multi-page documents. | `false` | | `barcode_detection` | A boolean flag to enable barcode detection and reading across pages. This is currently supported only with `model03` OCR model. | `false` | | `merge_tables` | A boolean flag to enable the merging of adjacent tables that are part of the same logical table. | `false` | ## OCR Models Tensorlake has a few different OCR models, with different strengths and weaknesses. We recommend experimenting with the models on your documents and using the best model for your use case. 1. `model03` - Our best model in terms of accuracy for business documents. It has the ability to read and describe complex tables and figures. Supports large scale ingestion of documents. 2. `model01` - Fast but could have lower accuracy on complex tables. 3. `model02` - Slower but could have higher accuracy on complex tables. 4. `gemini3` - Uses Google's Gemini3 for OCR processing. A key difference between Model03 and Model01/02 is that Model01/02 provides bounding boxes of the table cells while Model03 doesn't. Gemini3 doesn't provide any bounding boxes. ## Retrieve Output The parsed document output can be retrieved using the [`/parse/{parse_id}`](/api-reference/v2/parse/get) endpoint, or using the `get_job` SDK function. ```python Python SDK theme={null} result = doc_ai.get_parsed_result(parse_id) ``` ```bash REST API theme={null} curl -X GET "https://api.tensorlake.ai/documents/v2/parse/parse_XXX" \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" ``` ## Markdown Chunks Leveraging the markdown chunks is a common next step after parsing documents. ```python Python SDK theme={null} for chunk in result.chunks: print(f"## Page Number: {chunk.page_number}\n") print(f"## Content: {chunk.content}\n") ``` ```json JSON theme={null} { ... "chunks": [ { "content": "....", "page_number": 0 }, { "content": "....", "page_number": 1 }, ... ], ... } ``` See [Parse Output](/document-ingestion/parsing/parse-output) for more details about the output. ## Bounding Boxes Each page fragment includes bounding box coordinates that specify the exact location of the content on the page. This is useful for creating citations, highlighting source content in a UI, or debugging extraction quality. [Google Colab Notebook](https://tlake.link/notebooks/bounding-boxes) ### Accessing Bounding Boxes ```python theme={null} result = doc_ai.parse_and_wait(file_id) for page in result.pages: for fragment in page.page_fragments: bbox = fragment.bbox print(f"Fragment type: {fragment.fragment_type}") print(f"Top-left: ({bbox['x1']}, {bbox['y1']})") print(f"Bottom-right: ({bbox['x2']}, {bbox['y2']})") ``` ### Coordinate System Bounding boxes use the following coordinate system: * **x1, y1**: Top-left corner of the bounding box * **x2, y2**: Bottom-right corner of the bounding box * **Origin (0,0)**: Top-left corner of the page * **Units**: Pixels All fragment types include bounding box coordinates. ## Table and Figure Summarization Document Ingestion API can be used to summarize tables and figures in documents. | Parameter | Description | Default Value | | ----------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- | | `table_cell_grounding` | Grounding of table cells, providing the bounding box of the cells. This will create a list of cells with their reference id `ref_id`, the bounding box and the cell text. | `false` | | `table_summarization` | Enable summarization of tables present in the document. This will generate a summary of the table content, including key insights and trends. | `false` | | `figure_summarization` | Enable summarization of figures present in the document. This will generate a summary of the figure content, including key insights and trends. | `false` | | `table_summarization_prompt` | A custom prompt to use for table summarization. This can be used to provide additional context or instructions to the LLM. If not specified, the default prompt will be used. | - | | `figure_summarization_prompt` | A custom prompt to use for figure summarization. This can be used to provide additional context or instructions to the LLM. If not specified, the default prompt will be used. | - | | `include_full_page_image` | Include the full page image as additional context when summarizing tables and figures, which can improve accuracy by capturing surrounding headers, captions, and related content. | `false` | | `chart_extraction` | Extraction of chart type and structured data series from images, delivered as clean JSON suitable for analytics and ingestion. | `false` | | `key_value_extraction` | Extraction of key-value pairs from forms as clean JSON. | `false` | ### Tables Tables can be summarized by setting `table_summarization` to `true` in the `enrichment_options` JSON object when calling the `parse` API. [Google Colab Notebook](https://tlake.link/notebooks/table-summaries) ```python Python SDK theme={null} from tensorlake.documentai import DocumentAI from tensorlake.documentai.models.options import ( EnrichmentOptions, ) enrichment_options = EnrichmentOptions( table_summarization=True, table_summarization_prompt="Summarize the table in a concise manner.", ) doc_ai = DocumentAI(api_key=API_KEY) parse_id = doc_ai.read( file_id="file_XXX", # Replace with your file ID or URL enrichment_options=enrichment_options, ) ``` ```json REST API theme={null} { "enrichment_options": { "table_summarization": true, "table_summarization_prompt": "Summarize the table in a way that is easy to understand and use for answering questions." } } ``` ### Figures Figures can be summarized by setting `figure_summarization` to `true` in the `enrichment_options` JSON object when calling the `parse` API. [Google Colab Notebook](https://tlake.link/notebooks/figure-summaries) ```python Python SDK theme={null} from tensorlake.documentai import ( DocumentAI, EnrichmentOptions, ) doc_ai = DocumentAI(api_key=API_KEY) enrichment_options = EnrichmentOptions( figure_summarization=True, figure_summary_prompt="Summarize the figure in a way that is easy to understand and use for answering questions.", ) parse_id = doc_ai.read( file_id="file_XXX", # Replace with your file ID or URL enrichment_options=enrichment_options, ) ``` ```json REST API theme={null} { "enrichment_options": { "figure_summarization": true, "figure_summary_prompt": "Summarize the figure in a way that is easy to understand and use for answering questions." } } ``` ### Full Page Image Context When summarizing tables and figures, you can optionally include the full page image as additional context. This helps the model better understand the surrounding content, headers, footers, and relationships between elements on the page. [Google Colab Notebook](https://tlake.link/notebooks/full-page-summary) ```python theme={null} enrichment_options = EnrichmentOptions( table_summarization=True, figure_summarization=True, include_full_page_image=True ) result = doc_ai.parse_and_wait( file_id, enrichment_options=enrichment_options ) ``` ### Charts Structured information about charts can be extracted by setting `chart_extraction` to `true` in the `enrichment_options` JSON object when calling the `parse` API. ```python Python SDK theme={null} from tensorlake.documentai import DocumentAI from tensorlake.documentai.models.options import ( EnrichmentOptions, ) enrichment_options = EnrichmentOptions( chart_extraction=True, ) doc_ai = DocumentAI(api_key=API_KEY) parse_id = doc_ai.read( file_id="file_XXX", # Replace with your file ID or URL enrichment_options=enrichment_options, ) ``` ```json REST API theme={null} { "enrichment_options": { "chart_extraction": true, } } ``` ### Key/Value Pairs Extraction of key-value pairs from forms can be done by setting `key_value_extraction`to `true` in the `enrichment_options` JSON object when calling the `parse` API. ```python Python SDK theme={null} from tensorlake.documentai import DocumentAI from tensorlake.documentai.models.options import ( EnrichmentOptions, ) enrichment_options = EnrichmentOptions( key_value_extraction=True, ) doc_ai = DocumentAI(api_key=API_KEY) parse_id = doc_ai.read( file_id="file_XXX", # Replace with your file ID or URL enrichment_options=enrichment_options, ) ``` ```json REST API theme={null} { "enrichment_options": { "key_value_extraction": true, } } ``` # Signature Detection Source: https://docs.tensorlake.ai/document-ingestion/parsing/signature Detect signatures in documents Document Ingestion API can be used to detect signatures and return their bounding boxes. Signature detection incurs additional costs, so please refer to the [pricing page](https://tensorlake.ai/pricing) for more details. ## Detecting Signatures Bounding boxes of signatures can be detected by setting `signature_detection` to `true` in the `parse_options` JSON object when calling the `parse` API. ```python Python SDK theme={null} from tensorlake.documentai import ( DocumentAI, ParsingOptions, ) doc_ai = DocumentAI(api_key="YOUR_API_KEY") parsing_options = ParsingOptions( signature_detection=True, ) parse_id = doc_ai.read( file_id="file_XXX", # Replace with your file ID or URL parsing_options=parsing_options, ) ``` ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/parse \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "file_id": "file_XXX", # Replace with your file ID or URL "parsing_options": { "signature_detection": true } }' ``` ## Response The bounding boxes of signatures are present in the Document object returned by the `parse` API. ```python Python SDK theme={null} parsed_result = doc_ai.wait_for_completion(parse_id=parse_id) # There is a signature on page 10 of this document signature_fragment result.outputs.document.pages[10].page_fragments[0] # PageFragment(fragment_type=, content=Text(content='Signature detected'), reading_order=-1, page_number=None, bbox={'x1': 79.0, 'x2': 200.0, 'y1': 812.0, 'y2': 855.0}) ``` ```bash curl theme={null} curl --request GET \ --url https://api.tensorlake.ai/documents/v2/parse/parse_XXX \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' # Response will contain the Document object with page fragments # that include the signature bounding boxes. { "page_fragments": [ { "bbox": { "x1": 97, "x2": 212, "y1": 621, "y2": 661 }, "content": { "content": "Signature detected" }, "fragment_type": "signature", "reading_order": -1, } ] } ``` # Structured Data Extraction Source: https://docs.tensorlake.ai/document-ingestion/parsing/structured-extraction Extract structured fields from documents using one or more JSON Schemas — no field limits, and multiple schemas in a single API call. Tensorlake can extract structured data from documents. This enables pulling out specific fields from documents. Some key features of structured extraction are: * No limits on the number of fields you can extract. * Extraction is guided by JSON Schema you provide (or Pydantic models with the Python SDK). * You can submit multiple schemas in a single API call. Try this out using this [Colab Notebook](https://tlake.link/parse-bank-statements). ## Structured Extraction Request Structured Outputs from Documents can be generated by specifying one or more JSON Schemas in the `structured_extraction_options` parameter in the `parse` endpoint. ```python Python theme={null} from pydantic import BaseModel, Field from tensorlake.documentai import ( DocumentAI, StructuredExtractionOptions, ) doc_ai = DocumentAI(api_key="YOUR_API_KEY") file_id = "file_XXX" # Replace with your file ID or URL class DriverLicense(BaseModel): first_name: str = Field(description="Name next to FN") last_name: str = Field(description="Name next to LN") id: str = Field(description="ID number") address: str = Field(description="Address of the ID holder") dob: str = Field(description="Date of birth of the ID holder") driver_license_extraction = StructuredExtractionOptions( schema_name="DriverLicense", json_schema=DriverLicense ) parse_id = doc_ai.extract( file_id=file_id, structured_extraction_options=[driver_license_extraction] ) parsed_result = doc_ai.wait_for_completion(parse_id=parse_id) ``` ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/parse \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "file_url": "https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/california_id.jpg", "structured_extraction_options": [ { "schema_name": "DriverLicense", "json_schema": { "type": "object", "properties": { "first_name": { "type": "string", "description": "Name next to FN" }, "last_name": { "type": "string", "description": "Name next to LN" }, "id": { "type": "string", "description": "ID number" }, "address": { "type": "string", "description": "Address of the ID holder" }, "dob": { "type": "string", "description": "Date of birth of the ID holder" } } } } ] }' ``` ```javascript Node.js theme={null} async function parseFile(fileUrl, tensorlakeApiKey) { const driversSchema = { title: "DriverLicense", type: "object", properties: { first_name: { type: "string", description: "Name next to FN" }, last_name: { type: "string", description: "Name next to LN" }, id: { type: "string", description: "ID number" }, address: { type: "string", description: "Address of the ID holder" }, dob: { type: "string", description: "Date of birth of the ID holder" }, }, }; const driversExtractionOptions = { schema_name: "DriverLicense", json_schema: driversSchema, }; const body = { file_url, structured_extraction_options: [driversExtractionOptions], }; const options = { method: "POST", headers: { Authorization: `Bearer ${tensorlakeApiKey}`, "Content-Type": "application/json", }, body: JSON.stringify(body), }; const response = await fetch( "https://api.tensorlake.ai/documents/v2/parse", options ); const result = await response.json(); console.log("result:", result); return result.jobId; } const fileId = "https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/california_id.jpg"; const tensorlakeApiKey = "your-tensorlake-api-key"; const jobId = await parseFile(fileId, tensorlakeApiKey); ``` The `structured_extraction_options` parameter is an array of objects, where each object contains the schema name and the JSON Schema to use for structured extraction. ## Structured Extraction Response Structured Data extracted from the document is returned in the `structured_data` field of the [Get Parse Job](/api-reference/v2/parse/get) endpoint response. The `structured_data` field is an array of objects, where each object contains the extracted data, the page numbers from which the data was extracted, and the schema name used for extraction. It includes the extracted data and the pages from which the data was extracted. ```json JSON theme={null} { // ... other fields ... "structured_data": [ { "data": { "first_name": "John", "last_name": "Doe", "id": "D1234567", "address": "123 Main St, Springfield, IL 62701", "dob": "1990-01-01" }, "page_numbers": [1, 2, 3], "schema_name": "DriverLicense", } ] } ``` ## JSON Schema for Structured Extraction Both the Python SDK and HTTP API support JSON Schema for structured extraction. If you are using the Python SDK, you can pass in a Python Dictionary, or a JSON schema encoded as a string. ```python Python theme={null} from tensorlake.documentai import StructuredExtractionOptions schema = { "type": "object", "properties": { "first_name": { "type": "string", "description": "Name next to FN" }, "last_name": { "type": "string", "description": "Name next to LN" }, } } # or schema = json.dumps(schema) driver_license_extraction = StructuredExtractionOptions( schema_name="DriverLicense", json_schema=schema ) parse_id = doc_ai.extract( file_id=file_id, structured_extraction_options=[driver_license_extraction] ) parsed_result = doc_ai.wait_for_completion(parse_id=parse_id) ``` HTTP API accepts a JSON schema as well. Please make sure the schema is a valid JSON **object**, and not encoded as a JSON string. ## Pydantic Models for Structured Extraction Pydantic models are supported only in the Python SDK. We transform the Pydantic model in many cases to make sure the model is compatible with our LLM. ## All Structured Extraction Options The Structured Extraction Options parameter is a list of objects, where each object contains: | Parameter | Description | Optional | Default Value | | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | ------------- | | `schema_name` | The name of the schema to use for structured data extraction. This will be used as the key in the `structured_data` field of the response. | No | - | | `json_schema` | The JSON Schema to use for structured data extraction. This schema will define the structure of the data to be extracted from the document. It should be a valid JSON Schema object. The schema can be used to extract structured data from the document, such as tables, forms, or other structured content. | No | - | | `partition_strategy` | The strategy to use for partitioning the document for structured data extraction. This can be `none`, `page`, or `fragment`. If not specified, the default is `none`. This will determine how the document is partitioned for structured data extraction. For example, if `page` is specified, structured data will be extracted from every page of the document. If `fragment` is specified, structured data will be extracted from every fragment of the document. This is useful for documents with multiple sections or tables. | Yes | `none` | | `page_classes` | An array of page class names to limit the structured data extraction to specific page types. This is useful for documents where structured data is only present on certain pages, such as signature pages or form pages. If not specified, structured data will be extracted from all pages of the document. | Yes | - | | `skip_ocr` | A boolean flag to skip OCR processing for the structured data extraction. This is useful for documents that are already in a machine-readable format, such as PDFs with embedded text. If set to `true`, the API will not perform OCR on the document and will only extract structured data from the text present in the document. | Yes | `false` | | `prompt` | A custom prompt to use for structured data extraction. This can be used to provide additional context or instructions to the AI model for extracting structured data from the document. If not specified, the default prompt will be used. This is useful for documents with complex structures or specific extraction requirements. | Yes | - | | `model-provider` | Structured Extraction is performed by using an LLM. At the moment, the following models are supported: `tensorlake` - Proprietary model specifically trained for structured data extraction, `gpt_4o_mini` - OpenAI model for structured extraction, `sonnet` - Anthropic model for structured extraction, `gemini3`- Google Gemini-3 model for structured extraction. | Yes | `tensorlake` | ## Partitioning the Document You can extract structured data from the whole Document at once, or from every page of the document. Each structured extraction object from the `structured_extraction_options` parameter can specify how the document should be partitioned for structured data extraction. For this, you can use the `partition_strategy` parameter in the JSON Schema of the structured extraction request object. Not to be confused with the `chunking_strategy` parameter in the `parse_options` property, which controls how the document is chunked for markdown generation. * `none`(*Default*) - Extract structured data from the whole document at once. * `page` - Extract structured data from every page of the document. * `section` - Extract structured data from each section of the document. * `PatternPartitionStrategy` - Extract structured data from within each block specified by the start and end pattern. ### Pattern-Based Partitioning Pattern-based partitioning uses regex patterns to partition documents for structured extraction, regardless of page boundaries or document layout. This is ideal when target data is consistently marked by text patterns but appears in different locations across similar documents. Instead of processing entire documents or fixed page ranges, you define start and end patterns that isolate extraction zones. For example, extract financial data between "Property Summary" and "Total Property" sections, or contract clauses between "Section 4.2" and "Section 4.3" headers. ```python Python theme={null} from pydantic import BaseModel, Field from tensorlake.documentai import DocumentAI, StructuredExtractionOptions class FinancialSummary(BaseModel): property_value: str = Field(description="Total property value") assessment_date: str = Field(description="Date of assessment") extraction_options = StructuredExtractionOptions( schema_name="FinancialSummary", json_schema=FinancialSummary, partition_strategy={ "strategy": "patterns", "patterns": { "start_patterns": ["\bProperty\s+Summary\b"], "end_patterns": ["\bTotal\s+Property\b"] } } ) parse_id = doc_ai.extract( file_id=file_id, structured_extraction_options=[extraction_options] ) ``` ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/parse \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "file_url": "https://example.com/financial-report.pdf", "structured_extraction_options": [ { "schema_name": "FinancialSummary", "partition_strategy": { "strategy": "patterns", "patterns": { "start_patterns": ["\\\\bProperty\\\\s+Summary\\\\b"], "end_patterns": ["\\\\bTotal\\\\s+Property\\\\b"] } }, "json_schema": { "type": "object", "properties": { "property_value": { "type": "string", "description": "Total property value" }, "assessment_date": { "type": "string", "description": "Date of assessment" } } } } ] }' ``` Pattern Configuration: * `start_patterns`: Array of regex patterns that mark the beginning of extraction zones * `end_patterns`: Array of regex patterns that mark the end of extraction zones * Use both to extract data between markers, or use only `start_patterns` to extract from marker to document end * Patterns are case-sensitive regex expressions; use `\\b` for word boundaries and `\\s+` for flexible whitespace matching This approach eliminates brittle page-based extraction and focuses on content structure, making your extraction pipeline resilient to document layout variations. ## Field‑Level Citations You can ask Tensorlake to return per‑field citations for structured outputs. When enabled, each extracted field includes bounding box data pointing to where that value came from (page numbers and, when available, bounding boxes you can use to highlight the source region in your UI). [Google Colab Notebook](https://tlake.link/notebooks/citations) Enable this by setting `provide_citations: True` on each `StructuredExtractionOptions` object. Citations add a small latency and payload overhead. We recommend enabling them for review flows, compliance use cases, and any UI where you highlight “where this came from.” ### Working with Citations * **Render highlights**: Use page\_number + bbox to draw overlays in a document viewer so reviewers can verify values quickly. * **Store provenance**: Persist citation anchors alongside your extracted JSON so downstream systems can trace and audit how a value was produced. * **Disambiguate fields**: If a field appears multiple times (e.g., “Total” on multiple pages), citations help confirm which instance was used. ```python Python theme={null} from pydantic import BaseModel, Field from tensorlake.documentai import DocumentAI, StructuredExtractionOptions doc_ai = DocumentAI(api_key="YOUR_API_KEY") file_id = "https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/dl_pen.jpeg" class DriverLicense(BaseModel): first_name: str = Field(description="Name next to FN") last_name: str = Field(description="Name next to LN") id: str = Field(description="ID number") address: str = Field(description="Address of the ID holder") dob: str = Field(description="Date of birth of the ID holder") driver_license_extraction = StructuredExtractionOptions( schema_name="DriverLicense", json_schema=DriverLicense, provide_citations=True, ) parse_id = doc_ai.extract( file_id=file_id, structured_extraction_options=[driver_license_extraction] ) result = doc_ai.wait_for_completion(parse_id=parse_id) ``` ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/parse \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "file_url": "https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/dl_pen.jpeg", "structured_extraction_options": [ { "schema_name": "DriverLicense", "provide_citations": true, "json_schema": { "type": "object", "properties": { "first_name": { "type": "string", "description": "Name next to FN" }, "last_name": { "type": "string", "description": "Name next to LN" }, "id": { "type": "string", "description": "ID number" }, "address": { "type": "string", "description": "Address of the ID holder" }, "dob": { "type": "string", "description": "Date of birth of the ID holder" } } } } ] }' ``` Running this on [this driver's license](https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/dl_pen.jpeg), for example, would yield these results: ```json JSON theme={null} [ { "data": { "address": "123 MAIN STREET APT. 1 HARRISBURG, PA 17101-0000", "address_citation": [ { "page_number": 1, "x1": 337, "x2": 714, "y1": 144, "y2": 371 } ], "dob": "01/07/1973", "dob_citation": [ { "page_number": 1, "x1": 337, "x2": 714, "y1": 144, "y2": 371 } ], "first_name": "ANDREW", "first_name_citation": [ { "page_number": 1, "x1": 337, "x2": 714, "y1": 144, "y2": 371 } ], "id": "99 999 999", "id_citation": [ { "page_number": 1, "x1": 337, "x2": 714, "y1": 144, "y2": 371 } ], "last_name": "SAMPLE", "last_name_citation": [ { "page_number": 1, "x1": 337, "x2": 714, "y1": 144, "y2": 371 } ] }, "page_numbers": [ 1 ], "schema_name": "summary" } ] ``` ## Filtering By Page Classes You can specify a subset of pages to extract structured data from by using the `page_classes` parameter in each structured data extraction request object. The top-level `page_range` will limit all parsing, classification, and data extraction capabilities to only those pages. ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/parse \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "page_range": "1-3", "file_id": "file_XXX", # Replace with your file ID "page_classifications": [ { "name": "front_of_dl", "description": "Pages that have a photo of a person." }, { "name": "back_of_dl", "description": "Pages that have a barcode." } ] "structured_extraction_options": [ { "schema_name": "DriverLicense", "json_schema": { "title": "DriverLicense", "type": "object", "page_classes": [ "front_of_dl" ], "properties": { "name": { "type": "string", "description": "Name of the ID holder" }, "age": { "type": "integer", "description": "Age of the ID holder" }, "address": { "type": "string", "description": "Address of the ID holder" }, "dob": { "type": "string", "description": "Date of birth of the ID holder" } } } } ] }' ``` ```python Python theme={null} from pydantic import BaseModel, Field from tensorlake.documentai import ( DocumentAI, StructuredExtractionOptions, ) doc_ai = DocumentAI(api_key="YOUR_API_KEY") file_id = "tensorlake-XXX" # Replace with your file ID or URL page_classifications = [ PageClassConfig( name="front_of_dl", description="Pages that have a photo of a person." ), PageClassConfig( name="back_of_dl", description="Pages that have a barcode." ), ] class DriverLicense(BaseModel): first_name: str = Field(description="Name next to FN") last_name: str = Field(description="Name next to LN") id: str = Field(description="ID number") address: str = Field(description="Address of the ID holder") dob: str = Field(description="Date of birth of the ID holder") driver_license_extraction = StructuredExtractionOptions( schema_name="DriverLicense", json_schema=DriverLicense, page_classes=["front_of_dl"] ) parse_id = doc_ai.parse( file=file_id, page_range="1-3,9-10", page_classifications=page_classifications, structured_extraction_options=[driver_license_extraction]) ``` ## Advanced: Reusing OCR Output for Structured Extraction If you're iterating on an extraction schema, you don't need to re-run OCR every time. The `read` and `extract` APIs are independent steps — `extract` operates on text, not the original PDF. By uploading your Markdown output as a file once, you get a `file_id` you can pass to any number of `extract` calls. ```python Python theme={null} from tensorlake.documentai import ( DocumentAI, ParsingOptions, StructuredExtractionOptions, ) from pydantic import BaseModel, Field doc_ai = DocumentAI(api_key="YOUR_API_KEY") # Step 1: Run OCR once parse_id = doc_ai.read(file_id="file_XXX") result = doc_ai.wait_for_completion(parse_id=parse_id) # Step 2: Save the Markdown output and upload it markdown = "\n\n".join(chunk.content for chunk in result.chunks) with open("output.md", "w") as f: f.write(markdown) markdown_file_id = doc_ai.upload(path="output.md") # Step 3: Iterate on your schema — no OCR cost on subsequent runs class Invoice(BaseModel): vendor_name: str = Field(description="Name of the vendor") total_amount: str = Field(description="Total amount due") extract_id = doc_ai.extract( file_id=markdown_file_id, structured_extraction_options=[ StructuredExtractionOptions( schema_name="Invoice", json_schema=Invoice, ) ], ) extraction_result = doc_ai.wait_for_completion(parse_id=extract_id) ``` ```bash curl theme={null} # Step 1: Run OCR once and save the markdown from the response # Step 2: Upload the Markdown output as a reusable file curl -X POST https://api.tensorlake.ai/documents/v2/files \ -H "Authorization: Bearer ${TENSORLAKE_API_KEY}" \ -F "file=@output.md" # Returns: { "file_id": "file_XXX" } # Step 3: Run extraction against the Markdown file curl --request POST \ --url https://api.tensorlake.ai/documents/v2/extract \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "file_id": "file_XXX", "structured_extraction_options": [ { "schema_name": "Invoice", "json_schema": { "type": "object", "properties": { "vendor_name": { "type": "string", "description": "Name of the vendor" }, "total_amount": { "type": "string", "description": "Total amount due" } } } } ] }' ``` When extracting from a Markdown file, `page`-based partitioning is not available since page boundaries are not preserved in the text. You can still use the default `none` strategy (whole document) or [pattern-based partitioning](#pattern-based-partitioning). ## Tips #### Skip OCR Some times document parsing doesn't work well on certain documents, which can lead to poor structured data extraction. We recommend skipping the OCR step if you care about only structured data extraction. This will make use of a Vision Language Model trained to extract JSON from Document Images. You should try this out in case you are seeing poor accuracy in structured data extraction. #### Describe the Fields Adding descriptions to the fields in the schema always improves the accuracy of the structured data extraction. Help the model understand the context of the fields you are extracting, and if possible mention what text or visual cues to look for in the document for each field. #### Don't compute new data in the schema We don't recommend make the LLM derive new information while performing structured extraction. For ex, if you ask the model to sum up all the rows in a table and return this in a new field, the model will likely hallucinate. We recommend doing this in your application code in a downstream task. # Summarization Source: https://docs.tensorlake.ai/document-ingestion/parsing/summarization Summarize Tables, Figures and Charts in Documents Document Ingestion API can be used to summarize tables, figures and charts in documents. | Parameter | Description | Default Value | | ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- | | `table_summarization` | Enable summarization of tables present in the document. This will generate a summary of the table content, including key insights and trends. | `false` | | `figure_summarization` | Enable summarization of figures present in the document. This will generate a summary of the figure content, including key insights and trends. | `false` | | `table_summarization_prompt` | A custom prompt to use for table summarization. This can be used to provide additional context or instructions to the AI model for summarizing tables in the document. If not specified, the default prompt will be used. | - | | `figure_summarization_prompt` | A custom prompt to use for figure summarization. This can be used to provide additional context or instructions to the AI model for summarizing figures in the document. If not specified, the default prompt will be used. | - | | `chart_extraction` | Extraction of chart type and structured data series from images, delivered as clean JSON suitable for analytics and ingestion. | `false` | #### Why would you want to summarize tables, figures and charts? * Even though LLMs have long context, embedding models often don't. In such cases, summarizing tables, embedding them, and storing their image along side the summary can help retreive the right table or figure when needed for the LLM to answer questions. * Figures often encode complex information which can't be converted to Markdown or HTML. Summarizing and indexing them can help retreive the right figure when relevant questions are asked. ## Summarizing Tables Tales can be summarized by setting `table_summarization` to `true` in the `enrichment_options` JSON object when calling the `parse` API. ```json JSON Request theme={null} { "enrichment_options": { "table_summarization": true, "table_summarization_prompt": "Summarize the table in a way that is easy to understand and use for answering questions." } } ``` ```python Python SDK theme={null} from tensorlake.documentai import DocumentAI from tensorlake.documentai.models.options import ( EnrichmentOptions, ) enrichment_options = EnrichmentOptions( table_summarization=True, table_summarization_prompt="Summarize the table in a concise manner.", ) doc_ai = DocumentAI(api_key=API_KEY) parse_id = doc_ai.read( file_id="file_XXX", # Replace with your file ID or URL enrichment_options=enrichment_options, ) ``` The table summary prompt is optional. If not provided, a default prompt will be used. ## Summarizing Figures and Charts Figures can be summarized by setting `figure_summarization` to `true` in the `enrichment_options` JSON object when calling the `parse` API. ```json JSON Request theme={null} { "enrichment_options": { "figure_summarization": true, "figure_summary_prompt": "Summarize the figure in a way that is easy to understand and use for answering questions." } } ``` ```python Python SDK theme={null} from tensorlake.documentai import ( DocumentAI, EnrichmentOptions, ) doc_ai = DocumentAI(api_key=API_KEY) enrichment_options = EnrichmentOptions( figure_summarization=True, figure_summary_prompt="Summarize the figure in a way that is easy to understand and use for answering questions.", ) parse_id = doc_ai.read( file_id="file_XXX", # Replace with your file ID or URL enrichment_options=enrichment_options, ) ``` The figure summary prompt is optional. If not provided, a default prompt will be used. ### Charts Structured information about charts can be extracted by setting `chart_extraction` to `true` in the `enrichment_options` JSON object when calling the `parse` API. ```python Python SDK theme={null} from tensorlake.documentai import DocumentAI from tensorlake.documentai.models.options import ( EnrichmentOptions, ) enrichment_options = EnrichmentOptions( chart_extraction=True, ) doc_ai = DocumentAI(api_key=API_KEY) parse_id = doc_ai.read( file_id="file_XXX", # Replace with your file ID or URL enrichment_options=enrichment_options, ) ``` ```json REST API theme={null} { "enrichment_options": { "chart_extraction": true, } } ``` # Table Merging Source: https://docs.tensorlake.ai/document-ingestion/parsing/table-merging Automatically merge table fragments that span multiple pages or columns into a single unified table for LLM-ready output. ## Overview PDFs are designed for printing, not data extraction. When a logical table spans multiple pages or is split across columns on a single page, most parsers output disconnected fragments — breaking the semantic integrity of the data and making it difficult for downstream LLMs and RAG pipelines to reason over. Tensorlake's Agentic Table Merging reconstructs these fragments into a single coherent table by reasoning over content and context, not just geometry. Enable it with `table_merging=True` in your `ParsingOptions`. ## Enabling Table Merging Set `table_merging=True` in your `ParsingOptions`: ```python Python SDK theme={null} from tensorlake.documentai import DocumentAI from tensorlake.documentai.models import ParsingOptions doc_ai = DocumentAI(api_key="YOUR_TENSORLAKE_CLOUD_API_KEY") file_id = doc_ai.upload(path="document.pdf") parsing_options = ParsingOptions( table_merging=True, ) parse_id = doc_ai.read( file_id=file_id, parsing_options=parsing_options, ) result = doc_ai.wait_for_completion(parse_id) ``` ```bash curl theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/parse \ --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \ --header 'Content-Type: application/json' \ --data '{ "file_id": "file_XXX", "parsing_options": { "table_merging": true } }' ``` ## How It Works Rather than relying on geometric position alone, an agent analyzes the content and context around each table fragment to decide whether it is a continuation of the previous one. For each candidate pair, the agent examines: * The end of the previous table fragment * The text in the gap between them (e.g. `"Page 14 of 92"`, `"(continued)"`, boilerplate disclaimers) * The start of the next table fragment * Whether column structures are compatible (same number of columns, matching or repeated headers) This allows the agent to ignore irrelevant footer noise while correctly identifying continuation cues. Two merge scenarios are handled: * **Cross-page merges** — tables that continue across one or more page breaks, often with repeated or noisy headers and footers * **Same-page merges** — tables split into multiple columns on a single page (e.g. an alphabetical list split left/right) that logically belong together ## Output When table merging is enabled, the parse result includes a `merged_tables` array. Each entry in the array represents a reconstructed table: | Field | Description | | ------------------- | -------------------------------------------------------------------- | | `merged_table_id` | Unique identifier for the merged table (e.g. `cross_page_merge_1_3`) | | `merged_table_html` | Full HTML representation of the unified table | | `start_page` | Page number where the first fragment was found | | `end_page` | Page number where the last fragment was found | | `pages_merged` | Number of pages spanned by the merged table | | `summary` | Human-readable summary of the merged table's content | | `merge_actions` | Details on the pages involved and target column count | | `merged_at` | ISO 8601 timestamp of when the merge was performed | ### Example: cross-page merge A financial table spanning three pages is merged into a single entry: ```json theme={null} { "merged_table_id": "cross_page_merge_1_3", "merged_table_html": "...
", "start_page": 1, "end_page": 3, "pages_merged": 3, "summary": "Financial results for the quarter and nine months ended September 30, 2025...", "merge_actions": { "pages": [1, 2, 3], "target_columns": 10 }, "merged_at": "2026-01-10T03:12:10.785866+00:00" } ``` ### Example: same-page column merge A holdings table split into two columns on one page is unified into a single continuous structure: ```json theme={null} { "merged_table_id": "same_page_merge_2_3", "merged_table_html": "...
", "start_page": 2, "end_page": 2, "pages_merged": 1, "summary": "Both tables share the same column structure (Security, Shares, Value) and represent a continuous alphabetical list of stock holdings...", "merge_actions": { "pages": [2], "target_columns": null } } ``` ## Common Use Cases * **Financial documents** — reconstruct multi-page income statements, balance sheets, and loan tables for accurate numeric reasoning * **Research papers** — unify results tables that span pages so LLMs can compare rows and compute aggregates * **Portfolio and fund reports** — merge holdings tables split across columns for reliable sector aggregation and exposure calculations * **RAG pipelines** — produce coherent table chunks that improve retrieval quality and reduce hallucinations on questions that depend on full table context ## Related * [Parsing Overview](/document-ingestion/parsing/read) * [Parse Output](/document-ingestion/parsing/parse-output) * [Cross-page Header Correction](/document-ingestion/parsing/header-correction) # Document Parsing Benchmarks Source: https://docs.tensorlake.ai/document-ingestion/production/benchmarks How Tensorlake benchmarks document parsing accuracy against leading solutions — structural preservation (TEDS) and downstream usability of extracted data. Tensorlake's Document AI delivers industry-leading accuracy on document parsing. We measure what matters: **structural preservation** of document layout and **downstream usability** of extracted data from documents, which is what often breaks in production. This page presents our comprehensive benchmarking methodology and results comparing Tensorlake against leading document parsing solutions. ## Our Evaluation Framework ### Two-Stage Methodology We mirror real-world workflows with a two-stage evaluation process: **Stage 1: Document Reading Abilities (OCR and Structural Preservation)** * Models generate Markdown/HTML output * Evaluated using **TEDS (Tree Edit Distance Similarity)**. * Captures predicted vs. ground-truth Markdown, structural fidelity in tables and complex layouts * **Answers: "Is this table still a table?"** Not just "Is the text similar?" **Stage 2: Structured JSON Extraction (Downstream Usability)** * Markdown passed through standardized LLM (GPT-4o) with predefined schemas * Evaluated using **JSON F1 (Field-Level Precision and Recall)** * Isolates how OCR quality impacts real extraction workflows * We measure precision to measure correctness of extracted fields and recall to measure completeness of required field capture. * F1 score combines both metrics for a holistic view. * **Answers: "Can automation use this data?"** Not just "Is text present?" This methodology ensures fair comparisons by varying only the OCR models while keeping extraction constant. ## Document Reading Benchmark Results ### Datasets * OCRBench v2: 400 diverse document images (invoices, contracts, forms), measuring overall structural and text accuracy. The data was audited to ensure consistency in ground truth. * OmniDocBench: 512 document images with complex tables, focusing on table parsing capabilities. We are using v1.5 evaluation code from the official repository. **Key Finding:** Tensorlake achieves the highest TEDS score, indicating superior structural preservation while maintaining competitive text accuracy. The gap between open-source and production-grade systems is substantial. ### Table Parsing Evaluated on 512 document images with tables from OmniDocBench (CVPR-accepted benchmark): ¹ Marker's number is from the officially published OmniDocBench repository. **Key Finding:** On complex, multi-page tables, Tensorlake leads with 86.79% TEDS. Open-source solutions struggle to preserve table structure (sub-70% TEDS). ## Structured Extraction Benchmark Results #### Datasets used: * We collected 100 document pages of proprietary data spanning banking, retail, and insurance sectors. This represents actual production workloads: invoices with water damage, scanned contracts with skewed text, bank statements with multi-level tables. * Ground truth schemas were generated using Gemini Pro 2.5 and audited by human reviewers to ensure accuracy. **Key Findings:** * Tensorlake achieves 91.7% F1—demonstrating superior OCR quality feeds better extraction * The gap between 91.7% and 68.9% F1 is **massive**: it’s **5 extra** fields correctly extracted out of every 20 * In production processing thousands of documents daily, this accuracy gap compounds into significant error reduction ### Production Impact Example For an insurance claims processor handling 10,000 documents per month: * **At 85% F1:** 1,500 documents require manual review * **At 90% F1:** 1,000 documents require manual review * **At 91.7% F1 (Tensorlake):** 830 documents require manual review **Result:** Tensorlake cuts monthly manual reviews from 1,500 → 830 (a **45% reduction** vs the 85% baseline). ## Cost & Performance Comparison Accuracy without affordability isn't practical. Here's the complete picture: | **Provider** | **Cost/1,000 Pages** | **TEDS** | **JSON F1** | | --------------------------- | -------------------- | --------- | ----------- | | Docling (open-source) | Free\* | 63.3% | 68.9% | | Marker (open-source) | Free\* | 71.1% | 71.2% | | Azure Document Intelligence | \$10 | 78.6% | 88.1% | | AWS Textract | \$15 | 81.0% | 88.4% | | **Tensorlake** | **\$10** | **84.1%** | **91.7%** | \*Free but requires self-hosting infrastructure ## Visual Comparison: Where Competitors Fail ### Example: Contact Information Extraction When parsing Section 21 (NOTICES) of a real estate contract: Azure: Missing opening parenthesis in phone number. Two-column layout collapsed into confusing single column. AWS Textract: Completely wrong phone number in buyer field (shows seller’s phone). Buyer’s phone (123)456-7890 entirely missing. Tensorlake: Perfect extraction of both phone numbers: (123)456-7890 and (456)789-1234 Two-column structure preserved with clear buyer/seller separation. All contact fields accurately captured. In legal documents, phone numbers are critical contact information. Errors like these cause compliance issues and workflow failures. ## Reproducibility To reproduce our table results: 1. Generate Markdown outputs using models listed above 2. Run evaluation from [OmniDocBench repository](https://github.com/opendatalab/OmniDocBench) 3. Use document data with tables (512 images) with v1.5 code version ### Deep Dive: Full Benchmark Analysis Read our comprehensive blog post: [The Document Parsing Benchmark That Actually Matters](https://tensorlake.ai/blog/benchmarks) The blog includes: * Detailed failure mode analysis * Additional benchmark datasets * Technical methodology deep-dive * Production deployment case studies * Code examples and reproducibility guides *Benchmarks conducted using OCRBench v2 (400 images), OmniDocBench (512 table images), and proprietary enterprise dataset (100 pages) in October 2024. All results are reproducible using public datasets and standardized evaluation frameworks.* # Integration Guide Source: https://docs.tensorlake.ai/document-ingestion/production/integration End-to-end guide to integrating Tensorlake Document Ingestion with your existing workflows via the HTTP APIs — upload, parse, and retrieve results. You can integrate Tensorlake Document Ingestion with your existing workflows by using the HTTP APIs. The [parse](/api-reference/v2/parse/parse) endpoint will create a parse job with the following request payload: * A file source, which can be: * A `file_id` returned from [uploading a file to Tensorlake Cloud](/api-reference/v2/files/upload). * A `file_url` that points to a publicly accessible file. * Options for parsing. See the [parse settings below](/document-ingestion/parsing/read#explore-main-configuration-options). * `page_range`: The range of pages to parse, ex: `1-2` or `1,3,5`. By default, all pages will be parsed. * `labels`: Metadata to identify the parse request. The labels are returned along with the parse response. The endpoint will return: * `parse_id`: The unique ID Tensorlake uses to reference the specific parsing job. This ID can be used to get the output when the parsing job is completed and re-visit previously used settings. The [`/parse/{parse_id}`](/api-reference/v2/parse/get) endpoint will return: * `status`: The status of the parsing job. This can be `failure`, `pending`, `processing`, or `successful`. * If the parsing job is `pending` or `processing`, you should wait a few seconds and then check again by re-calling the endpoint. When the parsing job is `successful`, you can retrieve the parsed result by calling the [`/parse/{parse_id}`](/api-reference/v2/parse/get) endpoint. The response payload will include an [`Response` object](/api-reference/v2/parse/get): * `chunks`: An array of objects that contain a chunk number (specified by the chunk strategy) and the markdown content for that chunk. * `pages`: A JSON representation of each page’s visual structure, including page dimensions, bounding boxes for each element (text, tables, figures, signatures), and the reading order. * `labels`: Labels associated with the parse job. The complete upload, parse, and get results flow The APIs to support this workflow are: File Management endpoints to upload, list, and delete files. Parse endpoints to parse uploaded Documents or any remote file. ## Webhooks You can also use the [Webhooks API](/webhooks/overview) to receive notifications when a parse job is completed. This is an alternative to polling for parse responses. # Document Ingestion Quickstart Source: https://docs.tensorlake.ai/document-ingestion/quickstart Parse your first document with the Tensorlake Document Ingestion API and inspect the structured output. The most basic use-cases of Document Ingestion API are: * Convert the Document to Markdown for feeding into an LLM. * Extract structured data from the document specified by a [JSON schema](https://json-schema.org/overview/what-is-jsonschema). You will learn how to convert a [rental agreement](https://tlake.link/docs/real-estate-agreement) document to markdown chunks, and extract structured data from the document specified by a schema. [Google Colab notebook](https://colab.research.google.com/drive/1LjD9euQOXMHRsNTOczZlvvO-isaszFUE?usp=sharing) ### Prerequisites * A Tensorlake [API key](/platform/authentication#api-keys) * \[Optional] Tensorlake SDK for Python ## Convert to Markdown ```bash theme={null} pip install tensorlake ``` Export the variable and the SDK will reference your environment variables, looking for `TENSORLAKE_API_KEY`: ```bash theme={null} export TENSORLAKE_API_KEY=your-api-key-here ``` ```python quickstart.py theme={null} from tensorlake.documentai import ( DocumentAI, ParsingOptions, ChunkingStrategy, ) doc_ai = DocumentAI() # Use a publicly accessible URL or upload a file to Tensorlake and use the file ID. file_url = "https://tlake.link/docs/real-estate-agreement" # In this example, we are using the PAGE chunking strategy, which means that each page of the document will be a separate chunk. parsing_options = ParsingOptions( chunking_strategy=ChunkingStrategy.PAGE, ) # Submit the parse operation and wait for the job to complete parse_id = doc_ai.read( file_url=file_url, page_range="1-3", parsing_options=parsing_options, ) ``` ```python quickstart.py theme={null} result = doc_ai.result(parse_id) ``` ```python quickstart.py theme={null} for chunk in result.chunks: print(f"## Page {chunk.page_number}\n\n") print(f"{chunk.content}\n\n") ``` ```javascript parseFileUrl.js theme={null} async function parseFileUrl(fileUrl, tensorlakeApiKey) { const parsingOptions = { chunking_strategy: "page", }; const body = { file_url: fileUrl, page_range: "1-3", parsing_options: parsingOptions, }; const options = { method: 'POST', headers: { Authorization: `Bearer ${tensorlakeApiKey}`, 'Content-Type': 'application/json', }, body: JSON.stringify(body), }; const response = await fetch( 'https://api.tensorlake.ai/documents/v2/read', options ); const result = await response.json(); console.log('result:', JSON.stringify(result, null, 2)); return result.parse_id; } const fileUrl = 'https://tlake.link/docs/real-estate-agreement'; const tensorlakeApiKey = 'your-tensorlake-api-key-here'; const parseId = await parseFileUrl(fileUrl, tensorlakeApiKey); ``` ```javascript getResults.js theme={null} function writeParseResults(jobResult) { let markdownContent = ''; jobResult.chunks.forEach((chunk) => { markdownContent += `## PAGE NUMBER ${chunk.page_number}\n\n`; markdownContent += `${chunk.content}\n\n`; }); console.log(markdownContent); } async function getParseResults(parseId, tensorlakeApiKey) { while (true) { const response = await fetch( `https://api.tensorlake.ai/documents/v2/parse/${parseId}`, { method: 'GET', headers: { Authorization: `Bearer ${tensorlakeApiKey}`, 'Content-Type': 'application/json', }, } ); if (!response.ok) { console.error(`Error fetching job: ${response.statusText}`); return; } const result = await response.json(); if (result.status === 'pending' || result.status === 'processing') { console.log('waiting 5s...'); await new Promise((resolve) => setTimeout(resolve, 5000)); console.log(`job status: ${result.status}`); } else { if (result.status === 'successful') { console.log(result); writeParseResults(result); return result; } else { console.error(`Job finished with status: ${result.status}`); return result; } } } } const parseId = 'your-parse-id-here'; const tensorlakeApiKey = 'your-tensorlake-api-key-here'; await getParseResults(parseId, tensorlakeApiKey); ``` ### Output When the parsing is complete, you will see - ```md Markdown Chunks expandable theme={null} ## Page 9 relationships in accordance with any agreement(s) made with licensed real estate agent(s). Seller has read and acknowledges receipt of a copy of this Agreement and authorizes any licensed real estate agent(s) to deliver a signed copy to the Buyer. Delivery may be in any of the following: (i) hand delivery; (ii) email under the condition that the Party transmitting the email receives electronic confirmation that the email was received to the intended recipient; and (iii) by facsimile to the other Party or the other Party’s licensee, but only if the transmitting fax machine prints a confirmation that the transmission was successful. XXX. LICENSED REAL ESTATE AGENT(S). If Buyer or Seller have hired the services of licensed real estate agent(s) to perform representation on their behalf, he/she/they shall be entitled to payment for their services as outlined in their separate written agreement. XXXI. DISCLOSURES. It is acknowledged by the Parties that: (check one) - There are no attached addendums or disclosures to this Agreement. - The following addendums or disclosures are attached to this Agreement: (check all that apply) - Lead-Based Paint Disclosure Form [ ] - [ ] - [ ] - [ ] - - - XXXII. ADDITIONAL TERMS AND CONDITIONS. None XXXIII. ENTIRE AGREEMENT. This Agreement together with any attached addendums or disclosures shall supersede any and all other prior understandings and agreements, either oral or in writing, between the Parties with respect to the subject matter hereof and shall constitute the sole and only agreements between the Parties with respect to the said Property. All prior negotiations and agreements between the Parties with respect to the Property hereof are merged into this Agreement. Each Party to this Agreement acknowledges that no representations, inducements, promises, or agreements, orally or otherwise, have been made by any Party or by anyone acting on behalf of any Party, which are not embodied in this Agreement and that any agreement, statement or promise that is not contained in this Agreement shall not be valid or binding or of any force or effect. e Buyer's Initials NE - Seller's Initials JV. Page 9 of 10 ## CHUNK NUMBER 1 ## Page 10 XXXIV. EXECUTION. | | | |----------------------------------------------------------------------------------|--------------------| | Buyer Signature: Nova Ellison Date: Print Name: Nova Ellison | September 10, 2025 | | Buyer Signature: Date: Print Name: | | | Seller Signature: Juno Vegi Date: Print Name: J uno Vega | September 10, 2025 | | Seller Signature: Date: Print Name: | | | Agent Signature: Aster Polaris Date: Print Name: Aster Polaris Polaris Group LLC | September 10, 2025 | | Agent Signature: Date: Print Name: | | e Page 10 of 10 ``` The chunks contain the document in markdown format. All the elements of the pages including text, tables, figures, etc, are available in the chunks. They are ordered by their natural reading order, which will improve the chunks for your document pre-processing pipelines. In addition, you also have the bounding boxes of every element in the document. Learn more about the output in detail [here](/document-ingestion/parsing/parse-output). *** ## Extract Structured Data ```python quickstart.py theme={null} import json import os from typing import Optional from pydantic import BaseModel, Field from tensorlake.documentai import ( DocumentAI, ParsingOptions, StructuredExtractionOptions, ChunkingStrategy, ) doc_ai = DocumentAI() # Use a publicly accessible URL or upload a file to Tensorlake and use the file ID. file_url = "https://tlake.link/docs/real-estate-agreement" # Define a JSON schema using Pydantic # Our structured extraction model will identify the properties we want to extract from the document. # In this case, we are extracting the names and signature dates of the buyer and seller. class Signers(BaseModel): buyer_name: Optional[str] = Field( default=None, description="The name of the buyer, do not extract initials" ) buyer_signature_date: Optional[str] = Field( default=None, description="Date and time that the buyer signed." ) seller_name: Optional[str] = Field( default=None, description="The name of the seller, do not extract initials" ) seller_signature_date: Optional[str] = Field( default=None, description="Date and time that the seller signed." ) # Create a structured extraction options object with the schema # # You can send as many schemas as you want, and the API will return structured data for each schema # indexed by the schema name. real_estate_agreement_extraction_options = StructuredExtractionOptions( schema_name="Signers", json_schema=Signers, ) # Submit the parse operation and wait for the job to complete parse_id = doc_ai.extract( file_url=file_url, page_range="9-10", structured_extraction_options=[real_estate_agreement_extraction_options], ) ``` ```python quickstart.py theme={null} result = doc_ai.wait_for_completion(parse_id) ``` ```python quickstart.py theme={null} print(json.dumps(result.structured_data[0].data, indent=4)) ``` ### Prerequisites * A Tensorlake [API key](/platform/authentication#api-keys) ```javascript parseFileUrl.js theme={null} async function parseFileUrl(fileUrl, tensorlakeApiKey) { const signersSchema = { title: "Signers", type: "object", properties: { buyerName: { type: "string", description: "The name of the buyer, do not extract initials", title: "Buyer Name" }, buyerSignatureDate: { type: "string", description: "Date and time that the buyer signed.", title: "Buyer Signature Date" }, sellerName: { type: "string", description: "The name of the seller, do not extract initials", title: "Seller Name" }, sellerSignatureDate: { type: "string", description: "Date and time that the seller signed.", title: "Seller Signature Date" } } }; const realEstateAgreementExtractionOptions = { schema_name: "Signers", json_schema: signersSchema, }; const body = { file_url: fileUrl, page_range: "9-10", structured_extraction_options: [realEstateAgreementExtractionOptions], }; const options = { method: 'POST', headers: { Authorization: `Bearer ${tensorlakeApiKey}`, 'Content-Type': 'application/json', }, body: JSON.stringify(body), }; const response = await fetch( 'https://api.tensorlake.ai/documents/v2/extract', options ); const result = await response.json(); console.log('result:', JSON.stringify(result, null, 2)); return result.parse_id; } const fileUrl = 'https://tlake.link/docs/real-estate-agreement'; const tensorlakeApiKey = 'your-tensorlake-api-key-here'; const parseId = await parseFileUrl(fileUrl, tensorlakeApiKey); ``` ```javascript getResults.js theme={null} import { writeFileSync } from 'fs'; function writeParseResults(jobResult) { const structuredData = jobResult.structured_data; console.log(structuredData); } async function getParseResults(parseId, tensorlakeApiKey) { while (true) { const response = await fetch( `https://api.tensorlake.ai/documents/v2/parse/${parseId}`, { method: 'GET', headers: { Authorization: `Bearer ${tensorlakeApiKey}`, 'Content-Type': 'application/json', }, } ); if (!response.ok) { console.error(`Error fetching job: ${response.statusText}`); return; } const result = await response.json(); if (result.status === 'pending' || result.status === 'processing') { console.log('waiting 5s...'); await new Promise((resolve) => setTimeout(resolve, 5000)); console.log(`job status: ${result.status}`); } else { if (result.status === 'successful') { console.log(result); writeParseResults(result); return result; } else { console.error(`Job finished with status: ${result.status}`); return result; } } } } const parseId = 'your-parse-id-here'; const tensorlakeApiKey = 'your-tensorlake-api-key-here'; await getParseResults(parseId, tensorlakeApiKey); ``` ### Output When the parsing is complete, you will see the structured data in the console. ```json theme={null} { "Signers": [ { "data": { "buyer_name": "Nova Ellison", "buyer_signature_date": "September 10, 2025", "seller_name": "Juno Vega", "seller_signature_date": "September 10, 2025" }, "page_numbers": [ 9, 10 ], "schema_name": "Signers" } ] } ``` *** ## Next Steps # Agent with Tool Calling Source: https://docs.tensorlake.ai/examples/agentic-applications/agent-with-tools Build a Claude agent that orchestrates complex workflows using tool calls. Check out the full source code for this example on GitHub. This tutorial demonstrates how to build an **Agent with Tool Calling** using Tensorlake and the Anthropic API. This agent orchestrates a multi-step workflow where Claude decides which tools to call and in what order to answer user queries effectively. ## Overview The Agent with Tool Calling follows this pattern: 1. **User Query**: The user asks a question that requires external data or actions (e.g., "What's the weather like at my current location?"). 2. **Tool Selection**: Claude analyzes the query and selects the appropriate tool(s) to call (e.g., `get_ip_address`, `get_location_info`). 3. **Tool Execution**: The selected tool functions are executed within Tensorlake's isolated environment. 4. **Information Synthesis**: The tool outputs are fed back to Claude, which then synthesizes a final answer or decides to call more tools. ## Prerequisites * **Python 3.11+** * **Tensorlake Account** and CLI installed. * **Anthropic API Key** ## Implementation (`app.py`) Here is the complete implementation for the Agent with Tool Calling. ```python theme={null} import os from typing import Dict, Any from anthropic import Anthropic from pydantic import BaseModel, Field from tensorlake import application, function, Image # Define the runtime environment image = Image(name="agent-with-tools").run("pip install anthropic requests tensorlake pydantic") class ToolInput(BaseModel): name: str arguments: Dict[str, Any] @application() @function(image=image, secrets=["ANTHROPIC_API_KEY"]) def agent_loop(query: str) -> str: """ Main entry point for the Agent with Tool Calling. Orchestrates the conversation loop between Claude and the tools. """ client = Anthropic() messages = [{"role": "user", "content": query}] # Define available tools for Claude tools = [ { "name": "get_ip_address", "description": "Get the public IP address of the current execution environment.", "input_schema": {"type": "object", "properties": {}} }, { "name": "get_location_info", "description": "Get location information based on an IP address.", "input_schema": { "type": "object", "properties": { "ip_address": {"type": "string", "description": "The IP address to lookup."} }, "required": ["ip_address"] } }, { "name": "get_weather_alerts", "description": "Get current weather alerts for a location.", "input_schema": { "type": "object", "properties": { "location": {"type": "string", "description": "City or location name."} }, "required": ["location"] } } ] while True: # Ask Claude for the next step response = client.messages.create( model="claude-3-opus-20240229", max_tokens=1024, tools=tools, messages=messages ) # If Claude decides to stop and give an answer if response.stop_reason == "end_turn": return response.content[0].text # If Claude wants to use a tool if response.stop_reason == "tool_use": tool_use = next(block for block in response.content if block.type == "tool_use") tool_name = tool_use.name tool_input = tool_use.input tool_use_id = tool_use.id print(f"Tool Call: {tool_name} with input: {tool_input}") # Execute the requested tool if tool_name == "get_ip_address": tool_result = get_ip_address() elif tool_name == "get_location_info": tool_result = get_location_info(tool_input["ip_address"]) elif tool_name == "get_weather_alerts": tool_result = get_weather_alerts(tool_input["location"]) else: tool_result = f"Error: Tool {tool_name} not found." # Add tool result to conversation history messages.append({"role": "assistant", "content": response.content}) messages.append({ "role": "user", "content": [ { "type": "tool_result", "tool_use_id": tool_use_id, "content": str(tool_result) } ] }) @function(image=image) def get_ip_address() -> str: """Simulates getting the public IP address.""" # In a real scenario, you might use `requests.get('https://api.ipify.org').text` return "203.0.113.1" @function(image=image) def get_location_info(ip_address: str) -> str: """Simulates getting location info for an IP.""" return f"Location for {ip_address}: San Francisco, CA" @function(image=image) def get_weather_alerts(location: str) -> str: """Simulates getting weather alerts.""" return f"No active weather alerts for {location}." ``` ## Running Locally To test the agent locally, add this code block to the end of `app.py`: ```python theme={null} if __name__ == "__main__": from tensorlake.applications import run_local_application # Run the application locally result = run_local_application(agent_loop, "What are the current weather alerts for my location?") print(f"Agent Output: {result}") ``` Then run the script: ```bash theme={null} export ANTHROPIC_API_KEY=your_key_here python app.py ``` ## Deploying to Tensorlake Deploy your agent to the cloud for production use: ```bash theme={null} tl secrets set ANTHROPIC_API_KEY=your_key_here tl deploy app.py ``` Your agent is now live! It can autonomously chain tool calls to solve complex user requests, all running within secure, scalable Tensorlake functions. # Code Interpreter Agent Source: https://docs.tensorlake.ai/examples/agentic-applications/code-interpreter Build a secure code execution environment using Tensorlake and OpenAI. Check out the full source code for this example on GitHub. This tutorial demonstrates how to build a **Code Interpreter Agent** that can safely execute Python code generated by an LLM. By leveraging Tensorlake's isolated sandboxing, you can run arbitrary code without compromising your local environment or production servers. ## Overview The Code Interpreter Agent follows this workflow: 1. **User Request**: The user asks a question that requires code execution (e.g., "Calculate the Fibonacci sequence up to 100"). 2. **Code Generation**: An OpenAI agent interprets the request and generates the necessary Python code. 3. **Secure Execution**: The code is sent to a Tensorlake function running in a secure, isolated container. 4. **Result Retrieval**: The execution output (stdout, stderr) is captured and returned to the agent. 5. **Final Answer**: The agent formulates a final response based on the code output. ## Prerequisites * **Python 3.11+** * **Tensorlake Account** and CLI installed. * **OpenAI API Key** ## Implementation (`app.py`) Here is the complete implementation for the Code Interpreter Agent. ```python theme={null} import sys import io import contextlib from typing import Optional from openai import OpenAI from pydantic import BaseModel, Field from tensorlake import application, function, Image # Define the execution environment # We install pandas and numpy to support data analysis tasks image = Image(name="code-interpreter").run("pip install openai pandas numpy") class ExecutionResult(BaseModel): stdout: str stderr: str result: Optional[str] = None @application() @function(image=image, secrets=["OPENAI_API_KEY"]) def code_interpreter(prompt: str) -> str: """ Main entry point for the Code Interpreter Agent. """ client = OpenAI() # Step 1: Generate code based on user prompt completion = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": "You are a Python code generator. Output only valid Python code to solve the user's problem. Do not include markdown blocks."}, {"role": "user", "content": prompt} ] ) code = completion.choices[0].message.content # Step 2: Execute the generated code securely print(f"Executing generated code: {code}") execution_result = execute_python_code(code) # Step 3: formulate final answer final_response = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": "Answer the user's question based on the code execution result."}, {"role": "user", "content": f"Question: {prompt} Code Output: {execution_result.stdout} Errors: {execution_result.stderr}"} ] ) return final_response.choices[0].message.content @function(image=image) def execute_python_code(code: str) -> ExecutionResult: """ Executes Python code in a secure sandbox and captures output. """ stdout = io.StringIO() stderr = io.StringIO() try: # Redirect stdout and stderr to capture print statements with contextlib.redirect_stdout(stdout), contextlib.redirect_stderr(stderr): exec(code, {"__name__": "__main__"}) except Exception as e: print(f"Execution error: {e}", file=stderr) return ExecutionResult( stdout=stdout.getvalue(), stderr=stderr.getvalue() ) ``` ## Running Locally To test the interpreter locally, add this code block to the end of `app.py`: ```python theme={null} if __name__ == "__main__": from tensorlake.applications import run_local_application # Run the application locally result = run_local_application(code_interpreter, "Calculate the sum of the first 50 prime numbers.") print(f"Code Interpreter Output:\n{result}") ``` Then run the script: ```bash theme={null} export OPENAI_API_KEY=your_key_here python app.py ``` ## Deploying to Tensorlake Deploy your secure code interpreter to the cloud with a single command: ```bash theme={null} tl secrets set OPENAI_API_KEY=your_key_here tl deploy app.py ``` This deployment creates a dedicated, isolated environment for every execution request, ensuring complete safety and scalability for your code interpretation tasks. # Deep Research Agent Source: https://docs.tensorlake.ai/examples/agentic-applications/deep-research Build a multi-agent deep research pipeline with Tensorlake and OpenAI. Check out the full source code for this example on GitHub. This tutorial demonstrates how to build a **Deep Research Agent** using Tensorlake and the OpenAI Agents SDK. This application orchestrates multiple agents to plan, search, and write comprehensive research reports on any given topic. ## Overview The Deep Research Agent consists of three specialized agents that work together in a pipeline: 1. **Planner Agent**: Breaks down the user's research topic into specific search queries and steps. 2. **Search Agent**: Executes the planned search queries in parallel, retrieving and summarizing relevant information from the web. 3. **Writer Agent**: Synthesizes the gathered information into a structured, comprehensive markdown report. Each agent runs as an isolated, serverless function on Tensorlake, ensuring scalability and fault tolerance. ## Prerequisites * **Python 3.11+** * **Tensorlake Account** and CLI installed. * **OpenAI API Key** ## Project Structure Your project should look like this: ```text theme={null} deep-research/ ├── app.py # Main application logic and Tensorlake functions ├── models.py # Pydantic data models for structured inputs/outputs ├── prompts.py # System prompts for the agents └── requirements.txt ``` ## Implementation ### 1. Define Data Models (`models.py`) First, we define the data structures that our agents will use to communicate. This ensures type safety and clear interfaces between the agents. ```python theme={null} from pydantic import BaseModel, Field from typing import List class SearchQuery(BaseModel): query: str = Field(..., description="A specific search query to execute.") rationale: str = Field(..., description="Why this query is important.") class ResearchPlan(BaseModel): topic: str search_queries: List[SearchQuery] class SearchResult(BaseModel): url: str title: str content: str summary: str class ResearchReport(BaseModel): topic: str markdown_content: str references: List[str] ``` ### 2. Create the Agents (`app.py`) In `app.py`, we define our Tensorlake functions. Each function represents a stage in the pipeline and utilizes an OpenAI agent. ```python theme={null} import os from typing import List from openai import OpenAI from tensorlake import application, function, Image from pydantic import BaseModel from models import ResearchPlan, SearchResult, ResearchReport # Import your prompts here # from prompts import PLANNER_PROMPT, SEARCH_PROMPT, WRITER_PROMPT # Define the runtime image image = Image(name="deep-research-agent").run("pip install openai tensorlake pydantic") @application() @function(image=image, secrets=["OPENAI_API_KEY"]) def deep_research_pipeline(topic: str) -> ResearchReport: """ Orchestrates the deep research pipeline. """ print(f"Starting deep research on: {topic}") # Phase 1: Planning plan = create_research_plan(topic) print(f"Plan created with {len(plan.search_queries)} queries.") # Phase 2: Searching (Parallel Execution) # We map the search function over the queries to run them in parallel search_results = execute_search.map(plan.search_queries) print(f"Completed {len(search_results)} searches.") # Phase 3: Writing report = write_report(topic, search_results) print("Report generation complete.") return report @function(image=image, secrets=["OPENAI_API_KEY"]) def create_research_plan(topic: str) -> ResearchPlan: client = OpenAI() completion = client.beta.chat.completions.parse( model="gpt-4o", messages=[ {"role": "system", "content": "You are an expert research planner."}, {"role": "user", "content": f"Create a research plan for: {topic}"} ], response_format=ResearchPlan ) return completion.choices[0].message.parsed @function(image=image, secrets=["OPENAI_API_KEY"]) def execute_search(query_obj) -> SearchResult: # In a real implementation, you would use a search tool or API here. # For this example, we'll simulate a search result. # You could use tools like Tavily, Serper, or a custom scraper. client = OpenAI() # Simulate processing the query summary = f"Simulated search results for: {query_obj.query}" return SearchResult( url="https://example.com", title=f"Results for {query_obj.query}", content="Full content would go here...", summary=summary ) @function(image=image, secrets=["OPENAI_API_KEY"]) def write_report(topic: str, results: List[SearchResult]) -> ResearchReport: client = OpenAI() # Compile context from search results context = " ".join([f"Source: {r.url} Summary: {r.summary}" for r in results]) completion = client.beta.chat.completions.parse( model="gpt-4o", messages=[ {"role": "system", "content": "You are an expert research writer. Write a comprehensive report based on the provided context."}, {"role": "user", "content": f"Topic: {topic} Context: {context}"} ], response_format=ResearchReport ) return completion.choices[0].message.parsed ``` ## Running Locally To test your pipeline locally, add this code block to the end of `app.py` and run it with python. ```python theme={null} if __name__ == "__main__": from tensorlake.applications import run_local_application # Run the application locally run_local_application(deep_research_pipeline, "The Future of Quantum Computing") ``` Then run the script: ```bash theme={null} export OPENAI_API_KEY=your_key_here python app.py ``` ## Deploying to Tensorlake When you're ready to deploy, use the `tl deploy` command. ```bash theme={null} tl secrets set OPENAI_API_KEY=your_key_here tl deploy app.py ``` Your deep research agent is now live and scalable! You can invoke it via the provided HTTP endpoint or the Tensorlake SDK. # Personal Finance Manager Source: https://docs.tensorlake.ai/examples/agentic-applications/personal-finance-manager Build an AI-powered finance manager that parses statements and answers spending questions. Check out the full source code for this example on GitHub. This tutorial demonstrates how to build a **Personal Finance Manager** using Tensorlake and Claude. This application can parse PDF bank statements, categorize transactions using LLMs, store them in a PostgreSQL database, and answer natural language questions about your spending. ## Overview The Personal Finance Manager consists of two main agents: 1. **Finance Analyzer Agent**: Parses PDF statements, extracts transactions, categorizes them using Claude, and stores them in a database. 2. **Finance Query Agent**: Translates natural language questions (e.g., "How much did I spend on groceries?") into SQL queries, executes them, and visualizes the results. ## Prerequisites * **Python 3.11+** * **Tensorlake Account** and CLI installed. * **Anthropic API Key** * **PostgreSQL Database** (e.g., Neon, Supabase, or local) ## Project Structure ```text theme={null} personal-finance/ ├── app.py # Main application logic and agents ├── models.py # Pydantic models for transactions and queries ├── config.py # Configuration and prompts └── requirements.txt ``` ## Implementation (`app.py`) Here is a simplified view of the core logic for the Finance Analyzer Agent. ```python theme={null} import os from typing import List from anthropic import Anthropic from tensorlake import Agent, Assistant, File, User, configure_logging, get_logger, Image, application, function # import your models and config here # Define the runtime environment with necessary dependencies image = Image(name="finance-manager").run("pip install anthropic pandas psycopg2-binary tensorlake pydantic") class FinanceAnalyzerAgent(Agent): """ Parses PDF statements and stores categorized transactions. """ def __init__(self, name: str, system_prompt: str, tools: List): super().__init__(name, system_prompt, tools) # Initialize Anthropic client and DB connection self.anthropic_client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY")) def call(self, user_message: User, files: List[File]) -> Assistant: if not files: return Assistant(content="Please upload a PDF statement.") pdf_file = files[0] # 1. Extract text from PDF (using a helper or library) pdf_text = self._extract_text(pdf_file) # 2. Extract transactions using Claude transactions = self._extract_transactions_from_text(pdf_text) # 3. Categorize transactions using Claude categorized = self._categorize_transactions(transactions) # 4. Store in Database self._insert_into_db(categorized) return Assistant(content=f"Processed statement. Added {len(categorized)} transactions.") # Helper methods implementation... ``` And the Finance Query Agent: ```python theme={null} class FinanceQueryAgent(Agent): """ Answers questions about financial data using SQL. """ def call(self, user_message: User, files: List) -> Assistant: # 1. Get Database Schema schema = self._get_db_schema() # 2. Generate SQL query using Claude based on user question and schema sql_query = self._generate_sql(user_message.content, schema) # 3. Execute SQL results = self._execute_sql(sql_query) # 4. Formulate answer answer = self._generate_answer(user_message.content, results) return Assistant(content=answer) ``` ## Running Locally 1. Set up your environment variables: ```bash theme={null} export ANTHROPIC_API_KEY=your_key export DATABASE_URL=postgresql://user:password@host:port/dbname ``` 2. Run the application: ```bash theme={null} python app.py path/to/statement.pdf ``` ## Deploying to Tensorlake Deploy your finance manager to the cloud securely. ```bash theme={null} tl secrets set ANTHROPIC_API_KEY=your_key tl secrets set DATABASE_URL=your_db_url tl deploy app.py ``` Your personal finance assistant is now ready to help you track your spending! # Weather Agent Source: https://docs.tensorlake.ai/examples/agentic-applications/weather-agent Build a conversational weather agent using Tensorlake and OpenWeatherMap. Check out the full source code for this example on GitHub. This tutorial demonstrates how to build a **Weather Agent** using Tensorlake and the OpenWeatherMap API. This agent can fetch real-time weather data for any location and answer natural language questions about it. ## Overview The Weather Agent consists of: 1. **Weather Tool**: A Python function that calls the OpenWeatherMap API. 2. **Weather Agent**: An LLM-powered agent that understands user queries (e.g., "Will I need an umbrella in London today?") and decides when to call the weather tool. ## Prerequisites * **Python 3.11+** * **Tensorlake Account** and CLI installed. * **OpenWeatherMap API Key** * **Anthropic API Key** ## Project Structure ```text theme={null} weather-app/ ├── agent.py # Weather Agent logic and API calls ├── tensorlake_app.py # Main application entry point ├── config.py # Configuration and prompts └── requirements.txt ``` ## Implementation (`agent.py`) Here is the core logic for the Weather Agent. ```python theme={null} import os import requests from typing import List from anthropic import Anthropic from pydantic import BaseModel, Field from tensorlake import Agent, Assistant, Function, FunctionTool, User, configure_logging, get_logger, Image, function # Define the runtime environment image = Image(name="weather-agent").run("pip install anthropic requests tensorlake pydantic") class GetCurrentWeatherInput(BaseModel): location: str = Field(..., description="The city and country, e.g., 'London, UK' or 'New York, USA'.") @function(image=image, secrets=["OPENWEATHER_API_KEY"]) def get_current_weather(location: str) -> str: """ Fetches current weather data for a specified location using the OpenWeatherMap API. """ api_key = os.getenv("OPENWEATHER_API_KEY") base_url = "http://api.openweathermap.org/data/2.5/weather" params = { "q": location, "appid": api_key, "units": "metric", } response = requests.get(base_url, params=params) if response.status_code == 200: data = response.json() desc = data['weather'][0]['description'] temp = data['main']['temp'] return f"Weather in {location}: {desc}, {temp}°C" else: return f"Could not fetch weather for {location}." class WeatherAgent(Agent): def __init__(self, name: str, system_prompt: str, tools: List[FunctionTool]): super().__init__(name, system_prompt, tools) self.anthropic_client = Anthropic() def call(self, user_message: User, files: List) -> Assistant: messages = [{"role": "user", "content": user_message.content}] response = self.anthropic_client.messages.create( model="claude-3-opus-20240229", max_tokens=1024, system=self.system_prompt, messages=messages, tools=self.tools, ) if response.stop_reason == "tool_use": tool_use = next(block for block in response.content if block.type == "tool_use") if tool_use.name == "get_current_weather": location = tool_use.input["location"] weather_info = get_current_weather(location) messages.append({"role": "assistant", "content": response.content}) messages.append({ "role": "user", "content": [ { "type": "tool_result", "tool_use_id": tool_use.id, "content": weather_info } ] }) final_response = self.anthropic_client.messages.create( model="claude-3-opus-20240229", max_tokens=1024, system=self.system_prompt, messages=messages, tools=self.tools, ) return Assistant(content=final_response.content) return Assistant(content=response.content) # Initialize the agent weather_agent = WeatherAgent( name="weather_agent", system_prompt="You are a helpful weather assistant.", tools=[ FunctionTool( name="get_current_weather", description="Get current weather for a location.", function=Function( name="get_current_weather", description="Get current weather for a location.", input_schema=GetCurrentWeatherInput.model_json_schema(), ), ) ] ) ``` ## Running Locally 1. Set up environment variables: ```bash theme={null} export OPENWEATHER_API_KEY=your_openweather_key export ANTHROPIC_API_KEY=your_anthropic_key ``` 2. Run the agent: ```bash theme={null} python agent.py ``` ## Deploying to Tensorlake Deploy your weather agent to the cloud. ```bash theme={null} tl secrets set OPENWEATHER_API_KEY=your_openweather_key tl secrets set ANTHROPIC_API_KEY=your_anthropic_key tl deploy agent.py ``` Your weather agent is now live and ready to answer queries! # Web Scraper to MongoDB Atlas Source: https://docs.tensorlake.ai/examples/agentic-applications/web-scraper Build a scalable web scraper that stores vector embeddings in MongoDB Atlas. Check out the full source code for this example on GitHub. This tutorial demonstrates how to build a production-grade **Web Scraper** that crawls websites, processes content into clean Markdown, generates embeddings using Voyage AI, and stores them in MongoDB Atlas Vector Search. ## Overview This application showcases the power of Tensorlake's parallel processing capabilities: 1. **Parallel Crawling**: Uses Breadth-First Search (BFS) with Tensorlake's `.map()` to fetch multiple pages concurrently at each depth level. 2. **Headless Browsing**: Utilizes **PyDoll** (based on Chromium) to render JavaScript-heavy websites. 3. **Content Cleaning**: Converts HTML and PDFs to clean Markdown, automatically removing boilerplate like headers, footers, and ads. 4. **Vector Embeddings**: Generates high-quality embeddings for document chunks using **Voyage AI**. 5. **Vector Search**: Stores the processed chunks and embeddings directly into **MongoDB Atlas** for RAG applications. ## Prerequisites * **Python 3.11+** * **Tensorlake Account** and CLI installed. * **MongoDB Atlas** cluster URI. * **Voyage AI** API Key. ## Implementation The application is defined in a single file, `scraper_to_atlas.py`. It defines two custom runtime images: one for scraping (with Chromium) and one for embedding (lightweight). ### 1. Define Dependencies and Images ```python theme={null} from tensorlake.applications import Image # Image with Chromium, pydoll, and dependencies for web scraping scraper_image = ( Image(name="scraper-to-atlas-image", base_image="python:3.11.0") .env("DEBIAN_FRONTEND", "noninteractive") .run("apt-get update && apt-get install -y chromium ...") # System deps .run("pip install pydoll-python tensorlake beautifulsoup4 markdownify pymupdf4llm") ) # Image for embedding and MongoDB operations embedding_image = ( Image(name="embedding-image", base_image="python:3.11.0") .run("pip install tensorlake voyageai pymongo") ) ``` ### 2. Main Scraping Logic (`scraper_to_atlas.py`) The `@application` entry point orchestrates the crawling process. It manages the BFS queue and dispatches parallel tasks using `fetch_and_convert.map()`. ```python theme={null} @application() @function(secrets=["VOYAGE_API_KEY", "MONGO_URI"]) def scrape_and_embed(input: ScrapeAndEmbedInput) -> dict: # ... setup BFS ... # Phase 1: Parallel BFS for depth in range(max_depth + 1): # ... deduce URLs to fetch ... # Parallel fetch all URLs at this depth level using map() results = fetch_and_convert.map(urls_to_fetch) # ... process results and collect new links ... # Phase 2: Process PDFs in parallel if pdf_urls: pdf_results = fetch_and_convert_pdf.map(list(pdf_urls)) # Phase 3: Generate embeddings and store embed_and_store(all_documents, ...) ``` ### 3. Page Fetching and Conversion The `fetch_and_convert` function runs in the `scraper_image` and uses PyDoll to render pages. ```python theme={null} @function(image=scraper_image, timeout=120, memory=4) def fetch_and_convert(url: str) -> dict: return asyncio.run(_fetch_and_convert_async(url)) async def _fetch_and_convert_async(url: str) -> dict: async with Chrome() as browser: page = await browser.new_page() await page.goto(url) html = await page.content() # ... extract title and links ... # Convert to clean markdown markdown = _html_to_markdown(html) chunks = _chunk_text(markdown) return {"url": url, "chunks": chunks, ...} ``` ### 4. Embedding and Storage The `embed_and_store` function runs in the `embedding_image` and handles interaction with Voyage AI and MongoDB. ```python theme={null} @function(image=embedding_image, secrets=["VOYAGE_API_KEY", "MONGO_URI"]) def embed_and_store(documents, mongo_uri, voyage_api_key, ...): # Initialize Voyage AI vo = voyageai.Client(api_key=voyage_api_key) # Generate embeddings embeddings = vo.embed(texts=[d["text"] for d in documents], model="voyage-4-large") # Store in MongoDB client = pymongo.MongoClient(mongo_uri) collection = client[db_name][col_name] collection.insert_many([{...} for ...]) ``` ## Running Locally 1. Set your environment variables: ```bash theme={null} export MONGO_URI="mongodb+srv://..." export VOYAGE_API_KEY="voyage-..." ``` 2. Run the application: ```bash theme={null} python scraper_to_atlas.py ``` ## Deploying to Tensorlake Deploy your scalable scraper to the cloud. ```bash theme={null} tl secrets set MONGO_URI="mongodb+srv://..." tl secrets set VOYAGE_API_KEY="voyage-..." tl deploy scraper_to_atlas.py ``` Your scraper will now run in the cloud, automatically scaling to handle hundreds of pages in parallel! # Bounding Boxes Source: https://docs.tensorlake.ai/examples/code-snippets/bounding-boxes Access per-fragment bounding box coordinates from Tensorlake parse results to highlight or position extracted content on source pages. **Accessing Bounding Boxes**: ```python theme={null} result = doc_ai.parse_and_wait(file_id) for page in result.pages: for fragment in page.page_fragments: bbox = fragment.bbox print(f"Fragment type: {fragment.fragment_type}") print(f"Top-left: ({bbox['x1']}, {bbox['y1']})") print(f"Bottom-right: ({bbox['x2']}, {bbox['y2']})") ``` **Citation with source location**: ```python# Show users exactly where information came from theme={null} for fragment in page.page_fragments: if fragment.fragment_type == "text": print(f"Content: {fragment.content.text}") print(f"Found on page {page.page_number} at ({fragment.bbox['x1']}, {fragment.bbox['y1']})") ``` **Calculate dimensions**: ```python theme={null} # Get width and height of a fragment for fragment in page.page_fragments: width = fragment.bbox['x2'] - fragment.bbox['x1'] height = fragment.bbox['y2'] - fragment.bbox['y1'] print(f"{fragment.fragment_type}: {width}x{height} pixels") ``` **Filter by location**: ```python theme={null} # Extract only content from a specific region (e.g., ignore headers/footers) main_content = [ fragment for fragment in page.page_fragments if 50 < fragment.bbox['y1'] < 700 # Exclude top/bottom margins ] ``` **Visual debugging**: ```python theme={null} # Identify which fragments might have extraction issues for fragment in page.page_fragments: width = fragment.bbox['x2'] - fragment.bbox['x1'] height = fragment.bbox['y2'] - fragment.bbox['y1'] if width < 10 or height < 10: print(f"Warning: Very small fragment at ({fragment.bbox['x1']}, {fragment.bbox['y1']})") ``` **Build Clickable Citations**: ```python theme={null} # Create data structure for highlighting content in a document viewer citations = [] for fragment in page.page_fragments: if "key information" in fragment.content.text.lower(): citations.append({ "page": page.page_number, "bbox": { "x1": fragment.bbox['x1'], "y1": fragment.bbox['y1'], "x2": fragment.bbox['x2'], "y2": fragment.bbox['y2'] }, "text": fragment.content.text }) ``` # Build Smart Document Understanding Agents with Tensorlake and OpenAI Agent SDK Source: https://docs.tensorlake.ai/examples/cookbooks/build-smarter-agents-with-doc-understanding Turn documents into action-ready inputs for agents by combining Tensorlake's structured extraction with the OpenAI Agents SDK. Agentic applications rely on accurate, structured inputs to make decisions. Tensorlake enables this by extracting structured document data that agents can reason over—no guesswork or prompting required. This turns your documents into action-ready inputs for any AI Agent framework. Parse research papers and ask natural language questions about them with OpenAI and Tensorlake using this notebook: Here is an example of how to use the Tensorlake Python SDK to parse a research paper and extract key information. With this information, build AI Agents that have more detailed and accurate information ready for natural language query engagement. 1. Get your [Tensorlake API key](https://docs.tensorlake.ai/platform/authentication#api-keys) 2. Install the Tensorlake SDK with `pip install tensorlake` ```python theme={null} import json from pydantic import BaseModel from tensorlake.documentai import ( DocumentAI, ParseStatus, ParsingOptions, StructuredExtractionOptions ) # Initialize the client and add your API Key doc_ai = DocumentAI() # Upload the document file_path = "https://tlake.link/docs/research-paper" ``` Define a Pydantic model to extract relevant information from the research paper. ```python theme={null} class ResearchPaperSchema(BaseModel): """Schema focusing on the most critical information from the research papers""" title: str = Field(description="Title of the research paper") authors: List[str] = Field(description="List of author names") abstract: str = Field(description="Abstract of the paper") research_problem: str = Field(description="What problem does this paper solve?") main_approach: str = Field(description="What is the main approach or method used?") key_contributions: List[str] = Field(description="What are the 3-5 most important contributions?") methodology_summary: str = Field(description="Brief summary of the research methodology") datasets_used: Optional[List[str]] = Field(description="Datasets mentioned in the paper", default=None) evaluation_metrics: Optional[List[str]] = Field(description="How do they measure success?", default=None) related_work_summary: Optional[str] = Field(description="Brief summary of how this relates to existing work", default=None) limitations: Optional[List[str]] = Field(description="What limitations do the authors acknowledge?", default=None) ``` ```python theme={null} # Configure parsing options for academic papers parsing_options = ParsingOptions( chunking_strategy=ChunkingStrategy.PAGE ) # Configure structured extraction structured_extraction_options = StructuredExtractionOptions( schema_name="Research Paper Analysis", json_schema=ResearchPaperSchema ) # Parse the document with the specified extraction options parse_id = doc_ai.parse(file_path, parsing_options=parsing_options, structured_extraction_options=[structured_extraction_options]) print(f"Parse job submitted with ID: {parse_id}") # Wait for completion result = doc_ai.wait_for_completion(parse_id) ``` The result will include the extracted data, all of the markdown chunks, and the entire document layout. ```python theme={null} # Print the structured data extracted print(json.dumps(result.structured_data[0].data, indent=2)) # Get all of the markdown chunks (by page) for index, chunk in enumerate(result.chunks): print(f"Chunk {index}:") print(chunk.content) ``` The output will be: ```json Structured Data theme={null} { "abstract": "A crucial component in many deep learning applications, such as Frequently Asked Questions (FAQ) and Retrieval-Augmented Generation (RAG), is dense retrieval. In this process, embedding models transform raw text into numerical vectors. However, the embedding models that currently excel on text embedding benchmarks, like the Massive Text Embedding Benchmark (MTEB), often have numerous parameters and high vector dimensionality. This poses challenges for their application in real-world scenarios. To address this issue, we propose a novel multi-stage distillation framework that enables a smaller student embedding model to distill multiple larger teacher embedding models through three carefully designed losses. Meanwhile, we utilize Matryoshka Representation Learning (MRL) to reduce the vector dimensionality of the student embedding model effectively. Our student model named Jasper with 2 billion parameters, built upon the Stella embedding model, obtained the No.3 position on the MTEB leaderboard (as of December 24, 2024), achieving an average 71.54 score across 56 datasets. We have released the model and data on the Hugging Face Hub, and the training codes are available in this project repository.", "authors": [ "Dun Zhang", "Jiacheng Li", "Ziyang Zeng", "Fulong Wang" ], "datasets_used": [ "sentence-transformers/embedding-training-data", "BAAI/Infinity-MM" ], "evaluation_metrics": [ "average score on MTEB leaderboard across 56 datasets" ], "key_contributions": [ "Propose a novel multi-stage distillation framework for reducing model size without significantly losing performance.", "Develop the Jasper model with 2 billion parameters that perform comparably to models with 7 billion parameters.", "Use Matryoshka Representation Learning (MRL) to reduce vector dimensionality efficiently.", "Publication of three tailored loss functions to enhance distillation learning.", "Release of model and data on Hugging Face Hub." ], "limitations": [ "The paper does not conduct experiments to evaluate the proposed approach for self-distillation in detail.", "Stage 4 only achieves preliminary alignment between text and image modalities, indicating room for improvement." ], "main_approach": "The main approach is a multi-stage distillation framework that involves distilling information from larger teacher models to a smaller student model using three specific loss functions, combined with Matryoshka Representation Learning (MRL) for dimensionality reduction.", "methodology_summary": "The methodology involves a four-stage distillation process where a smaller student model distills information from larger teacher models using specifically designed loss functions to learn effective text representations while employing MRL for dimensionality reduction. Subsequent stages focus on enhanced dimension reduction and unlocking multimodal potential through incorporating vision encodings.", "related_work_summary": "The paper builds on existing dense retrieval and knowledge distillation methodologies, emphasizing enhanced retrieval training efficiency and effectiveness, with references to prior works on knowledge distillation and representation learning.", "research_problem": "The paper addresses the challenge of deploying high-performing dense retrieval models with large parameters and vector dimensions in practical applications by proposing a distillation framework to reduce model size while maintaining performance.", "title": "Jasper and Stella: distillation of SOTA embedding models" } ``` ```md markdown_chunks.json expandable theme={null} Chunk 0: arXiv:2412.19048v2 [cs.IR] 23 Jan 2025 ## Jasper and Stella: distillation of SOTA embedding models Dun Zhang1, Jiacheng Li1; Ziyang Zeng1,2, Fulong Wang1 1 NovaSearch Team 2Beijing University of Posts and Telecommunications infgrad@163.com jcli.nlp@gmail.com ziyang1060@bupt.edu.cn wangfl1989@163.com ## Abstract A crucial component in many deep learning applications, such as Frequently Asked Ques- tions (FAQ) and Retrieval-Augmented Gener- ation (RAG), is dense retrieval. In this pro- cess, embedding models transform raw text into numerical vectors. However, the embed- ding models that currently excel on text embed- ding benchmarks, like the Massive Text Embed- ding Benchmark (MTEB), often have numer- ous parameters and high vector dimensionality. This poses challenges for their application in real-world scenarios. To address this issue, we propose a novel multi-stage distillation frame- work that enables a smaller student embedding model to distill multiple larger teacher embed- ding models through three carefully designed losses. Meanwhile, we utilize Matryoshka Rep- resentation Learning (MRL) to reduce the vec- tor dimensionality of the student embedding model effectively. Our student model named Jasper with 2 billion parameters, built upon the Stella embedding model, obtained the No.3 po- sition on the MTEB leaderboard (as of Decem- ber 24, 2024), achieving average 71.54 score across 56 datasets. We have released the model and data on the Hugging Face Hub 1 2, and the training codes are available in this project repository 3. ## 1 Introduction With the rapid development of natural language pro- cessing technologies, text embedding models play a crucial role in text representation (Kashyap et al., 2024), information retrieval (Zhao et al., 2024a), and text generation tasks (Gao et al., 2023). By mapping words, sentences, or documents into a high-dimensional continuous space, these models bring similar texts closer together in their vector representations, thereby not only enhancing the manipulability of textual data but also significantly improving the performance of various downstream tasks (Agarwal et al., 2024; Wang et al., 2024; Zhou et al., 2024). However, embedding models that demonstrate excellent performance on the METB leaderboard4 (Muennighoff et al., 2023) usually contain a large number of parameters and high vector dimensions. For instance, both NV-Embed-v2 (Lee et al., 2024; Moreira et al., 2024) and bge-en-icl (Xiao et al., 2023; Li et al., 2024) have 7 billion parameters and 4096-dimensional vector representations. These characteristics lead to slow inference speeds and high storage costs, posing a significant challenge to their direct practical application. To address the aforementioned challenges, we propose a novel multi-stage knowledge distillation framework for embedding models. Knowledge dis- tillation is widely recognized for enhancing the effectiveness of dense retrieval training (Hofstätter et al., 2021; Lin et al., 2021). In our framework, we introduce three carefully designed loss func- tions to distill knowledge from the teacher model to the student model. These loss functions shift from a specific constraint to a broader constraint. The first, cosine loss, calculates the absolute dif- ference in text representations between the student and teacher models. The pointwise signal derived from a single text is straightforward, yet its lim- ited optimization direction tends to readily lead to overfitting on the training data. Thus, we introduce the similarity loss, which measures the semantic discrepancies between the student and teacher mod- els from a text-pair perspective. Additionally, we design the relative similarity distillation loss to fur- ther leverage relative ranking information. This *Dun Zhang and Jiacheng Li make equal contributions to this work. 1https://huggingface.co/infgrad/jaspe r_en_vision_language_v1 2https://huggingface.co/datasets/infg rad/ jasper_text_distill_dataset 3https : //github. com/NLPJCL/RAG-Retriev al 4https://huggingface.co/spaces/mteb/l eaderboard Chunk 1: ensures that the student model learns the teacher's ranking preferences across all potential positive and negative text pairs within the batch, thereby improving the robustness of embedding learning. To further improve the performance of the stu- dent model, we utilize multiple powerful large em- bedding models as teachers. Specifically, we con- catenate the vectors produced by all teacher models to create the final ground truth, which inevitably leads to an increase in the student model's vector dimension. To address this issue, we adopt a Ma- tryoshka Representation Learning (MRL)-based training method (Kusupati et al., 2024) to effec- tively compress the student model's vector rep- resentation. Additionally, to develop the multi- modal retrieval capability of our student model, we integrate a vision encoder and introduce a self- distillation mechanism to align the visual embed- dings with the textual embeddings. In terms of the overall training process, we employ a 4-stage dis- tillation approach to progressively transfer knowl- edge from the teacher models to the student model. Each stage focuses on specific aspects, combining three loss functions and fine-tuning different pa- rameters of the student model to ensure a smooth and effective distillation process. Experimental results on the MTEB leaderboard demonstrate that our student model named Jasper with 2 billion (2B) parameters, primarily built upon the foundation of the Stella embedding model, de- livers excellent performance (average 71.54 score across 56 datasets) comparable to other embedding models with 7 billion (7B) parameters, and sig- nificantly outperforms models with fewer than 2B parameters. The main contributions of this paper can be sum- marized as follows: (1) We propose a novel multi-stage distillation framework, which enables a smaller student embedding model to effectively distill knowl- edge from multiple larger teacher embedding models through three carefully designed loss functions. (2) Our 2B Jasper model obtained the No.3 posi- tion on the MTEB leaderboard (as of Decem- ber 24, 2024), producing results comparable to other top-ranked 7B embedding models and significantly outperforming other models with less than 2B parameters. ## 2 Methods ## 2.1 Definitions For a more comprehensive introduction of our model and distillation framework, we make the following definitions: · Student Model: The text embedding model that is the subject of training, tasked with learning to produce effective vector represen- tations. · Teacher Model: The state-of-the-art (SOTA) embedding model serving as a teacher, guid- ing the student model in generating effective vectors. Notably, the teacher model will not be trained. · Sx: The normalized vector representation of a text x produced by the student model. · tx: The vector representation of the same text x, first normalized, then concatenated, and normalized again, produced by multiple teacher models. · Sx: A matrix of normalized vector represen- tations for a batch of text X produced by the student model. · Tx: A corresponding matrix of vector rep- resentations for the same batch of text X, first normalized, then concatenated, and subse- quently normalized again, generated by multi- ple teacher models. ## 2.2 Model Architecture Our student model architecture follows the sim- ple and standard design of combining a language model with a vision encoder. As shown in Figure 1, it consists of four components: 1. A encoder-based language model that gener- ates text embeddings through mean pooling. 2. A vision encoder that independently maps im- ages into vision token embeddings. 3. A pooler that maps vision token embed- dings to the same dimension as the language model's input textual embeddings, while re- ducing the length of visual token sequences. 4. Several fully connected (FC) layers that project the embeddings to a specific dimen- sion for the final output. Chunk 2: Figure 1: The model architecture of Jasper model. ### Figure 12288 dim vector 1024 dim vector 512 dim vector 256 dim vector FC1 FC2 FC3 FC4 Mean Polling ... Stella Encoder AvgPool2d Siglip Vision Encoder Stella Input Embedding Image Text ## 2.3 Stage 1&2: Distillation from Multiple Teachers In the first two stages of distillation, we use a fully connected layer to map the vectors of the student model onto the dimensions of the teacher mod- els. Specifically, we employ NV-Embed-v25 and stella_en_1.5B_v56 as teacher models, which have vector dimensions of 4096 and 8192, respectively. After the mapping process, the student model's vector dimension is adjusted to 12288, equal to the combined vector dimensions of two teacher models (4096 + 8192). The objective of the first two stages is to enable the student model to effectively learn text represen- tations from multiple teacher models by aligning its output vectors with the corresponding teacher vectors. To achieve this goal, we carefully design three loss functions that progress from a specific to a broader perspective. The first loss function is cosine loss, which is formulated as follows: Lcosine = 2 x 1 - Sx . tr. (1) The Lcosine is designed to minimize the angular difference between student and teacher vectors in the high-dimensional space, with the aim of align- ing their absolute text representations. However, the Lcosine value generally does not converge to zero, suggesting a persistent angular discrepancy between the student and the teachers. Meanwhile, the pointwise signal derived from a single text has a limited optimization direction, which can easily lead to overfitting on the training data. Lsim = MSE(SxSk,TxTÆ)) (2) To complement the limitations of Lcosine, we in- troduce the second loss function, similarity loss, as defined in (2), which models the semantic matching differences between the student and teacher mod- els from a text-pair perspective. This loss function ensures a relatively consistent judgment of simi- larity between the student model and the teacher models, without enforcing an absolute fit between the student model and the teacher model. N 1 Cresim = > MAX(0, ti-tj>tm.tn Sm . Sn - Si Sj + margin) (3) To further leverage relative comparison signals, inspired by CoSENT loss7, we propose the third loss function, relative similarity distillation loss, as defined in (3). For each batch of text data, we em- ploy teacher models to automatically generate soft labels for all text pairs, thereby identifying poten- tial positive and negative samples. Subsequently, the student model is trained to ensure that the simi- larity between positive pairs exceeds that between negative pairs, with the margin hyperparameter controlling the degree of this difference. If the batch size is m, the total number of text pairs (i.e., N) is given by C22 . C = \1Lcosine + 12Lsim + AzLresim (4) The final loss £ is a weighted sum of the afore- mentioned three loss functions. where X1,12, and 13 are hyperparameters. The biggest advantage of distillation vectors is that we do not need any supervised data. Without considering resource con- straints, we can use trillions of unsupervised texts 5https: //huggingface. co/nvidia/NV-Emb ed-v2 'https://huggingface.co/dunzhang/stel la_en_1.5B_v5 "https://spaces.ac.cn/archives/8847 Chunk 3: for distillation training to achieve extreme perfor- mance for a given model size. Notably, the main difference between stage 1 and stage 2 lies in the trained parameters. In stage 1, only the fully connected layer (FC1) is trained, whereas in stage 2, both the fully connected layer (FC1) and the last three encoder layers of the stu- dent model are trained. ## 2.4 Stage 3: Dimension Reduction In the first two stages, the student model is trained by learning from the teacher models. Specifically, we concatenate the vectors produced by the two teacher models, resulting in a student model vector with a dimensionality of 12,288 (4,096 + 8,192), which is impractically large. Inspired by MRL (Kusupati et al., 2024), we introduce three addi- tional, independent fully connected layers (FC2, FC3, and FC4) to generate low-dimensionality vec- tors, each achieving a different level of dimension reduction. For instance, by incorporating the fully connected layer FC3 with a shape of (15368, 512), we obtain a more manageable 512-dimensional vec- tor space. For the three FC layers, since the dimensions of the reduced vectors do not align with those of the concatenated teacher vector, the Lcosine is omitted and only the Lsim and Cresim are utilized. To en- sure the accuracy of the vectors generated from the FC1 layer (i.e., the 12288-dimensional vec- tors), they continue to be trained using all three loss functions. During this stage, all parameters of the student model are trained. In addition to the previously mentioned dimen- sion reduction method, we present a potentially promising approach to self-distillation, where the aligned vectors from an earlier stage of the student model's training serve as teacher vectors. Specifi- cally, we propose to utilize the 12288-dimensional vectors output from the FC1 layer to serve as teach- ers for the shorter vectors generated by the other three FC layers. This approach provides a unique advantage by enabling the reduction of the dimen- sionality of any embedding model, utilizing only unsupervised data and the model itself. Given that this paper primarily focuses on introducing the training methods of the Stella and Jasper mod- els, we did not conduct experiments to evaluate the specific merits of this proposed approach. ## 2.5 Stage 4: Unlock Multimodal Potential In stage 4, we leverage image-caption pairs as the training dataset, focusing exclusively on training the visual encoder while keeping the other compo- nents frozen. The training process is based on self- distillation, where the caption's vector representa- tion serves as the teacher vector, and the image's vector representation acts as the student vector. All fully connected layers introduced in previous stages are employed to generate multiple pairs of student and teacher vectors. For each pair, we calculate three losses, which are then averaged to obtain the final loss. It is important to note that this stage achieves only a preliminary alignment between the text and image modalities, leaving significant room for im- provement. In future work, we aim to further ex- plore and refine the modality alignment process. ## 3 Experiments ## 3.1 Implementation details Our model named Jasper is initialized from stella_en_1.5B_v5 and google/siglip-so400m- patch14-384 (Zhai et al., 2023; Alabdulmohsin et al., 2024). stella_en_1.5B_v5 and NV-Embed-v2 are our teacher models. The total number of parameters in our Jasper model is 1.9B (stella 1.5B parameters and siglip 400M parameters). For hyperparameters, we set X1 = 10, 12 = 200, 13 = 20, margin = 0.015. In all four stages, the model is trained using 8 x RTX A6000 GPUs, with a maximum input length of 512 tokens, mixed precision training (BF16), DeepSpeed ZERO-stage-2, and the AdamW opti- mizer. During stage 1 (distillation training), the batch size is set to 128, the learning rate is 1e-4 per step, and the model checkpoint at step 4000 is selected as the final model. In the case of stage 2 (also distillation training), the batch size remains 128, the learning rate drops to 8e-5 per step, and the final model is the checkpoint at step 7000. For stage 3 (dimension reduction training), the batch size is again 128, the learning rate is adjusted to 7e- 5 per step, and the checkpoint at step 2200 serves as the final model. Lastly, in stage 4 (multimodal training), the batch size is reduced to 90, the learn- ing rate returns to 1e-4 per step, and the final model is chosen from the checkpoint at step 3500. 8This refers to the dimensionality of the encoder layer's hidden state. Chunk 4: Table 1: MTEB Results as of December 24, 2024. We use the original model names on the leaderboard for clarity.
Model Model Size Average(56 datasets) Classification Clustering PairClassification Reranking Retrieval STS Summarization
NV-Embed-v2 7851M 72.31 90.37 58.46 88.67 60.65 62.65 84.31 30.7
bge-en-icl 7111M 71.67 88.95 57.89 88.14 59.86 62.16 84.24 30.77
Stella_en_1.5B_v5 1543M 71.19 87.63 57.69 88.07 61.21 61.01 84.51 31.49
SFR-Embedding-2_R 7111M 70.31 89.05 56.17 88.07 60.14 60.18 81.26 30.71
gte-Qwen2-1.5B-instruct 1776M 67.16 82.47 48.75 87.51 59.98 58.29 82.73 31.17
voyage-lite-02-instruct 1220M 67.13 79.25 52.42 86.87 58.24 56.60 85.79 31.01
Jasper (our model) 1543M+400M 71.54 88.49 58.04 88.07 60.91 61.33 84.67 31.42
## 3.2 Datasets In stage 1, stage 2 and stage 3, we use fineweb-edu (Lozhkov et al., 2024) as our main text training dataset, which accounts for 80% of the full text data. The remaining 20% of the text data comes from sentence-transformers/embedding-training- data9. The reason we choose the sentence- transformers/embedding-training-data is that the majority of the fineweb-edu data consists of pas- sages. However, in addition to passages, we also require questions to enhance the diversity of our training data. The total amount of text training data is 8 million. For the documents in our dataset, we perform the following actions: 1. We randomly select 30% of the documents and divide them into short texts, each consist- ing of 1 to 10 sentences. 2. We randomly select 0.08% of the text and shuffle the words within it. In stage 4, we use the caption data of BAAI/Infinity-MM (Gu et al., 2024) as our vision training data. ## 3.3 Results We evaluate the proposed Jasper and Stella models on the full MTEB benchmark, which encompasses 15 retrieval datasets, 4 reranking datasets, 12 clas- sification datasets, 11 clustering datasets, 3 pair classification datasets, 10 semantic textual similar- ity datasets, and 1 summarization dataset. Table 1 presents the average score of our Jasper model across the overall performance and seven subcategory tasks of the METB benchmark. We compare our model with other frontier models on the MTEB leaderboard, as well as those with fewer than 2B parameters. Experimental results demon- strate that our Jasper model significantly outper- forms other models with fewer than 2B parameters. Furthermore, despite having only 2B parameters, our model produces results that are comparable to those of models with 7B parameters. ## 4 Discussion ## 4.1 Instruction Robustness Instruction-based embedding models require an in- struction to be prepended to a query or passage dur- ing text encoding. Currently, many state-of-the-art text embedding models use instructions to prompt the model and obtain better embeddings. Similar to the usage of large language models (Zhao et al., 2024b), different tasks necessitate different instruc- tions, which is both logical and intuitive. Therefore, the ability to understand instructions is crucial for these text embedding models. Jasper is also an instruction-based embedding model. To demonstrate the impact of different prompts on the Jasper model, we conducted a sim- ple experiment. Specifically, we evaluated Jasper on some short evaluation tasks using similar in- structions generated by GPT-4o. Table 2 lists all the original and modified instructions. Based on the results shown in Table 3, we conclude that our Jasper model is robust to instructions and can accu- rately understand different instructions. ## 4.2 Possible Improvements for Vision Encoding Due to time and resource constraints, we were only able to equip the Jasper model with a basic image encoding capability. Initially, stage 4 was envi- sioned as a fundamental visual-language alignment training phase, with a potential stage 5 involving contrastive learning utilizing a Visual Question An- swering (VQA) dataset. Additionally, we observed oscillatory behavior in our loss function during stage 4. Overall, there is considerable room for enhancement in the multimodal training. ## 5 Conclusion In this paper, we present the distillation-based train- ing procedure for the Jasper model. We have 9https://huggingface.co/datasets/sent ence-transformers/embedding-training-dat a Chunk 5: Table 2: Original instructions and corresponding synonyms.
Original Instruction Synonym of Original Instruction
Classify the sentiment expressed in the given movie review text from the IMDB dataset Determine the sentiment conveyed in the provided movie review text from the IMDB dataset.
Identify the topic or theme of StackExchange posts based on the titles Determine the subject or theme of StackExchange posts based on the titles.
Given a news summary, retrieve other semantically similar summaries Given a news summary, find other summaries with similar meanings.
Retrieve duplicate questions from StackOverflow forum Find duplicate questions on the StackOverflow forum.
Given a title of a scientific paper, retrieve the titles of other relevant papers Given the title of a scientific paper, find the titles of other related papers.
Classify the sentiment of a given tweet as either positive, negative, or neutral Determine the sentiment of a given tweet as positive, negative, or neutral.
Given a claim, find documents that refute the claim Given a claim, locate documents that contradict the claim.
Given a question, retrieve relevant documents that best answer the question Given a question, find relevant documents that best answer it.
Retrieve tweets that are semantically similar to the given tweet Find tweets that have similar meanings to the given tweet.
Retrieve semantically similar text. Find text with similar meanings.
Identify the main category of Medrxiv papers based on the titles Determine the primary category of Medrxiv papers based on the titles.
Retrieve duplicate questions from AskUbuntu forum Find duplicate questions on the AskUbuntu forum.
Given a question, retrieve detailed question descriptions from Stackexchange that are duplicates to the given question Given a question, find detailed question descriptions from Stackexchange that are duplicates.
Identify the main category of Biorxiv papers based on the titles and abstracts Determine the primary category of Biorxiv papers based on the titles and abstracts.
Given a financial question, retrieve user replies that best answer the question Given a financial question, find user replies that best answer it.
Given a online banking query, find the corresponding intents Given an online banking query, identify the corresponding intents.
Identify the topic or theme of the given news articles Determine the subject or theme of the given news articles.
Classify the emotion expressed in the given Twitter message into one of the six emotions: anger, fear, joy, love, sadness, and surprise Given a user utterance as query, find the user intents Determine the emotion expressed in the given Twitter message as one of six emotions: anger, fear, joy, love, sadness, and surprise. Given a user utterance as a query, identify the user intents.
Identify the main category of Biorxiv papers based on the titles Determine the primary category of Biorxiv papers based on the titles.
Classify the given Amazon review into its appropriate rating category Classify the given Amazon review into its appropriate rating category.
Given a scientific claim, retrieve documents that support or refute the claim Given a scientific claim, find documents that support or contradict the claim.
Identify the topic or theme of StackExchange posts based on the given paragraphs Determine the subject or theme of StackExchange posts based on the given paragraphs.
Given a scientific paper title, retrieve paper abstracts that are cited by the given paper Given a scientific paper title, find paper abstracts that are cited by the given paper.
Classify the given comments as either toxic or not toxic Classify the given comments as toxic or non-toxic.
Classify the intent domain of the given utterance in task-oriented conversation Determine the intent domain of the given utterance in task-oriented conversation.
Retrieve duplicate questions from Sprint forum Find duplicate questions on the Sprint forum.
Given a user utterance as query, find the user scenarios Given a user utterance as a query, identify the user scenarios.
Classify the intent of the given utterance in task-oriented conversation Determine the intent of the given utterance in task-oriented conversation.
Classify a given Amazon customer review text as either counterfactual or not-counterfactual Classify a given Amazon customer review text as counterfactual or non-counterfactual.
Identify the main category of Medrxiv papers based on the titles and abstracts Determine the primary category of Medrxiv papers based on the titles and abstracts.
Given a query on COVID-19, retrieve documents that answer the query Given a query on COVID-19, find documents that answer the query.
Table 3: MTEB Results on different instructions.
Task Type Task Name Original Score Score with Modified Instructions
Classification MTOPDomainClassification 0.992 0.992
Classification AmazonCounterfactual Classification 0.958 0.957
Classification TweetSentimentExtractionClassification 0.773 0.776
Classification EmotionClassification 0.877 0.859
Classification MassiveIntentClassification 0.853 0.854
Classification AmazonReviewsClassification 0.629 0.630
Classification MassiveScenarioClassification 0.912 0.912
Classification Banking77Classification 0.873 0.875
Classification ImdbClassification 0.971 0.971
Classification ToxicConversations Classification 0.913 0.910
Classification MTOPIntentClassification 0.915 0.912
Clustering MedrxivClusteringS2S 0.448 0.448
Clustering StackExchangeClusteringP2P 0.494 0.492
Clustering StackExchangeClustering 0.800 0.795
Clustering TwentyNewsgroupsClustering 0.630 0.625
Clustering MedrxivClustering P2P 0.470 0.468
Clustering BiorxivClusteringS2S 0.476 0.475
Clustering BiorxivClusteringP2P 0.520 0.518
PairClassification TwitterURLCorpus 0.877 0.877
PairClassification SprintDuplicateQuestions 0.964 0.964
PairClassification TwitterSemEval2015 0.803 0.801
Reranking StackOverflowDupQuestions 0.546 0.548
Reranking SeiDocsRR 0.891 0.890
Reranking AskUbuntuDupQuestions 0.674 0.676
Retrieval CQADupstackMathematicaRetrieval 0.369 0.370
Retrieval CQADupstackStatsRetrieval 0.413 0.413
Retrieval CQADupstack TexRetrieval 0.362 0.362
Retrieval SCIDOCS 0.247 0.247
Retrieval CQADupstackEnglishRetrieval 0.543 0.543
Retrieval ArguAna 0.653 0.652
Retrieval TRECCOVID 0.865 0.866
Retrieval CQADupstackUnixRetrieval 0.482 0.482
Retrieval CQADupstackGamingRetrieval 0.632 0.633
Retrieval CQADupstackGisRetrieval 0.444 0.448
Retrieval CQADupstack WordpressRetrieval 0.388 0.386
Retrieval FIQA2018 0.601 0.601
Retrieval SeiFact 0.805 0.805
Retrieval CQADupstackPhysicsRetrieval 0.549 0.548
Retrieval NFCorpus 0.431 0.431
Retrieval CQADupstackProgrammersRetrieval 0.505 0.505
Retrieval CQADupstackAndroidRetrieval 0.571 0.571
Retrieval CQADupstack WebmastersRetrieval 0.464 0.464
STS BIOSSES 0.848 0.854
STS STS13 0.897 0.888
STS STS12 0.803 0.804
STS STSBenchmark 0.888 0.886
STS STS15 0.902 0.900
STS STS14 0.853 0.851
STS STS16 0.864 0.869
STS STS22 0.672 0.748
STS SICK-R 0.822 0.823
STS STS17 0.911 0.908
Summarization SummEval 0.313 0.314
Average Score 0.686 0.687
designed three loss functions to distill multiple large teacher embedding models into a student em- bedding model from diverse perspectives. Subse- quently, we utilized a MRL-based training method to reduce the vector dimensionality of the student model. Experimental results on the MTEB demon- strate that our Jasper model achieves state-of-the- art performance at the 2B parameter scale and ex- hibits comparable results to other top-ranked em- bedding models with 7B parameters. Future work will further explore the alignment between multiple modalities. ## References Prabhat Agarwal, Minhazul Islam SK, Nikil Pancha, Kurchi Subhra Hazra, Jiajing Xu, and Chuck Rosen- berg. 2024. Omnisearchsage: Multi-task multi-entity embeddings for pinterest search. In Companion Pro- ceedings of the ACM on Web Conference 2024, WWW 2024, Singapore, Singapore, May 13-17, 2024, pages 121-130. ACM. Ibrahim Alabdulmohsin, Xiaohua Zhai, Alexander Kolesnikov, and Lucas Beyer. 2024. Getting vit in shape: Scaling laws for compute-optimal model de- sign. Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Qianyu Guo, Meng Wang, and Haofen Wang. 2023. Retrieval- augmented generation for large language models: A survey. CoRR, abs/2312.10997. Shuhao Gu, Jialing Zhang, Siyuan Zhou, Kevin Yu, Zhaohu Xing, Liangdong Wang, Zhou Cao, Jintao Jia, Zhuoyi Zhang, Yixuan Wang, Zhenchong Hu, Bo-Wen Zhang, Jijie Li, Dong Liang, Yingli Zhao, Yulong Ao, Yaoqi Liu, Fangxiang Feng, and Guang Liu. 2024. Infinity-mm: Scaling multimodal perfor- mance with large-scale and high-quality instruction data. Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury. 2021. Effi- ciently teaching an effective dense retriever with bal- anced topic aware sampling. In SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021, pages 113-122. ACM. Abhinav Ramesh Kashyap, Thanh-Tung Nguyen, Vik- tor Schlegel, Stefan Winkler, See-Kiong Ng, and Soujanya Poria. 2024. A comprehensive survey of sentence representations: From the BERT epoch to the CHATGPT era and beyond. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL Chunk 6: 2024 - Volume 1: Long Papers, St. Julian's, Malta, March 17-22, 2024, pages 1738-1751. Association for Computational Linguistics. Aditya Kusupati, Gantavya Bhatt, Aniket Rege, Matthew Wallingford, Aditya Sinha, Vivek Ramanu- jan, William Howard-Snyder, Kaifeng Chen, Sham Kakade, Prateek Jain, and Ali Farhadi. 2024. Ma- tryoshka representation learning. Chankyu Lee, Rajarshi Roy, Mengyao Xu, Jonathan Raiman, Mohammad Shoeybi, Bryan Catanzaro, and Wei Ping. 2024. Nv-embed: Improved techniques for training llms as generalist embedding models. arXiv preprint arXiv:2405.17428. Chaofan Li, MingHao Qin, Shitao Xiao, Jianlyu Chen, Kun Luo, Yingxia Shao, Defu Lian, and Zheng Liu. 2024. Making text embedders few-shot learners. Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. 2021. In-batch negatives for knowledge distillation with tightly-coupled teachers for dense retrieval. In Proceedings of the 6th Workshop on Representation Learning for NLP, RepLANLP@ACL-IJCNLP 2021, Online, August 6, 2021, pages 163-173. Association for Computational Linguistics. Anton Lozhkov, Loubna Ben Allal, Leandro von Werra, and Thomas Wolf. 2024. Fineweb-edu: the finest collection of educational content. Gabriel de Souza P Moreira, Radek Osmulski, Mengyao Xu, Ronay Ak, Benedikt Schifferer, and Even Oldridge. 2024. Nv-retriever: Improving text em- bedding models with effective hard-negative mining. arXiv preprint arXiv:2407.15831. Niklas Muennighoff, Nouamane Tazi, Loïc Magne, and Nils Reimers. 2023. MTEB: massive text embedding benchmark. In Proceedings of the 17th Conference of the European Chapter of the Association for Compu- tational Linguistics, EACL 2023, Dubrovnik, Croatia, May 2-6, 2023, pages 2006-2029. Association for Computational Linguistics. Xiaohua Wang, Zhenghua Wang, Xuan Gao, Feiran Zhang, Yixin Wu, Zhibo Xu, Tianyuan Shi, Zhengyuan Wang, Shizheng Li, Qi Qian, Ruicheng Yin, Changze Lv, Xiaoqing Zheng, and Xuanjing Huang. 2024. Searching for best practices in retrieval-augmented generation. In Proceedings of the 2024 Conference on Empirical Methods in Natu- ral Language Processing, EMNLP 2024, Miami, FL, USA, November 12-16, 2024, pages 17716-17736. Association for Computational Linguistics. Shitao Xiao, Zheng Liu, Peitian Zhang, and Niklas Muennighoff. 2023. C-pack: Packaged resources to advance general chinese embedding. Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, and Lucas Beyer. 2023. Sigmoid loss for language image pre-training. Wayne Xin Zhao, Jing Liu, Ruiyang Ren, and Ji-Rong Wen. 2024a. Dense text retrieval based on pretrained language models: A survey. ACM Trans. Inf. Syst., 42(4):89:1-89:60. Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, Yifan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, and Ji-Rong Wen. 2024b. A survey of large language models. Junjie Zhou, Zheng Liu, Shitao Xiao, Bo Zhao, and Yongping Xiong. 2024. VISTA: visualized text em- bedding for universal multi-modal retrieval. In Pro- ceedings of the 62nd Annual Meeting of the Associa- tion for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11- 16, 2024, pages 3185-3200. Association for Compu- tational Linguistics. ```
With Tensorlake parse output you have accurate, detailed, and precise data that is much more useful for AI Agents compared to only providing the agent with a PDF. # Chonkie Source: https://docs.tensorlake.ai/examples/cookbooks/chonkie Build semantic chunking pipelines that preserve context and respect document structure [Chonkie](https://github.com/bhavnicksm/chonkie) is a fast, lightweight chunking library that uses embeddings to detect natural topic boundaries. When combined with Tensorlake's document parsing, you get intelligent chunking that respects semantic structure. Perfect for research papers, technical documentation, and dense content. Combining Chonkie and Tensorlake eliminates broken context in RAG systems where naive chunking splits thoughts mid-sentence or separates tables from explanations. Run this end-to-end in Colab: ## Why Use Tensorlake + Chonkie? **The Problem:** * Fixed-size chunking breaks sentences mid-thought and splits tables from context * Token-based splitting ignores document structure (sections, subsections, figures) * Chunks lose hierarchical meaning, leading to poor retrieval in RAG * Dense technical documents need semantic boundaries, not arbitrary character limits **The Solution:** Tensorlake preserves document structure during parsing. Chonkie uses embeddings to detect where topics naturally shift. Together, they produce context-preserving chunks that align with the author's intent. **Key Benefits:** * **Semantic boundaries** - Chunks align with natural topic transitions, not arbitrary limits * **Structure preservation** - Respect sections, subsections, and hierarchical organization * **Better retrieval** - Embeddings capture complete thoughts instead of fragmented text * **Production-ready** - Handle research papers, contracts, and technical docs with confidence ## Installation ```bash theme={null} pip install tensorlake chonkie chonkie[model2vec] ``` ## Quick Start ### Step 1: Parse Documents with Tensorlake Tensorlake extracts structured data, tables, and figures while preserving reading order: ```python theme={null} from tensorlake.documentai import ( DocumentAI, ParsingOptions, StructuredExtractionOptions, EnrichmentOptions, ChunkingStrategy, PageFragmentType ) # Initialize client doc_ai = DocumentAI() # Define schema for structured extraction research_paper_schema = { "title": "ResearchPaper", "type": "object", "properties": { "title": {"type": "string", "description": "Paper title"}, "authors": {"type": "array", "items": {"type": "string"}}, "abstract": {"type": "string", "description": "Paper abstract"}, "keywords": {"type": "array", "items": {"type": "string"}}, "sections": { "type": "array", "items": { "type": "object", "properties": { "heading": {"type": "string"}, "level": {"type": "integer", "description": "Heading level (1-6)"} } } } } } # Configure parsing parsing_options = ParsingOptions( chunking_strategy=ChunkingStrategy.NONE, # Let Chonkie handle chunking cross_page_header_detection=True ) structured_extraction = StructuredExtractionOptions( schema_name="Research Paper Analysis", json_schema=research_paper_schema ) enrichment_options = EnrichmentOptions( figure_summarization=True, figure_summarization_prompt="Summarize this figure in the context of the research paper.", table_summarization=True, table_summarization_prompt="Summarize this table's data and significance." ) # Parse document file_path = "https://tlake.link/docs/sota-research-paper" parse_id = doc_ai.parse( file_path, parsing_options=parsing_options, structured_extraction_options=[structured_extraction], enrichment_options=enrichment_options ) result = doc_ai.wait_for_completion(parse_id) ``` ### Step 2: (Optional) Understand Structured Data and Content Review metadata, tables, and figures extracted by Tensorlake: ```python theme={null} # Extract metadata paper_metadata = result.structured_data[0].data if result.structured_data else {} print(f"Title: {paper_metadata.get('title')}") print(f"Authors: {', '.join(paper_metadata.get('authors', []))}") # Get full markdown with preserved structure full_markdown = result.chunks[0].content if result.chunks else "" # Extract table summaries print("\nTable Summaries:") for page in result.pages: for i, fragment in enumerate(page.page_fragments): if fragment.fragment_type == PageFragmentType.TABLE: print(f"Table {i} (Page {page.page_number}): {fragment.content.summary}") # Extract figure summaries print("\nFigure Summaries:") for page in result.pages: for i, fragment in enumerate(page.page_fragments): if fragment.fragment_type == PageFragmentType.FIGURE: print(f"Figure {i} (Page {page.page_number}): {fragment.content.summary}") ``` **Output Example:** ``` Title: State of the Art Research Paper Authors: John Doe, Jane Smith Table Summaries: Table 0 (Page 3): Comparison of model performance metrics across three datasets Table 1 (Page 5): Hyperparameter configurations used in experiments Figure Summaries: Figure 0 (Page 2): Training loss curve showing convergence after 50 epochs Figure 1 (Page 4): Confusion matrix demonstrating high classification accuracy ``` ### Step 3: Semantic Chunking with Chonkie Use Chonkie's semantic chunker to create context-preserving chunks: ```python theme={null} from chonkie import SemanticChunker # Initialize semantic chunker chunker = SemanticChunker( embedding_model="minishlab/potion-base-8M", threshold=0.5, # Similarity threshold for detecting topic boundaries chunk_size=1024, # Target chunk size in tokens min_sentences=2, # Minimum sentences per chunk mode="window" # Use sliding window for boundary detection ) # Chunk the markdown semantic_chunks = [] for chunk in chunker.chunk(full_markdown): if chunk.text.strip(): semantic_chunks.append({ "text": chunk.text, "token_count": chunk.token_count }) print(f"Created {len(semantic_chunks)} semantic chunks") ``` **Output:** ``` Created 17 semantic chunks ``` ### Step 4: Review Chunk Quality Inspect chunks to verify they preserve semantic meaning: ```python theme={null} # Example chunk print(semantic_chunks[7]["text"]) ``` **Example Chunk:** ```markdown theme={null} ## 4. Experimental Results We evaluated our approach on three benchmark datasets: ImageNet, COCO, and Pascal VOC. Table 1 shows the performance metrics across all datasets. Our method achieves state-of-the-art results, outperforming previous approaches by 3-5% on average. The key insight is that combining attention mechanisms with residual connections enables the model to focus on relevant features while maintaining gradient flow. Figure 2 illustrates the attention maps learned by our model, showing clear focus on discriminative regions. ``` Notice how the chunk: * Includes complete thoughts from start to finish * Keeps tables and figures with their explanations * Respects section boundaries * Contains enough context for standalone understanding ## How Semantic Chunking Works Traditional chunking uses fixed token limits or recursive splitting. This breaks semantic units arbitrarily. **Semantic chunking changes the approach:** 1. **During parsing**: Tensorlake extracts the full document with preserved structure 2. **Embedding**: Chonkie embeds sentences using a lightweight model (model2vec) 3. **Boundary detection**: Compares embeddings in a sliding window to find where topics shift 4. **Threshold-based splitting**: When similarity drops below threshold, a new chunk begins 5. **Size constraints**: Respects minimum and maximum chunk sizes while honoring boundaries The key insight: **Chunk boundaries align with topic transitions**, not arbitrary character counts. This produces embeddings that capture complete ideas. ## Use Cases ### Research Paper Analysis Parse academic papers with complex sections, tables, and figures. Semantic chunks keep methodology descriptions intact and don't split results from their interpretation. ### Technical Documentation Process API docs, manuals, and specifications where hierarchical structure matters. Chunks respect code examples, parameter descriptions, and related content. ### Legal Document Processing Handle contracts and legal briefs where clauses must stay together. Semantic boundaries prevent splitting provisions mid-thought. ### Financial Reports Parse earnings reports and regulatory filings with dense tables and analysis sections. Keep financial data with its explanatory context. ### Medical Literature Process clinical studies where methods, results, and conclusions need separate chunks but internal coherence matters. ## Best Practices ### 1. Choose the Right Threshold Lower thresholds (0.3-0.5) create more chunks with tighter topic focus. Higher thresholds (0.6-0.8) create longer chunks spanning related topics. ```python theme={null} # Tight topic focus - more chunks chunker = SemanticChunker(threshold=0.4) # Broader context - fewer chunks chunker = SemanticChunker(threshold=0.7) ``` ### 2. Balance Chunk Size and Semantics Set `chunk_size` to match your embedding model's context window. Use `min_sentences` to prevent tiny fragments. ```python theme={null} chunker = SemanticChunker( chunk_size=512, # For models like text-embedding-3-small min_sentences=3, # Prevent single-sentence chunks ) ``` ### 3. Leverage Tensorlake's Structure Use Tensorlake's section detection and table summaries to enrich chunks: ```python theme={null} # Add table summaries to chunks table_summaries = {} for page in result.pages: for frag in page.page_fragments: if frag.fragment_type == PageFragmentType.TABLE: table_summaries[page.page_number] = frag.content.summary # Include summaries when embedding for chunk in semantic_chunks: chunk["metadata"] = { "has_table": any(f"Page {p}" in chunk["text"] for p in table_summaries) } ``` ### 4. Validate Chunk Quality Check that chunks are semantically complete: ```python theme={null} def validate_chunk_quality(chunks): """Ensure chunks have complete sentences and reasonable length.""" for i, chunk in enumerate(chunks): text = chunk["text"] # Check for incomplete sentences if not text.strip().endswith((".", "!", "?", '"')): print(f"Warning: Chunk {i} may be incomplete") # Check token count range if chunk["token_count"] < 50: print(f"Warning: Chunk {i} is very short ({chunk['token_count']} tokens)") validate_chunk_quality(semantic_chunks) ``` ### 5. Store Chunks with Metadata Include source information for retrieval: ```python theme={null} enriched_chunks = [] for i, chunk in enumerate(semantic_chunks): enriched_chunks.append({ "id": f"chunk_{i}", "text": chunk["text"], "token_count": chunk["token_count"], "metadata": { "source": paper_metadata.get("title"), "authors": paper_metadata.get("authors"), "chunk_index": i } }) ``` ## Complete Example Try the full working example with research paper analysis: Complete code walkthrough with quality validation and embedding examples ## What's Next? **Use these chunks in a RAG system:** * [Qdrant Integration](/integrations/qdrant) - Store semantic chunks with metadata * [ChromaDB Integration](/integrations/chroma) - Add citation tracking to chunks **Learn more about chunking strategies:** * [Blog: Fix Broken Context in RAG](https://tlake.link/blog/chonkie) - Deep dive into semantic chunking ## Resources * [Chonkie Documentation](https://github.com/bhavnicksm/chonkie) * [Tensorlake Blog: Semantic Chunking](https://tlake.link/blog/chonkie) * [API Reference](https://tlake.link/docs) ## Need Help? Join our community to discuss semantic chunking strategies: * [Slack Community](https://tlake.link/slack) # Detect Buyer and Seller Signatures with Tensorlake SDK Source: https://docs.tensorlake.ai/examples/cookbooks/detect-buyer-and-seller-signatures-sdk Detect and extract buyer and seller signature status from contracts using the Tensorlake Python SDK with structured JSON output. Since Tensorlake is built with a code-first philosophy, the data you extract is structured and can be easily integrated into your existing workflows. Try out this example now: Here is an example of how to use the Tensorlake Python SDK to parse a real estate purchase agreement and extract the buyer and seller signatures, names, and signature dates. 1. Get your [Tensorlake API key](https://docs.tensorlake.ai/platform/authentication#api-keys) 2. Install the Tensorlake SDK with `pip install tensorlake` ```python signature-detection.py theme={null} import json from pydantic import BaseModel from tensorlake.documentai import ( DocumentAI, ParseStatus, ParsingOptions, StructuredExtractionOptions ) # Initialize the client and add your API Key doc_ai = DocumentAI() # Upload the document file_path = "https://tlake.link/lease-agreement" ``` The Python snippet below will assume you have this schema in a file called `real-estate-schema.json` in a folder called `schema`. ```python signature-detection.py theme={null} # Create a schema class Buyer(BaseModel): buyer_name: str buyer_signature_date: str buyer_signed: bool class Seller(BaseModel): seller_name: str seller_signature_date: str seller_signed: bool class RealEstateSchema(BaseModel): buyer: Buyer seller: Seller ``` ```python signature-detection.py theme={null} real_state_extraction = StructuredExtractionOptions( schema_name="real-estate-schema", json_schema=RealEstateSchema ) # Configure parsing options parsing_options = ParsingOptions( signature_detection=True ) parse_id = doc_ai.parse( file=file_path, parsing_options=parsing_options, structured_extraction_options=[real_state_extraction], ) print(f"Parse job submitted with ID: {parse_id}") # Get the result result = doc_ai.wait_for_completion(parse_id) if(result.status == ParseStatus.FAILURE): print("Parse job failed!") exit(1) print("Successully parsed the document!") ``` The result will include the extracted data, all of the markdown chunks, and the entire document layout. ```python signature-detection.py theme={null} print("========Structured Data========") print(json.dumps(result.structured_data[0].data, indent=2)) print("\n\n") print("========Chunks========") for chunk in result.chunks: print(chunk.content) ``` The output will be: ```json Structured Data theme={null} { "buyer": { "buyer_name": "Nova Ellison", "buyer_signature_date": "September 10, 2025", "buyer_signed": true }, "seller": { "seller_name": "Juno Vega", "seller_signature_date": "September 10, 2025", "seller_signed": true } } ``` ```markdown Markdown Chunks expandable theme={null} ## RESIDENTIAL REAL ESTATE PURCHASE AGREEMENT I. THE PARTIES. This Real Estate Purchase Agreement ("Agreement") made on September 20 20 25 , ("Effective Date") between: Buyer: Nova Ellison with a mailing address of 123 Tensor Rd, San Francisco, CA 99999 ("Buyer"), who agrees to buy, and: Seller: Juno Vega with a mailing address of 456 Lake Rd, San Francisco, CA 99999 ("Seller"), who agrees to sell and convey real and personal property as described in Sections II & III. Buyer and Seller are each referred to herein as a "Party" and, collectively, as the "Parties." II. LEGAL DESCRIPTION. The real property is a: (check one) [x] - Single-Family Home [ ] - Condominium [ ] - Planned Unit Development (PUD) [ ] - Duplex [ ] - Triplex [ ] - Fourplex [ ] - Other: Street Address: 789 Solution Ln, San Francisco, CA 99999 Tax Parcel Information: TX-PL-0987-6543 Other Description: 2-story home, 3 bed / 2.5 bath, built 2022 III. PERSONAL PROPERTY. In addition to the real property described in Section II, the Seller shall include the following personal property: None. The described real property in Section II and personal property in Section III shall be collectively known as the "Property." IV. EARNEST MONEY. After acceptance by all Parties, the Buyer agrees to make a payment in the amount of $ 10,000 as consideration by September 15 20 25 ,at 10 :00 [x] AM [ ] PM ("Earnest Money"). The Earnest Money shall be applied to the Purchase Price at Closing e Buyer's Initials TE Seller's Initials IV. - Page 1 of 10 Signature detected Signature detected and subject to the Buyer's ability to perform under the terms of this Agreement. Any Earnest Money accepted [x] is [ ] is not required to be placed in a separate trust or escrow account in accordance with Governing Law. V. PURCHASE PRICE & TERMS. The Buyer agrees to purchase the Property by payment of One Hundred Fifty Thousand US Dollars ($ 150,000 as follows: (check one) [ ] AM [ ] PM. Seller shall [ ] - All Cash Offer. No loan or financing of any kind is required in order to purchase the Property. Buyer shall provide Seller written third (3rd) party documentation verifying sufficient funds to close no later than 20 at : ) have three (3) business days after the receipt of such documentation to notify Buyer, in writing, if the verification of funds is not acceptable. If Buyer fails to provide such documentation, or if Seller finds such verification of funds is not acceptable, Seller may terminate this Agreement. Failure of Seller to provide Buyer written notice of objection to such verification shall be considered acceptance of verification of funds. - Bank Financing. The Buyer's ability to purchase the Property is contingent upon the Buyer's ability to obtain financing under the following conditions: (check one) [x] - Conventional Loan [ ] - FHA Loan (Attach Required Addendums) [x] [ ] - VA Loan (Attach Required Addendums) [ ] - Other: . a.) In addition, Buyer agrees, within a reasonable time, to make a good faith loan application with a credible financial institution; b.) If Buyer does not reveal a fact of contingency to the lender and this purchase does not record because of such nondisclosure after initial application, the Buyer shall be in default; c.) On or before September 20 20 25 the Buyer will provide the Seller a letter from a credible financial institution verifying a satisfactory credit report, acceptable income, source of down payment, availability of funds to close, and that the loan approval [ ] is [ ] is not contingent on the lease, sale, or recording of another property; d.) In the event the Buyer fails to produce the aforementioned letter or other acceptable verification by the date above in Section V(c), this Agreement may be terminated at the election of the Seller with written notice provided to the Buyer within 3 _days from said date; e Buyer's Initials ME - Seller's Initials J. Page 2 of 10 Signature detected Signature detected e.) Buyer must obtain Seller's approval, in writing, to any change to the letter described in Section V(c) regarding the financial institution, type of financing, or allocation of closing costs; and f.) Buyer agrees to pay all fees and satisfy all conditions, in a timely manner, required by the financial institution for processing of the loan application. Buyer agrees the interest rate offered by lender or the availability of any financing program is not a contingency of this Agreement, so long as Buyer qualifies for the financing herein agreed. Availability of any financing program may change at any time. Any licensed real estate agent hired by either Party is not responsible for representations or guarantees as to the availability of any loans, project and/or property approvals or interest rates. [ ] - Seller Financing. Seller agrees to provide financing to the Buyer under the following terms and conditions: a.) Loan Amount: $ b.) Down Payment: $ c.) Interest Rate (per annum): % d.) Term: [ ] Months [ ] Years e.) Documents: The Buyer shall be required to produce documentation, as required by the Seller, verifying the Buyer's ability to purchase according to the Purchase Price and the terms of the Seller Financing. Therefore, such Seller Financing is contingent upon the Seller's approval of the requested documentation to be provided on or before 20 .The Seller shall have until 20 to approve the Buyer's documentation. In the event Buyer fails to obtain Seller's approval, this Agreement shall be terminated with the Buyer's Earnest Money being returned within five (5) business days. VI. SALE OF ANOTHER PROPERTY. Buyer's performance under this Agreement: (check one) [x] - Shall not be contingent upon selling another property. [ ] - Shall be contingent upon selling another property with a mailing address of , within days from the Effective Date. VII. CLOSING COSTS. The costs attributed to the Closing of the Property shall be the responsibility of [x] Buyer [ ] Seller [ ] Both Parties. The fees and costs related to the Closing shall include but not be limited to a title search (including the abstract and any owner's title policy), preparation of the deed, transfer e Buyer's Initials ME - Seller's Initials J. Page 3 of 10 Signature detected Signature detected taxes, recording fees, and any other costs by the title company that is in standard procedure with conducting the sale of a property. VIII. FUNDS AT CLOSING. Buyer and Seller agree that before the recording can take place, funds provided shall be in one (1) of the following forms: cash, interbank electronic transfer, money order, certified check or cashier's check drawn on a financial institution located in the state of Governing Law, or any above combination that permits the Seller to convert the deposit to cash no later than the next business day. IX. CLOSING DATE. This transaction shall close on October 15 , 2025 ,at 1 :00 [ ] AM [x] PM or earlier at the office of a title company to be agreed upon by the Parties ("Closing"). Any extension of the Closing must be agreed upon, in writing, by Buyer and Seller. Real estate taxes, rents, dues, fees, and expenses relating to the Property for the year in which the sale is closed shall be prorated as of the Closing. Taxes due for prior years shall be paid by Seller. X. SURVEY. Buyer may obtain a survey of the Property before the Closing to assure that there are no defects, encroachments, overlaps, boundary line or acreage disputes, or other such matters, that would be disclosed by a survey ("Survey Problems"). The cost of the survey shall be paid by the Buyer. Not later than 10 business days prior to the Closing, Buyer shall notify Seller of any Survey Problems which shall be deemed to be a defect in the title to the Property. Seller shall be required to remedy such defects within 5_business days and prior to the Closing. If Seller does not or cannot remedy any such defect(s), Buyer shall have the option of canceling this Agreement, in which case the Earnest Money shall be returned to Buyer. XI. MINERAL RIGHTS. It is agreed and understood that all rights under the soil, including but not limited to water, gas, oil, and mineral rights shall be transferred by the Seller to the Buyer at Closing. XII. TITLE. Seller shall convey title to the property by warranty deed or equivalent. The Property may be subject to restrictions contained on the plat, deed, covenants, conditions, and restrictions, or other documents noted in a Title Search Report. Upon execution of this Agreement by the Parties, Seller will order a Title Search Report, and have it delivered to the Buyer. Upon receipt of the Title Search Report, the Buyer shall have 5 business days to notify the Seller, in writing, of any matters disclosed in the report which are unacceptable to Buyer. Buyer's failure to timely object to the report shall constitute acceptance of the Title Search Report. e Buyer's Initials ME - Seller's Initials, JU. Page 4 of 10 Signature detected Signature detected If any objections are made by Buyer regarding the Title Search Report, mortgage loan inspection, or other information that discloses a material defect, the Seller shall have 3 business days from the date the objections were received to correct said matters. If Seller does not remedy any defect discovered by the Title Search Report, Buyer shall have the option of canceling this Agreement, in which case the Earnest Money shall be returned to Buyer. After Closing, Buyer shall receive an owner's standard form policy of title insurance insuring marketable title in the Property to Buyer in the amount of the Purchase Price, free and clear of the objections and all other title exceptions agreed to be removed as part of this transaction. XIII. PROPERTY CONDITION. Seller agrees to maintain the Property in its current condition, subject to ordinary wear and tear, from the time this Agreement comes into effect until the Closing. Buyer recognizes that the Seller, along with any licensed real estate agent(s) involved in this transaction, make no claims as to the validity of any property disclosure information. Buyer is required to perform their own inspections, tests, and investigations to verify any information provided by the Seller. Afterward, the Buyer shall submit copies of all tests and reports to the Seller at no cost. Therefore, Buyer shall hold the right to hire licensed contractors, or other qualified professionals, to further inspect and investigate the Property until October 5 [ ] AM [x] PM. , 2025 ,at 1 :00 After all inspections are completed, Buyer shall have until October 10 20 25 , at 1 : 00 [ ] AM [x] , PM to present any new property disclosures to the Seller in writing. The Buyer and Seller shall have 3 _business days to reach an agreement over any new property disclosures found by the Buyer. If the Parties cannot come to an agreement, this Agreement shall be terminated with the Earnest Money being returned to the Buyer. If the Buyer fails to have the Property inspected or does not provide the Seller with written notice of the new disclosures on the Property, in accordance with this Agreement, Buyer hereby accepts the Property in its current condition and as described in any disclosure forms presented by the Seller. In the event improvements on the Property are destroyed, compromised, or materially damaged prior to Closing, the Agreement may be terminated at Buyer's option. XIV. SELLER'S INDEMNIFICATION. Except as otherwise stated in this Agreement, after recording, the Buyer shall accept the Property AS IS, WHERE IS, with all defects, latent or otherwise. Neither Seller nor their e Buyer's Initials VE - Seller's Initials JU. - Page 5 of 10 Signature detected Signature detected licensed real estate agent(s) or any other agent(s) of the Seller, shall be bound to any representation or warranty of any kind relating in any way to the Property or its condition, quality or quantity, except as specifically set forth in this Agreement or any property disclosure, which contains representations of the Seller only, and which is based upon the best of the Seller's personal knowledge. XV. APPRAISAL. Buyer's performance under this Agreement: (check one) [x] - Shall not be contingent upon the appraisal of the Property being equal to or greater than the agreed upon Purchase Price. [ ] - Shall be contingent upon the appraisal of the Property being equal to or greater than the agreed upon Purchase Price. If the Property does not appraise to at least the amount of the Purchase Price, or if the appraisal discovers lender-required repairs, the Parties shall have business days to re-negotiate this Agreement ("Negotiation Period"). In such event the Parties cannot come to an agreement during the Negotiation Period, this Agreement shall terminate with the Earnest Money being returned to the Buyer. XVI. REQUIRED DOCUMENTS. Prior to the Closing, the Parties agree to authorize all necessary documents, in good faith, in order to record the transaction under the conditions required by the recorder, title company, lender, or any other public or private entity. XVII. TERMINATION. In the event this Agreement is terminated, as provided in this Agreement, absent of default, any Earnest Money shall be returned to the Buyer, in-full, within 3 _business days with all parties being relieved of their obligations as set forth herein. XVIII. SEX OFFENDERS. Section 2250 of Title 18, United States Code, makes it a federal offense for sex offenders required to register pursuant to the Sex Offender Registration and Notification Act (SORNA), to knowingly fail to register or update a registration as required. State convicted sex offenders may also be prosecuted under this statute if the sex offender knowingly fails to register or update a registration as required, and engages in interstate travel, foreign travel, or enters, leaves, or resides on an Indian reservation. A sex offender who fails to properly register may face fines and up to ten (10) years in prison. Furthermore, if a sex offender knowingly fails to update or register as required and commits a violent federal crime, he or she may face up to thirty (30) years in prison under this statute. The Buyer may seek more information online by visiting https://www.nsopw.gov/. e Buyer's Initials VE - Seller's Initials JU. - Page 6 of 10 Signature detected Signature detected XIX. TIME. Time is of the essence. All understandings between the Parties are incorporated in this Agreement. Its terms are intended by the Parties as a final, complete and exclusive expression of their Agreement with respect to its subject matter and they may not be contradicted by evidence of any prior agreement or contemporaneous oral agreement. XX. BUYER'S DEFAULT. Seller's remedies shall be limited to liquidated damages in the amount of the Earnest Money set forth in Section IV. It is agreed that such payments and things of value are liquidated damages and are Seller's sole and only remedy for Buyer's failure to perform the obligations of this Agreement. The Parties agree that Seller's actual damages in the event of Buyer's default would be difficult to measure, and the amount of the liquidated damages herein provided for is a reasonable estimate of such damages. XXI. SELLER'S DEFAULT. Buyer may elect to treat this Agreement as cancelled, in which case all Earnest Money paid by Buyer hereunder shall be returned and Buyer may recover such damages as may be proper, or Buyer may elect to treat this Agreement as being in full force and effect and Buyer shall have the right to specific performance or damages, or both. XXII. EARNEST MONEY DISPUTE. Notwithstanding any termination of this Agreement, the Parties agree that in the event of any controversy regarding the release of the Earnest Money that the matter shall be submitted to mediation as provided in Section XXIII. XXIII. DISPUTE RESOLUTION. Buyer and Seller agree to mediate any dispute or claim arising out of this Agreement, or in any resulting transaction, before resorting to arbitration or court action. a.) Mediation. If a dispute arises, between or among the Parties, and it is not resolved prior to or after recording, the Parties shall first proceed in good faith to submit the matter to mediation. Costs related to mediation shall be mutually shared between or among the Parties. Unless otherwise agreed in mediation, the Parties retain their rights to proceed to arbitration or litigation. b.) Arbitration. The Parties agree that any dispute or claim in law or equity arising between them out of this Agreement or any resulting transaction, which is not settled through mediation, shall be decided by neutral, binding arbitration. The arbitrator is required to be a retired judge or justice, or an attorney with at least five (5) years of residential real estate law experience unless the Parties mutually agree to a different arbitrator. Under arbitration, the Parties shall have the right to discovery in accordance with Governing Law. Judgment upon the award of the arbitrator(s) may be entered into any court having jurisdiction. Enforcement of this Agreement to arbitrate shall be governed by the Federal Arbitration Act. e Buyer's Initials ne - Seller's Initials JU. - Page 7 of 10 Signature detected Signature detected c.) Exclusions. The following matters shall be excluded from the mediation and arbitration: (i) a judicial or non-judicial foreclosure or other action or proceeding to enforce a deed, mortgage or installment land sale contract as defined in accordance with Governing Law; (ii) an unlawful detainer action, forcible entry detainer, eviction action, or equivalent; (iii) the filing or enforcement of a mechanic's lien; and (iv) any matter that is within the jurisdiction of a probate, small claims or bankruptcy court. The filing of a court action to enable the recording of a notice of pending action, for order of attachment, receivership, injunction, or other provisional remedies, shall not constitute a waiver or violation of the mediation and arbitration provisions of this Section. XXIV. GOVERNING LAW. This Agreement shall be interpreted in accordance with the laws in the state of California ("Governing Law"). XXV. TERMS AND CONDITIONS OF OFFER. This is an offer to purchase the Property in accordance with the above stated terms and conditions of this Agreement. If at least one, but not all, of the Parties initial such pages, a counteroffer is required until an agreement is reached. Seller has the right to continue to offer the Property for sale and to accept any other offer at any time prior to notification of acceptance. If this offer is accepted and Buyer subsequently defaults, Buyer may be responsible for payment of licensed real estate agent(s) compensation. This Agreement and any supplement, addendum or modification, including any copy, may be signed in two or more counterparts, all of which shall constitute one and the same writing. XXVI. BINDING EFFECT. This Agreement shall be for the benefit of, and be binding upon, the Parties, their heirs, successors, legal representatives, and assigns, which therefore, constitutes the entire agreement between the Parties. No modification of this Agreement shall be binding unless signed by both Buyer and Seller. XXVII. SEVERABILITY. In the event any provision or part of this Agreement is found to be invalid or unenforceable, only that particular provision or part so found, and not the entire Agreement, will be inoperative. XXVIII. OFFER EXPIRATION. This offer to purchase the Property as outlined in this Agreement shall be deemed revoked and the Earnest Money shall be returned unless this Agreement is signed by Seller and a copy of this Agreement is personally given to the Buyer by September 10 , 2025 ,at 6 :00 [ ] AM [x] PM. XXIX. ACCEPTANCE. Seller warrants that Seller is the owner of the Property or has the authority to execute this Agreement. Therefore, by the Seller's authorization below, he/she/they accepts the above offer and agrees to sell the Property on the above terms and conditions and agrees to the agency e Buyer's Initials 18. - Seller's Initials JU. - Page 8 of 10 Signature detected Signature detected relationships in accordance with any agreement(s) made with licensed real estate agent(s). Seller has read and acknowledges receipt of a copy of this Agreement and authorizes any licensed real estate agent(s) to deliver a signed copy to the Buyer. Delivery may be in any of the following: (i) hand delivery; (ii) email under the condition that the Party transmitting the email receives electronic confirmation that the email was received to the intended recipient; and (iii) by facsimile to the other Party or the other Party's licensee, but only if the transmitting fax machine prints a confirmation that the transmission was successful. XXX. LICENSED REAL ESTATE AGENT(S). If Buyer or Seller have hired the services of licensed real estate agent(s) to perform representation on their behalf, he/she/they shall be entitled to payment for their services as outlined in their separate written agreement. XXXI. DISCLOSURES. It is acknowledged by the Parties that: (check one) [x] - There are no attached addendums or disclosures to this Agreement. [ ] - The following addendums or disclosures are attached to this Agreement: (check all that apply) [ ] - Lead-Based Paint Disclosure Form [ ] - . [ ] - [ ] - . ## XXXII. ADDITIONAL TERMS AND CONDITIONS. ### Figure None XXXIII. ENTIRE AGREEMENT. This Agreement together with any attached addendums or disclosures shall supersede any and all other prior understandings and agreements, either oral or in writing, between the Parties with respect to the subject matter hereof and shall constitute the sole and only agreements between the Parties with respect to the said Property. All prior negotiations and agreements between the Parties with respect to the Property hereof are merged into this Agreement. Each Party to this Agreement acknowledges that no representations, inducements, promises, or agreements, orally or otherwise, have been made by any Party or by anyone acting on behalf of any Party, which are not embodied in this Agreement and that any agreement, statement or promise that is not contained in this Agreement shall not be valid or binding or of any force or effect. e Buyer's Initials VE - Seller's Initials JU. - Page 9 of 10 Signature detected Signature detected XXXIV. EXECUTION. Buyer Signature: Kova Ellison Date: e: September 10, 2025 Print Name: Nova Ellison Buyer Signature: Date: Print Name: Seller Signature: Juno Vego Date: September 10, 2025 Print Name: Juno Vega Seller Signature: Date: Print Name: Agent Signature: Aster Polaris Date: September 10, 2025 Print Name: Aster Polaris - Polaris Group LLC Agent Signature: Date: Print Name: e Page 10 of 10 Signature detected Signature detected Signature detected ``` With this output you have precision and automation with very little code—and can be extended with webhooks, LangGraph agents, or even downstream CRM updates. # Outlines Source: https://docs.tensorlake.ai/examples/cookbooks/outlines Build schema-enforced document extraction pipelines with guaranteed valid outputs [Outlines](https://outlines-dev.github.io/outlines/) is a library that enables structured text generation with language models by constraining outputs to match JSON schemas or Pydantic models. When combined with Tensorlake's document parsing, you get a complete pipeline: Tensorlake extracts structured data from complex documents, and Outlines guarantees that LLM-generated outputs always match your schema. This integration is particularly powerful for production document AI pipelines where schema violations break downstream systems. Run this end-to-end in Colab: ## Why Use Tensorlake + Outlines? **The Problem:** * LLMs return malformed JSON, mix up date formats, and hallucinate values * Traditional solutions (regex cleanup, validation scripts, retry loops) don't scale * Production pipelines break when outputs don't match expected schemas **The Solution:** Outlines enforces schema constraints **during generation**, not after. Instead of hoping a model follows instructions, constrained decoding guarantees every output matches your JSON Schema or Pydantic model. **Key Benefits:** * **Guaranteed valid JSON** on every run - no parsing failures * **Type-safe outputs** that match your Pydantic models exactly * **No post-processing** - outputs are ready for downstream systems * **Production-ready** - eliminates an entire class of pipeline failures ## Installation ```bash theme={null} pip install tensorlake outlines ``` ## Quick Start ### Step 1: Parse Documents with Tensorlake Tensorlake converts your documents into structured fragments with metadata: ```python theme={null} from tensorlake.documentai import DocumentAI, ParseStatus doc_ai = DocumentAI() file_id = doc_ai.upload("invoice.pdf") result = doc_ai.parse_and_wait(file_id) assert result.status == ParseStatus.SUCCESSFUL # Combine parsed chunks into structured text document_text = '\n'.join([frag.content for frag in result.chunks]) ``` ### Step 2: Define Your Schema Use Pydantic to describe the fields you expect: ```python theme={null} from pydantic import BaseModel, Field class Invoice(BaseModel): invoice_number: str = Field(description="Invoice number on the invoice") issue_date: str = Field(description="Date when invoice was issued") due_date: str = Field(description="Payment due date") vendor_name: str = Field(description="Name of the vendor/seller") total_amount: float = Field(description="Total amount to be paid") ``` This schema becomes the contract for your pipeline. Downstream code can rely on this shape without extra validation. ### Step 3: Create Few-Shot Examples Help the model understand the extraction pattern: ```python theme={null} from outlines import Template examples = [ { "document": "Invoice #: INV-1234\nDate: 2025-01-15\nDue: 2025-02-15\nVendor: Tech Solutions Inc.\nTotal: $5,250.00", "json": '{"invoice_number": "INV-1234", "issue_date": "2025-01-15", "due_date": "2025-02-15", "vendor_name": "Tech Solutions Inc.", "total_amount": 5250.00}' }, { "document": "Invoice Number: 2024-0567\nIssued: March 10, 2024\nPayment Due: April 10, 2024\nFrom: Office Supplies Co.\nAmount Due: 1,875.50", "json": '{"invoice_number": "2024-0567", "issue_date": "2024-03-10", "due_date": "2024-04-10", "vendor_name": "Office Supplies Co.", "total_amount": 1875.50}' } ] # Create the extraction template invoice_extraction_prompt = Template.from_string( """ {% for example in examples %} DOCUMENT: {{example.document}} JSON: {{example.json}} {% endfor %} DOCUMENT: {{document}} JSON:""" ) ``` ### Step 4: Extract with Schema Enforcement ```python theme={null} import outlines import openai # Generate the prompt with your document prompt = invoice_extraction_prompt(document=document_text, examples=examples) # Use Outlines with OpenAI model = outlines.from_openai(openai.OpenAI(), "gpt-4o-mini") answer = model(prompt) # Parse the result invoice = Invoice.model_validate_json(answer) print(invoice.model_dump()) ``` ## Output The result is clean, type-safe, and ready for downstream systems: ```json theme={null} { "invoice_number": "INV-2387", "issue_date": "2025-05-10", "due_date": "2025-06-10", "vendor_name": "Acme Supplies Ltd.", "total_amount": 3000.0 } ``` No regex cleanup, no retries, no manual corrections. ## How Outlines Enforces Schema Constraints Language models normally work by outputting probability distributions over the next token at each step. Nothing stops the model from outputting invalid JSON or incorrect types. **Outlines changes the decoding loop:** 1. Builds a finite state machine (FSM) that represents all valid outputs for your schema 2. At each decoding step, masks out any tokens that would violate the schema 3. Only allows tokens that keep the output valid For example: * If your schema says `"total_amount"` must be a number, Outlines prunes away every token that isn't a digit, decimal point, or valid number continuation * If your schema requires valid JSON, the FSM ensures braces, commas, and quotes are placed correctly - preventing `{,]` or unclosed strings The constraint happens **during generation**, not after. That's why every output from Outlines is guaranteed to be valid. ## Use Cases ### Financial Document Processing Extract structured data from invoices, receipts, and financial statements with guaranteed field presence and type safety. ### Contract Analysis Parse contracts with complex schemas where missing fields or type mismatches break downstream workflows. ### Insurance Claims Processing Extract claim data with validated amounts, dates, and classifications that comply with downstream systems. ### Legal Document Review Structure legal documents into typed objects that can be safely stored in databases and processed by analytics pipelines. ## Best Practices ### 1. Design Schemas Carefully Keep schemas as simple as possible with low nesting levels. Experiment with different schema keys and descriptions. ### 2. Filter Before Extraction Use Tensorlake's page classification and fragment typing to discard irrelevant sections (like footers or signatures) before passing text to the model. This reduces noise and improves accuracy. ### 3. Validate Twice Outlines guarantees schema validity during decoding, but validate again downstream with Pydantic as an extra safety net before writing to databases. ### 4. Handle Missing Values Explicitly Instead of letting models hallucinate, define optional fields in your schema so the absence of data is captured cleanly: ```python theme={null} class Invoice(BaseModel): invoice_number: str issue_date: str due_date: Optional[str] = None # Optional field vendor_name: str total_amount: float ``` ### 5. Benchmark Cost and Latency Constrained decoding has overhead, especially for large schemas. Measure the trade-offs between schema complexity and generation speed. ## Complete Example Try the full working example in our Colab notebook: Complete code walkthrough with invoice extraction example ## Resources * [Outlines Documentation](https://outlines-dev.github.io/outlines/) * [Outlines GitHub](https://github.com/outlines-dev/outlines) * [Tensorlake Blog: Schema-Enforced Pipelines](https://tlake.link/blog/outlines) * [API Reference](https://tlake.link/docs) ## Need Help? Join our community to discuss schema-enforced document pipelines: * [Slack Community](https://tlake.link/slack) # Parse Resumes with Tensorlake Source: https://docs.tensorlake.ai/examples/cookbooks/resume-parsing Extract skills, experience, and education from resumes into structured data with Tensorlake — useful for ATS, candidate screening, and talent databases. Resume parsing is crucial in modern hiring workflows where recruiters deal with hundreds or thousands of resumes. Automating the extraction of key information (skills, experience, education) saves time and enables efficient candidate screening. It also powers recommendation engines, applicant tracking systems (ATS), and helps maintain structured databases of talent profiles. Try parsing resumes using this notebook: Here is an example of how to use the Tensorlake Python SDK how to use extract structured data when parsing candidate resumes. 1. Get your [Tensorlake API key](https://docs.tensorlake.ai/platform/authentication#api-keys) 2. Install the Tensorlake SDK with `pip install tensorlake` ```python theme={null} # Import libraries from tensorlake.documentai import DocumentAI from tensorlake.documentai.models import ( ParsingOptions, StructuredExtractionOptions, ParseStatus ) from tensorlake.documentai.models.enums import ChunkingStrategy import time import json # Create a Tensorlake Client doc_ai = DocumentAI() # Reference to a resume that you want to parse file_path = 'https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/jakes-resume.pdf' ``` Define a JSON schema to extract relevant information from the resume. ```python theme={null} structured_schema = { "title": "ResumeInfo", "type": "object", "properties": { "candidateName": { "type": "string" }, "email": { "type": "string" }, "phone": { "type": "string" }, "address": { "type": "string" }, "professionalSummary": { "type": "string" }, "skills": { "type": "array", "items": { "type": "string" } }, "workExperience": { "type": "array", "items": { "type": "object", "properties": { "jobTitle": { "type": "string" }, "companyName": { "type": "string" }, "location": { "type": "string" }, "startDate": { "type": "string" }, "endDate": { "type": "string" }, "description": { "type": "string" } } } }, "education": { "type": "array", "items": { "type": "object", "properties": { "degree": { "type": "string" }, "fieldOfStudy": { "type": "string" }, "institution": { "type": "string" }, "location": { "type": "string" }, "graduationDate":{ "type": "string" } } } } } } ``` ```python theme={null} # Configure parsing with structured schema parsing_options = ParsingOptions( chunking_strategy=ChunkingStrategy.PAGE ) structured_extraction_options = StructuredExtractionOptions( schema_name="Candidate Resume", json_schema=structured_schema # schema for structured extraction ) # Parse the document with the specified extraction options for structured data parse_id = doc_ai.parse(file_path, parsing_options=parsing_options, structured_extraction_options=[structured_extraction_options]) print(f"Parse job submitted with ID: {parse_id}") # Wait for completion result = doc_ai.wait_for_completion(parse_id) ``` The result will include the extracted data, all of the markdown chunks, and the entire document layout. ```python theme={null} # Print the structured data output print(json.dumps(result.structured_data[0].data, indent=2)) # Get the markdown from extracted data for index, chunk in enumerate(result.chunks): print(f"Chunk {index}:") print(chunk.content) ``` The output will be: ```json Structured Data Outputs theme={null} { "data": { "address": null, "candidateName": "Jake Ryan", "education": [ { "degree": "Bachelor of Arts", "fieldOfStudy": "Computer Science, Minor in Business", "graduationDate": "May 2021", "institution": "Southwestern University", "location": "Georgetown, TX" }, { "degree": "Associate's in Liberal Arts", "fieldOfStudy": null, "graduationDate": "May 2018", "institution": "Blinn College", "location": "Bryan, TX" } ], "email": "jake@su.edu", "phone": "123-456-7890", "professionalSummary": null, "skills": [ "Java", "Python", "C/C++", "SQL (Postgres)", "JavaScript", "HTML/CSS", "R", "React", "Node.js", "Flask", "JUnit", "WordPress", "Material-UI", "FastAPI", "Git", "Docker", "TravisCI", "Google Cloud Platform", "VS Code", "Visual Studio", "PyCharm", "IntelliJ", "Eclipse", "pandas", "NumPy", "Matplotlib" ], "workExperience": [ { "companyName": "Texas A&M University", "description": "• Developed a REST API using FastAPI and PostgreSQL to store data from learning management systems • Developed a full-stack web application using Flask, React, PostgreSQL and Docker to analyze GitHub data • Explored ways to visualize GitHub collaboration in a classroom setting", "endDate": null, "jobTitle": "Undergraduate Research Assistant", "location": "College Station, TX", "startDate": "June 2020" }, { "companyName": null, "description": "• Explored methods to generate video game dungeons based off of The Legend of Zelda Georgetown, TX • Developed a game in Java to test the generated dungeons • Contributed 50K+ lines of code to an established codebase via Git • Conducted a human subject study to determine which video game dungeon generation technique is enjoyable • Wrote an 8-page paper and gave multiple presentations on-campus • Presented virtually to the World Conference on Computational Intelligence", "endDate": "July 2019", "jobTitle": "Artificial Intelligence Research Assistant", "location": "Georgetown, TX", "startDate": "May 2019" }, { "companyName": "Georgetown, TX", "description": "• Communicate with managers to set up campus computers used on campus • Assess and troubleshoot computer problems brought by students, faculty and staff • Maintain upkeep of computers, classroom equipment, and 200 printers across campus", "endDate": null, "jobTitle": "Information Technology Support Specialist", "location": "Georgetown, TX", "startDate": "Sep. 2018" } ] }, "page_numbers": [1], "schema_name": "Candidate Resume" } ``` ```md Markdown Chunks expandable theme={null} Chunk 0: ## Jake Ryan 123-456-7890 | \underline{jake@su.edu} | \underline{linkedin.com/in/jake} | \underline{github.com/jake} ## EDUCATION Southwestern University Bachelor of Arts in Computer Science, Minor in Business Blinn College Associate's in Liberal Arts Georgetown, TX Aug. 2018 – May 2021 Bryan, TX Aug. 2014 – May 2018 ## EXPERIENCE ## Undergraduate Research Assistant ## Texas A&M University June 2020 – Present College Station, TX • Developed a REST API using FastAPI and PostgreSQL to store data from learning management systems • Developed a full-stack web application using Flask, React, PostgreSQL and Docker to analyze GitHub data • Explored ways to visualize GitHub collaboration in a classroom setting ## Information Technology Support Specialist Sep. 2018 – Present Georgetown, TX • Communicate with managers to set up campus computers used on campus • Assess and troubleshoot computer problems brought by students, faculty and staff • Maintain upkeep of computers, classroom equipment, and 200 printers across campus ## Artificial Intelligence Research Assistant May 2019 – July 2019 • Explored methods to generate video game dungeons based off of The Legend of Zelda Georgetown, TX • Developed a game in Java to test the generated dungeons • Contributed 50K+ lines of code to an established codebase via Git • Conducted a human subject study to determine which video game dungeon generation technique is enjoyable • Wrote an 8-page paper and gave multiple presentations on-campus • Presented virtually to the World Conference on Computational Intelligence ## PROJECTS Gitlytics | Python, Flask, React, PostgreSQL, Docker June 2020 – Present • Developed a full-stack web application using with Flask serving a REST API with React as the frontend • Implemented GitHub OAuth to get data from user’s repositories • Visualized GitHub data to show collaboration • Used Celery and Redis for asynchronous tasks Simple Paintball | Spigot API, Java, Maven, TravisCI, Git May 2018 – May 2020 • Developed a Minecraft server plugin to entertain kids during free time for a previous job • Published plugin to websites gaining 2K+ downloads and an average 4.5/5-star review • Implemented continuous delivery using TravisCI to build the plugin upon new a release • Collaborated with Minecraft server administrators to suggest features and get feedback about the plugin ## TECHNICAL SKILLS Languages: Java, Python, C/C++, SQL (Postgres), JavaScript, HTML/CSS, R Frameworks: React, Node.js, Flask, JUnit, WordPress, Material-UI, FastAPI Developer Tools: Git, Docker, TravisCI, Google Cloud Platform, VS Code, Visual Studio, PyCharm, IntelliJ, Eclipse Libraries: pandas, NumPy, Matplotlib ``` With Tensorlake parse output you have accurate, detailed, and precise data that is reliable for quick filtering of candidates. # Alert Slack if the Buyer Has Not Signed Source: https://docs.tensorlake.ai/examples/cookbooks/signature-detection/alert-slack-buyer-not-signed Recipe — send a Slack alert when Tensorlake's signature detection shows the buyer's signature is missing from a contract. ## Goal Send a Slack alert if the buyer's signature is missing from the document. ## Input Structured JSON output from Tensorlake (with signature detection enabled). ## Output Slack message to notify a team. ```python theme={null} import json import os import requests def notify_slack(msg): webhook_url = os.getenv("SLACK_WEBHOOK_URL") requests.post(webhook_url, json={"text": msg}) with open("output/signature-check.json") as f: data = json.load(f) buyer = data["outputs"]["structured_data"]["pages"][0]["data"].get("buyer", {}) if not buyer.get("buyer_signed"): name = buyer.get("buyer_name", "Unknown") notify_slack(f"🚨 Buyer {name} has not signed the document.") ``` # Detect Buyer and Seller Signatures with the Tensorlake Playground Source: https://docs.tensorlake.ai/examples/cookbooks/signature-detection/detect-buyer-and-seller-signatures-playground Extract buyer/seller names, signature status, and signature dates from a real estate purchase agreement without writing any code, using the Tensorlake Playground. The [Tensorlake Playground](https://cloud.tensorlake.ai) is the easiest way to explore and understand how our document parsing framework works, even before writing any code. For this example, we'll upload a real estate purchase agreement and configure a parsing pipeline that extracts key information like the buyer and seller names, signature status, and signature dates. Try it out yourself in the [Tensorlake Playground](https://cloud.tensorlake.ai) by following these steps: 1. **Document Upload**: Upload the PDF document you want to parse. If you don't have a document with signatures to test, you can [download the one from this blog post here](https://drive.google.com/uc?export=download\&id=1YuMjScDloX6DFjUewuY0JyL8v4gzrSXX). 2. **Structured Extraction**: Define what you want to extract using the schema builder or JSON. This includes specifying the fields for Buyer and Seller names, signature status, and signature data. Here is the schema for this example: ```python real-estate-schema.json theme={null} { "properties": { "buyer": { "properties": { "buyer_name": { "type": "string" }, "buyer_signature_date": { "type": "string" }, "buyer_signed": { "type": "boolean" } }, "type": "object" }, "seller": { "properties": { "seller_name": { "type": "string" }, "seller_signature_date": { "type": "string" }, "seller_signed": { "type": "boolean" } }, "type": "object" } }, "title": "real_estate_purchase_agreement", "type": "object" } ``` 3. **Extraction and Parsing Options**: Configure options like page range, chunking strategy, table summarization, and signature detection. For this example, we want to skip OCR because we want to detect the handwritten signature and not just convert it to text. We'll also parse all pages and enable signature detection. Parsing Options 4. As output, you will get a markdown version of your document, a markdown preview, a JSON file with all of the extracted data, and a JSON file with the structured data you defined. Specifically for this example, we are focused on the JSON file with the extracted data, including the buyer and seller names, signature dates, and whether they signed the agreement: ```json output.json theme={null} { "pages": [ { "page_number": 0, "json_result": { "buyer": { "buyer_name": "Nova Ellison", "buyer_signature_date": "September 10, 2025", "buyer_signed": true }, "seller": { "seller_name": "Juno Vega", "seller_signature_date": "September 10, 2025", "seller_signed": true } } } ] } ``` With this output you have precision and automation with very little code—and can be extended with webhooks, LangGraph agents, or even downstream CRM updates. # Route to Manual Review if the Seller Has Not Signed Source: https://docs.tensorlake.ai/examples/cookbooks/signature-detection/route-seller-not-signed Recipe — branch on Tensorlake's structured parsing output to trigger a human review queue when the seller signature is missing. ## Goal Branch logic and trigger routing logic if the seller signature is missing. ## Input Tensorlake structured parsing output. ## Output A printed message or a webhook to trigger a human review queue. ```python theme={null} def route_to_manual_review(): print("Routing document to human review queue...") seller = data["outputs"]["structured_data"]["pages"][0]["data"].get("seller", {}) if not seller.get("seller_signed"): route_to_manual_review() ``` # Extract Structured Data from Images Source: https://docs.tensorlake.ai/examples/cookbooks/structured-extraction-from-images Build an application that extracts structured data from driver's license images using OpenAI's vision model This example demonstrates how to build a Tensorlake application that extracts structured data from driver's license images using OpenAI's vision model. **What you'll learn:** * Processing multimodal data (images) in Tensorlake * Using custom dependencies with Docker images * Managing API secrets securely * Extracting structured data with Pydantic models ```python structured_extraction.py theme={null} import os import base64 import requests from pydantic import BaseModel from tensorlake.applications import application, function, Image, RequestError # Install dependencies for your application image = Image().run("pip install openai pydantic requests") # List of secrets required by the application. # The application expects to find these secrets in the environment. secrets = ["OPENAI_API_KEY"] class DrivingLicense(BaseModel): name: str date_of_birth: str address: str license_number: str license_expiration_date: str @application() @function(image=image, secrets=secrets) def extract_driving_license_data(url: str) -> DrivingLicense: from openai import OpenAI # Download image from URL http_response = requests.get(url) http_response.raise_for_status() # Encode image as base64 image_base64 = base64.b64encode(http_response.content).decode("utf-8") # Determine image format from content type or URL content_type = http_response.headers.get("content-type", "") if "jpeg" in content_type or "jpg" in content_type: image_format = "jpeg" elif "png" in content_type: image_format = "png" else: # Default to jpeg if can't determine image_format = "jpeg" # Extract structured data using OpenAI's vision model openai = OpenAI(api_key=os.getenv("OPENAI_API_KEY")) completion = openai.beta.chat.completions.parse( model="gpt-4o-mini", messages=[ { "role": "system", "content": "Extract the personal information from the driving license image.", }, { "role": "user", "content": [ { "type": "image_url", "image_url": { "url": f"data:image/{image_format};base64,{image_base64}" }, } ], }, ], response_format=DrivingLicense, ) license_data: DrivingLicense = completion.choices[0].message.parsed return license_data ``` Building custom images allows you to install pretty much anything you want in your function's environment. Take a look at the [Dependency management](/applications/images) guide to learn more about it. Before we deploy this application on Tensorlake, we need to make sure the function can access the secret api key. You can do this by running the `tl secrets` command in your terminal: ```bash theme={null} tl secrets set OPENAI_API_KEY= ``` If you want to learn more about how we manage secrets, take a look at the [Secrets management](/applications/secrets) guide. Now that you've defined your custom image, and set the secret api key, you can deploy the application: ```bash theme={null} tl deploy structured_extraction.py ``` You should see the tensorlake stream build logs as your image is being built. Once the image is built, you can invoke the application as before. ```bash theme={null} curl -N -X POST https://api.tensorlake.ai/applications/extract_driving_license_data \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ --json '"https://tlake.link/dl"' ``` The response will contain the extracted structured data: ```json theme={null} { "name": "John Doe", "date_of_birth": "1990-01-15", "address": "123 Main St, City, State 12345", "license_number": "D1234567", "license_expiration_date": "2025-01-15" } ``` # Examples Source: https://docs.tensorlake.ai/examples/overview Tutorials, cookbooks, and use cases for building with Tensorlake. Explore hands-on examples for building agentic applications and document processing pipelines on Tensorlake. Want an example added? [Let us know in Slack](https://tlake.link/slack). *** ## Agentic Applications Multi-agent research pipeline with parallel web search and report synthesis. Execute LLM-generated code in isolated containers with data science libraries. Claude agentic loop that chains tool calls, each in its own container. Parse bank statements and answer spending questions with Claude. Serverless web crawler that scrapes websites N levels deep using headless Chrome. Conversational weather agent powered by Claude, deployed as an HTTP API. *** ## Document Processing Extract structured data from images using Tensorlake Applications. Detect, validate, and route based on signatures. Parse resumes and extract structured candidate data. Query parsed SEC filings with Databricks. Ask natural language questions about contracts using LangGraph. # Product Scraper Source: https://docs.tensorlake.ai/examples/tutorials/product-scraper Learn how to leverage secrets and images on Tensorlake Serverless by building a product-scraping workflow. This tutorial uses the legacy Graph API (`Graph`, `tensorlake_function`). For new projects, use the current `@application()` / `@function()` API — see the [Quickstart](/applications/quickstart). In this tutorial, we will: 1. Create a Tensorlake Graph 2. Test Locally 3. Define Dependencies and Secrets 4. Deploy to Tensorlake Serverless 5. Invoke the Graph Remotely 6. Troubleshoot Remote Executions Let's create a simple workflow that scrapes an e-commerce product page, summarizes the product details, and extracts some structured information about the product. ## Prerequisites Before proceeding, ensure you have the following: * **Python Environment**: Python 3.9 or higher installed. * **Tensorlake Account**: Sign up at [Tensorlake](https://cloud.tensorlake.ai/). * **API Key**: After creating your account, generate an API key for the Tensorlake CLI and set it as an environment variable: ```bash theme={null} export TENSORLAKE_API_KEY= ``` * **Tensorlake SDK**: Install the Tensorlake SDK using pip: ```bash theme={null} pip install tensorlake ``` * **OpenAI API Key**: Can be created at [OpenAI](https://platform.openai.com/api-keys). ## Step 1: Writing the Graph In `workflow.py`, we write three functions: * `scrape_website` will leverage [https://jina.ai/reader/](https://jina.ai/reader/) to parse websites into text. * `summarize_text` will leverage OpenAI's chatgpt to summarize the text outputted from `scrape_website`. * `extract_structured_data` will leverage OpenAI's chatgpt to extract structured data defined as a Python class from the text outputted from `scrape_website`. The Graph `website-summarizer` executes the `scrape_website` function, and then executes both the `summarize_text` and `extract_structured_data` in parallel with the output of the scraper. ```python theme={null} from tensorlake import tensorlake_function, Graph from tensorlake import Image from pydantic import BaseModel from openai import OpenAI import requests @tensorlake_function() def scrape_website(url: str) -> str: return requests.get(f"http://r.jina.ai/{url}").text @tensorlake_function() def summarize_text(text: str) -> str: completion = OpenAI().chat.completions.create( model="gpt-4o-mini-2024-07-18", messages=[ {"role": "system", "content": "You are a helpful assistant. Generate a summary of this website"}, {"role": "user", "content": text}, ], ) return completion.choices[0].message.content class Product(BaseModel): name: str description: str price: float @tensorlake_function() def extract_structured_data(text: str) -> Product: client = OpenAI() completion = client.beta.chat.completions.parse( model="gpt-4o-2024-08-06", messages=[ {"role": "system", "content": "Extract the product information."}, {"role": "user", "content": text}, ], response_format=Product, ) return completion.choices[0].message.parsed graph = Graph(name="product-scraper", start_node=scrape_website) graph.add_edge(scrape_website, summarize_text) graph.add_edge(scrape_website, extract_structured_data) ``` ## Step 2: Test Locally Before running the code locally, we need to ensure all the dependencies of the graph are available locally. For this graph, we need to have to run `pip install openai` to install the OpenAI SDK. Additionally, the OpenAI SDK requires the `OPENAI_API_KEY` environment variable: ```bash theme={null} export OPENAI_API_KEY= ``` Once the dependencies and secrets are available, add the following code to enable running the graph locally: ```python theme={null} if __name__ == "__main__": invocation = graph.local().queue("https://onyxcoffeelab.com/products/blend-box") outputs = invocation.outputs("summarize_text") print(outputs) outputs = invocation.outputs("extract_structured_data") print(outputs) ``` Running `python workflow.py` will execute the workflow locally and print the outputs. There are two print statements for this graph: one for the text summarization text and one for the structured extraction. ## Step 3: Define Dependencies and Secrets The current version of the graph requires some Python dependencies and some environment variables containing secrets. Tensorlake Serverless provides Images and Secrets to define what a `tensorlake_function` requires when running on the Tensorlake Cloud. ### Dependencies With Tensorlake Serverless, every function runs in its own sandbox defined via images. We define two images that we associate with the function that requires them. ```python theme={null} from tensorlake import tensorlake_function, Graph, Image from pydantic import BaseModel scrape_image = Image().run("pip install requests") openai_image = Image().run("pip install openai") @tensorlake_function(image=scrape_image) def scrape_website(url: str) -> str: import requests return requests.get(f"http://r.jina.ai/{url}").text @tensorlake_function(image=openai_image) def summarize_text(text: str) -> str: from openai import OpenAI completion = OpenAI().chat.completions.create( model="gpt-4o-mini-2024-07-18", messages=[ {"role": "system", "content": "You are a helpful assistant. Generate a summary of this website"}, {"role": "user", "content": text}, ], ) return completion.choices[0].message.content class Product(BaseModel): name: str description: str price: float @tensorlake_function(image=openai_image) def extract_structured_data(text: str) -> Product: from openai import OpenAI completion = OpenAI().beta.chat.completions.parse( model="gpt-4o-2024-08-06", messages=[ {"role": "system", "content": "Extract the product information."}, {"role": "user", "content": text}, ], response_format=Product, ) return completion.choices[0].message.parsed ``` As part of adding an image attribute to the `tensorlake_function` decorator, we also moved imports within each function. This allows creating smaller per-function images without needing to have all the dependencies in all images therefore reducing cold-start when big dependencies are needed like AI models. ### Secrets The graph requires the presence of the `OPENAI_API_KEY` environment variable containing a sensitive value. Tensorlake Serverless provides the concept of secrets that are injected at runtime into functions depending on them. Secrets are encrypted and only decrypted to be injected into functions. Create the tensorlake secret using the Tensorlake CLI: ```bash theme={null} tl secrets set OPENAI_API_KEY= ``` Change the function requiring the OpenAI API Key so that Tensorlake Serverless can inject the value at runtime: ```python theme={null} @tensorlake_function(image=openai_image, secrets=["OPENAI_API_KEY"]) def summarize_text(text: str) -> str: from openai import OpenAI completion = OpenAI().chat.completions.create( model="gpt-4o-mini-2024-07-18", messages=[ {"role": "system", "content": "You are a helpful assistant. Generate a summary of this website"}, {"role": "user", "content": text}, ], ) return completion.choices[0].message.content class Product(BaseModel): name: str description: str price: float @tensorlake_function(image=openai_image, secrets=["OPENAI_API_KEY"]) def extract_structured_data(text: str) -> Product: from openai import OpenAI completion = OpenAI().beta.chat.completions.parse( model="gpt-4o-2024-08-06", messages=[ {"role": "system", "content": "Extract the product information."}, {"role": "user", "content": text}, ], response_format=Product, ) return completion.choices[0].message.parsed ``` Every remote invocation will now use the value of the secret we created when running the `summarize_text` and `extract_structured_data` functions. ## Step 4: Deploying the Graph The graph can be deployed as a remote API on Tensorlake Cloud, and can be called from any application on-demand. ```bash theme={null} tl deploy workflow.py ``` This process will create a new image capable of running your functions, and deploy the graph as a remote API. ## Step 5: Invoking the Graph Remotely Once the graph is deployed, you can invoke it remotely by modifying the main code: ```python theme={null} if __name__ == "__main__": invocation = graph.queue("https://onyxcoffeelab.com/products/blend-box") outputs = invocation.outputs("summarize_text") print(outputs) outputs = invocation.outputs("extract_structured_data") print(outputs) ``` Alternatively, you can obtain a reference to the deployed graph and invoke it: ```python theme={null} from tensorlake import TensorlakeClient if __name__ == "__main__": client = TensorlakeClient() graph = client.get_graph("product-scrapper") invocation = graph.queue("https://onyxcoffeelab.com/products/blend-box") outputs = invocation.outputs("summarize_text") print(outputs[0]) outputs = invocation.outputs("extract_structured_data") print(outputs[0]) ``` The Graph is called with the input of the starting node of the graph, in this case `scrape_website`, so the input to the graph is the `url` parameter. The result of calling a graph is an `Invocation`. Since data applications can take a long time to complete, calling `outputs` on an invocation will wait for the invocation to be complete. In either case, the result of the individual functions can be retrieved using the invocation id, and the name of the function. ## Step 6: Monitoring and Troubleshooting Monitor your graph's invocations and logs using the Tensorlake CLI: ```bash theme={null} tl invocations list tl invocations logs --function-name ``` These commands help you track executions and diagnose any issues that may arise during remote invocations. # Query SEC Filings Stored in Databricks Source: https://docs.tensorlake.ai/examples/tutorials/query-sec-filings-databricks Track how AI risk disclosures evolved across major tech companies from 2021-2025 by parsing SEC filings, extracting structured risk data, and running SQL analytics in Databricks. # Analyzing AI Risk Disclosures in SEC Filings with Tensorlake & Databricks Track how AI risk disclosures evolved across major tech companies from 2021-2025 by parsing SEC filings, extracting structured risk data, and running SQL analytics in Databricks Data Intelligence Platform. ## Track AI Risk Evolution Across Tech Companies Let's set the context for this example: you'll build a document analytics pipeline that processes SEC filings from major tech companies to track how AI risk disclosures have evolved from 2021-2025. You'll learn how to: * Use Tensorlake's [Page Classification](https://docs.tensorlake.ai/document-ingestion/parsing/page-classification) to identify risk factor pages with VLMs * Extract [structured data](https://docs.tensorlake.ai/document-ingestion/parsing/structured-extraction) from only relevant pages using Pydantic schemas * Deploy serverless applications on Tensorlake's platform to run your entire pipeline * Load parsed document data into [Databricks](https://www.databricks.com/) for SQL analytics * Query trends, compare companies, and discover emerging risk patterns ### The Challenge Major tech companies file lengthy SEC reports (100-200+ pages) quarterly. AI-related risk disclosures are scattered throughout these documents, making manual analysis time-consuming and prone to missing critical information. ### Our Solution We'll analyze 3 SEC filings from Microsoft, Google, and Meta spanning 2024-2025 to: 1. Use VLMs to identify pages containing AI risk factors (reducing processing from \~200 pages to \~20 per document) 2. Extract structured risk data from only relevant pages 3. Deploy the entire pipeline as serverless applications on Tensorlake 4. Store and analyze trends in Databricks SQL Warehouse 5. Uncover emerging AI risk patterns and regulatory concerns ## Prerequisites * Python 3.11+ * A [Tensorlake API key](https://docs.tensorlake.ai/platform/authentication#api-keys) * Databricks SQL Warehouse credentials: * Server Hostname * HTTP Path * Access Token * \[Optional] A [virtual Python environment](https://docs.python.org/3/library/venv.html) to keep dependencies isolated ## Getting Started ### Databricks Setup You need access to a Databricks SQL Warehouse. Find your connection details in the Databricks workspace under **SQL Warehouses → Connection Details**. ### Local Testing #### 1. Install Dependencies ```bash theme={null} pip install --upgrade tensorlake databricks-sql-connector pandas pyarrow ``` #### 2. Set Environment Variables ```bash theme={null} export TENSORLAKE_API_KEY=YOUR_TENSORLAKE_API_KEY export DATABRICKS_SERVER_HOSTNAME=YOUR_DATABRICKS_SERVER_HOSTNAME export DATABRICKS_HTTP_PATH=YOUR_DATABRICKS_HTTP_PATH export DATABRICKS_ACCESS_TOKEN=YOUR_DATABRICKS_ACCESS_TOKEN ``` Or create a `.env` file with these values. ## Build Your Document Processing Application We'll create a Tensorlake application that extracts AI risk data from SEC filings and stores it in Databricks. This application demonstrates a complete document processing pipeline using Tensorlake Applications with parallel processing via `.map()`. ### Pipeline Architecture The application follows this flow: ``` document_ingestion (entry point) └──> classify_pages - Classifies pages in SEC filings └──> extract_structured_data.map() - Extracts data from classified pages IN PARALLEL └──> initialize_databricks_table - Sets up database schema └──> write_to_databricks.map() - Writes results to Databricks IN PARALLEL ``` Key Tensorlake concepts used: * `@application()`: Marks the entry point of your application * `@function()`: Makes functions distributed and executable in the cloud or locally * `.map()`: Enables parallel execution across multiple items * `Image`: Defines the Docker container environment with dependencies * `secrets`: Securely injects environment variables at runtime ### Define Your Extraction Schemas First, define the Pydantic models that describe the data structure you want to extract: ```python theme={null} from pydantic import BaseModel, Field from typing import List, Optional class AIRiskMention(BaseModel): """Individual AI-related risk mention""" risk_category: str = Field( description="Category: Operational, Regulatory, Competitive, Ethical, Security, Liability" ) risk_description: str = Field(description="Description of the AI risk") severity_indicator: Optional[str] = Field(None, description="Severity level if mentioned") citation: str = Field(description="Page reference") class AIRiskExtraction(BaseModel): """Complete AI risk data from a filing""" company_name: str ticker: str filing_type: str filing_date: str fiscal_year: str fiscal_quarter: Optional[str] = None ai_risk_mentioned: bool ai_risk_mentions: List[AIRiskMention] = [] num_ai_risk_mentions: int = 0 ai_strategy_mentioned: bool = False ai_investment_mentioned: bool = False ai_competition_mentioned: bool = False regulatory_ai_risk: bool = False ``` ### Create the Document Processing Application Create a file called `process-sec.py`: ```python theme={null} import os import json from typing import List, Optional, Tuple, Any from pydantic import BaseModel, Field from databricks import sql from tensorlake.applications import Image, application, function, cls from tensorlake.documentai import ( DocumentAI, PageClassConfig, StructuredExtractionOptions, ParseResult ) # TENSORLAKE APPLICATIONS: Define a custom runtime environment # Image defines the Docker container environment where your functions will run. # You can specify dependencies, system packages, and environment configuration. # All @function decorators can reference this image to ensure consistent execution. image = ( Image(base_image="python:3.11-slim", name="databricks-sec") .run("pip install databricks-sql-connector pandas pyarrow") ) # [Include the Pydantic models from above] # TENSORLAKE APPLICATIONS: Application Entry Point # @application() marks this function as the main entry point for your Tensorlake application. # @function() makes this a distributed function that can run in the cloud or locally. # # Key concepts: # - secrets: List of environment variable names that will be securely injected at runtime # - image: The runtime environment (Docker container) where this function executes # - Functions decorated with @function can call other @function decorated functions # - You can use .map() on @function decorated functions for parallel execution @application() @function( secrets=[ "TENSORLAKE_API_KEY" ], image=image ) def document_ingestion(document_urls: List[str]) -> None: """Main entry point for document processing pipeline""" print(f"Starting document ingestion for {len(document_urls)} documents.") # Step 1: Classify pages in all documents parse_ids = classify_pages(document_urls) print(f"Classification complete. Parse IDs: {parse_ids}") # Step 2: Extract structured data with parallel execution # .map() calls extract_structured_data once for each item in parse_ids.items() # Each call runs in parallel, making this very efficient for processing multiple documents # Returns a list of results (tuples in this case) from all parallel executions results = extract_structured_data.map(parse_ids.items()) print(f"Extraction complete. Results: {results}") # Step 3: Initialize database schema initialize_databricks_table() print("Databricks table initialized.") # Step 4: Write data to Databricks in parallel # .map() again enables parallel processing - each result tuple is written to Databricks # in parallel, significantly speeding up the data ingestion process print("Writing results to Databricks.") write_to_databricks.map(results) print("Document ingestion process completed.") @function( secrets=[ "TENSORLAKE_API_KEY" ], image=image ) def classify_pages(document_urls: List[str]) -> None: """Classify pages in SEC filings to identify AI risk factors""" doc_ai = DocumentAI(api_key=os.getenv("TENSORLAKE_API_KEY")) page_classifications = [ PageClassConfig( name="risk_factors", description="Pages that contain risk factors related to AI." ), ] parse_ids = {} for file_url in document_urls: try: parse_id = doc_ai.classify( file_url=file_url, page_classifications=page_classifications ) parse_ids[file_url] = parse_id print(f"Successfully classified {file_url}: {parse_id}") except Exception as e: print(f"Failed to classify document {file_url}: {e}") return parse_ids # TENSORLAKE APPLICATIONS: Distributed Function for Parallel Processing # This function is designed to be called via .map() for parallel execution. # When called with .map(), this function runs once for each item in the input list, # with all executions happening in parallel across multiple workers. # # Error Handling: Always wrap .map() functions in try-except to return None on failure. # This allows the pipeline to continue processing other items even if one fails. @function( image=image, secrets=[ "TENSORLAKE_API_KEY" ] ) def extract_structured_data(url_parse_id_pair: Tuple[str, str]) -> Optional[Tuple[str, str]]: """Extract structured data from classified pages Args: url_parse_id_pair: Tuple of (file_url, parse_id) from the classification step Returns: Tuple of (extract_result_id, file_url) or None if processing fails """ print(f"Processing: {url_parse_id_pair}") try: doc_ai = DocumentAI(api_key=os.getenv("TENSORLAKE_API_KEY")) result = doc_ai.wait_for_completion(parse_id=url_parse_id_pair[1]) page_numbers = [] for page_class in result.page_classes: if page_class.page_class == "risk_factors": page_numbers.extend(page_class.page_numbers) if not page_numbers: print(f"No risk factor pages found for {url_parse_id_pair[0]}") return None page_number_str_list = ",".join(str(i) for i in page_numbers) print(f"Extracting from pages: {page_number_str_list}") extract_result = doc_ai.extract( file_url=url_parse_id_pair[0], page_range=page_number_str_list, structured_extraction_options=[ StructuredExtractionOptions( schema_name="AIRiskExtraction", json_schema=AIRiskExtraction ) ] ) print(f"Extraction result: {extract_result}") return (extract_result, url_parse_id_pair[0]) except Exception as e: print(f"Error processing {url_parse_id_pair[0]}: {e}") return None @function( image=image, secrets=[ "DATABRICKS_SERVER_HOSTNAME", "DATABRICKS_HTTP_PATH", "DATABRICKS_ACCESS_TOKEN" ] ) def initialize_databricks_table() -> None: """Initialize the Databricks table with the required schema""" connection = sql.connect( server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"), http_path=os.getenv("DATABRICKS_HTTP_PATH"), access_token=os.getenv("DATABRICKS_ACCESS_TOKEN"), _tls_no_verify=True, ) cursor = connection.cursor() create_ai_risk_factors_sql = """ CREATE TABLE IF NOT EXISTS ai_risk_filings ( company_name STRING, ticker STRING, filing_type STRING, filing_date STRING, fiscal_year STRING, fiscal_quarter STRING, ai_risk_mentioned BOOLEAN, ai_risk_mentions STRING, num_ai_risk_mentions INT, ai_strategy_mentioned BOOLEAN, ai_investment_mentioned BOOLEAN, ai_competition_mentioned BOOLEAN, regulatory_ai_risk BOOLEAN ) """ cursor.execute(create_ai_risk_factors_sql) create_ai_risk_mentions_sql = """ CREATE TABLE IF NOT EXISTS ai_risks ( company_name STRING, ticker STRING, fiscal_year STRING, fiscal_quarter STRING, source_file STRING, risk_category STRING, risk_description STRING, severity_indicator STRING, citation STRING ) """ cursor.execute(create_ai_risk_mentions_sql) connection.commit() connection.close() # TENSORLAKE APPLICATIONS: Parallel Database Write Function # This function is called via .map() to write results to Databricks in parallel. # Each execution processes one result tuple from the extraction step. # # Data Flow: extract_structured_data returns tuples -> .map() collects them into a list # -> write_to_databricks.map() processes each tuple in parallel # # Secrets: Multiple secrets can be specified. Each will be available as an environment # variable inside the function. Secrets are never logged or exposed in code. @function( image=image, secrets=[ "TENSORLAKE_API_KEY", "DATABRICKS_SERVER_HOSTNAME", "DATABRICKS_HTTP_PATH", "DATABRICKS_ACCESS_TOKEN" ] ) def write_to_databricks(result_tuple: Tuple[Any, str]) -> None: """Write structured data to Databricks tables Args: result_tuple: Tuple of (extract_result_id, file_url) from extract_structured_data """ # Handle None values - functions called via .map() should gracefully skip failed items if result_tuple is None: return extract_result, file_url = result_tuple if extract_result is None: return doc_ai = DocumentAI(api_key=os.getenv("TENSORLAKE_API_KEY")) result: ParseResult = doc_ai.wait_for_completion(extract_result) if not result.structured_data: return raw = result.structured_data[0].data record = raw if isinstance(raw, dict) else (raw[0] if isinstance(raw, list) and raw else {}) data = dict(record) mentions = data.pop("ai_risk_mentions", []) or [] # Add source file reference source_file = os.path.basename(file_url) connection = sql.connect( server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"), http_path=os.getenv("DATABRICKS_HTTP_PATH"), access_token=os.getenv("DATABRICKS_ACCESS_TOKEN"), _tls_no_verify=True, ) cursor = connection.cursor() # Serialize mentions for STRING column storage ai_risk_mentions_json = json.dumps(mentions) if mentions else None # Insert the single record into ai_risk_filings insert_sql = """ INSERT INTO ai_risk_filings ( company_name, ticker, filing_type, filing_date, fiscal_year, fiscal_quarter, ai_risk_mentioned, ai_risk_mentions, num_ai_risk_mentions, ai_strategy_mentioned, ai_investment_mentioned, ai_competition_mentioned, regulatory_ai_risk ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) """ # Execute the insert with positional parameters cursor.execute(insert_sql, ( data.get('company_name'), data.get('ticker'), data.get('filing_type'), data.get('filing_date'), data.get('fiscal_year'), data.get('fiscal_quarter'), data.get('ai_risk_mentioned', False), ai_risk_mentions_json, data.get('num_ai_risk_mentions', 0), data.get('ai_strategy_mentioned', False), data.get('ai_investment_mentioned', False), data.get('ai_competition_mentioned', False), data.get('regulatory_ai_risk', False) )) # Insert into ai_risks table if mentions: insert_mentions_sql = """ INSERT INTO ai_risks ( company_name, ticker, fiscal_year, fiscal_quarter, source_file, risk_category, risk_description, severity_indicator, citation ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?) """ for mention in mentions: cursor.execute(insert_mentions_sql, ( data.get('company_name'), data.get('ticker'), data.get('fiscal_year'), data.get('fiscal_quarter'), source_file, mention.get('risk_category'), mention.get('risk_description'), mention.get('severity_indicator'), mention.get('citation') )) connection.commit() connection.close() if __name__ == "__main__": from tensorlake.applications import run_local_application # TENSORLAKE APPLICATIONS: Local Development # run_local_application() executes your application locally for testing and development. # Pass the entry point function (decorated with @application()) and its arguments. # # For production deployment: # 1. Use Tensorlake CLI to deploy: `tl deploy` # 2. Your application will run in the cloud with automatic scaling # 3. All @function decorated functions will execute in their specified container environments # # Secrets: When running locally, secrets are read from environment variables. # In production, secrets are managed securely through the Tensorlake platform. # Example usage with a single document test_urls = [ "https://investors.confluent.io/static-files/95299e90-a988-42c5-b9b5-7da387691f6a" ] response = run_local_application( document_ingestion, test_urls ) print(response.output()) ``` ### Test Locally Run the processing script to extract data from a test SEC filing: ```bash theme={null} python process-sec.py ``` This will: 1. Classify pages to find AI risk factors using VLMs 2. Extract structured data from those pages in parallel via `.map()` 3. Initialize the Databricks table schema 4. Load the extracted data into your Databricks tables in parallel via `.map()` ## Build Your Query Application Now create a separate application for querying the extracted data. Create a file called `query-sec.py`: ```python theme={null} import os import json from databricks import sql from tensorlake.applications import Image, application, function image = ( Image(base_image="python:3.11-slim", name="databricks-sec") .run("pip install databricks-sql-connector pandas pyarrow") ) @application() @function(image=image) def query_sec(query_choice: str) -> str: """Query AI risk data from Databricks""" # Default query: Risk category distribution query = """ SELECT risk_category, COUNT(*) as total_mentions, COUNT(DISTINCT company_name) as companies_mentioning FROM ai_risks WHERE risk_category IS NOT NULL GROUP BY risk_category ORDER BY total_mentions DESC """ # Select query based on user choice match query_choice: case "operational-risks": query = """ WITH ranked_risks AS ( SELECT company_name, ticker, risk_description, citation, LENGTH(risk_description) as description_length, ROW_NUMBER() OVER ( PARTITION BY company_name ORDER BY LENGTH(risk_description) DESC ) as rn FROM ai_risks WHERE risk_category = 'Operational' ) SELECT company_name, ticker, risk_description, citation, description_length FROM ranked_risks WHERE rn = 1 ORDER BY company_name """ case "risk-evolution": query = """ SELECT company_name, ticker, fiscal_year, fiscal_quarter, risk_category, risk_description, citation FROM ai_risks WHERE fiscal_year = '2025' ORDER BY company_name, fiscal_quarter """ case "risk-timeline": query = """ SELECT fiscal_year, fiscal_quarter, COUNT(DISTINCT company_name) as num_companies, SUM(num_ai_risk_mentions) as total_risk_mentions, AVG(num_ai_risk_mentions) as avg_risk_mentions_per_filing, SUM(CASE WHEN regulatory_ai_risk THEN 1 ELSE 0 END) as filings_with_regulatory_risk FROM ai_risk_filings GROUP BY fiscal_year, fiscal_quarter ORDER BY fiscal_year, fiscal_quarter """ case "risk-profiles": query = """ SELECT company_name, ticker, risk_category, COUNT(*) as frequency FROM ai_risks WHERE risk_category IS NOT NULL GROUP BY company_name, ticker, risk_category ORDER BY company_name, frequency DESC """ case "company-summary": query = """ SELECT company_name, ticker, COUNT(*) as total_filings, AVG(num_ai_risk_mentions) as avg_risk_mentions, SUM(CASE WHEN regulatory_ai_risk THEN 1 ELSE 0 END) as filings_with_regulatory_risk, SUM(CASE WHEN ai_competition_mentioned THEN 1 ELSE 0 END) as filings_mentioning_competition, SUM(CASE WHEN ai_investment_mentioned THEN 1 ELSE 0 END) as filings_mentioning_investment FROM ai_risk_filings GROUP BY company_name, ticker ORDER BY avg_risk_mentions DESC """ return make_query(query) @function( image=image, secrets=[ "DATABRICKS_SERVER_HOSTNAME", "DATABRICKS_HTTP_PATH", "DATABRICKS_ACCESS_TOKEN" ] ) def make_query(query: str) -> str: """Execute query against Databricks and return JSON results""" import pandas as pd from databricks import sql try: connection = sql.connect( server_hostname=os.getenv("DATABRICKS_SERVER_HOSTNAME"), http_path=os.getenv("DATABRICKS_HTTP_PATH"), access_token=os.getenv("DATABRICKS_ACCESS_TOKEN"), _tls_no_verify=True, ) cursor = connection.cursor() cursor.execute(query) # Fetch results as pandas DataFrame results = cursor.fetchall() columns = [desc[0] for desc in cursor.description] df = pd.DataFrame(results, columns=columns) cursor.close() connection.close() return df.to_json(orient='records') except Exception as e: raise e if __name__ == "__main__": from tensorlake.applications import run_local_application import sys queries = [ "risk-distribution", "operational-risks", "risk-evolution", "risk-timeline", "risk-profiles", "company-summary" ] query = queries[0] if len(sys.argv) > 1: query = queries[int(sys.argv[1])] response = run_local_application(query_sec, query) pretty_json = json.loads(response.output()) print(json.dumps(pretty_json, indent=4)) ``` ### Test Queries Locally Query the extracted data (replace `5` with any query number 0-5): ```bash theme={null} python query-sec.py 5 ``` Available queries: * `0` - Risk category distribution * `1` - Operational AI risks (most detailed per company) * `2` - Emerging risks in 2025 * `3` - Risk timeline analysis * `4` - Company risk profiles * `5` - Company summary statistics ## Deploy to Tensorlake Cloud Now that you've tested locally, deploy your applications to run as serverless functions in the cloud. ### 1. Verify Tensorlake Connection ```bash theme={null} tl whoami ``` ### 2. Set Secrets Store your credentials securely in Tensorlake: ```bash theme={null} tl secrets set DATABRICKS_SERVER_HOSTNAME='YOUR_DATABRICKS_SERVER_HOSTNAME' tl secrets set DATABRICKS_HTTP_PATH='YOUR_DATABRICKS_HTTP_PATH' tl secrets set DATABRICKS_ACCESS_TOKEN='YOUR_DATABRICKS_ACCESS_TOKEN' tl secrets set TENSORLAKE_API_KEY='YOUR_TENSORLAKE_API_KEY' ``` ### 3. Verify Secrets ```bash theme={null} tl secrets list ``` ### 4. Deploy Applications Deploy the processing application: ```bash theme={null} tl deploy process-sec.py ``` Deploy the query application: ```bash theme={null} tl deploy query-sec.py ``` Once deployed, you'll see both applications in your dashboard at [cloud.tensorlake.ai](https://cloud.tensorlake.ai). ### 5. Run the Full Pipeline Create a script called `process-sec-remote.py` to process all SEC filings using your deployed application: ```python theme={null} from tensorlake.applications import run_remote_application, Request # SEC Filings to process sec_filings = [ 'https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/goog-10k-december-24.pdf', 'https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/msft-10k-june-25.pdf', 'https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/meta-10k-december-24.pdf' ] request: Request = run_remote_application('document_ingestion', sec_filings) ``` Run it: ```bash theme={null} python process-sec-remote.py ``` ### 6. Query from the Deployed Application Create a script called `query-sec-remote.py`: ```python theme={null} from tensorlake.applications import run_remote_application, Request import json import sys # Available queries queries = [ "risk-distribution", "operational-risks", "risk-evolution", "risk-timeline", "risk-profiles", "company-summary" ] # Choose a query (default: risk-distribution) query = queries[0] if len(sys.argv) > 1: query = queries[int(sys.argv[1])] request: Request = run_remote_application('query_sec', query) output = request.output() pretty_json = json.loads(output) print(json.dumps(pretty_json, indent=4)) ``` Run a specific query: ```bash theme={null} python query-sec-remote.py 2 ``` ## Analyze Your Results Let's examine what insights we can extract from the data. ### Query 1: Risk Category Distribution See which types of AI risks are most common: ```bash theme={null} python query-sec-remote.py 0 ``` **Expected Output:** ```json theme={null} [ { "risk_category": "Operational", "total_mentions": 15, "companies_mentioning": 3 }, { "risk_category": "Regulatory", "total_mentions": 12, "companies_mentioning": 3 }, { "risk_category": "Ethical", "total_mentions": 10, "companies_mentioning": 3 } ] ``` ### Query 2: Most Detailed Operational Risks Find the most comprehensive operational risk description from each company: ```bash theme={null} python query-sec-remote.py 1 ``` This returns the longest (most detailed) operational risk disclosure per company, helping you understand each company's primary operational concerns. ### Query 3: Timeline Analysis Track how risk mentions evolved over time: ```bash theme={null} python query-sec-remote.py 3 ``` This shows trends in risk disclosure volume and helps identify when companies started taking AI risks more seriously. ### Query 4: Company Risk Profiles Compare risk category frequencies across companies: ```bash theme={null} python query-sec-remote.py 4 ``` Understand which companies focus on which types of risks. ## Key Insights Through this analysis pipeline, you can uncover: 1. **Risk Category Trends**: Operational and regulatory risks dominate across all companies 2. **Disclosure Evolution**: Risk mention frequency increases in more recent filings 3. **Company Differences**: Each company emphasizes different risk categories based on their AI strategy 4. **Emerging Patterns**: New risk categories appear over time (liability, IP concerns, energy dependencies) ## Architecture Benefits This Tensorlake + Databricks integration provides: * **Serverless Execution**: No infrastructure to manage, applications scale automatically * **Parallel Processing**: Multiple documents processed simultaneously via `.map()` at both extraction and database write stages * **Separation of Concerns**: Document processing and querying are independent applications * **Reusable Components**: Each function can be called independently or composed into larger pipelines * **Secret Management**: Credentials stored securely and injected at runtime * **Fault Tolerance**: Functions wrapped in try-except ensure pipeline continues even if individual items fail ## Adapt This Pipeline This pipeline can be adapted for any document analysis use case: * **ESG Disclosures**: Track sustainability commitments across annual reports * **Financial Metrics Tracking**: Extract KPIs from earnings reports over time * **Competitive Intelligence**: Monitor competitor product launches and strategies * **Regulatory Compliance**: Alert on new compliance requirements in legal documents * **Contract Analysis**: Extract key terms and obligations from agreements ## Clean Up When you're done with this example: ```bash theme={null} # Deactivate virtual environment deactivate # Optional: Delete deployed applications tl applications delete document_ingestion tl applications delete query_sec ``` ## Next Steps Now that you have the basics down, explore these resources: * [Python SDK and API Docs](https://docs.tensorlake.ai) * [Applications Documentation](https://docs.tensorlake.ai/applications/quickstart) * [Page Classification Guide](https://docs.tensorlake.ai/document-ingestion/parsing/page-classification) * [Structured Extraction Guide](https://docs.tensorlake.ai/document-ingestion/parsing/structured-extraction) * [Blog](https://tensorlake.ai/blog) * [Community Slack](https://tlake.link/slack) Try building your own document intelligence pipeline with Tensorlake and Databricks today! # Query SEC Filings Stored in MotherDuck Source: https://docs.tensorlake.ai/examples/tutorials/query-sec-filings-motherduck Track how AI risk disclosures evolved across Microsoft, Google, and Meta from 2021-2025 by parsing 40 SEC filings, extracting structured risk data, and running SQL analytics in MotherDuck. # Analyzing AI Risk Disclosures in SEC Filings with Tensorlake & MotherDuck Track how AI risk disclosures evolved across Microsoft, Google, and Meta from 2021-2025 by parsing 40 SEC filings, extracting structured risk data, and running SQL analytics in MotherDuck. Try out this example using this [Colab Notebook](https://tlake.link/notebooks/motherduck-sec-filings) ## Track AI Risk Evolution Across Tech Companies Let's set the context for this example: you'll build a document analytics pipeline that processes SEC filings from major tech companies to track how AI risk disclosures have evolved from 2021-2025. You'll learn how to: * Use Tensorlake's [Page Classification](https://docs.tensorlake.ai/document-ingestion/parsing/page-classification) to identify risk factor pages with VLMs * Extract [structured data](https://docs.tensorlake.ai/document-ingestion/parsing/structured-extraction) from only relevant pages using Pydantic schemas * Load parsed document data into [MotherDuck](https://motherduck.com/) for SQL analytics * Query trends, compare companies, and discover emerging risk patterns ### The Challenge Major tech companies file lengthy SEC reports (100-200+ pages) quarterly. AI-related risk disclosures are scattered throughout these documents, making manual analysis time-consuming and prone to missing critical information. ### Our Solution We'll analyze 40 SEC filings from Microsoft, Google, and Meta spanning 2021-2025 to: 1. Use VLMs to identify pages containing AI risk factors (reducing processing from \~200 pages to \~20 per document) 2. Extract structured risk data from only relevant pages 3. Store and analyze trends in MotherDuck's cloud data warehouse 4. Uncover emerging AI risk patterns and regulatory concerns ## Prerequisites * Python 3.10+ * A [Tensorlake API key](https://docs.tensorlake.ai/platform/authentication#api-keys) * A [MotherDuck token](https://motherduck.com/docs/authenticating-to-motherduck/) * SEC filing PDFs (we provide sample URLs) * \[Optional] A [virtual Python environment](https://docs.python.org/3/library/venv.html) to keep dependencies isolated ## Build Your Document Analytics Pipeline ### Set up your environment The `tensorlake` package includes DocumentAI for parsing, while `duckdb` provides the MotherDuck connector. ```bash theme={null} python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate ``` ### Install necessary packages ```bash theme={null} pip install --upgrade tensorlake duckdb==1.3.2 ``` ### Configure your API keys Set environment variables for authentication: ```bash theme={null} export TENSORLAKE_API_KEY="your_tensorlake_api_key" export MOTHERDUCK_TOKEN="your_motherduck_token" ``` Or create a `.env` file: ``` TENSORLAKE_API_KEY=your_tensorlake_api_key MOTHERDUCK_TOKEN=your_motherduck_token ``` ### Prepare your imports ```python theme={null} import os import json import duckdb import pandas as pd from tensorlake.documentai import DocumentAI, PageClassConfig, StructuredExtractionOptions from pydantic import BaseModel, Field from typing import List, Optional ``` ### Configure target documents We'll analyze SEC filings from three AI leaders. These URLs point to 10-Ks (annual) and 10-Qs (quarterly) filings: ```python theme={null} # Company configurations AI_COMPANIES = { "MSFT": "Microsoft", "GOOGL": "Alphabet", "META": "Meta Platforms" } BASE_URL = "https://pub-226479de18b2493f96b64c6674705dd8.r2.dev" FILING_URLS = { "GOOGL": [ f"{BASE_URL}/goog-10q-june-25.pdf", f"{BASE_URL}/goog-10q-march-25.pdf", f"{BASE_URL}/goog-10k-december-24.pdf", f"{BASE_URL}/goog-10q-september-24.pdf", f"{BASE_URL}/goog-10q-june-24.pdf", # ... more filings ], "MSFT": [ f"{BASE_URL}/msft-10k-june-25.pdf", f"{BASE_URL}/msft-10q-march-25.pdf", # ... more filings ], "META": [ f"{BASE_URL}/meta-10q-june-25.pdf", f"{BASE_URL}/meta-10q-march-25.pdf", # ... more filings ] } ``` ## Step 1: Classify Risk Factor Pages with VLMs Using Tensorlake's Vision Language Models, we'll scan all filings to identify pages containing AI-related risk factors. This typically reduces processing from \~200 pages to \~20-30 relevant pages per document: ```python theme={null} doc_ai = DocumentAI() # Define what pages to identify page_classifications = [ PageClassConfig( name="risk_factors", description="Pages that contain risk factors related to AI." ) ] # Store classified pages for each document document_ai_risk_pages = {} # Classify all filings for company in AI_COMPANIES: for file_url in FILING_URLS[company]: print(f"Classifying: {file_url}") # Classify pages parse_id = doc_ai.classify( file_url=file_url, page_classifications=page_classifications ) # Wait for completion result = doc_ai.wait_for_completion(parse_id=parse_id) # Extract risk factor page numbers for page_class in result.page_classes: if page_class.page_class == "risk_factors": document_ai_risk_pages[file_url] = page_class.page_numbers print(f" Found risk pages: {page_class.page_numbers}") ``` ### Review classification results Let's examine which pages were identified as containing AI risk factors: ```python theme={null} for file_url, page_numbers in document_ai_risk_pages.items(): filename = os.path.basename(file_url) print(f"{filename}: Pages {page_numbers}") ``` You'll notice variation across companies and time periods—newer filings often have more pages dedicated to AI risks. ## Step 2: Define Extraction Schema We'll extract structured data about AI risks including categories (Operational, Regulatory, Competitive, etc.), descriptions, and severity indicators: ```python theme={null} class AIRiskMention(BaseModel): """Individual AI-related risk mention""" risk_category: str = Field( description="Category: Operational, Regulatory, Competitive, Ethical, Security, Liability" ) risk_description: str = Field(description="Description of the AI risk") severity_indicator: Optional[str] = Field(None, description="Severity level if mentioned") citation: str = Field(description="Page reference") class AIRiskExtraction(BaseModel): """Complete AI risk data from a filing""" company_name: str ticker: str filing_type: str filing_date: str fiscal_year: str fiscal_quarter: Optional[str] = None ai_risk_mentioned: bool ai_risk_mentions: List[AIRiskMention] = [] num_ai_risk_mentions: int = 0 ai_strategy_mentioned: bool = False ai_investment_mentioned: bool = False ai_competition_mentioned: bool = False regulatory_ai_risk: bool = False ``` ## Step 3: Extract Structured Risk Data Now we extract detailed AI risk information from only the classified pages. This targeted approach processes \~15% of pages while capturing 100% of relevant risk disclosures: ```python theme={null} results = {} for file_url, page_numbers in document_ai_risk_pages.items(): print(f"Extracting from: {file_url}") # Convert page numbers to comma-separated string page_range = ",".join(str(i) for i in page_numbers) print(f" Processing pages: {page_range}") # Parse and extract structured data result = doc_ai.parse_and_wait( file=file_url, page_range=page_range, structured_extraction_options=[ StructuredExtractionOptions( schema_name="AIRiskExtraction", json_schema=AIRiskExtraction ) ] ) results[file_url] = result print(f" ✓ Extracted {len(result.structured_data[0].data.get('ai_risk_mentions', []))} risk mentions") ``` ## Step 4: Save Extracted Data to JSON Export each filing's risk data to JSON files for loading into MotherDuck: ```python theme={null} json_files = [] for file_url, result in results.items(): if result.structured_data: # Extract filename from URL filename = os.path.basename(file_url) # "msft-10k-june-25.pdf" json_filename = filename.replace('.pdf', '.json') # Write to file with open(json_filename, 'w') as f: json.dump(result.structured_data[0].data, f, indent=2, default=str) json_files.append(json_filename) print(f"✓ Saved to {json_filename}") ``` ## Step 5: Load Data into MotherDuck Create a cloud-based data warehouse table in MotherDuck to enable fast SQL analytics across all filings: ```python theme={null} # Connect to MotherDuck con = duckdb.connect('md:ai_risk_factors') # Drop existing table if present con.execute("DROP TABLE IF EXISTS ai_risk_filings") # Prepare rows for loading rows = [] for filename in json_files: with open(filename, 'r') as f: data = json.load(f) # Convert nested array to JSON string for storage data['ai_risk_mentions'] = json.dumps(data.get('ai_risk_mentions', [])) rows.append(data) # Convert to DataFrame and create table df = pd.DataFrame(rows) con.execute("CREATE TABLE ai_risk_filings AS SELECT * FROM df") # Verify the data result = con.execute("SELECT * FROM ai_risk_filings").fetchdf() print(f"Loaded {len(result)} filings into MotherDuck") print(result.head()) ``` ## Step 6: Analyze Risk Trends with SQL Now the real power emerges—run SQL analytics on your document data to uncover insights. ### Query 1: Risk Category Distribution Understand the breakdown of AI risk categories across all companies: ```python theme={null} # Extract all risk mentions from JSON column all_risks = [] for _, row in con.execute("SELECT company_name, ai_risk_mentions FROM ai_risk_filings").fetchdf().iterrows(): risks = json.loads(row['ai_risk_mentions']) if not risks: continue for risk in risks: all_risks.append({ 'company_name': row['company_name'], 'risk_category': risk.get('risk_category'), 'risk_description': risk.get('risk_description') }) risks_df = pd.DataFrame(all_risks) # Count by category risk_categories = risks_df.groupby('risk_category').agg( total_mentions=('risk_category', 'count'), companies_mentioning=('company_name', 'nunique') ).reset_index().sort_values('total_mentions', ascending=False) print("Risk Category Distribution:") print(risk_categories) ``` **Expected Output:** ``` risk_category total_mentions companies_mentioning 0 Operational 37 3 1 Ethical 28 3 2 Regulatory 26 3 3 Competitive 7 2 4 Security 4 2 5 Liability 3 1 ``` ### Query 2: Deep Dive into Operational Risks Extract the most detailed operational risk descriptions from each company: ```python theme={null} operational_risks_df = con.execute(""" SELECT company_name, ticker, json_extract(risk.value, '$.risk_description') as risk_description, json_extract(risk.value, '$.citation') as citation FROM ai_risk_filings, json_each(ai_risk_mentions) as risk WHERE ai_risk_mentions IS NOT NULL AND ai_risk_mentions != '[]' AND json_extract(risk.value, '$.risk_category') = 'Operational' """).fetchdf() # Get longest (most detailed) operational risk per company operational_risks_df['description_length'] = operational_risks_df['risk_description'].apply( lambda x: len(x) if pd.notna(x) else 0 ) top_operational = ( operational_risks_df .sort_values('description_length', ascending=False) .groupby('company_name') .head(1) .reset_index(drop=True) ) # Print results for _, row in top_operational.iterrows(): print(f"\n{row['company_name']} ({row['ticker']}):") print("-" * 100) print(f"Citation: {row['citation']}") print(row['risk_description'][:500]) # First 500 chars print() ``` ## Key Insights Discovered Through this analysis pipeline, we've: * Processed 40 SEC filings (\~6,000+ total pages) * Identified and extracted AI risk disclosures from relevant pages * Built a queryable database of AI risk evolution from 2022-2025 ### Emerging Trends: 1. **Operational risks dominate** (37 mentions) - All three companies express concerns about AI infrastructure costs, development challenges, and potential misuse of AI systems 2. **Ethical considerations intensifying** (28 mentions) - Growing focus on bias, harmful content, and societal impact, particularly around generative AI 3. **Regulatory landscape evolving rapidly** - 2025 filings show increased mentions of specific regulations (EU AI Act, US AI Executive Order) 4. **New risk categories emerging in 2025**: * **Liability risks** - Meta explicitly discussing third-party misuse of open-source AI * **Intellectual property concerns** - Copyright and training data issues becoming prominent * **Energy dependencies** - Companies highlighting reliance on computing power 5. **Risk disclosure volume increasing** - Average risk mentions per filing grew from 2.0 in 2022 to 7.0 in 2024 ### Company-Specific Patterns: * **Microsoft**: Most comprehensive risk disclosures (55 total mentions), heavy focus on operational (19) and ethical (17) risks * **Meta**: Balanced concern across operational (16) and regulatory (16) risks, unique focus on open-source AI liability * **Alphabet**: More measured disclosures (10 total), but showing acceleration in 2025 ## Adapt This Pipeline for Your Use Case This pipeline can be adapted for any document analysis need: * **ESG disclosures** - Track sustainability commitments and progress * **Financial metrics tracking** - Extract KPIs across earnings reports * **Competitive intelligence** - Monitor competitor product launches and strategies * **Regulatory compliance monitoring** - Alert on new compliance requirements The combination of Tensorlake's intelligent document processing and MotherDuck's cloud analytics provides a scalable solution for turning unstructured documents into actionable insights. ## Clean Up When you're done with this example: ```bash theme={null} # Deactivate virtual environment deactivate # Optional: Remove local JSON files rm *.json ``` ## Next Steps Now that you have the basics down, explore these resources: * [Python SDK and API Docs](https://docs.tensorlake.ai) * [Page Classification Guide](https://docs.tensorlake.ai/document-ingestion/parsing/page-classification) * [Structured Extraction Guide](https://docs.tensorlake.ai/document-ingestion/parsing/structured-extraction) * [Blog](https://tensorlake.ai/blog) * [Community Slack](https://tlake.link/slack) Try building your own document intelligence pipeline with Tensorlake and MotherDuck today! # Real Estate Agent with LangGraph (using CLI) Source: https://docs.tensorlake.ai/examples/tutorials/real-estate-agent-with-langgraph-cli Build a real estate agent using LangGraph to interact with purchase agreements and answer agent queries. In this tutorial you will extract contextual information from documents containing signatures using Tensorlake, LangChain, and OpenAI. Try out this example using this [Colab Notebook](https://tlake.link/langchain-tool-docs) A full, runnable example of an already built agent using both the CLI and a Streamlit app is available in the [Tensorlake GitHub repository](https://github.com/tensorlakeai/tensorlake/tree/main/examples/signature-detection/). ## Closing Deals Faster with Signature Detection and LangGraph Let's set the context for this example, you will build a LangGraph agent for a real estate company to help track who has signed property documents, when they signed, and who still needs to sign. You'll learn how to: * Use Tensorlake's [Signature Detection SDK](https://docs.tensorlake.ai/document-ingestion/parsing/signature) * Extract and summarize signature status per property * Create a [LangGraph agent](https://langchain-ai.github.io/langgraph/concepts/why-langgraph/) that uses the structured data to answer questions like: * How many signatures were detected in this document and who are the parties involved? * What contextual information can you extract about any signatures? * Are there any missing signatures on any pages? This is perfect for automating due diligence and compliance tracking across large sets of signature-heavy documents. ## Prerequisites * Python 3.10+ * An [OpenAI API key](https://platform.openai.com/api-keys) * A [Tensorlake API key](https://docs.tensorlake.ai/platform/authentication#api-keys) * Some [sample real estate documents](https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/real-estate-purchase-all-signed.pdf) * \[Optional] A [virtual Python environment](https://docs.python.org/3/library/venv.html) to keep dependencies isolated ## Build and test your LangGraph Agent Installing the `langchain-tensorlake` package will make sure that all relevant Tensorlake and LangChain packages are installed. For this tutorial, we're using `.env` files for our OpenAI and Tensorlake API keys, so you need to install dotenv. Tensorlake and LangGraph both look in environment variables for the necessary keys so that you don't have to manually set them. ```bash theme={null} pip install langchain-tensorlake dotenv ``` In `.env`, set your API keys: ``` OPENAI_API_KEY=your_openai_api_key TENSORLAKE_API_KEY=your_tensorlake_api_key ``` At the top, make sure you've imported all of the necessary Tensorlake, LangGraph, LangChain, and helper packages. Then, load your environment variables from `.env`: ```python signature-detection-agent.py theme={null} # helper packages import os from dotenv import load_dotenv # LangGraph and LangChain imports from langchain_tensorlake import document_markdown_tool from langgraph.prebuilt import create_react_agent # 1. Load environment variables load_dotenv() OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") TENSORLAKE_API_KEY = os.getenv("TENSORLAKE_API_KEY") ``` In this example, we're including the file and questions in the code. You could imagine this as input from the user instead. Make sure the file is at a publicly accessible URL. ```python signature-detection-agent.py theme={null} # 2. Define the document path path= "https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/real-estate-purchase-all-signed.pdf" # 3. Define the questions to be asked and create the agent questions = f"1. How many signatures were detected in the document found at {path} and who are the parties involved?\n \ 2. What contextual information can you extract about any signatures in the document found at {path}?\n \ 3. Are there any missing signatures on any pages in the document found at {path}?" ``` Our goal is to create an agent that can communicate with the Tensorlake LangChain Tool to allow users to ask natural language questions about complex contractual documents. This agent will: * Use the `document_markdown_tool` to: * Extract signature data from the document using [Tensorlake's Contextual Signature Detection](https://docs.tensorlake.ai/document-ingestion/parsing/signature) * Parse the documents into markdown chunks that are easily consumable by the LLM of our choice (in this case, ChatGPT) * Interpret user questions (e.g. "Which pages are missing signatures?") * Return structured, accurate answers ```python signature-detection-agent.py theme={null} # 4. Create the agent agent = create_react_agent( model="openai:gpt-4o-mini", tools=[document_markdown_tool], prompt=(f"""You are a helpful assistant that answers questions about documents with signature detection data. Your responsibilities: 1. Answer questions based on that loaded data 2. Help users understand the signature analysis results You can answer questions like: - How many signatures were found? - Which pages contain signatures? - Who signed the document? - What does the content say around signatures? - What type of document is this? - Who are the parties involved? - What is the date of the signature? - Did each party sign the document? - Are there any missing signatures on any pages? - Which property is missing signatures? - Who is the agent for the properties missing signatures? Please analyze the above parsed output and answer the questions provided by the user. """ ), name="real-estate-agent" ) ``` With the document, questions, and agent defined, you can invoke the agent and print the results. ```python signature-detection-agent.py theme={null} # 5. Invoke the agent with the document path and questions print("Processing document with signature detection...") result = agent.invoke({"messages": [{"role": "user", "content": questions}]}) # 6. Print the result print("\nAnalysis Results:") print(result["messages"][-1].content) ``` Finally, run the script to see the agent in action. It will: * Parse the document using Tensorlake's signature detection * Use the LangGraph agent to answer questions about the signatures ```bash theme={null} (venv) % python3 signature_detection_langgraph_agent.py Processing document with signature detection... Analysis Results: Here are the answers to your questions about the document: 1. **How many signatures were detected in the document and who are the parties involved?** - **Number of Signatures:** 8 signatures were detected. - **Parties Involved:** - **Buyer:** Nova Ellison, with mailing address 123 Tensor Rd, San Francisco, CA 99999. - **Seller:** Juno Vega, with mailing address 456 Lake Rd, San Francisco, CA 99999. - Additional signatures from Aster Polaris representing the Polaris Group LLC. 2. **What contextual information can you extract about any signatures in the document?** - The signatures indicate acceptance of various sections of the agreement, suggesting participation in contractual obligations regarding the purchase of the property described. The document outlines critical details: - The agreement was made on September 20, 2025. - The purchase price of the property is specified at $150,000. - Earnest money of $10,000 is due as part of the agreement by September 15, 2025. - Both the buyer and seller affirm their understanding and acceptance of the terms as indicated by their initials next to key sections. 3. **Are there any missing signatures on any pages?** - No, all required signatures seem to be present for the buyers, sellers, and the agent. Each page that includes signature spaces has signatures completed as necessary for the agreement to be valid. If you need any more information or further clarification, feel free to ask! ``` Don't forget to `deactive venv` when you're done testing the agent. ## Next Steps: Build a Tensorlake backed LangGraph agent yourself You can start using Signature Detection today in the [Tensorlake Playground](https://cloud.tensorlake.ai/) or via our [Python SDK](https://pypi.org/project/tensorlake/). When you sign up, you get **100 free credits**, enough to process about 100 pages. If you want to run an already built agent, you can check out this full example in the [Tensorlake GitHub repository](https://github.com/tensorlakeai/tensorlake/tree/main/examples/signature-detection/). We built Tensorlake to empower developers and product teams to do more with documents - faster, and with less complexity. We'd love to see what you build with this, you can share with us or give us feedback in our [Slack Community](https://join.slack.com/t/tensorlakecloud/shared_invite/zt-32fq4nmib-gO0OM5RIar3zLOBm~ZGqKg). # Workflow Tutorial Source: https://docs.tensorlake.ai/examples/tutorials/workflow-tutorial Running an Hello World workflow on Tensorlake Serverless. This tutorial uses the legacy Graph API (`Graph`, `tensorlake_function`). For new projects, use the current `@application()` / `@function()` API — see the [Quickstart](/applications/quickstart). In this introductory tutorial, we will: 1. Create a Tensorlake Graph 2. Test Locally 3. Deploy to Tensorlake Serverless 4. Invoke the Graph Remotely 5. Troubleshoot Remote Executions ## Prerequisites Before proceeding, ensure you have the following: * **Python Environment**: Python 3.9 or higher installed. * **Tensorlake Account**: Sign up at [Tensorlake](https://cloud.tensorlake.ai/). * **API Key**: After creating your account, generate an API key for the Tensorlake CLI and set it as an environment variable: ```bash theme={null} export TENSORLAKE_API_KEY= ``` * **Tensorlake SDK**: Install the Tensorlake SDK using pip: ```bash theme={null} pip install tensorlake ``` ## Step 1: Create the Graph We'll create a graph that takes a name as input and returns a personalized greeting. ### Define the Functions In `workflow.py`, start by importing the necessary components and defining the functions: ```python theme={null} from tensorlake import Graph, tensorlake_function @tensorlake_function() def hello_name(name: str) -> str: return f"Hello {name}!" @tensorlake_function() def hello_world(sentence: str) -> str: return f"{sentence} Hello world!" ``` ### Construct the Graph Next, construct the graph by specifying the nodes and their connections: ```python theme={null} graph = Graph(name="hello-world", start_node=hello_name) graph.add_edge(hello_name, hello_world) ``` Here, the graph consists of two nodes: `hello_name` and `hello_world`, with `hello_name` as the starting node and an edge directing the flow to `hello_world`. ## Step 2: Test Locally To test the graph locally, add the following code: ```python theme={null} if __name__ == "__main__": invocation = graph.local().queue("Tensorlake") outputs = invocation.outputs("hello_world") print(outputs[0]) ``` Running `python workflow.py` will execute the workflow locally and print the output. ## Step 3: Deploying the Graph Deploy the graph to the Tensorlake Serverless platform using the following command: ```bash theme={null} tl deploy workflow.py ``` This command uploads your code to the Tensorlake cloud, making it ready for remote invocations distributed across multiple machines. ## Step 4: Invoking the Graph Remotely Once the graph is deployed, you can invoke it remotely by modifying the main code: ```python theme={null} if __name__ == "__main__": invocation = graph.queue("Tensorlake") outputs = invocation.outputs("hello_world") print(outputs[0]) ``` Alternatively, you can obtain a reference to the deployed graph and invoke it: ```python theme={null} from tensorlake import TensorlakeClient if __name__ == "__main__": client = TensorlakeClient() graph = client.get_graph("hello-world") invocation = graph.queue("Tensorlake") outputs = invocation.outputs("hello_world") print(outputs[0]) ``` The Graph is called with the input of the starting node of the graph, in this case `hello_name`, so the input to the graph is the `name` parameter. The result of calling a graph is an `Invocation`. Since data applications can take a long time to complete, calling `outputs` on an invocation will wait for the invocation to be complete. In either case, the result of the individual functions can be retrieved using the invocation id, and the name of the function. ## Step 5: Monitoring and Troubleshooting Monitor your graph's invocations and logs using the Tensorlake CLI: ```bash theme={null} tl invocations list tl invocations logs --function-name ``` These commands help you track executions and diagnose any issues that may arise during remote invocations. ## Conclusion By following this tutorial, you've successfully created, deployed, and invoked a simple "Hello World" workflow using Tensorlake Serverless. This foundation enables you to build more complex data-intensive AI workflows with ease. # How much does Tensorlake cost and what are the limits? Source: https://docs.tensorlake.ai/faqs/pricing-limits-faq Tensorlake pricing — free tier, On-Demand, Pro, and Enterprise plans, per-second sandbox billing, CPU and RAM rates, plan limits, and resource defaults. Common questions about Tensorlake's pricing model, plan tiers, and the resource limits that apply to sandboxes and image builds. For the live pricing page, see [tensorlake.ai/pricing](https://www.tensorlake.ai/pricing). ## How is sandbox-as-a-service pricing typically structured? Sandbox-as-a-service products are usually billed on usage rather than a flat seat fee. The common dimensions are: * **Compute time** — CPU-seconds or vCPU-hours while the sandbox is running. * **Memory-time** — GB-seconds or GB-hours of allocated memory. * **Storage** — disk for snapshots, suspended-state preservation, and persisted volumes. * **Network egress** — outbound bandwidth. * **Build minutes** — time spent building custom images, sometimes metered separately. Tensorlake Cloud uses **per-second sandbox billing** with separate CPU and RAM rates, plus per-GB-hour storage for suspended snapshots. ## Does Tensorlake have a free tier? Yes. The **Free tier is \$0 forever with no credit card required.** It includes: * 2 concurrent sandboxes * 1 core / 1 GB RAM / 10 GB disk per sandbox * Unmetered sessions * Self-serve docs and community Slack support * SOC 2 Type 2 compliance The Free tier's 1 core / 1 GB RAM / 10 GB disk allowance matches the [SDK defaults](#what-are-the-default-resources-for-a-tensorlake-sandbox), so most starter scripts will run unchanged. ## How much does a Tensorlake Sandbox cost? | | On-Demand | Pro | | -------------------- | -------------------- | -------------------------------- | | Base fee | \$0 | \$250 / month | | CPU | \$0.05 / hr per core | \$0.03 / hr per core *(40% off)* | | RAM | \$0.015 / hr per GB | 40% off metered usage | | Concurrent sandboxes | up to 100 | 1,000 | Sandbox runtime is **billed by the second**. One credit is worth \$0.01. Snapshots are billed **per GB-hour while suspended**. ## What are the Tensorlake plan tiers? | Plan | Price | Concurrent sandboxes | Highlights | | -------------- | --------------------- | -------------------- | ----------------------------------------------------------------------------------- | | **Free** | \$0 forever (no card) | 2 | 1 core / 1 GB RAM / 10 GB disk per sandbox; unmetered sessions; SOC 2 Type 2 | | **On-Demand** | \$0 base + usage | up to 100 | $0.05/hr per core, $0.015/hr per GB RAM; best-effort support | | **Pro** | \$250 / month + usage | 1,000 | \$0.03/hr per core (40% off); snapshot + resume; 24×7 Slack/email; 24h P1 SLA | | **Enterprise** | Custom quote | Unlimited | HIPAA + SOC 2 Type 2; SSO/SAML; RBAC; DPA; in-VPC / on-prem; 1h P1 SLA; resident SA | See [tensorlake.ai/pricing](https://www.tensorlake.ai/pricing) for current rates. ## Is sandbox billing per-second or per-hour in Tensorlake? **Per-second.** Rates are quoted per hour for clarity ($0.05/hr per core on On-Demand, $0.03/hr per core on Pro), but you only pay for the seconds the sandbox is running. This is why suspending a named sandbox between agent turns is meaningfully cheaper than leaving it running idle — see [suspend/resume](/sandboxes/lifecycle). ## Do I pay for a suspended Tensorlake Sandbox? A suspended sandbox **consumes no compute**, so CPU and RAM time are not billed while suspended. Snapshots are billed **per GB-hour** while suspended. ## What are the default resources for a Tensorlake Sandbox? By default, `Sandbox.create()` allocates: | Resource | Default | How to override | | --------- | -------------------- | ---------------------------------- | | CPUs | `1.0` | `cpus=` (SDK) or `--cpus` (CLI) | | Memory | `1024 MB` | `memory_mb=` / `--memory` | | Root disk | `10240 MiB` (10 GiB) | `disk_mb=` / `--disk_mb` | | Timeout | none | `timeout_secs=` / `--timeout-secs` | ## What are the memory and disk limits for a Tensorlake Sandbox? * **Memory:** between `1024` and `8192` MB per CPU core. * **Disk:** between `10240` and `102400` MiB (10–100 GiB). Disk size is **growth-only**: when creating a sandbox from an `image`, `disk_mb` can grow the root disk at create time. When restoring from a filesystem snapshot, `disk_mb` can grow the root disk at restore time. You cannot shrink the disk below the source. See the [SDK reference](/sandboxes/sdk-reference) and [Lifecycle: resources](/sandboxes/lifecycle) for the full parameter table. ## How many concurrent sandboxes can I run on each Tensorlake plan? | Plan | Concurrent sandboxes | | ---------- | -------------------- | | Free | 2 | | On-Demand | up to 100 | | Pro | 1,000 | | Enterprise | Unlimited | ## What is the default sandbox timeout in Tensorlake? The default `timeout_secs` is `600` (10 minutes). The maximum allowed value depends on your plan: | Plan | Max `timeout_secs` | | ------------------------- | -------------------------------------------------------------- | | Free (unverified) | 3600 (1 hour) | | Free (verified) | 7200 (2 hours) | | On-Demand (pay-as-you-go) | 86400 (24 hours) | | Pro / Enterprise | See [tensorlake.ai/pricing](https://www.tensorlake.ai/pricing) | Verify your free account by adding a credit card — it's used for identity verification only. Setting `timeout_secs=0` requests the plan maximum. When `timeout_secs` elapses: * **Named sandboxes** auto-**suspend** — state is preserved and you can `resume`. * **Ephemeral sandboxes** auto-**terminate** — the state is gone. ```python theme={null} # Auto-suspend a named sandbox after 30 minutes of inactivity sandbox = Sandbox.create(name="my-env", timeout_secs=1800) ``` ## What are the build-time defaults for a Tensorlake sandbox image? Build-time defaults are `cpus=2.0`, `memory=4096 MB`, and `disk=10 GiB`. Override them with `--cpus`, `--memory`, and `--disk_mb` in the CLI, or `cpus`, `memoryMb`, and `diskMb` in SDK options. See [Sandbox Images](/sandboxes/images) for full image-build parameters. # How do sandbox images work in Tensorlake? Source: https://docs.tensorlake.ai/faqs/sandbox-images-faq Frequently asked questions about Tensorlake sandbox images — base images, custom images, and building from Python, TypeScript, or a Dockerfile. Sandbox images let you prebuild dependencies, files, and environment setup once, then launch fresh sandboxes from that prepared state. Below are the most common questions about images. ## What is a sandbox image and how is it different from a Docker image? A sandbox image is a prebuilt, named environment used to launch isolated VMs or sandboxes. It packages dependencies, files, and environment setup so each new sandbox starts from the same prepared state. The shape is similar to a Docker image — base layer, build operations like `run`, `copy`, `env`, `workdir` — but: * The runtime target is a **MicroVM**, not a container. * The build artifact is a **sandbox snapshot**, not an OCI image. * The image is **scoped to a project**, not a registry. Tensorlake supports defining sandbox images in Python, TypeScript, or a raw Dockerfile. See [Sandbox Images](/sandboxes/images). ## What sandbox images does Tensorlake provide? Tensorlake ships several base images: * **`ubuntu-minimal`** *(default)* — minimal Ubuntu without systemd. Boots in a few hundred milliseconds. Best for fast cold boot. * **`ubuntu-systemd`** — Ubuntu with a full systemd init system. Use when you need to install packages like Docker or Kubernetes inside the sandbox. * **`debian-minimal`** — minimal Debian 13. * **`ubuntu-vnc`** — desktop-enabled Ubuntu (based on `ubuntu-systemd`) with XFCE, TigerVNC, and Firefox preinstalled. Use for browser automation and [computer-use](/sandboxes/computer-use) workloads. ## Which sandbox image should I use? | Workload | Recommended image | | ------------------------------------------- | ----------------- | | Fastest cold boot | `ubuntu-minimal` | | Need systemd, Docker, or Kubernetes | `ubuntu-systemd` | | Debian-based environment | `debian-minimal` | | Browser automation / desktop / computer use | `ubuntu-vnc` | ## How do I build a custom sandbox image? You can build a custom image on top of a base image from Python, TypeScript, or a raw Dockerfile. When you build an image, Tensorlake: 1. Parses the image definition DSL and local source context. 2. Starts a temporary sandbox from the selected base image. 3. Translates supported build operations such as `run`, `copy`, `add`, `env`, and `workdir` into sandbox build steps and executes them there. 4. Creates a snapshot of the prepared sandbox. 5. Registers the image name for that snapshot in your current project. Then create new sandboxes with: ```bash theme={null} tl sbx new --image ``` See [Sandbox Images](/sandboxes/images) for full examples. ## Are Tensorlake sandbox images project-scoped? Yes. Custom images are scoped to the project selected in the CLI. Before registering one: ```bash theme={null} pip install tensorlake tl login ``` ```bash theme={null} npm install tensorlake npx tl login ``` Programmatic image registration from the TypeScript SDK additionally needs `TENSORLAKE_API_KEY`, `TENSORLAKE_ORGANIZATION_ID`, and `TENSORLAKE_PROJECT_ID`. ## Why does `pip install` require `--break-system-packages` in a Tensorlake sandbox? The base Ubuntu and Debian images ship a PEP 668–managed system Python, so `pip install` requires `--break-system-packages` (or an explicit virtualenv). Without the flag, `pip` exits with `error: externally-managed-environment`. The flag is a requirement, not a stylistic choice. ## Can I use my existing Dockerfile with Tensorlake? Yes. Tensorlake can build a sandbox image from a raw Dockerfile alongside the Python and TypeScript image DSLs. Tensorlake parses the Dockerfile, starts a temporary sandbox from the selected base image, runs supported build operations (`run`, `copy`, `add`, `env`, `workdir`), captures a snapshot, and registers the resulting image name in your project. After that, launch sandboxes with `tl sbx new --image `. # How does the sandbox lifecycle work? Source: https://docs.tensorlake.ai/faqs/sandbox-lifecycle-faq Frequently asked questions about Tensorlake Sandbox lifecycle — ephemeral vs named sandboxes, suspend, resume, terminate, and timeouts. ## How do you keep an AI agent's environment alive between turns or sessions? There are two general approaches: * **Pause-in-place** — suspend the live VM so its filesystem, memory, and running processes are preserved, then resume under the same identifier. Useful for keeping an agent's working memory and open processes alive between turns. * **Snapshot-and-restore** — capture a reusable artifact you can boot into a fresh VM later. Useful for retrying from a checkpoint or branching experiments. In Tensorlake, these map to [suspend/resume](#how-do-i-suspend-a-tensorlake-sandbox) on named sandboxes and [snapshot/restore](/sandboxes/snapshots) via filesystem or memory snapshots. ## What's the difference between ephemeral and named sandboxes? | | Ephemeral | Named | | -------------------- | ------------------------------------ | ---------------------------------------- | | **Created with** | `tl sbx create` | `tl sbx create ` | | **Suspend / Resume** | Not supported | Supported | | **Reference by** | ID only | ID **or** name | | **Use when** | Short-lived tasks, one-off execution | Multi-step work, persistent environments | Ephemeral sandboxes have no name and run until you terminate them or they time out. Named sandboxes support suspend/resume so you can pause between tasks and pick up where you left off. ## What states does a Tensorlake Sandbox move through? Every sandbox moves through these states. Create starts the sandbox in `Pending`; from `Running`, you can suspend (named only), snapshot, or terminate. Ephemeral sandboxes skip `Suspending`/`Suspended`. | State | What it means | | ---------------- | ------------------------------------------------------------------------------- | | **Pending** | Sandbox is being scheduled and booted. Transitions to `Running` automatically. | | **Running** | Sandbox is live and accepting commands, file operations, and process execution. | | **Snapshotting** | A reusable snapshot artifact is being captured. Returns to `Running` when done. | | **Suspending** | Named sandbox is being paused — manually or via `timeout_secs`. | | **Suspended** | Named sandbox is paused. Consumes no compute; state preserved. | | **Terminated** | Sandbox has stopped. Final state; cannot be reversed. | For the full state diagram, see [Lifecycle](/sandboxes/lifecycle). ## How do I suspend a Tensorlake Sandbox? Call `suspend` on a named sandbox. The sandbox transitions to `Suspended`, consumes no compute, and preserves its state. Call `resume` on the same sandbox ID to bring it back to `Running`. Suspend is not supported for ephemeral sandboxes. ## When should I use suspend vs. snapshot? Both preserve sandbox state, but they serve different purposes: * **Suspend** pauses *this* sandbox so you can resume it later under the same ID. * **Snapshot** captures a reusable artifact you can restore into a *new* sandbox. | Scenario | Use Suspend | Use Snapshot | | ------------------------------- | ----------- | ------------ | | Pause and resume later | ✅ | ❌ | | Save cost when idle | ✅ | ❌ | | Keep agent memory alive | ✅ | ❌ | | Retry from a checkpoint | ❌ | ✅ | | Run experiments from same state | ❌ | ✅ | | Clone environment | ❌ | ✅ | See [Snapshots](/sandboxes/snapshots) for save-and-restore. ## What happens when a sandbox times out? Ephemeral sandboxes are **terminated** when their timeout elapses. Named sandboxes transition to **Suspending** then **Suspended** when `timeout_secs` elapses, preserving state for later resume rather than terminating. ## Can I reverse a terminated sandbox? No. `Terminated` is the final state. To restore prior state into a new sandbox, capture a snapshot before termination and create a new sandbox from that snapshot. **Coming soon:** a `restart` operation that re-boots a terminated sandbox from its last snapshot. ## How is suspend/resume different from stopping and restarting a Docker container? Stopping a Docker container ends its processes and discards memory state — restarting goes through a cold boot and re-initializes the application. Tensorlake `suspend` pauses a named sandbox in place: filesystem, memory, and running processes are preserved. `resume` brings it back to `Running` under the same sandbox ID with the same in-memory state, so an agent harness can pick up exactly where it left off. # What are Tensorlake Sandboxes? Source: https://docs.tensorlake.ai/faqs/sandboxes-faq Frequently asked questions about Tensorlake Sandboxes — isolated MicroVMs for AI agents, tool calls, builds, and IDEs. ## What is a MicroVM sandbox? A MicroVM sandbox is a lightweight virtual machine — typically backed by [Firecracker](https://firecracker-microvm.github.io/) or CloudHypervisor — designed to start in milliseconds and run a single workload in hardware-isolated form. Unlike containers, each MicroVM has its own kernel, which makes them safer for running untrusted or AI-generated code. They're commonly used for AI agents, code execution, serverless functions, and CI/build workloads. Tensorlake Sandboxes are MicroVMs built on Firecracker and CloudHypervisor. ## What are Tensorlake Sandboxes? Tensorlake Sandboxes are isolated MicroVMs that boot in hundreds of milliseconds, with memory and filesystem preserved across suspend and resume. You can use them to run agent harnesses, execute tool calls, or as VMs for coding agents, builds, and IDEs. ## How are Tensorlake Sandboxes isolated? Each sandbox is a MicroVM backed by [Firecracker](https://firecracker-microvm.github.io/) and CloudHypervisor. Sandboxes provide hardware-level isolation rather than container-level isolation, so untrusted or AI-generated code can run safely without sharing a kernel with other workloads. ## How fast does a Tensorlake Sandbox start? Tensorlake creates a fresh sandbox in **single-digit milliseconds**; OS boot then completes in a few hundred milliseconds for the default `tensorlake/ubuntu-minimal` image. `tensorlake/ubuntu-systemd`, which includes a full init system and additional tooling (like Docker and Kubernetes support), takes around one second to boot. At peak load, the scheduler creates hundreds of sandboxes per second — see [Architecture](/applications/architecture) for how this differs from Kubernetes pod creation. ## How do I create a Tensorlake Sandbox? Create one on demand from the CLI or the SDK. Pass `image`, `cpus`, and memory to control the runtime. ```bash cli theme={null} tl sbx create ``` ```python sandbox.py theme={null} from tensorlake.sandbox import Sandbox resp = Sandbox.create( image="tensorlake/ubuntu-minimal", cpus=4, memory_mb=8192, ) ``` ```typescript sandbox.ts theme={null} import { Sandbox } from "tensorlake"; const resp = await Sandbox.create({ image: "tensorlake/ubuntu-minimal", cpus: 4, memoryMb: 8192, }); ``` See the [Quickstart](/sandboxes/quickstart) for a full walkthrough. ## What can I run inside a Tensorlake Sandbox? Anything the OS supports. Common workloads include: * Agent harnesses and [tool calls](/sandboxes/tool-calls) * LLM-generated or untrusted code * Browser automation and [computer use](/sandboxes/computer-use) * Builds, tests, and CI workloads * Long-running processes and [PTY sessions](/sandboxes/pty-sessions) * [Networking](/sandboxes/networking) and [tunnels](/sandboxes/tunnels) ## Is Tensorlake compliant with HIPAA and SOC 2? Yes. Tensorlake is HIPAA and SOC 2 Type II compliant, supports EU data residency, and offers zero data retention. ## How are Tensorlake Sandboxes different from Docker containers? Tensorlake Sandboxes are MicroVMs backed by Firecracker and CloudHypervisor, which means each sandbox has its own kernel and hardware-level isolation. Docker containers share the host kernel — faster to start, but weaker isolation for running untrusted or AI-generated code. Tensorlake also provides built-in [suspend/resume](/sandboxes/lifecycle) and [snapshots](/sandboxes/snapshots), which aren't part of the standard Docker runtime. If you have an existing Dockerfile, Tensorlake can build a sandbox image from it — see [Sandbox Images](/sandboxes/images). # What are Tensorlake Workflows? Source: https://docs.tensorlake.ai/faqs/workflows-faq Tensorlake Workflows automate and orchestrate complex tasks by composing functions into a Graph of parallel or sequential steps. Tensorlake Workflows let you compose functions into a Graph and execute them in parallel or serially. Below are the most common questions about how Workflows work. ## What is durable execution? Durable execution is a runtime model where the outputs of each step in a long-running program are checkpointed, so a crash, timeout, or retry can resume from the last completed step instead of restarting from scratch. It's the foundation behind systems like Temporal, Inngest, and Restate, and is commonly used for AI agents, long-running data pipelines, multi-step orchestration, and workflows that span minutes, hours, or days. Tensorlake Workflows implement durable execution natively in Python: function outputs are checkpointed to object storage, and on failure the scheduler [replays](/applications/architecture#replay) the call graph and skips already-completed steps. See [Durable Execution](/applications/durability) for the full model. ## What are Tensorlake Workflows? Tensorlake Workflows are a way to automate and orchestrate complex tasks. You define a series of functions that execute in parallel or sequentially, and Tensorlake handles distribution, persistence, and recovery. ## What is a Graph in a Tensorlake Workflow? A Graph connects multiple functions together into a workflow. It contains: * **Node** — a function that operates on data. * **Start Node** — the first function executed when the graph is invoked. * **Edges** — represent data flow between functions. * **Conditional Edge** — evaluates input data from the previous function and decides which edges to take. Like an if-else statement. Graphs are workflows whose functions can be executed in parallel, while Pipelines are linear workflows that execute functions serially. ## How do I define a function in a Tensorlake Workflow? Functions are regular Python functions decorated with `@tensorlake_function()`. A function executes in a distributed manner and its output is stored, so if downstream functions fail they can resume from that output. The decorator accepts parameters to configure retry behavior, placement constraints, and more. ## How do I run a sequential pipeline in Tensorlake? Chain nodes with `add_edge` so each function transforms the output of the previous one until reaching the end node. ```mermaid theme={null} flowchart TD node1 --> node2 node2 --> node3 ``` ```python theme={null} @tensorlake_function() def node1(input: int) -> int: return input + 1 @tensorlake_function() def node2(input2: int) -> int: return input2 + 2 @tensorlake_function() def node3(input3: int) -> int: return input3 + 3 graph = Graph(name="pipeline", start_node=node1) graph.add_edge(node1, node2) graph.add_edge(node2, node3) ``` ***Use case:*** Transforming a video into text by first extracting the audio, and then doing Automatic Speech Recognition (ASR) on the extracted audio. ## How do I run workflow steps in parallel in Tensorlake? Add multiple edges from one start node to different downstream functions. Each branch produces an output for the same input in parallel. ```mermaid theme={null} flowchart TD start_node --> add_two start_node --> is_odd ``` ```python theme={null} @tensorlake_function() def start_node(input: int) -> int: return input + 1 @tensorlake_function() def add_two(input: int) -> int: return input + 2 @tensorlake_function() def is_even(input: int) -> int: return input % 2 == 0 graph = Graph(name="pipeline", start_node=start_node) graph.add_edge(start_node, add_two) graph.add_edge(start_node, is_even) ``` ***Use case:*** Extracting embeddings and structured data from the same unstructured data. ## How do I parallelize a function across many items (map) in Tensorlake? When an upstream function returns a sequence and the downstream function accepts a single element of that sequence, Tensorlake automatically parallelizes the downstream function — one invocation per element — across machines and worker processes. ```mermaid theme={null} flowchart TD map(map) node1(node) node2(node) node3(node) map --> node1 map --> node2 map --> node3 ``` ```python theme={null} @tensorlake_function() def fetch_urls() -> list[str]: return [ 'https://example.com/page1', 'https://example.com/page2', 'https://example.com/page3', ] # scrape_page is called in parallel for every element of fetch_url across # many machines in a cluster or across many worker processes in a machine @tensorlake_function() def scrape_page(url: str) -> str: content = requests.get(url).text return content ``` ***Use case:*** Generating an embedding for every chunk of a document. ## How do I aggregate results across many items (reduce) in Tensorlake? Reduce functions aggregate outputs from one or more functions that return sequences. They have two key properties: * **Lazy evaluation** — reduce functions are invoked incrementally as elements become available, so they stream over large datasets efficiently. * **Stateful aggregation** — the aggregated value persists between invocations. Each call receives the current accumulated state along with the new element to process. ```mermaid theme={null} flowchart TD map(map) reducer1(reducer) reducer2(reducer) reducer3(reducer
output=accumulator) map --> reducer1 reducer1 --> reducer2 map --> reducer2 reducer2 --> reducer3 map --> reducer3 ``` ```python theme={null} @tensorlake_function() def fetch_numbers() -> list[int]: return [1, 2, 3, 4, 5] class Total(BaseModel): value: int = 0 @tensorlake_function(accumulate=Total) def accumulate_total(total: Total, number: int) -> Total: total.value += number return total ``` ***Use case:*** Aggregating a summary from hundreds of web pages. ## How do I conditionally route data between functions in Tensorlake? Use `@tensorlake_router` on a function that returns the list of downstream functions to invoke based on custom logic. The router decides at runtime which branch(es) to take. ```mermaid theme={null} flowchart TD router{router} router -->|if <condition>| node1 router -->|else| node2 ``` ```python theme={null} @tensorlake_function() def handle_error(text: str): # Logic to handle error messages pass @tensorlake_function() def handle_normal(text: str): # Logic to process normal text pass # The function routes data into the handle_error and handle_normal based on the # logic of the function. @tensorlake_router() def analyze_text(text: str) -> List[Union[handle_error, handle_normal]]: if 'error' in text.lower(): return [handle_error] else: return [handle_normal] ``` ***Use case:*** Processing outputs differently based on classification results. ## How do Tensorlake Workflows compare to durable-execution systems like Temporal or Inngest? Tensorlake Workflows are a durable-execution runtime in the same category as Temporal, Inngest, and Restate: function outputs are checkpointed, and on failure the scheduler replays the call graph from the last completed checkpoint instead of re-running everything from scratch. The differences are surface and integration: * **Authored as plain Python.** Functions are decorated with `@tensorlake_function` — no separate worker SDK or activity/workflow split. * **One runtime for code and isolation.** Workflows run on the same platform as [Tensorlake Sandboxes](/sandboxes/introduction), so the durable functions and the isolated environments they call into are managed by one scheduler. * **Output storage built in.** Function outputs are persisted to object storage by default, so you can pass large files between steps without external workarounds. See [Architecture](/applications/architecture) and [Durable Execution](/applications/durability) for how checkpointing and replay work. # ChromaDB Source: https://docs.tensorlake.ai/integrations/chroma Build citation-aware RAG systems that trace every AI answer back to exact source locations [ChromaDB](https://www.trychroma.com/) is an open-source vector database designed for AI applications. When combined with Tensorlake's document parsing, you can build RAG systems where every generated statement links directly to its source, complete with page numbers and bounding boxes. This integration is critical for legal, medical, financial, and compliance applications where source verification isn't optional. Run this end-to-end in Colab: ## Why Use Tensorlake + ChromaDB? **The Problem:** * Traditional RAG can't prove where answers come from * Users have no way to verify AI-generated claims * Compliance and audit requirements demand source attribution * Hallucinations are impossible to trace back to their origin **The Solution:** Tensorlake preserves spatial metadata (page numbers, bounding boxes) during parsing. Combined with ChromaDB's vector search, you get citation-aware RAG: every AI response includes exact source locations users can verify. **Key Benefits:** * **Citation provenance** - Track every claim to specific paragraphs in source documents * **Audit-ready outputs** - Meet regulatory requirements with verifiable source attribution * **Zero hallucination detection** - Instantly verify whether answers are grounded in your documents * **Production-ready** - Handle legal contracts, medical research, and financial reports with confidence ## Installation ```bash theme={null} pip install tensorlake chromadb ``` ## Quick Start ### Step 1: Parse Documents with Spatial Metadata Tensorlake captures not just text, but coordinates of every element: ```python theme={null} from tensorlake.documentai import DocumentAI, ParseStatus doc_ai = DocumentAI() file_id = doc_ai.upload("research_paper.pdf") result = doc_ai.parse_and_wait(file_id) assert result.status == ParseStatus.SUCCESSFUL # Access parsed pages with spatial metadata pages = result.pages # Each page contains fragments with bounding boxes ``` ### Step 2: Build Citation-Ready Chunks Create sections with embedded citation anchors while storing coordinates separately: ```python theme={null} import json def build_citation_chunks(result, file_name: str): """ Chunk document by sections, embedding citation anchors inline while storing bounding boxes and page numbers in metadata. """ sections = [] current_section = None # Group by section headers for page in result.pages: page_num = page.page_number for fragment in page.page_fragments: content = fragment.content.content.strip() bbox = fragment.bbox if fragment.fragment_type == "section_header": if current_section: sections.append(current_section) current_section = [{ "page_number": page_num, "text": content, "bbox": bbox }] elif content and current_section is not None: current_section.append({ "page_number": page_num, "text": content, "bbox": bbox }) if current_section: sections.append(current_section) # Build chunks with citation anchors chunks, metadatas, ids = [], [], [] for sec_idx, section in enumerate(sections, start=1): citation_map = {} text_lines = [] for elem_idx, element in enumerate(section, start=1): anchor_id = f"S{sec_idx}.{elem_idx}" text_lines.append(f"{element['text']}") citation_map[anchor_id] = { "page_number": element["page_number"], "bbox": element["bbox"] } chunks.append("\n".join(text_lines)) metadatas.append({ "file": file_name, "citations": json.dumps(citation_map) }) ids.append(f"section-{sec_idx}") return chunks, metadatas, ids ``` ### Step 3: Store in ChromaDB ```python theme={null} import chromadb from chromadb.utils import embedding_functions client = chromadb.Client() collection = client.create_collection( name="citation_aware_rag", embedding_function=embedding_functions.OpenAIEmbeddingFunction( api_key=os.environ["OPENAI_API_KEY"], model_name="text-embedding-3-small" ) ) # Add chunks with citation metadata chunks, metadatas, ids = build_citation_chunks(result, "research_paper.pdf") collection.add(documents=chunks, metadatas=metadatas, ids=ids) ``` ### Step 4: Query with Automatic Citation Extraction ```python theme={null} from openai import OpenAI from pydantic import BaseModel from typing import List class CitedResponse(BaseModel): response: str citations: List[str] def query_with_citations(query: str, k: int = 3): # Retrieve relevant chunks results = collection.query(query_texts=[query], n_results=k) context = "\n---\n".join(results["documents"][0]) # Build citation lookup map citation_map = {} for metadata in results["metadatas"][0]: citations = json.loads(metadata["citations"]) file_name = metadata["file"] for anchor_id, coords in citations.items(): citation_map[anchor_id] = { "file": file_name, "page": coords["page_number"], "bbox": coords["bbox"] } # Generate response with citation extraction client = OpenAI(api_key=os.environ["OPENAI_API_KEY"]) completion = client.chat.completions.parse( model="gpt-4o-mini", messages=[ { "role": "system", "content": """Answer questions based on provided context. Extract citation IDs (format: ) and include them in your citations list. Only cite sources you actually used.""" }, {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {query}"} ], response_format=CitedResponse ) response = completion.choices[0].message.parsed # Map citations to source locations print(f"Answer: {response.response}\n") print("Sources:") for anchor_id in response.citations: cite = citation_map.get(anchor_id) if cite: print(f" • {cite['file']} | Page {cite['page']}") print(f" Location: {cite['bbox']}") return response, citation_map # Example usage query_with_citations("What methodology did the researchers use?") ``` ## Output ``` Answer: The researchers used SMOTE (Synthetic Minority Over-sampling Technique) to address class imbalance in their training dataset, which improved model performance on minority classes. Sources: • research_paper.pdf | Page 3 Location: {'x': 72, 'y': 234, 'width': 450, 'height': 24} • research_paper.pdf | Page 3 Location: {'x': 72, 'y': 260, 'width': 450, 'height': 36} ``` Every answer includes verifiable source locations. Users can jump directly to the cited text in the original document. ## How Citation Anchors Work Traditional RAG loses document structure during chunking. You can't trace an AI answer back to its source. **Citation-aware RAG changes the architecture:** 1. **During parsing**: Tensorlake captures bounding boxes and page numbers for every text element 2. **During chunking**: We embed citation anchors (``) directly in the text while storing coordinates in metadata 3. **During retrieval**: Citation anchors travel with the text, so the LLM sees which sentences came from where 4. **During generation**: The LLM naturally references citation IDs when answering 5. **After generation**: We map citation IDs back to page numbers and bounding boxes for verification The key insight: **Citation anchors stay with the text during embedding**, ensuring semantic relevance, while **spatial coordinates stay in metadata**, keeping embeddings clean. ## Use Cases ### Legal Document Analysis Extract contract clauses with exact page and paragraph references for court filings. Citation metadata creates automatic audit trails. ### Medical Research Build literature review systems that cite specific sentences from research papers. Meet peer review standards with verifiable references. ### Financial Compliance Generate audit reports where every figure traces back to source statements in regulatory filings. Essential for SOX and SEC compliance. ### Insurance Claims Processing Verify policy coverage with direct links to relevant policy document sections. Speed up claims review while maintaining accuracy. ### Pharmaceutical Documentation Meet FDA requirements by citing specific sections in clinical trial reports. Citation metadata enables regulatory audit trails. ## Best Practices ### 1. Optimize Chunking Strategy Chunk by semantic boundaries (sections, subsections) rather than character counts. Include section headers for better retrieval context. ### 2. Validate Citation Accuracy Implement validation to ensure citation integrity: ```python theme={null} def validate_citations(response, citation_map, context): """Verify that cited anchors exist in context.""" for anchor_id in response.citations: if anchor_id not in citation_map: print(f"⚠️ Invalid citation: {anchor_id}") elif f"" not in context: print(f"⚠️ Citation not found in context: {anchor_id}") ``` ### 3. Adapt Citation Formats Customize anchor formats for your domain: ```python theme={null} # Legal: Paragraph numbering anchor = f"¶{section_idx}.{elem_idx}" # Academic: Section.subsection.item anchor = f"{section_title}.{elem_idx}" # Medical: Protocol step IDs anchor = f"PROTOCOL_{section_idx}_STEP_{elem_idx}" ``` ### 4. Handle Tables and Figures For complex documents with tables, use Tensorlake's table summaries as separate chunks with their own citation anchors. ### 5. Use Persistent Storage For production, use `chromadb.PersistentClient(path="./chroma_db")` to avoid re-embedding on restart. ## Complete Example Try the full working example with research paper analysis: Complete code walkthrough including citation validation and accuracy metrics ## Resources * [ChromaDB Documentation](https://docs.trychroma.com) * [Tensorlake Blog: Why Citations Matter in RAG](https://tlake.link/blog/rag-citations) * [API Reference](https://tlake.link/docs) ## Need Help? Join our community to discuss citation-aware RAG: * [Slack Community](https://tlake.link/slack) # Databricks Source: https://docs.tensorlake.ai/integrations/databricks Build serverless pipelines to ingest unstructured data into Databricks Data Intelligence Platform. # Databricks [Databricks](https://www.databricks.com/) is a unified data analytics platform built on Apache Spark, designed for data engineering, machine learning, and analytics at scale. When combined with Tensorlake's document parsing and serverless agentic application runtime, you can build AI workflows and agents which can automate processing of Documents and other forms of unstructured data and land them in Databricks. In Databricks's Medallion Architecture, Tensorlake can extract semi-structured (JSON) or structured data from unstructured data and land it in Bronze stage tables in Databricks. This enables enterprises to increase data coverage in Databricks for downstream analytics use cases. ## Integration Architecture There are two main ways of integrating Tensorlake with Databricks: 1. **Document Ingestion API**: Use Tensorlake's Document Ingestion API from Databricks Jobs or Notebooks to extract structured data or markdown from documents, then load them into Databricks tables. 2. **Full Ingestion Pipeline on Tensorlake**: Build the entire pipeline of ingestion, transformation, and writing to Databricks on Tensorlake's platform. These pipelines are exposed as HTTP APIs and run whenever data is ingested, eliminating infrastructure management and scaling concerns. Tensorlake allows you to write distributed Python applications, making the developer experience of building and deploying scalable pipelines. ## Installation ```bash theme={null} pip install tensorlake databricks-sql-connector pandas pyarrow ``` ## Quick Start: Simple Document-to-Database Integration This example demonstrates the core integration pattern between Tensorlake's DocumentAI and Databricks. ### Step 1: Extract Structured Data from a Document Define a schema and extract structured data using Tensorlake: ```python theme={null} from tensorlake.documentai import DocumentAI, StructuredExtractionOptions, ParseStatus from pydantic import BaseModel, Field from typing import List # Define your extraction schema class CompanyInfo(BaseModel): """Basic company information from a document""" company_name: str = Field(description="Name of the company") revenue: str = Field(description="Annual revenue") industry: str = Field(description="Primary industry") # Initialize DocumentAI doc_ai = DocumentAI() records = [] # Extract structured data parse_id = doc_ai.extract( file="https://example.com/company-report.pdf", structured_extraction_options=[ StructuredExtractionOptions( schema_name="CompanyInfo", json_schema=CompanyInfo ) ] ) result = doc_ai.result(parse_id=parse_id) if result.status == ParseStatus.SUCCESS: for data in result.structured_data: records.append(data) ``` ### Step 2: Load Data into Databricks Connect to Databricks SQL Warehouse and insert the extracted data: ```python theme={null} from databricks import sql import pandas as pd import os dataframe = pd.DataFrame(records) spark_df = spark.createDataFrame(dataframe) spark.sql("CREATE DATABASE IF NOT EXISTS companies") ( spark_df .write .mode("append") .saveAsTable("companies.company_info") ) print("Loaded", len(records), "records into companies.company_info") display(spark_df) ``` ## How the Integration Works The integration follows a straightforward pipeline: 1. **Document Processing**: Tensorlake's DocumentAI parses documents and extracts structured data based on your Pydantic schemas 2. **Database Loading**: Data is loaded into Databricks tables using the Spark DataFrame API 3. **Orchestration**: You can orchestrate this process from Databricks Jobs, Notebooks or any other orchestrator. ## Full Ingestion Pipeline on Tensorlake The orchestration of your ingestion pipeline happens on Tensorlake. You can write a distributed and durable ingestion pipeline in pure Python and Tensorlake will automatically queue requests as they arrive and scale the cluster to process data. The platform is serverless, so you only pay for compute resources used for processing data. Architecture diagram showing documents flowing through Tensorlake Platform's serverless Python functions into Databricks tables. For a comprehensive example including page classification, multi-document processing, and advanced analytics, see our tutorial: [Query SEC Filings Stored in Databricks](/examples/tutorials/query-sec-filings-databricks) ## What's Next? Learn more about Tensorlake and Databricks: * [Structured Extraction Guide](/document-ingestion/parsing/structured-extraction) - Define custom schemas * [Applications Documentation](/applications/quickstart) - Deploy production pipelines * [Databricks Documentation](https://docs.databricks.com/) - Learn more about Databricks features # LangChain Source: https://docs.tensorlake.ai/integrations/langchain Build agentic workflows that automatically parse documents on-demand [LangChain](https://www.langchain.com/) is a framework for building LLM-powered applications. When combined with Tensorlake, you can create agents that automatically parse complex documents during conversations—no manual preprocessing needed. This integration is essential for building financial analysts, research assistants, and document QA agents that need to process files on-the-fly. Run this end-to-end in Colab: ## Why Use Tensorlake + LangChain? **The Problem:** * Agents need to process documents mid-conversation but parsing happens outside the workflow * Manual file preprocessing breaks agentic automation * Agents can't extract structured data, tables, or figures without custom code * No way to handle document parsing as a tool in agent toolchains **The Solution:** Tensorlake's LangChain tool enables agents to parse documents on-demand. When an agent encounters a file URL, it automatically calls Tensorlake to extract text, tables, and summaries. **Key Benefits:** * **Automatic parsing** - Agents parse documents when needed, no preprocessing * **Tool integration** - Document parsing becomes a native agent capability * **Structured extraction** - Pull metadata, tables, and figures in agent workflows * **Production-ready** - Handle financial reports, research papers, and contracts in conversational AI ## Installation ```bash theme={null} pip install langchain-tensorlake ``` ## Quick Start ### Step 1: Set API Keys ```bash theme={null} export TENSORLAKE_API_KEY="your-tensorlake-api-key" export OPENAI_API_KEY="your-openai-api-key" ``` ### Step 2: Create Agent with Document Parsing Tool Build a LangGraph agent that can parse documents automatically: ```python theme={null} from langchain_tensorlake import document_markdown_tool from langgraph.prebuilt import create_react_agent # Create agent with document parsing capability agent = create_react_agent( model="openai:gpt-4o-mini", tools=[document_markdown_tool], prompt=( """ I have a document that needs to be parsed. Please parse this document and answer the question about it. """ ), name="financial-analyst", ) # Agent automatically parses documents when needed result = agent.invoke({ "messages": [{ "role": "user", "content": "What is the quarterly revenue of Apple based on this file? https://www.apple.com/newsroom/pdfs/fy2025-q2/FY25_Q2_Consolidated_Financial_Statements.pdf" }] }) print(result["messages"][-1].content) ``` **Output:** ``` Based on the financial statements from Apple's second quarter of FY2025, the quarterly revenue figures are as follows: - **Total Net Sales** for the quarter ended March 29, 2025: **$95,359 million**. This total includes revenue from both products and services: - **Products Revenue**: $68,714 million - **Services Revenue**: $26,645 million ``` The agent automatically: 1. Detected the PDF URL in the query 2. Called Tensorlake to parse the financial statement 3. Extracted revenue data from tables 4. Answered the question with specific figures ## How Agent-Based Parsing Works Traditional document pipelines require upfront processing. Agents can't adapt to new files during conversations. **This integration changes the workflow:** 1. **During conversation**: User mentions a file URL 2. **Tool invocation**: Agent recognizes it needs document content and calls the Tensorlake tool 3. **Parsing**: Tensorlake parses the document and extracts text, tables, and data 4. **Context injection**: Parsed content returns to the agent's context window 5. **Response generation**: Agent answers using the parsed document The key insight: **Parsing happens on-demand** as part of the agent's reasoning loop, not as a separate preprocessing step. ## Use Cases ### Financial Analysis Agents Build analysts that parse earnings reports, balance sheets, and regulatory filings on-demand. Extract revenue, expenses, and key metrics without manual preprocessing. ### Research Assistants Create agents that read research papers mid-conversation. Automatically extract abstracts, methodologies, and experimental results when users ask questions. ### Legal Document Review Build agents that analyze contracts and legal briefs. Parse clause content, extract key terms, and compare documents during conversations. ### Customer Support Automation Enable support agents to parse product manuals, warranty documents, and technical specs when helping customers. ### Compliance Monitoring Create agents that review regulatory filings and compliance documents. Extract required disclosures and flag missing information. ## Best Practices ### 1. Design Clear Agent Prompts Help agents understand when to use document parsing: ```python theme={null} agent = create_react_agent( model="openai:gpt-4o-mini", tools=[document_markdown_tool], name="analyst", system_message="""You are a financial analyst. When users provide PDF links to financial documents, use the document parsing tool to extract and analyze the content.""" ) ``` ### 2. Handle Multiple Documents Efficiently Process documents in parallel when comparing multiple files: ```python theme={null} # Agent can process multiple documents in one turn result = agent.invoke({ "messages": [{ "role": "user", "content": """Compare these three quarterly reports: Q1: https://example.com/q1.pdf Q2: https://example.com/q2.pdf Q3: https://example.com/q3.pdf""" }] }) ``` ### 3. Validate Tool Usage Monitor agent behavior to ensure proper tool usage: ```python theme={null} result = agent.invoke({"messages": conversation}) # Check tool calls for message in result["messages"]: if message.get("type") == "tool_call": print(f"Tool used: {message['name']}") print(f"Arguments: {message['args']}") ``` ## Using the Python SDK Directly For non-agentic workflows, use the Tensorlake Python SDK directly in LangChain pipelines: ```python theme={null} from tensorlake.documentai import DocumentAI, ParsingOptions, ChunkingStrategy from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain_openai import OpenAIEmbeddings from langchain_chroma import Chroma # Parse with Tensorlake doc_ai = DocumentAI() file_id = doc_ai.upload("contract.pdf") parse_options = ParsingOptions( chunking_strategy=ChunkingStrategy.SECTION ) result = doc_ai.parse_and_wait(file_id, parse_options) # Use in LangChain documents = [chunk.content for chunk in result.chunks] # Create embeddings and vector store embeddings = OpenAIEmbeddings() vectorstore = Chroma.from_texts(documents, embeddings) # Build retrieval chain retriever = vectorstore.as_retriever() ``` ## Complete Example Try the full working example with financial analysis agent: Complete code walkthrough. ## What's Next? **Build advanced agents:** * [LangGraph Documentation](https://langchain-ai.github.io/langgraph/) - Learn agent architectures * [Tensorlake API Reference](https://tlake.link/docs) - Explore parsing options **Combine with vector databases:** * [Qdrant Integration](/integrations/qdrant) - Build RAG agents * [ChromaDB Integration](/integrations/chroma) - Add citation tracking ## Resources * [LangChain Documentation](https://python.langchain.com/) * [langchain-tensorlake Package](https://tlake.link/tools/langchain) * [Tensorlake API Reference](https://tlake.link/docs) ## Need Help? Join our community to discuss agentic workflows: * [Slack Community](https://tlake.link/slack) # MotherDuck Source: https://docs.tensorlake.ai/integrations/motherduck Build serverless pipelines to ingest unstructured data into DuckDB and MotherDuck. # MotherDuck [MotherDuck](https://motherduck.com/) is a serverless analytics platform built on DuckDB, designed for fast, collaborative data analysis. When combined with Tensorlake's document parsing and serverless agentic application orchestration, you get end-to-end document intelligence pipelines. Tensorlake runs your Python application that ingests and extracts information from PDF or other forms of unstructured data, and lands them into MotherDuck. Deploying these pipelines takes minutes. This integration is essential to integrate information from unstructured data sources into DuckDB for analytics. ## Integration Architecture There are two main ways of integrating Tensorlake with MotherDuck: 1. **Document Ingestion API**: Use Tensorlake's Document Ingestion API from your existing workflows to extract structured data from documents, then load the results into MotherDuck. 2. **Full Pipeline on Tensorlake**: Build the entire pipeline of ingestion, transformation, and writing to MotherDuck on Tensorlake's platform. These pipelines are exposed as HTTP APIs and run whenever data is ingested, eliminating infrastructure management and scaling concerns. ## Installation ```bash theme={null} pip install tensorlake duckdb==1.3.2 ``` ## Quick Start: Simple Document-to-Database Integration This example demonstrates the core integration pattern between Tensorlake's DocumentAI and MotherDuck. ### Step 1: Extract Structured Data from a Document Define a schema and extract structured data using Tensorlake: ```python theme={null} from tensorlake.documentai import DocumentAI, StructuredExtractionOptions from pydantic import BaseModel, Field from typing import List # Define your extraction schema class CompanyInfo(BaseModel): """Basic company information from a document""" company_name: str = Field(description="Name of the company") revenue: str = Field(description="Annual revenue") industry: str = Field(description="Primary industry") # Initialize DocumentAI doc_ai = DocumentAI() # Extract structured data result = doc_ai.parse_and_wait( file="https://example.com/company-report.pdf", structured_extraction_options=[ StructuredExtractionOptions( schema_name="CompanyInfo", json_schema=CompanyInfo ) ] ) extracted_data = result.structured_data[0].data ``` ### Step 2: Load Data into MotherDuck Connect to MotherDuck and insert the extracted data: ```python theme={null} import duckdb import pandas as pd # Connect to MotherDuck (uses $motherduck_token environment variable) con = duckdb.connect('md:my_database') # Convert extracted data to DataFrame df = pd.DataFrame([extracted_data]) # Create table and load data con.execute("CREATE OR REPLACE TABLE companies AS SELECT * FROM df") # Verify the data result = con.execute("SELECT * FROM companies").fetchdf() print(result) ``` ### Step 3: Query Your Data Run SQL analytics on the document data: ```python theme={null} # Example: Query companies by industry industry_summary = con.execute(""" SELECT industry, COUNT(*) as company_count, AVG(CAST(revenue AS DECIMAL)) as avg_revenue FROM companies GROUP BY industry ORDER BY company_count DESC """).fetchdf() print(industry_summary) ``` ## How the Integration Works The integration follows a straightforward pipeline: 1. **Document Processing**: Tensorlake's DocumentAI parses documents and extracts structured data based on your Pydantic schemas 2. **Data Transformation**: Extracted data is converted into a format compatible with DuckDB (typically DataFrames or dictionaries) 3. **Database Loading**: Data is loaded into MotherDuck tables using DuckDB's Python API 4. **SQL Analytics**: Run complex queries, joins, and aggregations on your document data using standard SQL ## Best Practices ### 1. Design Schemas for Queryability Structure your Pydantic models to match your analysis needs: ```python theme={null} # Good schema for SQL analytics class CompanyFiling(BaseModel): company_name: str # For GROUP BY queries filing_date: str # For time series analysis fiscal_year: str # For year-over-year comparisons risk_count: int # For aggregations risks: List[dict] # Nested data for detailed queries ``` ### 2. Handle Nested Data Appropriately Use DuckDB's JSON functions for nested structures: ```python theme={null} # Extract from nested arrays query = """ SELECT company_name, json_extract(item.value, '$.field') as extracted_field FROM companies, json_each(nested_array) as item """ result = con.execute(query).fetchdf() ``` ### 3. Process Multiple Documents When working with multiple documents, extract from all documents then load in bulk: ```python theme={null} document_urls = ["url1.pdf", "url2.pdf", "url3.pdf"] all_extractions = [] # Extract data from all documents for url in document_urls: result = doc_ai.parse_and_wait( file=url, structured_extraction_options=[ StructuredExtractionOptions( schema_name="CompanyInfo", json_schema=CompanyInfo ) ] ) all_extractions.append(result.structured_data[0].data) # Load all data at once df = pd.DataFrame(all_extractions) con.execute("CREATE OR REPLACE TABLE companies AS SELECT * FROM df") ``` ## Complete Example with Advanced Features For a more comprehensive example including page classification, multi-document processing, and advanced analytics, see our blog post: [Building Document Intelligence Pipelines with Tensorlake and MotherDuck](https://github.com/tensorlakeai/mother-duck) ## What's Next? Build on this foundation: * [Structured Extraction Guide](/document-ingestion/parsing/structured-extraction) - Define custom schemas * [Applications Documentation](/applications/quickstart) - Deploy production pipelines * [MotherDuck Documentation](https://motherduck.com/docs) - Learn more about MotherDuck features # Integrations Source: https://docs.tensorlake.ai/integrations/overview Connect Tensorlake with your favorite AI frameworks, vector databases, and data platforms. Tensorlake is designed to be the ingestion and processing layer for your AI stack. We provide seamless integrations with leading orchestration frameworks, vector databases, and data platforms to help you build end-to-end workflows. ## Orchestration Frameworks Build powerful agents and RAG pipelines by combining Tensorlake's document understanding with your preferred framework. Use Tensorlake loaders and tools directly within your LangChain chains and agents. Power your OpenAI agents with structured data extraction and document processing tools. ## Vector Databases Index your parsed documents efficiently for Retrieval Augmented Generation (RAG). Upsert structured points and embeddings directly into Qdrant collections. Store and retrieve document chunks with rich metadata in ChromaDB. ## Data Platforms Turn unstructured documents into analytical tables for SQL queries and business intelligence. Load extracted data into Databricks Delta Tables for large-scale analytics. Query your parsed documents using SQL with DuckDB in the cloud. ## Advanced Processing Enhance your pipelines with specialized processing tools. Semantic chunking that respects the structure of your Tensorlake-parsed documents. # Qdrant Source: https://docs.tensorlake.ai/integrations/qdrant Build RAG applications with richer embeddings from tables, figures, and structured metadata [Qdrant](https://qdrant.tech/) is a high-performance vector database built for AI applications. When combined with Tensorlake's document parsing, you get RAG systems with complete document understanding—including table summaries, figure descriptions, and metadata filtering that traditional parsing misses. This integration is essential for academic research, financial reports, legal documents, and technical documentation where visual content matters. Run this end-to-end in Colab: ## Why Use Tensorlake + Qdrant? **The Problem:** * Traditional parsing loses tables, figures, and reading order * Text-only embeddings miss critical visual content * Generic chunking breaks document structure and context * No way to filter by document metadata before searching **The Solution:** Tensorlake preserves tables, figures, and structure during parsing. Qdrant stores embeddings with rich metadata for filtering. Together, they create RAG systems that understand complete documents. **Key Benefits:** * **Richer embeddings** - Include table summaries and figure descriptions, not just text * **Advanced filtering** - Search within specific authors, dates, or document types * **Better chunking** - Semantic sections (abstract, methods, results) instead of arbitrary splits * **Accurate results** - Complete context leads to more relevant retrieval ## Installation ```bash theme={null} pip install tensorlake qdrant-client sentence-transformers ``` ## Quick Start ### Step 1: Parse Documents with Tensorlake Configure Tensorlake to extract structured data, tables, and figures in one API call: ```python theme={null} from tensorlake.documentai import ( DocumentAI, ParsingOptions, StructuredExtractionOptions, EnrichmentOptions, ChunkingStrategy, TableParsingFormat, TableOutputMode ) # Initialize client doc_ai = DocumentAI() # Define schema for structured extraction json_schema = { "title": "ResearchPaper", "type": "object", "properties": { "title": {"type": "string"}, "authors": {"type": "array", "items": {"type": "string"}}, "abstract": {"type": "string"}, "conference_name": {"type": "string"}, "publication_year": {"type": "integer"} } } # Configure parsing parsing_options = ParsingOptions( chunking_strategy=ChunkingStrategy.SECTION, # Chunk by semantic sections table_parsing_strategy=TableParsingFormat.TSR, table_output_mode=TableOutputMode.MARKDOWN, ) structured_extraction_options = [StructuredExtractionOptions( schema_name="ResearchPaper", json_schema=json_schema, )] enrichment_options = EnrichmentOptions( figure_summarization=True, figure_summarization_prompt="Summarize this figure in the context of the research paper.", table_summarization=True, table_summarization_prompt="Summarize this table's data and significance to the research.", ) # Parse document file_url = "https://example.com/research_paper.pdf" parse_id = doc_ai.parse( file_url, parsing_options, structured_extraction_options, enrichment_options ) result = doc_ai.wait_for_completion(parse_id) ``` **What You Get:** * Markdown chunks preserving reading order and structure * Structured metadata (title, authors, conference, year) * Table summaries that capture data meaning * Figure descriptions explaining visual content ### Step 2: Create Embeddings and Store in Qdrant Transform parsed content into embeddings and store with metadata for filtering: ```python theme={null} from qdrant_client import QdrantClient, models from sentence_transformers import SentenceTransformer from uuid import uuid4 # Initialize Qdrant and embedding model qdrant_client = QdrantClient(":memory:") # Use QdrantClient(url="...") for production model = SentenceTransformer("all-MiniLM-L6-v2") collection_name = "research_papers" # Create collection qdrant_client.create_collection( collection_name=collection_name, vectors_config=models.VectorParams( size=model.get_sentence_embedding_dimension(), distance=models.Distance.COSINE ) ) # Extract structured metadata structured_metadata = result.structured_data[0].data if result.structured_data else {} # Create embeddings for text chunks points = [] for chunk in result.chunks: embedding = model.encode(chunk.content).tolist() payload = { **structured_metadata, 'content': chunk.content, 'type': 'text' } points.append(models.PointStruct( id=str(uuid4()), vector=embedding, payload=payload )) # Create embeddings for table summaries for page in result.pages: for fragment in page.page_fragments: if fragment.fragment_type == "table" and fragment.content.summary: embedding = model.encode(fragment.content.summary).tolist() payload = { **structured_metadata, 'content': fragment.content.summary, 'type': 'table', 'page': page.page_number } points.append(models.PointStruct( id=str(uuid4()), vector=embedding, payload=payload )) # Create embeddings for figure summaries for page in result.pages: for fragment in page.page_fragments: if fragment.fragment_type == "figure" and fragment.content.summary: embedding = model.encode(fragment.content.summary).tolist() payload = { **structured_metadata, 'content': fragment.content.summary, 'type': 'figure', 'page': page.page_number } points.append(models.PointStruct( id=str(uuid4()), vector=embedding, payload=payload )) # Upload to Qdrant qdrant_client.upsert(collection_name=collection_name, points=points) # Create index for filtering by author qdrant_client.create_payload_index( collection_name=collection_name, field_name="authors", field_schema="keyword", ) print(f"Uploaded {len(points)} embeddings to Qdrant") ``` **Key Insight:** Table and figure summaries get their own embeddings. When users search for data, they'll retrieve both text explanations and visual content summaries. ### Step 3: Query with Filtering Combine semantic search with metadata filtering for precise results: ```python theme={null} # Query with author filter query = "Does computer science education improve problem solving skills?" query_embedding = model.encode(query).tolist() results = qdrant_client.query_points( collection_name=collection_name, query=query_embedding, query_filter=models.Filter( must=[ models.FieldCondition( key="authors", match=models.MatchValue(value="William G. Griswold"), ) ] ), limit=5, ).points # Display results for point in results: print(f"Title: {point.payload.get('title', 'Unknown')}") print(f"Authors: {point.payload.get('authors', 'Unknown')}") print(f"Score: {point.score:.4f}") print(f"Content: {point.payload.get('content')[:200]}...") print("-" * 80) ``` **Output Example:** ``` Title: Teaching Problem Solving Through CS Education Authors: ['William G. Griswold', 'Jane Smith'] Score: 0.8752 Content: Our study demonstrates that computer science courses significantly improve students' problem-solving abilities across multiple domains... -------------------------------------------------------------------------------- ``` ### Step 4: Build an Intelligent Agent Let AI decide when to apply filters based on query intent: ```python theme={null} from openai import OpenAI client = OpenAI() def smart_search(query: str): """Use LLM to extract filters and search Qdrant.""" # Extract filters using LLM response = client.chat.completions.create( model="gpt-4o-mini", messages=[ { "role": "system", "content": "Extract author names from the query if mentioned. Return JSON." }, {"role": "user", "content": query} ] ) filters_json = response.choices[0].message.content # Parse and build Qdrant filter... # Search with extracted filters results = qdrant_client.query_points( collection_name=collection_name, query=model.encode(query).tolist(), query_filter=build_filter(filters_json), limit=5 ) return results # Example queries smart_search("What did John Doe publish about neural networks?") smart_search("Recent papers on transformer architectures from 2024") ``` ## How Rich Embeddings Work Traditional RAG only embeds text chunks. Tables and figures are ignored or poorly represented. **This integration changes the data flow:** 1. **During parsing**: Tensorlake extracts tables and generates summaries like "Comparison of model accuracy across three datasets showing 5-10% improvement" 2. **During embedding**: Both text chunks and table/figure summaries get vectorized separately 3. **During storage**: Metadata (authors, year, conference) is stored as filterable payload fields 4. **During retrieval**: Queries match against text AND visual content summaries 5. **During response**: Results include both explanatory text and data from tables/figures The key insight: **Visual content becomes searchable** through AI-generated summaries, dramatically improving RAG accuracy for documents with tables and figures. ## Use Cases ### Academic Research Search through research papers with complex layouts. Retrieve both methodology text and experimental results from tables in a single query. ### Financial Reports Parse earnings reports, balance sheets, and regulatory filings. Filter by company, quarter, and fiscal year while searching across narrative and tabular data. ### Legal Documents Handle contracts and regulatory documents with proper structure. Filter by contract type, date range, or parties while searching clause content. ### Technical Documentation Process API docs, manuals, and specifications. Search across text explanations and data tables showing parameters, configurations, or benchmarks. ### Medical Literature Parse clinical studies with methods, results, and patient data tables. Filter by study type, date, or authors while retrieving complete experimental context. ## Best Practices ### 1. Optimize Chunking Strategy Use semantic chunking by section rather than fixed token limits. Include section headers for context. ```python theme={null} parsing_options = ParsingOptions( chunking_strategy=ChunkingStrategy.SECTION, # Not FIXED_SIZE ) ``` ### 2. Create Strategic Indices Index frequently filtered fields for performance: ```python theme={null} # Common filters qdrant_client.create_payload_index(collection_name, "authors", "keyword") qdrant_client.create_payload_index(collection_name, "publication_year", "integer") qdrant_client.create_payload_index(collection_name, "conference_name", "keyword") ``` ### 3. Handle Large Tables Intelligently For tables spanning multiple pages, create summaries with proper context: ```python theme={null} enrichment_options = EnrichmentOptions( table_summarization_prompt="""Summarize this table's data including: 1. What the table measures 2. Key findings or patterns 3. How it relates to the paper's main argument""" ) ``` ### 4. Combine Multiple Filter Conditions Build complex queries that narrow results effectively: ```python theme={null} query_filter=models.Filter( must=[ models.FieldCondition(key="publication_year", range=models.Range(gte=2020)), models.FieldCondition(key="conference_name", match=models.MatchValue(value="NeurIPS")) ] ) ``` ### 5. Validate Embeddings Quality Spot-check that table and figure summaries are meaningful: ```python theme={null} # Review summaries before embedding for page in result.pages: for fragment in page.page_fragments: if fragment.fragment_type == "table": print(f"Table summary: {fragment.content.summary}") # Ensure it's descriptive, not just "A table of data" ``` ## Complete Example Try the full working example with research paper search: Complete code walkthrough including agent-based filtering and result ranking ## What's Next? **Build on this foundation:** * [ChromaDB Integration](/integrations/chroma) - Add citation tracking to results * [Chonkie Integration](/examples/cookbooks/chonkie) - Advanced semantic chunking strategies **Learn more about document AI:** * [Blog: Fix Broken Context in RAG](https://tlake.link/blog/chonkie) - Why chunking matters ## Resources * [Qdrant Documentation](https://qdrant.tech/documentation/) * [Tensorlake API Reference](https://tlake.link/docs) * [Blog: RAG Without Compromise](https://tlake.link/smarter-rag) ## Need Help? Join our community to discuss RAG architectures: * [Slack Community](https://tlake.link/slack) # Deploy the Indexify Workflow Engine Source: https://docs.tensorlake.ai/opensource/ce_engine Learn how to deploy the open-source Indexify workflow engine on your own infrastructure — server, container images, and graphs. Indexify is a Function Execution Engine, not a container orchestration engine. Self Hosting Indexify involves - 1. Deploying the Indexify Server 2. Building and Deploying Container Images capable of running functions of your workflow. 3. Deploying your Graphs to Indexify Server using the SDK. Indexify doesn't depend on Kubernetes or Docker, it is very flexible and can be deployed in the following manner - * Bare Metal and VMs * Docker Compose * Kubernetes (or any other container orchestrator) ## Bare Metal #### Download Indexify Server and Python SDK * The server can be [downloaded](https://github.com/tensorlakeai/indexify/releases/tag/v0.2.9) from here. * Install the Python SDK using pip. ```bash theme={null} pip install indexify ``` #### Start Server Start the server on one machine. Read the configuration reference to understand how to customize the server to use blob stores for storing function outputs. ```bash theme={null} indexify-server ``` By default the server will save graphs, invocations, and function outputs in the `indexify_storage` folder. Deleting that folder will require redeploying the graphs. We have a replicated mode for the server, based on Raft consensus protocol. It's not public yet because we are still figuring out how to make it easy to configure, operate and use by developers. If you are interested in using it, please reach out to us. #### Start Executor Start as many executors you want in different machines. ```bash theme={null} indexify-cli executor --server-addr : [--image-version=] ``` Arguments: * `server-addr` the address of the indexify server to connect to. Default: `127.0.0.1:8900`. * `image-version` the version of the image exposed by this executor. Default: `1`. Enables identifying the version of images run by this executor for function placement. If interested, see more technical details about Executors in this [README](https://github.com/tensorlakeai/indexify/blob/main/indexify/src/indexify/executor/README.md). ## Docker Compose You can spin up the server and executor using docker compose, and deploy and run in a production-like environment. Copy the [docker-compose.yaml file from here](https://raw.githubusercontent.com/tensorlakeai/indexify/refs/heads/main/docker-compose.yaml). ```bash theme={null} docker compose up ``` This starts the server and two replicas of the executor in separate containers. Change the `replicas` field for the executor in docker compose to add more executors (i.e parallelism) to the workflow. This uses a default executor container based on Debian and a vanilla Python installation. We generally provide docker compose files for local testing of every example project in the repository. ## Kubernetes We provide some basic Helm charts to deploy Indexify on Kubernetes. If you'd like to try with your own cluster, check out the [instructions][operations/k8s]. [operations/k8s]: https://github.com/tensorlakeai/indexify/tree/main/operations/k8s # Indexify Server Configuration Source: https://docs.tensorlake.ai/opensource/configuration Configure the open-source Indexify server with a YAML file — generate a sample with the CLI and tweak it to fit your deployment. The server is configured by a YAML configuration file. The easiest way to start is by generating it with the CLI or by downloading a sample configuration file, and then tweaking it to fit your needs. ## Overview ```yaml theme={null} listen_addr: 0.0.0.0:8900 state_store_path: /tmp/indexify/state blob_storage: backend: disk disk: path: /tmp/indexify/blobs ``` * **listen\_addr:** The interface on which the servers listens on. Typically you would want to listen on all interfaces. Default: `0.0.0.0:8900`. * **state\_store\_path:** Path where the state store is stored. This is where the state of the graph is stored. This is needed for resuming the graph from where it left off in case of a failure. Default: `indexify_storage`. * **blob\_storage:** Configuration for storing blobs. Blobs are raw bytes of data that are stored in the system. This is used for storing intermediate data between functions. ## Blob Storage Configuration Blob storage is used to store the output of functions. We support two forms of blob storage at the moment - Disk and S3 Storage. ### Disk ```yaml theme={null} blob_storage: backend: disk disk: path: /tmp/indexify-blob-storage ``` ### S3 Storage For S3 Storage, you'll need to also ensure you have the two following environment variables configured. Once you've configured these environment variables, our S3 integration will take care of the rest * `AWS_ACCESS_KEY_ID` * `AWS_SECRET_ACCESS_KEY` * `AWS_REGION` ```yaml theme={null} blob_storage: backend: s3 s3: path: "s3://my-bucket/" ``` You can create the bucket using the following command: ``` aws s3api create-bucket --bucket my-bucket --region us-east-1 ``` ### Executor health checks You can fetch the current Executor health status by issuing an HTTP GET request to its monitoring endpoint, e.g.: ``` curl localhost:7000/monitoring/health {"status": "nok", "message": "A Function Executor health check failed", "checker": "GenericHealthChecker"} ``` An HTTP status 200 means that Executor is healthy, 503 means that health check failed. We recommend to automatically restart the Executor if its health checks are failing because it helps to mitigate recurring bugs in functions, OS drivers and etc. You can specify a different monitoring server and port for an Executor using its CLI arguments `--monitoring-server-host` and `--monitoring-server-port`. # Kubernetes Deployment with Helm Source: https://docs.tensorlake.ai/opensource/deployment Deploy the open-source Indexify server and executor on Kubernetes using the provided Helm chart as a starting point. Deployment on Kubernetes is done using Helm charts. We provide a Helm chart to deploy the Indexify server and executor on Kubernetes. The Helm chart is very lightweight and is meant to be a starting point for deploying Indexify on Kubernetes. It is not meant to be a one-size-fits-all solution, and you may need to customize it to fit your needs. But we are always open to contributions and feedback to make it more useful for everyone! ## Components * [Server][server.yaml] - The API server which manages the graph and orchestrates the execution of functions. It is deployed as a StatefulSet; the server is a stateful application, and it requires a persistent volume to store the state of the execution graph. The server requires a blob store to store the output of functions. * [Executor][executor.yaml] - Executors are the workers that execute the functions in the graph. They are deployed as a Deployment. The helm chart deploys the default executor by default, but you can customize it to deploy many executors with the same Indexify server. * Blob Store - The blob store is used to store the output of functions. We require using an S3 like service for the blob store. The credentials are stored as a Kubernetes secret and mounted as environment variables in the server. [server.yaml]: https://github.com/tensorlakeai/indexify/blob/main/operations/k8s/helm/templates/server.yaml [executor.yaml]: https://github.com/tensorlakeai/indexify/blob/main/operations/k8s/helm/templates/executor.yaml ## Values The Helm chart is parameterized with the following values: ### Required #### Blob Store * `blobStore.endpoint` - The endpoint for the blob store. * `blobStore.config.s3.accessKey` - The access key for the blob store. * `blobStore.config.s3.secretKey` - The secret key for the blob store. * `blobStore.allowHTTP` - Whether to allow HTTP connections to the blob store. #### Server * `server.persistance.size` - The size of the persistent volume for the server. * `server.persistance.storageClass` - The storage class for the persistent volume. This will depend on the cloud provider you are using. For example, `standard` for GCP, `gp2` for AWS, etc. ### Optional #### Server * `server.image` - The Docker image for the Indexify server. * `server.ingress.enabled` - Whether to create an Ingress resource for the server. #### Executors * `executors.replicas` - The number of replicas for the executor. 1 by default. * `executors.image` - The Docker image for the Indexify executor. * `executors.name` - The name of the executor. `indexify-executor` by default. ##### Adding additional executors You can add additional executors by adding a new key under `executors` in the values file. For example, to add a new executor with the name `indexify-executor-2`, you can add the following: ```yaml theme={null} executors: - name: indexify-pdf-blueprint-downloader image: tensorlake/pdf-blueprint-download:latest replicas: 1 - name: indexify-pdf-blueprint-parser image: tensorlake/pdf-blueprint-pdf-parser:latest replicas: 1 - name: indexify-pdf-blueprint-lancdb image: tensorlake/pdf-blueprint-lancdb replicas: 1 - name: indexify-pdf-blueprint-st image: tensorlake/pdf-blueprint-st:latest replicas: 1 ``` In each executor, one has to specify the `name`, `image`, and `replicas` of the executor. ## Dependencies ### Blob Store We recommend using an S3 like service for the blob store. Our [local][helm/local.yaml] helm values override uses minio for this. See the \[environment variable patch]\[minio/api.yaml] for how this gets configured. [helm/local.yaml]: https://github.com/tensorlakeai/indexify/blob/main/operations/k8s/helm/local.yaml #### GCP * You'll want to create a [HMAC key][gcp-hmac] to use as `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`. * Set `AWS_ENDPOINT_URL` to `https://storage.googleapis.com/` [gcp-hmac]: https://cloud.google.com/storage/docs/authentication/hmackeys #### Other Clouds Not all clouds expose a S3 interface. For those that don't check out the [s3proxy][s3proxy] project. However, we'd love help implementing your native blob storage of choice! Please open an [issue][issue] so that we can have a discussion on how that would look for the project. [s3proxy]: https://github.com/gaul/s3proxy [issue]: https://github.com/tensorlakeai/indexify/issues # Indexify Source: https://docs.tensorlake.ai/opensource/indexify Open Source Compute Engine for Agentic Data Applications We have open-sourced the core task scheduler and orchestration engine that powers Tensorlake Workflows. You can use it to build a platform for your company if you are building data-intensive workflows for AI Applications or Data Science projects. Our Document Ingestion Engine is built on Indexify as well. Indexify is an alternative to - * Apache Airflow, Prefect and Temporal - For building and running data-intensive workflows. * Apache Spark/Ray/Dask - For doing map-reduce style parallel processing. In short, you get a durable workflow engine like Temporal, with the ability to scale out like Spark on 1000s of nodes in a cluster. And, in addition, you can - Run workflows across many different clouds or compute providers. For example, you can run the control plane in AWS, store data in S3, and run the data processing on GCP, Azure, Lambda Labs, Digital Ocean, etc. This enables you to acquire the right compute resources, at the right price point, with the developer experience of building and testing workflows locally and deploying and running them in the cloud. ## Quick Start Let's create a simple workflow to summarize a website on-demand! It demonstrates how to build and serve a workflow as a **remote** Python API. Install the Indexify SDK. ```bash theme={null} pip install indexify openai requests ``` We will write two functions, `scrape_website` and `summarize_text`. We create a Graph `website-summarizer` that executes the scrape function, and then executes the summarizer with the outputs of the scraper. ```python theme={null} from tensorlake.applications import function, application @application() @function() def scrape_website(url: str) -> str: import requests text: str = requests.get(f"http://r.jina.ai/{url}").text return summarize_text(text) @function() def summarize_text(text: str) -> str: from openai import OpenAI completion = OpenAI().chat.completions.create( model="gpt-4o-mini-2024-07-18", messages=[ {"role": "system", "content": "You are a helpful assistant. Generate a summary of this website"}, {"role": "user", "content": text}, ], ) return completion.choices[0].message.content ``` The graph can be run as-is, this is useful for testing. ```python theme={null} from tensorlake.applications import run_remote_application, Request request: Request = run_remote_application(scrape_website, "https://en.wikipedia.org/wiki/Golden_State_Warriors") result: str = request.output() print(result) ``` When it's time to consume your graph from other applications, you can serve it as an API. You can run the server in production in many ways, but here we run this in our laptop to show how it works. ```bash theme={null} indexify-cli server-dev-mode ``` Note: The `indexify-cli` command is part of the `indexify` python package previously installed. This starts the following processes - * **Server:** Orchestrates functions in the graph, stores execution state, and hosts Remote Graph APIs. * **Executor:** Runs the individual functions in the graph. Once the server is ready, you can deploy the graph - ```python theme={null} from indexify import RemoteGraph RemoteGraph.deploy(g, server_url="http://localhost:8900") ``` Once the graph is deployed, you can get a reference of the Graph in any application. ```python theme={null} graph = RemoteGraph.by_name(name="website-summarizer", server_url="http://localhost:8900") ``` You can now call the graph as a remote API. ```python theme={null} invocation_id = graph.run(block_until_done=True, url="https://en.wikipedia.org/wiki/Golden_State_Warriors") results = graph.output(invocation_id, "summarize_text") ``` # Monitoring and Troubleshooting Source: https://docs.tensorlake.ai/opensource/monitoring Monitor and troubleshoot the open-source Indexify server with Prometheus metrics exposed at /metrics/service and other internal endpoints. This guide provides information on how to monitor and troubleshoot your Indexify deployment using available metrics and internal endpoints. ## Prometheus Metrics Indexify Server exposes Prometheus metrics at `{server_url}/metrics/service`. These metrics are valuable for monitoring system health and performance. ### Key Metrics for Monitoring | Metric | Description | Use Case | | ----------------------------------------------------------------- | ----------------------------------------- | ---------------------------------------- | | `active_invocations_gauge` | Count of uncompleted invocations | Monitors system backlog | | `active_tasks` | Count of uncompleted tasks | Tracks overall system load | | `unallocated_tasks` | Count of tasks not allocated to executors | Identifies resource constraints | | `max_invocation_age_seconds` | Age of oldest running invocation | Detects stuck invocations | | `max_task_age_seconds` | Age of oldest running task | Identifies abnormally long-running tasks | | `task_completion_latency_seconds_bucket_count{outcome="Success"}` | Count of successfully completed tasks | Tracks successful throughput | | `task_completion_latency_seconds_bucket_count{outcome="Failure"}` | Count of failed tasks | Monitors system errors | | `task_completion_latency_seconds_bucket` | Distribution of task completion times | Analyzes performance trends | Additional internal metrics are available and documented in the `/metrics/service` endpoint. ## Troubleshooting Endpoints Indexify provides internal endpoints for deeper troubleshooting when issues are detected through metrics: | Endpoint | Description | Use Case | | ----------------------------------------- | -------------------------------------- | -------------------------------- | | `{server_url}/internal/allocations` | Lists current allocations per executor | Debugging executor load balance | | `{server_url}/internal/unallocated_tasks` | Lists all tasks not being allocated | Identifying resource bottlenecks | ## Common Troubleshooting Scenarios ### High Count of Unallocated Tasks If `unallocated_tasks` metric is high: 1. Check if you have executors capable of handling the specific task types. Note: Make sure all unallocated tasks have at least one executor with the `--function` argument matching the unallocated task. 2. Check the current load on executors by examining the `/internal/allocations` endpoint to see if executors are at capacity 3. Examine executor logs for errors 4. Examine server logs for errors ### Abnormally Long-Running Tasks If `max_task_age_seconds` is unusually high: 1. Use `/internal/allocations` to identify the specific long-running tasks 2. Check the `stdout` of long running tasks using the Indexify UI at `{server_url}/ui` 3. Check the executor logs handling these tasks 4. Consider adjusting resource allocations or timeouts ### Failed Tasks If `task_completion_latency_seconds_bucket_count{outcome="Failure"}` is increasing: 1. Make sure Invocation Input Payload is valid. 2. Check for the root cause in the `stdout` or `stderr` of failed tasks using the Indexify UI at `{server_url}/ui` 3. Verify your Compute Graph code to see if logs seen in stdout or stderr can be explained. # Access Control Source: https://docs.tensorlake.ai/platform/access-control Organization and project hierarchy, role-based permissions, and user management. This guide covers Tensorlake's access control system, including user roles and permissions for both dashboard users and programmatic API access. The system manages access through a hierarchical structure of organizations and projects, with role-based permissions that apply to both human users and API keys. Dashboard users interact with organizations and projects through [Tensorlake Cloud dashboard](https://cloud.tensorlake.ai), while developers can also use API keys for programmatic access. API keys operate at the project level with project-member permissions, making them ideal for integrating Tensorlake into applications and automated workflows. [//]: # "TODO: create a diagram to explain the access control" ## Entities and Relationships ### Organizations Organizations are the top-level entity in our system. Each organization can contain multiple projects and has its own set of members. Organizations implement a role-based access control system with two distinct roles: admin and member. These roles determine what actions users can perform within the organization and its projects. ### Projects Projects exist within organizations and serve as containers for related resources that require similar access control. Unlike team-based structures, projects are designed to group resources that should be protected and accessed in a consistent manner. This resource-centric approach allows for fine-grained access control based on the nature of the resources rather than organizational hierarchy. ### API Keys API Keys function exclusively at the project level and have the same permissions as project members. They can: * Access project resources and data * Make API calls within the project scope * Cannot perform any administrative actions API Keys are ideal for service accounts, automated processes, and integrations that need programmatic access to project resources. ## Membership Rules Project membership is tied to organization membership. A user must first be a member of an organization before they can be added to any projects within that organization. This hierarchical structure ensures proper access control across your resources. API keys have the same permissions as project members. This means they can access project resources but cannot perform administrative actions that are reserved for project admins. ## Roles and Permissions The following table categorizes permissions by functional area to clearly show what each role can do: ### Organization Management Permissions | Permission | Org Admin | Org Member | Project Admin | Project Member | API Key | | -------------------------------- | :-------: | :--------: | :-----------: | :------------: | :-----: | | Create new projects | ✅ | ❌ | ❌ | ❌ | ❌ | | Invite users to organization | ✅ | ❌ | ❌ | ❌ | ❌ | | View organization members | ✅ | ✅ | ❌ | ❌ | ❌ | | Manage organization member roles | ✅ | ❌ | ❌ | ❌ | ❌ | | Remove members from organization | ✅ | ❌ | ❌ | ❌ | ❌ | ### Project Access Control Permissions | Permission | Org Admin | Org Member | Project Admin | Project Member | API Key | | ------------------------------------- | :-------: | :--------: | :-----------: | :------------: | :-----: | | Access all projects automatically | ✅ | ❌ | ❌ | ❌ | ❌ | | Add organization members to a project | ✅ | ❌ | ✅ | ❌ | ❌ | | Remove members from a project | ✅ | ❌ | ✅ | ❌ | ❌ | | Change project member roles | ✅ | ❌ | ✅ | ❌ | ❌ | | View projects they are members of | ✅ | ✅ | ✅ | ✅ | N/A | ### Resource Access Permissions | Permission | Org Admin | Org Member | Project Admin | Project Member | API Key | | ----------------------------- | :-------: | :--------: | :-----------: | :------------: | :-----: | | View project resources\* | ✅ | ❌ | ✅ | ✅ | ✅ | | Manage project resources\* | ✅ | ❌ | ✅ | ❌ | ❌ | | Create API keys for a project | ✅ | ❌ | ✅ | ❌ | ❌ | | Create Webhooks for a project | ✅ | ❌ | ✅ | ❌ | ❌ | \*Project resources include Files, Datasets, and Webhooks. API Keys are specific to a project *and* user. ### Organization Roles in Detail #### Organization Admin Organization admins have complete control over the organization. They have full access to all projects within the organization, regardless of whether they are explicitly added as project members. Organization admins are the only users who can create new projects, invite users to join the organization, manage the roles of organization members, and remove members from the organization. Org admins can also [configure and enforce SSO](/platform/sso) for the organization. #### Organization Member Organization members have limited access within the organization. They can view the member list of the organization but can only access projects to which they have been explicitly added. Their permissions within accessible projects are determined by their project role. ### Project Roles in Detail #### Project Admin Project admins have management capabilities within their specific project. They can add existing organization members to their project, remove members from the project, and change the roles of project members. However, project admins cannot invite new users to the organization—this capability is reserved for organization admins. #### Project Member Project members have basic access to the project resources according to the system's permission model. They can view and interact with the project but cannot modify membership or roles. ## Invitation Process User invitations can only be created by organization admins. When creating an invitation, the admin specifies the invitee's email address, organization role, and a default project and project role. Upon invitation creation, an email is sent to the invitee with a unique link. After the invitee authenticates and accepts the invitation, the system verifies that the account email matches the invitation email. Once verified, the user is added to the organization with the specified role and to the default project contained in the invitation. Invitations expire 7 days after creation. The invitation can only be accepted if the account accepting it has the same email as the invitation. ## Usage Guidelines Projects should be used strategically to group resources that require similar access control patterns. Rather than organizing by teams or departments, consider organizing projects based on resource types, security requirements, or functional boundaries. Consider the following best practices: * Create projects based on resource sensitivity and access requirements * Group resources that are commonly accessed together in the same project * Use projects to implement the principle of least privilege by limiting access to only necessary resources * Regularly audit project membership and permissions * Rotate API keys periodically for enhanced security ## Frequently Asked Questions Organization roles (admin/member) control access to organization-wide functions like creating projects and inviting users. Project roles (admin/member) control access to specific project resources and project-level management. Yes, your organization role and project roles are independent. An organization member can be a project admin for specific projects they're added to. Only organization admins can invite new users. Go to your organization settings and create an invitation with the user's email, organization role, and default project assignment. No, API keys have the same permissions as project members. They can access project resources but cannot manage users, create projects, or perform administrative functions. Invitations expire after 7 days. If expired, an organization admin will need to create a new invitation for the user. Yes, as long as there is one Organization admin, other admins can be removed or changed to be a member, regardless of if they made the Organization. Organization admins automatically have access to all projects. Organization members can only see and access projects they've been explicitly added to. To find out which projects you have acces to, go to the organization and click on the dropdown menu to select a project. You can also get a full list by going to `https://cloud.tensorlake.ai/organizations/[YOUR_ORG_ID]]/projects`. Yes, both project admins and project members can create API keys for their projects. Only organization admins and project admins can manage other aspects of projects. Organize projects based on resource sensitivity and access requirements rather than team structure. Group resources that need similar access controls and are commonly used together. Datasets, files, API keys, and Webhooks are organized by project. Regularly review project membership, especially when team members change roles or leave. Also rotate API keys periodically for enhanced security. # Authentication Source: https://docs.tensorlake.ai/platform/authentication Learn how to make API requests to the Tensorlake APIs ## Tensorlake Account You need to have a Tensorlake Cloud account to make API requests if you're using the Python SDK, the `tensorlake` npm package, a generated TypeScript client, or directly calling the REST API. You can create an account on [cloud.tensorlake.ai](https://cloud.tensorlake.ai). ## API keys API keys are project-specific credentials that allow programmatic access to resources within a project. Each API key exists solely within the context of its project and has the [same permissions as a project member](/platform/access-control#project-access-control-permissions). API keys cannot have organization-level permissions. #### Creating API keys 1. Go to the [Tensorlake Dashboard](https://cloud.tensorlake.ai) 2. Select the project to make API calls against. 3. Create an API key. Create API key Every tensorlake API key starts with `tl_apiKey_*`. ## Tensorlake Python SDK The [Tensorlake SDK](https://github.com/tensorlakeai/tensorlake) leverages API keys for authentication. For example: ```python your_app.py theme={null} from tensorlake.documentai import DocumentAI, ParsingOptions API_KEY="tl_apiKey_xxxx" doc_ai = DocumentAI(api_key=API_KEY) file_id = doc_ai.upload(path="/path/to/file.pdf") parse_id = doc_ai.read(file_id=file_id, parsing_options=ParsingOptions()) ``` ## TypeScript For sandbox and cloud/application APIs, use the official `tensorlake` npm package. ```bash theme={null} npm install tensorlake export TENSORLAKE_API_KEY=your-api-key-here ``` ```ts theme={null} import { Sandbox } from "tensorlake"; const sandbox = Sandbox.create({ apiKey: process.env.TENSORLAKE_API_KEY, }); ``` These examples assume Node.js 18+, Bun, or Deno so `fetch` is available globally. The current npm package covers sandboxes plus cloud/application APIs. Document Ingestion TypeScript examples still use a generated client from Tensorlake's public OpenAPI schema: ```bash theme={null} npm install @hey-api/client-fetch npm install -D @hey-api/openapi-ts typescript npx @hey-api/openapi-ts \ -i https://docs.tensorlake.ai/api-reference/openapi.yaml \ -o src/lib/tensorlake \ -c @hey-api/client-fetch \ -p @hey-api/sdk ``` Use the npm package inline on pages that already show Python examples: * [Sandboxes](/sandboxes/introduction) Use the generated client on Document Ingestion pages: * [Document Ingestion Quickstart](/document-ingestion/quickstart) * [Read Documents](/document-ingestion/parsing/read) ## REST API REST API requests needs to include the API key in the header as a Bearer Token. For example, to make a request to the Document File Management API, you would use the following curl command: ```bash theme={null} curl --request POST \ --url https://api.tensorlake.ai/documents/v2/files \ --header 'Authorization: Bearer ' \ --header 'Content-Type: multipart/form-data' \ --form 'labels={}' ``` For enterprise SSO, see [Single Sign-On (SSO)](/platform/sso). ## Frequently Asked Questions You can regenerate your API key from the Tensorlake Dashboard. Go to your project settings, find the API keys section, and create a new key. Remember to update your applications with the new key. No, API keys are project-specific. Each API key only works within the context of the project where it was created. You'll need separate API keys for each project. API keys have the same permissions as a project member. They cannot have organization-level permissions and are limited to project-specific operations. Create a new API key first, update your applications to use the new key, then delete the old key from the dashboard. This ensures no downtime during rotation. Immediately delete the compromised API key from the Tensorlake Dashboard and generate a new one. Update all applications using the old key as soon as possible. API keys do not have explicit expiration dates. Each API key will remain active until it is deleted. This usually means your API key is invalid, was deleted, or is not properly included in the Authorization header as a Bearer token. Verify your key exists in the project you expect on the Tensorlake Dashboard, then verify format and header structure. Use environment variables or secure credential management systems. Never hardcode API keys in your source code or commit them to version control. API keys are for programmatic access and machine-to-machine communication, while user authentication is for interactive dashboard access. API keys don't expire with user sessions. API keys inherit project member permissions. # Billing Source: https://docs.tensorlake.ai/platform/billing Tensorlake Cloud uses usage-based billing — see the Billing page in your dashboard for current usage and invoices. Billing in Tensorlake Cloud is usage-based. For detailed information, please refer to the Billing page. Billing page # EU Endpoints Source: https://docs.tensorlake.ai/platform/eu-data-residency Use Tensorlake's EU endpoints for data residency and compute in Europe, available for both Document Ingestion and Workflows APIs. Tensorlake APIs are available in the EU region, to provide data residency and compute in Europe. ## Document Ingestion API ```python theme={null} from tensorlake.documentai import (DocumentAI, Region) doc_ai = DocumentAI(api_key="YOUR_TENSORLAKE_API_KEY", region=Region.EU) ``` ## EU HTTP Endpoint EU HTTP Endpoint is `https://api.eu.tensorlake.ai/` ## Workflows API ```python theme={null} from tensorlake.functions_sdk import Graph graph = Graph(name="my_graph", region=Region.EU) ``` ## API Keys and Webhooks The same API key can be used for both US and EU regions. Webhooks are supported in both regions. # Tensorlake Cloud Playground Overview Source: https://docs.tensorlake.ai/platform/playground/overview Get started with the Tensorlake Cloud Playground — an interactive interface to parse and extract data from documents. The [Tensorlake Playground](https://cloud.tensorlake.ai/playground) is our interactive visual interface that allows you to parse and extract data from documents with the full capabilities of our API. You can use our sample documents, or upload your own documents. In addition to the sandbox experience, this is where you can: * Create Organizations and Projects * Manage access across your organization and Projects * Create and manage API keys * Visually explore the output of jobs you ran through the SDK or API Start parsing documents and extracting structured data without writing any code [//]: # "TODO: Let's have some basic functionality like images videos, and gifs to describe how ot use the Playground. We can split up onto multiple pages" # Sample Documents Source: https://docs.tensorlake.ai/platform/playground/sample-documents Understand Tensorlake's capabilities through our sample documents, available in the Playground [//]: # "TODO: consider listing all of the different sample documents here - we could even maybe put a \"suggest a document for us to use as a sample\"" # Security Policies Source: https://docs.tensorlake.ai/platform/security Tensorlake's data storage, encryption, and compliance practices for enterprise customers in healthcare, financial services, legal, and government. At Tensorlake, we take data security and privacy extremely seriously. We serve customers across healthcare, financial services, legal, and government sectors who entrust us with high-stakes personally identifiable information (PII) and mission-critical data. We understand that protecting this sensitive information isn't just important—it's essential to our customers' operations and regulatory compliance. We have implemented robust, enterprise-grade security measures to ensure the highest level of protection. This report outlines our data storage practices, encryption protocols, and compliance adherence. ## Data Storage The two types of data that may be stored on Tensorlake is documents and parsed output. Below is the default policies around data storage. There are options for Hybrid and Fully-Disconnected On Prem usage of Tensorlake. Contact us at [support@tensorlake.ai](mailto:support@tensorlake.ai) if you have to ensure your documents and parsed data never leave your servers. 1. **Documents**: Documents can either be uploaded to Tensorlake or provided via a link. * Uploaded documents are stored in accordance with our data storage policy below. * Linked documents are not stored by Tensorlake. 2. **Parse Output**: The output from a parse job includes a markdown representation, a document layout, structured data, and page classifications. * All of the output of your parse job is stored in accordance with our data storage policy below. ### Storage Policies 1. **Storage Location**: We utilize Amazon Web Services (AWS) S3 for storing data. Data is encrypted at rest and in transit. 2. **Access Permissions**: Access to AWS S3 storage is strictly limited to the internal document processing services. This ensures that only authorized and authenticated processes can interact with the stored data, minimizing the risk of unauthorized access. 3. **Data Retention**: For all users, you can delete your documents and data from our servers at any time using our APIs. 4. **Data Usage**: For all users, we never use any of your data for training purposes. We respect the privacy of our customers and ensure only they have access to the data from their requests. ## Deleting Your Data While your data is stored securely in accordance with our storage policies outlined above, we understand that you may want to remove specific documents or parse outputs at any time. If you need to request complete data deletion and/or access audit logs from Tensorlake, please contact us at [support@tensorlake.ai](mailto:support@tensorlake.ai) . 1. **Document Deletion**: You can delete any uploaded document from our servers using the document ID (`doc_id`). Once deleted, the document is permanently removed from our storage and cannot be recovered. The output of any parse job that referenced this document will still be accessible. 2. **Parse Output Deletion**: You can delete the output from any parse job using the parse ID (`parse_id`). This removes all associated data including the markdown representation, document layout, structured data, and page classifications. Deleting the parse output will not delete the document that was parsed. 3. **Immediate Deletion**: When you request deletion, the data is immediately removed from our active systems. This ensures that you maintain full control over your data lifecycle. 4. **API Access**: Data deletion can be performed through our API endpoints, allowing you to integrate data management into your workflows and compliance processes. Whether you need to comply with data retention policies, respond to data subject requests, or simply manage your storage usage, you have the tools to delete your data whenever needed. ## Encryption 1. **Encryption at Rest**: All data stored in AWS S3 is encrypted at rest using industry-standard encryption algorithms. This means that even if unauthorized individuals were to gain access to the stored data, they would not be able to decipher it without the proper encryption keys. 2. **Encryption in Transit**: We employ encryption protocols to protect data in transit. All communication between our systems and the data storage is conducted over secure channels using encryption mechanisms such as SSL/TLS. This ensures that data remains confidential and tamper-proof during transmission. If you have any further questions or require additional information regarding our security practices, please don't hesitate to reach out to [support@tensorlake.ai](mailto:support@tensorlake.ai). ### List of Authorized Subprocessors | Company | Description | Country (where subprocessing takes place) | | ------------------------------- | ----------------------- | ----------------------------------------- | | Amazon Web Services, Inc. (AWS) | Cloud Infrastructure | United States, EU | | OpenAI, LLC | Artificial Intelligence | United States, EU | | Anthropic PBC | Artificial Intelligence | United States, EU | | Datadog | Error Monitoring | United States | | PostHog, Inc. | Product Analytics | United States | | Google Cloud | Cloud Infrastructure | United States, EU | | Microsoft Azure | Cloud Infrastructure | United States, EU | | Lambda Labs. | Cloud Infrastructure | United States, EU | # Single Sign-On (SSO) Source: https://docs.tensorlake.ai/platform/sso Configure and enforce SSO for your organization using OIDC or SAML 2.0 identity providers. Single Sign-On (SSO) lets your team sign in to Tensorlake through your company's identity provider (IdP). Once configured, members authenticate with your IdP instead of managing separate Tensorlake credentials. ## Prerequisites * You must be an [organization admin](/platform/access-control#organization-admin). * SSO access must be enabled for your organization. If you don't have access, request it from the SSO settings page in the dashboard. ## Setup Navigate to **Organization Settings > SSO** and click **Configure SSO Connection**. Tensorlake supports two protocols: * **OIDC (OpenID Connect)** — recommended for providers like Google Workspace, Okta, and Auth0. * **SAML 2.0** — supported for providers like Azure AD (Entra ID), OneLogin, and other SAML-compatible IdPs. Provide the following details: | Field | Description | | -------------------------- | --------------------------------------------------------------------------------------------------------------- | | **Domain** | Your organization's email domain (e.g. `yourcompany.com`). Users with this domain will be directed to your IdP. | | **Issuer URL** | The issuer or entity ID from your IdP. | | **Client ID** | The application/client ID assigned by your IdP. | | **Client Secret** | The client secret from your IdP (OIDC only). | | **Authorization Endpoint** | The URL where users are sent to authenticate (OIDC). | | **Token Endpoint** | The URL used to exchange authorization codes for tokens (OIDC). | | **ACS URL / SSO URL** | The Assertion Consumer Service URL (SAML). Provided by Tensorlake. | | **Certificate** | The X.509 signing certificate from your IdP (SAML). | **Attribute mapping:** Ensure your IdP sends at minimum the user's email address. Name attributes (first name, last name) are recommended for a complete profile. After saving your configuration, test the connection by performing a test login. 1. Click **Test Connection** in the SSO settings. 2. You will be redirected to your IdP to authenticate. 3. After successful authentication, you are redirected back to Tensorlake and the provider is marked as **Verified**. SSO enforcement cannot be enabled until you have completed a successful test login. Enforcing an untested configuration could lock users out of your organization. SSO enforcement requires all organization members to sign in through your IdP. When enabled, password-based login is disabled for all members — the only way to sign in is through the IdP. To enable enforcement: 1. Designate at least one organization admin as a **bypass user**. This is required before enforcement can be enabled. 2. Toggle **Enforce SSO** in the SSO settings. **Bypass users** are organization admins who retain password-based login for emergency recovery — for example, if your IdP goes down and you need to access the dashboard to disable enforcement. Only organization admins can be designated as bypass users. Enabling SSO enforcement invalidates all existing sessions for the organization. All members (except bypass users) will be signed out and must re-authenticate through the IdP. Only organization admins can enable or disable SSO enforcement and manage bypass users. ## How SSO login works When SSO is configured for your organization, the login flow works as follows: 1. A user enters their email on the Tensorlake login page. 2. Tensorlake checks whether the email domain has an SSO provider configured. 3. If SSO is configured and enforced, the user is redirected to the IdP with an `SSO_REQUIRED` response. Password-based login is not available. 4. If SSO is configured but not enforced, the user can choose to sign in with SSO or with their Tensorlake password. 5. After authenticating with the IdP, the user is redirected back to Tensorlake and signed in. New users who sign in via SSO on their first login are automatically provisioned with a Tensorlake account. Users who previously signed in with email OTP may need to be [invited to the organization](/platform/access-control#invitation-process) before SSO login will work for them. ## Frequently Asked Questions Tensorlake supports any IdP that implements OIDC or SAML 2.0. Common providers include Okta, Azure AD (Entra ID), Google Workspace, OneLogin, and Auth0. Each organization supports one SSO provider at a time. If you need to switch providers, update the existing SSO configuration with the new provider's details. All members must authenticate through your IdP. Password-based login is disabled for the entire organization, except for designated bypass users. Only organization admins can be designated as bypass users. At least one bypass user is required before SSO enforcement can be enabled. Bypass users retain password-based login solely for emergency recovery, such as disabling enforcement if your IdP becomes unavailable. API keys are not affected by SSO enforcement. Existing API keys continue to work regardless of SSO settings. API keys authenticate directly with the Tensorlake API and do not go through the IdP login flow. If your IdP is unavailable, members will not be able to sign in. A bypass user (org admin with password-based login retained) can sign in and disable enforcement until the IdP is restored. Yes. SSO is not enforced until you explicitly enable enforcement. During setup and testing, all users can continue to sign in with their existing credentials. No. Existing organization members continue to have access. They will simply be redirected to the IdP on their next login if enforcement is enabled. # Webhook Configuration Source: https://docs.tensorlake.ai/platform/webhooks/configuration Add a webhook endpoint in Tensorlake, select event types to subscribe to, and disable CSRF protection on the receiving endpoint. ## Adding a webhook *It is important to disable CSRF protection for the endpoint if the framework you use enables them by default.* Adding an endpoint is done by accessing the Webhooks tab and providing a URL that you control and selecting the event types that you want to listen to. Create and list webhooks Webhook creation # Webhooks Source: https://docs.tensorlake.ai/platform/webhooks/overview Learn how to use webhooks to get notified when Tensorlake Jobs finishes. You can use webhooks to get notified about the status of Document Ingestion Jobs. Webhooks are configured on a per-project basis. The project associated with the API key that is being used to configure the webhook is the one for which the webhook will be triggered. *Our webhooks events are managed by Svix, please note none of your data is sent to Svix, only event statuses.* ## Create and Configure Your Webhook To use Webhooks with Tensorlake, make sure you have created a project on [Tensorlake Cloud](https://cloud.tensorlake.ai). It is important to disable CSRF protection for the endpoint if the framework you use enables them by default. Go to the project that you want to create the webhook for, and click into the Webhooks tab. Create and list webhooks Only admins for a project can create webhooks. However, members of a project can view webhooks. Give the webhook a name, and provide a URL that you control. Webhook creation Configure your webhook by selecting the event types that you want to listen to. Webhook creation The three event types are: 1. `tensorlake.document_ingestion.job.created`: Triggered when Tensorlake receives the request to parse a document and kicks off the parsing. 2. `tensorlake.document_ingestion.job.failed`: Triggered when a parsing job fails. 3. `tensorlake.document_ingestion.job.completed`: Triggered when a parsing job succeeds. ## Understand the Webhook Payload The payload of the webhook will depend on the event type received. The payload will be a JSON object with the following structure: ```json theme={null} { "job_id": "parse_XXX", "status": "pending", "created_at": "2023-10-01T12:00:00Z", } ``` ```json theme={null} { "job_id": "parse_XXX", "status": "failure", "error": "Error message describing the failure", "created_at": "2023-10-01T12:00:00Z", "finished_at": "2023-10-01T12:05:00Z" } ``` ```json theme={null} { "job_id": "parse_XXX", "status": "successful", "created_at": "2023-10-01T12:00:00Z", "finished_at": "2023-10-01T12:05:00Z", "usage": { "pages_parsed": 10, "extraction_input_tokens_used": 0, "extraction_output_tokens_used": 0, "ocr_input_tokens_used": 0, "ocr_output_tokens_used": 0, "signature_detected_pages": 0, "strikethrough_detected_pages": 0, "summarization_input_tokens_used": 0, "summarization_output_tokens_used": 0 } } ``` ### Workflow Payloads When all functions of an application finish running for a given input: ```json theme={null} { "workflow_name": "workflow_XXXX", "invocation_id": "invocation_XXXX", "fn_status": { "fn_A": "success", "fn_B": "failure" } } ``` Possible statuses for each function: * `success` — The function finished running successfully. * `failure` — The function failed to finish running. ## Test Your Webhooks in Tensorlake Cloud We have a UI to test webhooks. After creating a webhook, select on the event type you want to test. Webhook creation ## Secure your Endpoints with Signature Verification Without signature verification, anyone could send forged requests to your endpoint. Each webhook endpoint has a unique secret to verify the authenticity of incoming requests. To get your webhook secret: 1. Navigate to your project's Webhooks tab in [Tensorlake Cloud](https://cloud.tensorlake.ai) 2. Click on your webhook to view its details 3. Copy the webhook secret displayed in the interface Webhook Secret Location You can use the [**Svix libraries**](https://docs.svix.com/receiving/verifying-payloads/how) to handle secret verification automatically in your application code. Alternatively, you can [**manually verify the signature**](https://docs.svix.com/receiving/verifying-payloads/how-manual) using the secret. The verification process involves: * Extracting the signature from the `svix-signature` header * Computing the expected signature using your webhook secret * Comparing the computed signature with the received signature For more details and other verification methods, refer to the following resources: * [Receiving Webhooks with Bridge](https://docs.svix.com/receiving/verifying-payloads/receiving-with-bridge) * [Additional Authentication](https://docs.svix.com/receiving/additional-authentication) * [Static Source IP Addresses](https://docs.svix.com/receiving/source-ips) # Document Ingestion Webhook Payload Source: https://docs.tensorlake.ai/platform/webhooks/payloads/document-ingestion Schema of the payload Tensorlake sends to your webhook URL when a Document Ingestion parse job finishes — job id, status, and dataset. The following payload is sent to your configured webhook URL when a job finishes. ```json theme={null} { "job_id": "job_XXXX", "status": "success", "dataset": "dataset_XXXX", } ``` If a Parse job is created without a dataset, the `dataset` field will be `null`. The possible statuses are: * `created` - The job is created. * `success` - The job finished successfully. * `failure` - The job failed to finish. # Workflow Webhook Payload Source: https://docs.tensorlake.ai/platform/webhooks/payloads/workflows Schema of the payload Tensorlake sends to your webhook URL when a Serverless Workflow invocation finishes, including per-function status. The following payload is sent to your configured webhook URL when all functions of a Serverless Workflow finishes running for a given Input. ```json theme={null} { "workflow_name": "workflow_XXXX", "invocation_id": "invocation_XXXX", "fn_status": { "fn_A": "success", "fn_B": "failure" } } ``` The following statuses are possible for each function - * `success` - The function finished running successfully. * `failure` - The function failed to finish running. ### Example Once you have configured the webhook, test it with this [example snippet](https://github.com/tensorlakeai/tensorlake/blob/main/examples/webhook.py) that: 1. Parses a document, and sets `deliver_webhook` to `True`. 2. Once the parsing is complete, you should see the webhook payload delivered to your configured webhook URL. We highly recommend using Svix Play to test whether webhooks are getting delivered. ```python theme={null} import time from tensorlake.documentai import DocumentAI from tensorlake.documentai.parse import ParsingOptions, TableParsingStrategy API_KEY = "tl_api_key_XXXX" doc_ai = DocumentAI(api_key=API_KEY) # Skip this if you are passing a pre-signed URL to the `DocumentParser`. # or pass an external URL # file_id = doc_ai.upload(path="/path/to/files") job_id = doc_ai.parse( # file_id, # You can pass in a publicly accessible URL instead of a file_id "https://pub-157277cc11d64fb1a11f71cc52c688eb.r2.dev/invoice-example.pdf", options=ParsingOptions( table_parsing_strategy=TableParsingStrategy.VLM, deliver_webhook=True, ), ) print(f"job id: {job_id}") result = doc_ai.get_job(job_id=job_id) print(f"job status: {result.status}") while True: if result.status in ["pending", "processing"]: print("waiting 5s...") time.sleep(5) result = doc_ai.get_job(job_id) print(f"job status: {result.status}") else: if result.status == "successful": # save the result to a file with open(f"{job_id}.json", "w", encoding="utf-8") as f: f.write(result.model_dump_json()) break ``` # Signature Verification Source: https://docs.tensorlake.ai/platform/webhooks/signature-verification Verify that incoming webhook messages were actually sent by Tensorlake (via Svix) to prevent forged webhook deliveries. Webhook signatures let you verify that webhook messages are actually sent by Svix. Without verification, forged webhooks can be sent to your endpoint. ## Svix verification links * [Verify Webhooks with the Svix Libraries](https://docs.svix.com/receiving/verifying-payloads/how) * [Receiving Webhooks with Bridge](https://docs.svix.com/receiving/verifying-payloads/receiving-with-bridge) * [Verifying Webhooks Manually](https://docs.svix.com/receiving/verifying-payloads/how-manual) * [Additional Authentication](https://docs.svix.com/receiving/additional-authentication) * [Static Source IP Addresses](https://docs.svix.com/receiving/source-ips) # Testing Webhooks Source: https://docs.tensorlake.ai/platform/webhooks/testing Use Tensorlake's UI to trigger test webhook events for a registered endpoint and verify your receiver before going live. We have a UI to test webhooks. After creating a webhook, select on the event type you want to test. Webhook creation # Agentic Autoresearch Loop Source: https://docs.tensorlake.ai/sandboxes/agentic-autoresearch Autonomously improve an ML training script overnight using an LLM agent that proposes code modifications, races them in parallel sandboxes, and hill-climbs toward lower validation loss. Automate ML research iteration with an LLM agent that reads your training script, proposes targeted code changes, and validates each change by running it in an isolated sandbox. Inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch) (March 2026), the loop runs unattended — each accepted modification becomes the new baseline, and the agent builds on what it has already learned. ## How it works 1. **Calibrate**: Run the baseline training script in a sandbox to establish a starting validation loss. 2. **Propose**: The agent reads the current best script and experiment history, then proposes *N* candidate modifications (with increasing temperature for diversity). 3. **Race**: All *N* candidates run in parallel TensorLake sandboxes for a fixed step budget. 4. **Evaluate**: Parse `val_loss` from each sandbox's stdout. The candidate with the lowest loss wins the round. 5. **Hill-climb**: Accept the winner only if it beats the current best. Update the baseline script and history. 6. **Repeat**: Loop until the iteration budget is exhausted. ### Why sandboxes are required The agent emits complete, self-contained Python scripts. Running untrusted LLM-generated training code in your host process would be unsafe — the model could write arbitrary filesystem operations or import unexpected packages. Each candidate runs in an isolated sandbox with a fixed memory ceiling and is killed automatically when the step budget completes. *** ## Prerequisites ```bash theme={null} pip install tensorlake openai rich python-dotenv ``` Create a `.env` file in your project root: ``` TENSORLAKE_API_KEY="your-api-key-here" OPENAI_API_KEY="your-openai-key-here" ``` *** ## TypeScript SDK starter The Node.js version follows the same core loop: propose candidates, race them in parallel sandboxes, parse `val_loss`, and keep the winner. ```typescript theme={null} import { Sandbox } from "tensorlake"; async function evaluateCandidate(script: string) { const sandbox = await Sandbox.create({ cpus: 2.0, memoryMb: 4096, timeoutSecs: 900, allowInternetAccess: false, }); try { await sandbox.writeFile( "/workspace/train.py", new TextEncoder().encode(script), ); const result = await sandbox.run("python", { args: ["/workspace/train.py"], workingDir: "/workspace", timeout: 900, }); const match = result.stdout.match(/val_loss:\s*([0-9.]+)/); return { valLoss: match ? Number(match[1]) : Number.POSITIVE_INFINITY, stdout: result.stdout, stderr: result.stderr, }; } finally { await sandbox.terminate(); } } const candidates = [ "print('val_loss: 1.2345')", "print('val_loss: 1.1021')", ]; const results = await Promise.all(candidates.map(evaluateCandidate)); const winner = results.reduce((best, cur) => cur.valLoss < best.valLoss ? cur : best, ); console.log(winner); client.close(); ``` Use the OpenAI response to generate each candidate script, then keep appending accepted experiments to your history exactly like the Python version below. *** ## Full example Pass `--smoke` for a fast proof-of-concept run (3 iterations, 2 candidates, 150 training steps, \~5 minutes). The full run uses 8 iterations, 3 candidates, and 300 steps (\~20 minutes). ````python theme={null} """ Karpathy Autoresearch Loop + TensorLake Sandboxes ================================================== Inspired by github.com/karpathy/autoresearch (April 2026, 64k⭐). The "Karpathy Loop": 1. Give an AI agent a training script and a plain-English program.md 2. Agent proposes one targeted code modification per iteration 3. Run the modified script in isolation for a fixed step budget 4. If val_loss improves → accept, update the baseline 5. Repeat overnight → hundreds of validated improvements TensorLake sandboxes are the right tool here: • Modified training code is untrusted (the agent could emit anything) • Multiple candidate modifications can race in parallel sandboxes • Each sandbox is killed after the time budget — no runaway experiments • The host process never imports/executes model weights or agent code This example: State : current best train.py + full experiment history Action : LLM proposes one self-contained code modification Sandbox : TensorLake runs the modified script for STEPS training steps Reward : Δval_loss = best_val_loss − new_val_loss (positive = improvement) Update : Greedy hill-climbing — accept if reward > 0 Parallelism: CANDIDATES sandboxes race each iteration (ThreadPoolExecutor) Smoke : --smoke → 3 iters, 2 candidates, 150 steps/run (~5 min) Full : 8 iters, 3 candidates, 300 steps/run (~20 min) """ from dotenv import load_dotenv load_dotenv() import re import sys import time from dataclasses import dataclass, field from typing import List, Optional from openai import OpenAI from tensorlake.sandbox import Sandbox from tensorlake.applications import application, function from rich.console import Console from rich.panel import Panel from rich.table import Table from rich.rule import Rule from rich import box console = Console() SMOKE = "--smoke" in sys.argv # ─── program.md — plain-English guidance for the agent ─────────────────────── # This is what Karpathy calls "the human's job": describe the search space. PROGRAM_GUIDANCE = """\ You are an ML research agent optimising a character-level MLP language model trained on a small text corpus using numpy only (no torch/tensorflow). The training script defines these tunable constants near the top: CTX (context window / n-gram size, int) HIDDEN (hidden layer size, int) LR (learning rate, float) BATCH (batch size, int) WDECAY (L2 weight decay, float) STEPS (DO NOT CHANGE — fixed budget) Good things to try: • Learning rate: sweep 1e-4 → 0.1 • Learning rate decay: multiply LR by 0.999 each step (add near opt.step) • Hidden size: 32 / 64 / 128 / 256 • Context window CTX: 2 / 4 / 8 • Weight decay WDECAY: 0 → 1e-4 • Activation: replace np.tanh with np.maximum(0, x) (ReLU) or np.clip(x,0,None) • Initialization scale: change 0.01 to 0.1 or use He/Xavier init • Add a second hidden layer (W3, b3) with size HIDDEN//2 • Momentum: track velocity vectors, apply SGD+momentum • Batch size: 16 / 32 / 64 Constraints: • numpy only — do not import torch, tensorflow, sklearn • STEPS must stay unchanged • Output format: last printed line must be val_loss: X.XXXX """ # ─── Baseline training script ───────────────────────────────────────────────── # ~130 K-param nano-GPT on a small public-domain text. # Runs in ~15 s on CPU (150 steps) / ~30 s (300 steps). BASELINE_SCRIPT = '''\ import subprocess, sys subprocess.run(["python3","-m","pip","install","numpy","-q","--target","/tmp/pkgs"], capture_output=True, check=False) sys.path.insert(0, "/tmp/pkgs") import numpy as np np.random.seed(42) # ── Corpus (opening of Alice in Wonderland, public domain) ────────────────── TEXT = ( "Alice was beginning to get very tired of sitting by her sister on the bank," " and of having nothing to do: once or twice she had peeped into the book her" " sister was reading, but it had no pictures or conversations in it, and what" " is the use of a book thought Alice without pictures or conversations so she" " was considering in her own mind as well as she could for the hot day made" " her feel very sleepy and stupid whether the pleasure of making a daisy-chain" " would be worth the trouble of getting up and picking the daisies when" " suddenly a White Rabbit with pink eyes ran close by her there was nothing so" " very remarkable in that nor did Alice think it so very much out of the way" " to hear the Rabbit say to itself oh dear oh dear I shall be late when she" " thought it over afterwards it occurred to her that she ought to have wondered" " at this but at the time it all seemed quite natural but when the Rabbit" " actually took a watch out of its waistcoat-pocket and looked at it and then" " hurried on Alice started to her feet for it flashed across her mind that she" " had never before seen a rabbit with either a waistcoat-pocket or a watch to" " take out of it and burning with curiosity she ran across the field after it" ) * 4 # ~4 800 chars # ── Tokeniser ──────────────────────────────────────────────────────────────── chars = sorted(set(TEXT)) vocab = len(chars) stoi = {c: i for i, c in enumerate(chars)} data = [stoi[c] for c in TEXT] split = int(0.9 * len(data)) train_d, val_d = data[:split], data[split:] # ── Hyperparameters (agent modifies these) ─────────────────────────────────── CTX = 4 # context window (n-gram) HIDDEN = 64 # hidden layer size LR = 0.05 # learning rate BATCH = 32 # mini-batch size WDECAY = 0.0 # L2 weight decay STEPS = STEPS_PLACEHOLDER # fixed budget — do not change # ── Parameters ─────────────────────────────────────────────────────────────── W1 = np.random.randn(vocab * CTX, HIDDEN) * 0.01 b1 = np.zeros(HIDDEN) W2 = np.random.randn(HIDDEN, vocab) * 0.01 b2 = np.zeros(vocab) def get_batch(d): idx = np.random.randint(0, len(d) - CTX, BATCH) X = np.zeros((BATCH, vocab * CTX)) for i, start in enumerate(idx): for j in range(CTX): X[i, j * vocab + d[start + j]] = 1.0 Y = np.array([d[i + CTX] for i in idx]) return X, Y def forward(X): H = np.tanh(X @ W1 + b1) logits = H @ W2 + b2 logits -= logits.max(1, keepdims=True) probs = np.exp(logits) probs /= probs.sum(1, keepdims=True) return H, probs def ce_loss(probs, Y): return -np.log(probs[np.arange(len(Y)), Y] + 1e-8).mean() # ── Training loop ───────────────────────────────────────────────────────────── for step in range(STEPS): X, Y = get_batch(train_d) H, probs = forward(X) dl = probs.copy(); dl[np.arange(BATCH), Y] -= 1; dl /= BATCH dW2 = H.T @ dl; db2 = dl.sum(0) dH = dl @ W2.T * (1 - H**2) dW1 = X.T @ dH; db1 = dH.sum(0) W1 -= LR * (dW1 + WDECAY * W1) b1 -= LR * db1 W2 -= LR * (dW2 + WDECAY * W2) b2 -= LR * db2 # ── Evaluate ────────────────────────────────────────────────────────────────── losses = [ce_loss(forward(get_batch(val_d)[0])[1], get_batch(val_d)[1]) for _ in range(30)] print(f"val_loss: {np.mean(losses):.4f}") ''' # ─── Data models ────────────────────────────────────────────────────────────── @dataclass class Experiment: iteration: int candidate: int description: str script: str val_loss: Optional[float] = None delta: Optional[float] = None # positive = improvement accepted: bool = False error: Optional[str] = None @dataclass class ResearchState: best_script: str best_val_loss: float = 999.0 history: List[Experiment] = field(default_factory=list) def history_summary(self) -> str: if not self.history: return "No experiments yet." lines = [] for e in self.history[-8:]: # last 8 status = "✓ ACCEPTED" if e.accepted else ("✗ error" if e.error else "✗ rejected") vl = f"{e.val_loss:.4f}" if e.val_loss else "—" d = f"Δ{e.delta:+.4f}" if e.delta is not None else "" lines.append(f" [{status}] iter={e.iteration} val={vl} {d} — {e.description}") return "\n".join(lines) # ─── Agent: propose one code modification ──────────────────────────────────── def propose_modification(state: ResearchState, candidate_idx: int) -> tuple[str, str]: """Returns (description, modified_script).""" client = OpenAI() prompt = f"""{PROGRAM_GUIDANCE} Current best val_loss: {state.best_val_loss:.4f} Experiment history: {state.history_summary()} Current best script: ```python {state.best_script} ``` Propose modification #{candidate_idx + 1} (make it different from recent attempts). Return ONLY a JSON object with two keys: "description": one sentence describing the change "script": the complete modified Python script No markdown fences around the JSON. Just the raw JSON object.""" resp = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}], temperature=0.9 + candidate_idx * 0.1, # more exploration for later candidates response_format={"type": "json_object"}, ) import json data = json.loads(resp.choices[0].message.content) return data["description"], data["script"] # ─── TensorLake @function for map-reduce ────────────────────────────────────────── @function() def run_experiment_in_sandbox(exp_data: dict) -> dict: """Run exp.script in a TensorLake sandbox via map-reduce.""" import re iteration = exp_data["iteration"] candidate = exp_data["candidate"] description = exp_data["description"] script = exp_data["script"] max_retries = 5 last_error = None for attempt in range(max_retries): try: box = Sandbox.create(memory_mb=4096, timeout_secs=900) ex = box.run("python3", ["-c", script], timeout=300) stdout = (ex.stdout or "").strip() stderr = (ex.stderr or "").strip() m = re.search(r"val_loss:\s*([0-9.]+)", stdout) if not m: return { "iteration": iteration, "candidate": candidate, "description": description, "val_loss": None, "error": (stderr or stdout)[:120] or "no val_loss in output" } return { "iteration": iteration, "candidate": candidate, "description": description, "val_loss": float(m.group(1)), "error": None } except Exception as exc: last_error = str(exc)[:150] if attempt < max_retries - 1: wait_time = 2 ** attempt time.sleep(wait_time) continue return { "iteration": iteration, "candidate": candidate, "description": description, "val_loss": None, "error": last_error } return { "iteration": iteration, "candidate": candidate, "description": description, "val_loss": None, "error": last_error } # ─── Main autoresearch loop (TensorLake @application) ────────────────────────────── @application() @function() def autoresearch(iterations: int = 8, candidates: int = 3): steps = 150 if SMOKE else 300 console.print(Panel( "[bold cyan]Karpathy Autoresearch Loop + TensorLake Map-Reduce[/bold cyan]\n\n" "[dim]Inspired by github.com/karpathy/autoresearch (April 2026)\n\n" "Loop:\n" " 1. Agent reads current best script + experiment history\n" " 2. Proposes CANDIDATES modifications (different temperatures)\n" " 3. All CANDIDATES run in parallel via Tensorlake map-reduce\n" " 4. Best val_loss wins; accepted if it beats the current best\n" " 5. Accepted script becomes the new baseline\n\n" "Reward = Δval_loss (positive = improvement)\n" "Policy = GPT-4o prompted with program.md + experiment history\n" "Update = greedy hill-climbing (accept if reward > 0)\n" f"Mode = {'SMOKE (3 iters, 2 candidates, 150 steps)' if SMOKE else f'Full ({iterations} iters, {candidates} candidates, {steps} steps)'}[/dim]", border_style="cyan", )) # ── Calibrate baseline ─────────────────────────────────────────────────── baseline = BASELINE_SCRIPT.replace("STEPS_PLACEHOLDER", str(steps)) console.print(Rule("[yellow]Calibrating baseline[/yellow]", style="yellow")) console.print("[dim]Running baseline script in sandbox to establish starting val_loss...[/dim]") calib_result = run_experiment_in_sandbox({ "iteration": 0, "candidate": 0, "description": "baseline", "script": baseline }) if calib_result["error"] or calib_result["val_loss"] is None: console.print(f"[red]Baseline failed: {calib_result['error']}[/red]") return calib_val_loss = calib_result["val_loss"] state = ResearchState(best_script=baseline, best_val_loss=calib_val_loss) console.print(f" Baseline val_loss: [bold yellow]{state.best_val_loss:.4f}[/bold yellow]\n") try: # ── Research iterations ────────────────────────────────────────────────── for it in range(1, iterations + 1): console.print(Rule(f"[cyan]Iteration {it}/{iterations}[/cyan]", style="cyan")) console.print(f" [dim]Current best: {state.best_val_loss:.4f} " f"Proposing {candidates} candidates (map-reduce)...[/dim]") # Propose candidates (sequentially — they call the LLM) proposals_data = [] for c in range(candidates): desc, script = propose_modification(state, c) proposals_data.append({ "iteration": it, "candidate": c, "description": desc, "script": script }) # Run all candidates in parallel using Tensorlake map-reduce console.print(f" [dim]Executing {candidates} candidates in parallel (Tensorlake map-reduce)...[/dim]") result_dicts = run_experiment_in_sandbox.map(proposals_data) results = [] for res in result_dicts: exp = Experiment( iteration=res["iteration"], candidate=res["candidate"], description=res["description"], script=proposals_data[res["candidate"]]["script"], val_loss=res["val_loss"], error=res["error"] ) results.append(exp) # Score & rank valid = [r for r in results if r.val_loss is not None] for r in valid: r.delta = state.best_val_loss - r.val_loss # positive = improvement valid.sort(key=lambda r: r.val_loss) # Print iteration table t = Table(box=box.SIMPLE, show_header=True, header_style="bold white") t.add_column("C", width=3) t.add_column("Modification", width=52) t.add_column("val_loss", width=9, justify="right") t.add_column("Δ", width=9, justify="right") t.add_column("", width=3) for r in sorted(results, key=lambda r: (r.val_loss or 999)): if r.error: t.add_row(str(r.candidate), r.description[:50], "—", "—", f"[red]✗[/red]") continue delta_str = (f"[green]{r.delta:+.4f}[/green]" if r.delta and r.delta > 0 else f"[red]{r.delta:+.4f}[/red]" if r.delta else "—") t.add_row(str(r.candidate), r.description[:50], f"{r.val_loss:.4f}", delta_str, "") console.print(t) # Accept best if improved if valid and valid[0].delta is not None and valid[0].delta > 0: winner = valid[0] winner.accepted = True state.best_val_loss = winner.val_loss state.best_script = winner.script console.print( f" [bold green]✓ Accepted: {winner.description}\n" f" val_loss {calib_val_loss:.4f} → {state.best_val_loss:.4f} " f"(Δ{winner.delta:+.4f})[/bold green]" ) else: console.print(" [dim]No improvement this iteration — baseline unchanged.[/dim]") state.history.extend(results) # ── Final summary ──────────────────────────────────────────────────────── finally: accepted = [e for e in state.history if e.accepted] total_improvement = calib_val_loss - state.best_val_loss pct = total_improvement / calib_val_loss * 100 color = "bold green" if pct > 5 else "yellow" if pct > 0 else "red" console.print(Panel( f"[bold green]Autoresearch complete[/bold green]\n\n" f"Baseline val_loss : [yellow]{calib_val_loss:.4f}[/yellow]\n" f"Final val_loss : [bold green]{state.best_val_loss:.4f}[/bold green]\n" f"Total improvement : [{color}]{total_improvement:+.4f} ({pct:+.1f}%)[/{color}]\n" f"Accepted changes : {len(accepted)} / {len(state.history)}\n\n" + ("\n".join(f" ✓ iter {e.iteration}: {e.description}" for e in accepted) if accepted else " (none)"), title="[bold]Research Summary[/bold]", border_style="green", )) # ── Result interpretation ───────────────────────────────────────────────── accept_rate = len(accepted) / len(state.history) * 100 if state.history else 0 console.print(Panel( f"[bold]What these numbers mean[/bold]\n\n" f"val_loss is cross-entropy on held-out characters (nats).\n" f"Lower = the model assigns higher probability to the correct next character.\n\n" f" Baseline {calib_val_loss:.4f} → Final {state.best_val_loss:.4f} " f"([{color}]{pct:+.1f}%[/{color}])\n\n" f"Context:\n" f" • A random character predictor on this ~50-char vocabulary scores ln(50) ≈ 3.91\n" f" • The baseline MLP ({calib_val_loss:.2f}) already beats random — it learned\n" f" that 'e', space, and 't' are far more likely than 'Z'\n" f" • Each accepted change is a genuine algorithmic improvement:\n" f" the agent modified real training code and the sandbox verified it\n" f" on held-out data — not on the training set\n\n" f"Acceptance rate: {len(accepted)}/{len(state.history)} ({accept_rate:.0f}%)\n" f" • Typical for greedy hill-climbing on a small model: 25–40% is normal\n" f" • Rejected experiments are still informative — they update the agent's\n" f" memory so it avoids the same dead ends next iteration\n\n" f"Smoke vs full run:\n" f" • Smoke (3 iters, 2 candidates, 150 steps) is a proof-of-concept\n" f" • Full run (8 iters, 3 candidates, 300 steps) gives the agent\n" f" enough budget to explore LR schedules, second layers, momentum,\n" f" and architecture changes — improvements compound across iterations\n" f" • Karpathy's original loop ran ~700 experiments overnight and found\n" f" 11% speed improvement; the same pattern scales here", title="[bold cyan]Score interpretation[/bold cyan]", border_style="cyan", )) if accepted: console.print(Rule("[green]Final best script[/green]", style="green")) console.print(Panel(state.best_script[:1200] + ("..." if len(state.best_script) > 1200 else ""), border_style="green")) if __name__ == "__main__": from tensorlake.applications import run_local_application, Request request: Request = run_local_application( autoresearch, iterations=3 if SMOKE else 8, candidates=2 if SMOKE else 3, ) ```` *** ## What happens step-by-step | Step | Component | Action | | :---- | :----------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **1** | **Calibration** | Baseline script runs in a sandbox. The resulting `val_loss` becomes the threshold every candidate must beat. | | **2** | **Proposal** | GPT-4o receives the current best script, `program.md` guidance, and the last 8 experiments. It returns *N* JSON objects, each with a `description` and a complete modified `script`. | | **3** | **Parallel race** | All *N* candidates are submitted to a `ThreadPoolExecutor`. Each thread creates a TensorLake sandbox, runs the script for the fixed step budget, and parses `val_loss` from stdout. | | **4** | **Selection** | Candidates are ranked by `val_loss`. The winner is accepted only if its loss is strictly lower than the current best (greedy hill-climbing). | | **5** | **State update** | The accepted script replaces the baseline. All results — accepted and rejected — are appended to the history so the agent avoids revisiting dead ends. | | **6** | **Next iteration** | The agent is prompted again with the updated script and history. Improvements compound across iterations. | *** ## Key design decisions ### Increasing temperature across candidates Each candidate is proposed with a slightly higher temperature (`0.9 + candidate_idx * 0.1`). The first candidate is a focused, conservative change. Later candidates are more exploratory. This covers both the safe and speculative ends of the search space in a single iteration. ### Experiment history as agent memory The agent receives a rolling window of the last 8 experiments (accepted and rejected), each annotated with `val_loss` and `Δ`. This prevents the agent from re-proposing changes that already failed and nudges it toward unexplored directions without any external memory store. ### Fixed `STEPS` budget enforced in `program.md` The `STEPS` constant is explicitly marked as off-limits in the guidance. Without this constraint, the agent could trivially reduce `val_loss` by running more training steps — a form of reward hacking that would make comparisons between candidates meaningless. ### Greedy hill-climbing over rollout The loop uses simple greedy acceptance (accept if `Δval_loss > 0`) rather than beam search or rollout. For an overnight research loop where each experiment costs real CPU time, greedy hill-climbing maximises the number of validated improvements within the time budget. *** This example uses `python-dotenv` to load your API keys. Create a `.env` file in your project root: ``` TENSORLAKE_API_KEY="your-api-key-here" OPENAI_API_KEY="your-openai-key-here" ``` Both clients pick them up automatically. ## What to build next Train a model directly with RL using sandboxes as the reward oracle — a complementary approach to autoresearch. Dispatch parallel sandboxes across a swarm of specialized worker agents. # Agentic Dungeons & Dragons Source: https://docs.tensorlake.ai/sandboxes/agentic-d&g Build a dynamic D&D-style game where parallel AI agents act as scene writers and a Dungeon Master agent orchestrates the story. Create a dynamic, unpredictable storytelling game using a swarm of AI agents. This guide demonstrates how to build a Dungeons & Dragons-style RPG where multiple "Scene Agents" draft possible outcomes in parallel, and a "Dungeon Master" agent weaves them into a coherent narrative based on player choice. This pattern uses a "Map-Reduce" model: parallel workers generate possibilities (map), and a lead agent synthesizes them (reduce). ## How it works 1. **Branching Possibilities**: For a given player choice, the application imagines several potential actions (e.g., "Fight," "Flee," "Negotiate"). 2. **Map (Parallel Scene Writers)**: A `scene_agent` is spawned for each potential action. These run in parallel, each in its own sandbox. 3. **Sandbox Execution**: Each `scene_agent` uses an LLM to generate a Python script that simulates a dice roll and determines the outcome of its assigned action. The script runs securely in the sandbox and outputs a JSON with narrative text, consequences, and an ASCII art illustration. 4. **Reduce (Dungeon Master)**: A `dungeon_master` agent receives the drafted scenes from all parallel workers. 5. **Narrate & Update State**: The DM selects the draft corresponding to the player's *actual* choice, applies the consequences (e.g., HP loss, new item), and uses an LLM to write the next part of the story, complete with new choices for the player. *** ## Prerequisites You'll need the Tensorlake SDK, an OpenAI client, and the `rich` library for the terminal UI. ```bash theme={null} pip install tensorlake openai pydantic rich python-dotenv ``` This example uses the `python-dotenv` library to load your API keys from a `.env` file. Create a file named `.env` in your project root and add your keys: ``` TENSORLAKE_API_KEY="your-api-key-here" OPENAI_API_KEY="your-openai-key-here" ``` The clients will automatically use these keys. *** ## TypeScript SDK starter In Node.js, model each branch as `LLM -> sandbox -> JSON scene draft`, then reduce the drafts with your Dungeon Master step: ````typescript theme={null} import OpenAI from "openai"; import { Sandbox } from "tensorlake"; type SceneDraft = { branch_id: number; branch_label: string; narrative: string; consequences: string; image_prompt: string; ascii_art: string; }; const openai = new OpenAI(); async function sceneAgent(branchId: number, branchLabel: string) { const prompt = `Write Python that simulates the "${branchLabel}" branch and prints one JSON object.`; const response = await openai.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: prompt }], }); const generatedCode = response.choices[0].message.content ?.replace("```python", "") .replace("```", "") .trim() ?? ""; const sandbox = await Sandbox.create({ allowInternetAccess: false, timeoutSecs: 600, }); try { const execution = await sandbox.run("python3", { args: ["-c", generatedCode], }); return JSON.parse(execution.stdout) as SceneDraft; } finally { await sandbox.terminate(); } } const drafts = await Promise.all([ sceneAgent(0, "Fight"), sceneAgent(1, "Flee"), sceneAgent(2, "Negotiate"), ]); console.log(drafts); sandboxes.close(); ```` This gives you the same parallel-map stage as the Python example. Your Dungeon Master step can stay in Node.js and operate on the returned JSON drafts. *** ## Full Example The complete script below orchestrates the entire game loop. You can run it directly to play in your terminal. ````python theme={null} from dotenv import load_dotenv load_dotenv() from tensorlake.sandbox import Sandbox from pydantic import BaseModel from typing import List, Optional from openai import OpenAI from rich.console import Console from rich.panel import Panel from rich.text import Text from rich.rule import Rule from rich import box import json from concurrent.futures import ThreadPoolExecutor import time console = Console() # ─── Data Models ───────────────────────────────────────────────────────────── class PlayerState(BaseModel): player_name: str hp: int = 20 max_hp: int = 20 inventory: List[str] = ["torch", "dagger"] story_history: List[str] = [] current_choice: Optional[str] = None turn: int = 0 class SceneDraft(BaseModel): branch_id: int branch_label: str narrative: str consequences: str image_prompt: str ascii_art: str class StoryBeat(BaseModel): scene_narrative: str choices: List[str] image_prompt: str ascii_art: str updated_state: PlayerState # ─── Agent 1: Scene Writer (runs in parallel per branch) ───────────────────── def scene_agent(args: dict) -> SceneDraft: """ Each scene_agent drafts ONE possible branch outcome in an isolated sandbox. Runs in parallel — one sandbox per branch. """ branch_id = args["branch_id"] branch_label = args["branch_label"] player_state = PlayerState(**args["player_state"]) setting = args["setting"] print(f"⚔️ Scene Agent [{branch_label}]: Drafting branch in sandbox...") client = OpenAI() prompt = f""" You are a D&D scene writer. The player chose: "{branch_label}". Setting: {setting} Player: {player_state.player_name}, HP: {player_state.hp}/{player_state.max_hp} Inventory: {player_state.inventory} Story so far: {' | '.join(player_state.story_history[-3:]) or 'Adventure begins.'} Write a Python script that: 1. Uses 'random' to simulate a D20 dice roll 2. Determines success/failure of the action "{branch_label}" based on the roll (>=10 is success) 3. Prints a single valid JSON (no markdown, no extra text) with these exact keys: - branch_id: {branch_id} - branch_label: "{branch_label}" - narrative: vivid 3-sentence scene description of the outcome - consequences: one of "-N HP", "+item_name", "no change", or "unlocked secret" - image_prompt: a DALL-E prompt for this scene in dark fantasy style - ascii_art: a 10-15 line ASCII art illustration using / \\ | _ . * # @ ~ ^ that depicts the scene visually. Must be a single string with \\n for newlines. Make it evocative of the environment — dungeon, dragon, forest, castle, monster, etc. IMPORTANT: The script must print ONLY valid JSON to stdout. No markdown, no code fences. """ response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}] ) generated_code = ( response.choices[0].message.content .replace("```python", "") .replace("```", "") .strip() ) print(f"⚔️ Scene Agent [{branch_label}]: Executing dice logic in Sandbox...") sandbox = Sandbox.create() execution = sandbox.run("python3", ["-c", generated_code]) output = execution.stdout.strip() print(f"⚔️ Scene Agent [{branch_label}]: Result -> {output[:80]}...") data = json.loads(output) return SceneDraft(**data) # ─── Agent 2: Dungeon Master (aggregator + narrator) ───────────────────────── def dungeon_master(args: dict) -> StoryBeat: """ The DM receives all parallel branch drafts, picks the player's chosen one, narrates the next scene with 3 new choices, and updates state. """ drafts = [SceneDraft(**d) for d in args["drafts"]] player_state = PlayerState(**args["player_state"]) chosen_label = player_state.current_choice print(f"🎲 Dungeon Master: Received {len(drafts)} branch drafts. Chosen: '{chosen_label}'") chosen = next((d for d in drafts if d.branch_label == chosen_label), drafts[0]) client = OpenAI() prompt = f""" You are an epic Dungeon Master continuing a D&D adventure. The player chose: "{chosen.branch_label}" What happened: {chosen.narrative} Consequences: {chosen.consequences} Player state: HP={player_state.hp}/{player_state.max_hp}, Inventory={player_state.inventory} Turn number: {player_state.turn} Now write the next story beat. Respond ONLY as raw JSON (no markdown, no code fences) with: - scene_narrative: 2 vivid paragraphs in second person ("You...") describing what unfolds - choices: list of exactly 3 short action choices for the player (action verbs, max 5 words each) - image_prompt: a DALL-E prompt for the scene illustration in dark fantasy oil painting style """ response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}] ) raw = ( response.choices[0].message.content .replace("```json", "") .replace("```", "") .strip() ) data = json.loads(raw) # Apply consequences to player state updated = player_state.model_copy(deep=True) cons = chosen.consequences.lower() if "hp" in cons and "-" in cons: try: dmg = int(''.join(filter(str.isdigit, cons.split("hp")[0]))) updated.hp = max(0, updated.hp - dmg) except ValueError: pass elif "hp" in cons and "+" in cons: try: heal = int(''.join(filter(str.isdigit, cons.split("hp")[0]))) updated.hp = min(updated.max_hp, updated.hp + heal) except ValueError: pass if "+" in cons and "hp" not in cons: item = chosen.consequences.replace("+", "").strip() if item and item not in updated.inventory: updated.inventory.append(item) updated.story_history.append(chosen.narrative[:100]) updated.turn += 1 return StoryBeat( scene_narrative=data["scene_narrative"], choices=data["choices"], image_prompt=data["image_prompt"], ascii_art=chosen.ascii_art, updated_state=updated, ) # ─── Application: One Full RPG Turn ────────────────────────────────────────── def rpg_adventure(player_name: str, choice: str, state_json: str, setting: str) -> str: """ One full turn: 1. Fan out 3 parallel scene agents (one per branch) 2. DM aggregates and narrates the chosen branch 3. Returns next StoryBeat as JSON """ print(f"\n🧙 RPG Turn: Player='{player_name}', Choice='{choice}'") state = json.loads(state_json) state["current_choice"] = choice branches = [ {"branch_id": 0, "branch_label": "Fight"}, {"branch_id": 1, "branch_label": "Flee"}, {"branch_id": 2, "branch_label": "Negotiate"}, ] scene_args = [ {**b, "player_state": state, "setting": setting} for b in branches ] # Threads are needed to run multiple sandboxes concurrently. # The sandboxes THEMSELVES run in parallel on the server, but threads allow # our script to wait for multiple results simultaneously. with ThreadPoolExecutor(max_workers=len(branches)) as executor: drafts = list(executor.map(scene_agent, scene_args)) beat = dungeon_master({ "drafts": [d.model_dump() for d in drafts], "player_state": state, }) return beat.model_dump_json(indent=2) # ─── UI Helpers ────────────────────────────────────────────────────────────── TITLE_SCREEN = r""" ____ ____ _ __ ___ __ ______________ _________ / __ \/ __ \/ | / / / _ | / / / / __/ ___/ _ \/ ___/ __/ / /_/ / / / / |/ / / __ |/ /_/ / _// (_ / // / /__/ _/ \____/_/ /_/_/|_/ /_/ |_|\____/___/\___/____/\___/___/ 🐉 A N A I - P O W E R E D A D V E N T U R E 🐉 """ def print_title(): console.print() console.print(Text(TITLE_SCREEN, style="bold red")) console.print(Rule(style="red")) console.print() def print_scene(beat: StoryBeat): console.print() console.print(Rule("⚔ NEW SCENE", style="yellow")) # ASCII art panel console.print( Panel( Text(beat.ascii_art, style="bold green", justify="center"), border_style="dim green", padding=(1, 4), ) ) # Narrative panel console.print( Panel( beat.scene_narrative, title="[bold cyan]📖 What Unfolds[/bold cyan]", border_style="cyan", padding=(1, 2), ) ) def print_stats(state: PlayerState): hp_color = "bold green" if state.hp > 10 else "bold yellow" if state.hp > 5 else "bold red" hp_bar = "█" * state.hp + "░" * (state.max_hp - state.hp) inv_str = ", ".join(state.inventory) if state.inventory else "nothing" console.print( Panel( f"[{hp_color}]❤ HP: {state.hp}/{state.max_hp} [{hp_bar}][/{hp_color}]\n" f"[bold white]🎒 Inventory:[/bold white] [dim]{inv_str}[/dim]\n" f"[bold white]📜 Turn:[/bold white] [dim]{state.turn}[/dim]", title=f"[bold magenta]🧙 {state.player_name}[/bold magenta]", border_style="magenta", box=box.SIMPLE, ) ) def print_choices(choices: List[str]): console.print() console.print(Rule("🎮 YOUR MOVE", style="bold yellow")) for i, c in enumerate(choices, 1): console.print(f" [bold yellow]{i}.[/bold yellow] [white]{c}[/white]") console.print(f" [dim]q. Quit adventure[/dim]") console.print() def get_player_choice(choices: List[str]) -> Optional[str]: while True: console.print("[bold]Enter 1, 2, 3 or q:[/bold] ", end="") raw = input().strip().lower() if raw == "q": return None if raw in ("1", "2", "3"): idx = int(raw) - 1 if idx < len(choices): return choices[idx] console.print(" [bold red]⚠ Invalid input. Try 1, 2, 3 or q.[/bold red]") # ─── Entry Point ───────────────────────────────────────────────────────────── if __name__ == "__main__": print_title() # Hero name console.print("[bold]Enter your hero's name[/bold] (or press Enter for 'Aldric the Bold'): ", end="") player_name = input().strip() or "Aldric the Bold" console.print(f"\n[bold green]Welcome, {player_name}! Your legend begins...[/bold green]\n") time.sleep(1) # Initial state & setting state = PlayerState(player_name=player_name) setting = ( "A crumbling dungeon entrance lit by sickly green torchlight. " "Ancient runes glow on the walls. Something massive growls in the darkness ahead. " "The air smells of sulfur and old bones." ) current_choice = "Explore" # ── Game Loop ── while True: console.print(f"\n[dim]⏳ Generating scene for: '[italic]{current_choice}[/italic]'...[/dim]") try: # Directly call the function instead of using run_local_application result_json = rpg_adventure( player_name=player_name, choice=current_choice, state_json=state.model_dump_json(), setting=setting, ) beat = StoryBeat.model_validate_json(result_json) except Exception as e: console.print(f"\n[bold red]❌ Error generating scene: {e}[/bold red]") console.print("[dim]Retrying...[/dim]") continue # Update state state = beat.updated_state # Render scene print_scene(beat) print_stats(state) # Death check if state.hp <= 0: console.print( Panel( "[bold red]💀 You have fallen in battle.\n\nYour legend ends here... for now.[/bold red]", border_style="red", padding=(1, 2), ) ) break # Choices print_choices(beat.choices) chosen = get_player_choice(beat.choices) if chosen is None: console.print( "\n[bold yellow]🏰 You sheathe your sword and walk away into the mist.\n" "Farewell, adventurer. Your story is unfinished.[/bold yellow]\n" ) break # Roll forward current_choice = chosen setting = beat.scene_narrative[-300:] # tail of scene becomes new setting context ```` *** ### What Happens Step-by-Step | Step | Component | Action | | :---- | :----------------- | :-------------------------------------------------------------------------------------------------------------------------- | | **1** | **Orchestrator** | Triggers 3 parallel `scene_agent` tasks using `.map()`, one for each potential action ("Fight", "Flee", "Negotiate"). | | **2** | **Scene Agent** | Uses GPT-4o to generate a Python script that simulates a dice roll and determines the outcome for its assigned branch. | | **3** | **Sandbox** | Securely executes the generated script, capturing the JSON output containing the narrative, consequences, and ASCII art. | | **4** | **Dungeon Master** | Receives all drafted scenes and selects the one matching the player's actual choice. | | **5** | **Dungeon Master** | Applies consequences (e.g., HP loss, new item) to the player's state based on the sandbox output. | | **6** | **Dungeon Master** | Uses GPT-4o to narrate the next story beat and generate three new, context-aware choices for the player. | | **7** | **UI Loop** | The main game loop receives the final `StoryBeat`, renders the scene and stats, and prompts the player for their next move. | *** ## How to Extend This Example ### Generate Images The agents already create image prompts for DALL-E. You could extend the `dungeon_master` to call an image generation API and display the resulting image, creating a true multimedia experience. ### Add More Complex Logic The sandbox is perfect for running more complex game mechanics. You could: * Implement a full combat system with multiple enemy types. * Create skill checks that depend on the player's inventory or stats. * Generate dynamic loot tables or environmental puzzles. ### Use Snapshots for Faster Turns If your `scene_agent` sandboxes needed to install libraries like `numpy` for more complex simulations, the `pip install` on every turn would add latency. You can pre-install dependencies into a base sandbox and create a **Snapshot**. Future turns can then launch from that snapshot instantly. ```python theme={null} # In your scene_agent: sandbox = Sandbox.create(snapshot_id="your-snapshot-id") # Dependencies are already installed! execution = sandbox.run("python3", ["-c", generated_code]) ``` *** ## What to build next See another example of the Map-Reduce pattern with parallel agents. Optimize your game's turn speed by pre-baking dependencies. # Reproducible Environments for RL Rollouts Source: https://docs.tensorlake.ai/sandboxes/agentic-rl-reproducible-env Use Tensorlake sandboxes to guarantee isolated, deterministic rollouts for reinforcement learning training. A rollout is a complete episode of agent-environment interaction: an agent takes actions, the environment transitions, and rewards accumulate until the episode ends. Reproducibility means that given the same random seed and the same action sequence, every rollout produces exactly the same observations, transitions, and rewards. This property is foundational for RL engineering — without it, you cannot reliably compare two policy versions, reproduce a bug seen during training, or verify that a reward spike was real and not noise. The hard part is that real training runs hundreds or thousands of rollouts in parallel. Each worker must be completely isolated from the others: no shared filesystem, no shared process state, no network side-effects leaking across episodes. If any state bleeds between workers, your "reproducible" seed no longer controls the outcome and you lose the guarantee. Tensorlake sandboxes enforce this isolation at the infrastructure level — every rollout gets its own fresh environment, and the seed is the only variable in play. *** ## Core concepts **Isolation** means each rollout runs in its own compute environment with no shared resources. Two workers seeded with different values must not be able to influence each other's trajectories through any shared channel — not a shared pip cache, not a shared `/tmp`, not a shared network state. In production, this matters most when you are running hundreds of rollouts per training step: any shared state becomes a source of variance that your reward signal cannot explain. **Stateful resets** mean the environment always starts from a known, controlled baseline when a new rollout begins. A reset that partially inherits state from a previous episode is one of the most common and hardest-to-debug sources of non-reproducibility. Because each sandbox is created fresh per rollout, the reset is total — there is no prior episode state to inherit. **Determinism** means the environment's random number generator is seeded before any interaction begins, and the seed is the sole source of randomness for the entire episode. Given the same seed, the same initial observation, and the same action sequence, the trajectory must be identical byte-for-byte. This lets you replay any episode from training history, compare policy versions on equal footing, and write regression tests against specific trajectories. *** ## How Tensorlake sandboxes provide this `Sandbox.create()` starts a fresh, isolated compute environment and returns a `box` handle. Every sandbox is a separate process tree with its own filesystem and memory. There is no shared state between two sandboxes created from the same client. The seed is passed into the environment harness as a string literal embedded in the Python script that runs inside the sandbox — not set on the host process. This keeps the host's random state completely separate from the environment's, which is important when you dispatch many rollouts from a single host thread pool. Parallel rollouts map cleanly onto `ThreadPoolExecutor`: each thread creates its own sandbox, runs its episode, collects its trajectory, and the sandbox is destroyed when the context manager exits. The executor manages concurrency; the sandboxes manage isolation. *** ## Prerequisites ```bash theme={null} pip install tensorlake gymnasium python-dotenv ``` Create a `.env` file in your project root: ``` TENSORLAKE_API_KEY="your-api-key-here" ``` *** ## TypeScript SDK starter The same reproducibility pattern works from Node.js: embed the seed in the harness, run one rollout per sandbox, and compare trajectories across identical seeds. ```typescript theme={null} import { Sandbox } from "tensorlake"; function gymHarness(seed: number) { return ` import gymnasium as gym, json seed = ${seed} env = gym.make("CartPole-v1") obs, _ = env.reset(seed=seed) env.action_space.seed(seed) trajectory = [] total_reward = 0.0 for _ in range(200): action = env.action_space.sample() next_obs, reward, terminated, truncated, _ = env.step(action) trajectory.append((obs.tolist(), int(action), float(reward), bool(terminated))) total_reward += reward obs = next_obs if terminated or truncated: break print(json.dumps({"seed": seed, "total_reward": total_reward, "trajectory": trajectory})) `; } async function runSingleRollout(seed: number) { const sandbox = await Sandbox.create({ memoryMb: 2048, allowInternetAccess: false, }); try { await sandbox.run("python3", { args: ["-m", "pip", "install", "gymnasium", "--break-system-packages", "-q"], }); const result = await sandbox.run("python3", { args: ["-c", gymHarness(seed)], }); return JSON.parse(result.stdout); } finally { await sandbox.terminate(); } } const [rolloutA, rolloutB] = await Promise.all([ runSingleRollout(42), runSingleRollout(42), ]); console.assert( JSON.stringify(rolloutA.trajectory) === JSON.stringify(rolloutB.trajectory), "same seed should produce the same trajectory", ); client.close(); ``` For batch collection, replace the final `Promise.all()` with one rollout per seed and aggregate the returned JSON results by seed. *** ## Full example ```python theme={null} """ Reproducible RL Rollouts with Tensorlake Sandboxes =================================================== Demonstrates three properties: 1. Isolation — each rollout runs in its own sandbox 2. Determinism — same seed → same trajectory, verified by assertion 3. Parallelism — multiple seeds dispatched concurrently via ThreadPoolExecutor """ from dotenv import load_dotenv load_dotenv() import json from concurrent.futures import ThreadPoolExecutor, as_completed from dataclasses import dataclass, field from typing import List, Tuple from tensorlake.sandbox import Sandbox # ─── Data models ────────────────────────────────────────────────────────────── @dataclass class RolloutConfig: seed: int env_name: str = "CartPole-v1" max_steps: int = 200 @dataclass class RolloutResult: seed: int total_reward: float steps: int # Each element is (observation, action, reward, terminated) trajectory: List[Tuple] = field(default_factory=list) # ─── Gymnasium harness ──────────────────────────────────────────────────────── # This script runs inside the sandbox. It is a self-contained string so that # the host's Python environment has no influence on the episode's random state. _GYM_HARNESS = """ import gymnasium as gym import json import sys seed = {seed} env_name = {env_name!r} max_steps = {max_steps} env = gym.make(env_name) obs, _ = env.reset(seed=seed) # env.reset(seed=) only seeds the observation/transition RNG. # The action space has its own RNG that must be seeded separately. env.action_space.seed(seed) trajectory = [] total_reward = 0.0 steps = 0 for _ in range(max_steps): action = env.action_space.sample() next_obs, reward, terminated, truncated, _ = env.step(action) trajectory.append((obs.tolist(), int(action), float(reward), bool(terminated))) total_reward += reward steps += 1 obs = next_obs if terminated or truncated: break env.close() result = {{ "seed": seed, "total_reward": total_reward, "steps": steps, "trajectory": trajectory, }} # The only output is the JSON result — the caller reads stdout print(json.dumps(result)) """ # ─── Single rollout ─────────────────────────────────────────────────────────── def run_single_rollout(config: RolloutConfig) -> RolloutResult: """ Run one complete RL episode in a fresh, isolated sandbox. A new sandbox is created for every call so there is no shared filesystem or process state between concurrent rollouts. The seed is embedded in the harness string rather than set on the host, which keeps the host's random state fully separate from the environment's. """ harness = _GYM_HARNESS.format( seed=config.seed, env_name=config.env_name, max_steps=config.max_steps, ) box = Sandbox.create(memory_mb=2048) # Use python3 -m pip to install into the sandbox's managed environment box.run("python3", ["-m", "pip", "install", "gymnasium", "--break-system-packages", "-q"]) execution = box.run("python3", ["-c", harness]) raw = (execution.stdout or "").strip() data = json.loads(raw) return RolloutResult( seed=data["seed"], total_reward=data["total_reward"], steps=data["steps"], trajectory=data["trajectory"], ) # ─── Parallel rollout collection ────────────────────────────────────────────── def collect_parallel_rollouts( seeds: List[int], env_name: str = "CartPole-v1", max_steps: int = 200, ) -> List[RolloutResult]: """ Dispatch one sandbox per seed, all running concurrently. ThreadPoolExecutor manages the concurrency; the sandboxes manage isolation. Results are returned in seed order regardless of completion order. """ configs = [RolloutConfig(seed=s, env_name=env_name, max_steps=max_steps) for s in seeds] results_by_seed = {} with ThreadPoolExecutor(max_workers=len(configs)) as pool: future_to_seed = {pool.submit(run_single_rollout, cfg): cfg.seed for cfg in configs} for future in as_completed(future_to_seed): seed = future_to_seed[future] results_by_seed[seed] = future.result() return [results_by_seed[s] for s in seeds] # ─── Reproducibility check ──────────────────────────────────────────────────── def verify_reproducibility( seed: int = 42, env_name: str = "CartPole-v1", max_steps: int = 200, ) -> None: """ Run the same seed twice in independent sandboxes and assert the trajectories are identical. This is the core guarantee: isolation + determinism means the seed fully determines the episode. """ print(f"Verifying reproducibility for seed={seed}...") config = RolloutConfig(seed=seed, env_name=env_name, max_steps=max_steps) result_a = run_single_rollout(config) result_b = run_single_rollout(config) assert result_a.steps == result_b.steps, ( f"Step count mismatch: {result_a.steps} vs {result_b.steps}" ) assert result_a.total_reward == result_b.total_reward, ( f"Reward mismatch: {result_a.total_reward} vs {result_b.total_reward}" ) assert result_a.trajectory == result_b.trajectory, ( "Trajectory mismatch: observations or actions differed between runs" ) print( f" Passed. seed={seed} → {result_a.steps} steps, " f"reward={result_a.total_reward:.1f} (identical across both runs)" ) # ─── Main ───────────────────────────────────────────────────────────────────── if __name__ == "__main__": # Step 1: Verify that the same seed always produces the same trajectory verify_reproducibility(seed=42) print( " → Same seed, two independent sandboxes, identical trajectory.\n" " The seed is the only source of variation — no shared state, no host RNG leakage." ) # Step 2: Collect 4 rollouts in parallel, one sandbox per seed seeds = [0, 1, 2, 3] print(f"\nCollecting {len(seeds)} parallel rollouts...") results = collect_parallel_rollouts(seeds) # Step 3: Print a summary table print(f"\n{'Seed':>6} {'Steps':>6} {'Total Reward':>14}") print("-" * 32) for r in results: print(f"{r.seed:>6} {r.steps:>6} {r.total_reward:>14.1f}") best = max(results, key=lambda r: r.total_reward) worst = min(results, key=lambda r: r.total_reward) print( f"\n → CartPole rewards 1.0 per step, so total reward equals episode length.\n" f" Seed {best.seed} balanced the longest ({int(best.total_reward)} steps); " f"seed {worst.seed} fell first ({int(worst.total_reward)} steps).\n" f" Different seeds produce different episodes because the initial pole\n" f" angle varies — run again with the same seeds and you get identical numbers." ) ``` **Expected output:** ``` Verifying reproducibility for seed=42... Passed. seed=42 → 30 steps, reward=30.0 (identical across both runs) Collecting 4 parallel rollouts... Seed Steps Total Reward -------------------------------- 0 18 18.0 1 29 29.0 2 14 14.0 3 15 15.0 ``` In CartPole, the reward is 1.0 per step regardless of action — so total reward equals step count. The episode ends when the pole tips past 12 degrees or the cart leaves the track. Different seeds produce different episode lengths because the initial pole angle varies. The reproducibility assertion confirms that seed=42 always produces the exact same 30-step trajectory in two independent sandboxes. *** ## Tic-tac-toe: policy evaluation This example extends the CartPole infrastructure to a custom two-player environment and shows where sandboxes are more directly necessary. Policies are defined as code strings — the same pattern used in [RL Training with GSPO](/sandboxes/gspo-agentic-rl) for LLM-generated completions. A policy that crashes, loops, or behaves unexpectedly only kills its own sandbox; the rest of the evaluation runs unaffected. The data model is the same as CartPole: `TttConfig` extends `RolloutConfig` by replacing `env_name` with `policy_x` and `policy_o`; `run_ttt_batch` returns the same `RolloutResult`. The `total_reward` field becomes the mean return per game from X's perspective (+1 win, −1 loss, 0 draw). `evaluate_matchup` follows the same parallel dispatch pattern as `collect_parallel_rollouts`, running one sandbox per seed to get a reliable return estimate. This is the **policy evaluation** step in policy iteration — you would call it after each policy update to measure how much the return improved. ```python theme={null} from dotenv import load_dotenv load_dotenv() import json import statistics from concurrent.futures import ThreadPoolExecutor, as_completed from dataclasses import dataclass, field from typing import Dict, List, Tuple from tensorlake.sandbox import Sandbox # ─── Reuse RolloutResult from the CartPole section ──────────────────────────── # total_reward = mean reward per game, X's perspective (+1 win, -1 loss, 0 draw) # steps = total moves across all games in the batch # trajectory = list of per-game outcomes @dataclass class RolloutResult: seed: int total_reward: float steps: int trajectory: List[dict] = field(default_factory=list) # ─── Tic-tac-toe config ─────────────────────────────────────────────────────── # Extends the RolloutConfig pattern: swap env_name for policy_x / policy_o, # add n_games (games per sandbox call = one rollout batch). @dataclass class TttConfig: seed: int policy_x: str # key into POLICIES policy_o: str n_games: int = 50 # ─── Policies as code strings ───────────────────────────────────────────────── # Treat these like LLM-generated completions: they run inside the sandbox, # never in the host process. A buggy policy crashes its sandbox, not the loop. POLICIES: Dict[str, str] = { "random": """ def choose_action(board, player, rng): moves = [i for i, v in enumerate(board) if v is None] return rng.choice(moves) """, "greedy": """ def choose_action(board, player, rng): WINS = [(0,1,2),(3,4,5),(6,7,8),(0,3,6),(1,4,7),(2,5,8),(0,4,8),(2,4,6)] opponent = "O" if player == "X" else "X" moves = [i for i, v in enumerate(board) if v is None] # Take the win if available for move in moves: b = board[:]; b[move] = player for a, c, d in WINS: if b[a] and b[a] == b[c] == b[d]: return move # Block the opponent's win for move in moves: b = board[:]; b[move] = opponent for a, c, d in WINS: if b[a] and b[a] == b[c] == b[d]: return move return rng.choice(moves) """, } # ─── Harness ────────────────────────────────────────────────────────────────── # Runs n_games games inside a single sandbox and returns the batch return. # Both policies execute in separate namespaces so they can't overwrite each # other's globals — important when policies come from different sources. _TTT_HARNESS = """ import json, random WINS = [(0,1,2),(3,4,5),(6,7,8),(0,3,6),(1,4,7),(2,5,8),(0,4,8),(2,4,6)] ns_x, ns_o = {{}}, {{}} exec({policy_x!r}, ns_x); exec({policy_o!r}, ns_o) choose_x = ns_x["choose_action"]; choose_o = ns_o["choose_action"] def winner(b): for a, c, d in WINS: if b[a] and b[a] == b[c] == b[d]: return b[a] return None rng = random.Random({seed}) games = [] for _ in range({n_games}): board, moves_played = [None] * 9, 0 for turn in range(9): player = "X" if turn % 2 == 0 else "O" action = (choose_x if player == "X" else choose_o)(board[:], player, rng) board[action] = player; moves_played += 1 w = winner(board) if w: games.append({{"outcome": w + " wins", "reward": 1 if w == "X" else -1, "moves": moves_played}}) break else: games.append({{"outcome": "draw", "reward": 0, "moves": moves_played}}) print(json.dumps({{ "total_reward": sum(g["reward"] for g in games) / len(games), "steps": sum(g["moves"] for g in games), "trajectory": games, }})) """ # ─── Interactive move oracle ────────────────────────────────────────────────── # For interactive play the sandbox stays open for the whole game session. # Each opponent turn sends the current board and gets one action back. # timeout_secs gives the human up to 5 minutes of total think time. _MOVE_HARNESS = """ import random ns = {{}} exec({policy!r}, ns) action = ns["choose_action"]({board!r}, {player!r}, random.Random({seed})) print(action) """ WINS = [(0,1,2),(3,4,5),(6,7,8),(0,3,6),(1,4,7),(2,5,8),(0,4,8),(2,4,6)] def _winner(board: list): for a, c, d in WINS: if board[a] and board[a] == board[c] == board[d]: return board[a] return None def _display(board: list) -> None: row = lambda i: " | ".join( str(i * 3 + j) if board[i * 3 + j] is None else board[i * 3 + j] for j in range(3) ) print(f" {row(0)}\n---+---+---\n {row(1)}\n---+---+---\n {row(2)}\n") def play_against(human_side: str = "X", opponent_policy: str = "greedy") -> None: """ Play a game of tic-tac-toe against a policy running in a sandbox. The sandbox opens once at the start of the game and stays live until the game ends. Each opponent turn is a single box.run() call — the policy code never executes in the host process. human_side: "X" (you move first) or "O" (opponent moves first) opponent_policy: any key in POLICIES """ assert human_side in ("X", "O"), "human_side must be 'X' or 'O'" opponent_side = "O" if human_side == "X" else "X" board = [None] * 9 print(f"\nYou are {human_side}. Opponent: {opponent_policy}.") print("Empty squares show their position number (0–8).\n") _display(board) # Keep one sandbox alive for the whole game — no re-creation per move box = Sandbox.create(memory_mb=1024, timeout_secs=300) for turn in range(9): player = "X" if turn % 2 == 0 else "O" available = [i for i, v in enumerate(board) if v is None] if player == human_side: while True: try: move = int(input(f"Your move ({human_side}), choose from {available}: ")) if move in available: break print(f" Square {move} is taken. Choose from {available}.") except ValueError: print(f" Enter a number from {available}.") else: # The seed is the turn number — deterministic but varies per turn harness = _MOVE_HARNESS.format( policy=POLICIES[opponent_policy], board=board, player=player, seed=turn, ) ex = box.run("python3", ["-c", harness]) move = int((ex.stdout or "").strip()) print(f" {opponent_side} ({opponent_policy}) plays {move}") board[move] = player _display(board) w = _winner(board) if w: print("You win!" if w == human_side else f"{opponent_policy} wins!") return print("Draw!") # ─── Single batch rollout ───────────────────────────────────────────────────── def run_ttt_batch(config: TttConfig) -> RolloutResult: """ Run one batch of n_games in a fresh sandbox and return a RolloutResult. Follows the same signature as run_single_rollout from the CartPole section: one config in, one RolloutResult out, one sandbox per call. """ harness = _TTT_HARNESS.format( policy_x=POLICIES[config.policy_x], policy_o=POLICIES[config.policy_o], seed=config.seed, n_games=config.n_games, ) box = Sandbox.create(memory_mb=1024) ex = box.run("python3", ["-c", harness]) data = json.loads((ex.stdout or "").strip()) return RolloutResult( seed=config.seed, total_reward=data["total_reward"], steps=data["steps"], trajectory=data["trajectory"], ) # ─── Policy evaluation ──────────────────────────────────────────────────────── def evaluate_matchup( policy_x: str, policy_o: str, seeds: List[int], n_games: int = 50, ) -> Tuple[float, float]: """ Run one batch per seed in parallel; return (mean_return, std_return). Follows the same parallel dispatch pattern as collect_parallel_rollouts: one sandbox per seed, all running concurrently. More seeds = tighter estimate of the true policy return. """ configs = [ TttConfig(seed=s, policy_x=policy_x, policy_o=policy_o, n_games=n_games) for s in seeds ] returns: List[float] = [0.0] * len(configs) with ThreadPoolExecutor(max_workers=len(configs)) as pool: futures = {pool.submit(run_ttt_batch, cfg): i for i, cfg in enumerate(configs)} for future in as_completed(futures): returns[futures[future]] = future.result().total_reward return statistics.mean(returns), statistics.stdev(returns) # ─── Q-learning ─────────────────────────────────────────────────────────────── # Uses str(s)+","+str(a) as Q-key to avoid f-string braces conflicting # with .format() when the harness template is rendered on the host. _QLEARN_HARNESS = """ import json, random WINS = [(0,1,2),(3,4,5),(6,7,8),(0,3,6),(1,4,7),(2,5,8),(0,4,8),(2,4,6)] def greedy_move(board, rng): moves = [i for i, v in enumerate(board) if v is None] for move in moves: b = board[:]; b[move] = "O" for a, c, d in WINS: if b[a] and b[a] == b[c] == b[d]: return move for move in moves: b = board[:]; b[move] = "X" for a, c, d in WINS: if b[a] and b[a] == b[c] == b[d]: return move return rng.choice(moves) def winner(b): for a, c, d in WINS: if b[a] and b[a] == b[c] == b[d]: return b[a] return None def skey(b): return tuple(0 if v is None else 1 if v == "X" else 2 for v in b) def qkey(s, a): return str(s) + "," + str(a) def qv(q, s, a): return q.get(qkey(s, a), 0.0) q = json.loads({q_json!r}) rng = random.Random({seed}) alpha, gamma, epsilon = {alpha}, {gamma}, {epsilon} ep_rewards = [] for _ in range({n_episodes}): board = [None] * 9 ep_r = 0.0 while True: moves = [i for i, v in enumerate(board) if v is None] if not moves: ep_rewards.append(ep_r); break s = skey(board) a = rng.choice(moves) if rng.random() < epsilon else max(moves, key=lambda x: qv(q, s, x)) board[a] = "X" w = winner(board) if w or not any(v is None for v in board): r = 1.0 if w == "X" else -1.0 if w == "O" else 0.0 q[qkey(s, a)] = qv(q, s, a) + alpha * (r - qv(q, s, a)) ep_r += r; ep_rewards.append(ep_r); break board[greedy_move(board[:], rng)] = "O" w = winner(board) r = 1.0 if w == "X" else -1.0 if w == "O" else 0.0 s2 = skey(board) moves2 = [i for i, v in enumerate(board) if v is None] nq = max((qv(q, s2, x) for x in moves2), default=0.0) if moves2 else 0.0 q[qkey(s, a)] = qv(q, s, a) + alpha * (r + gamma * nq - qv(q, s, a)) ep_r += r if w or not moves2: ep_rewards.append(ep_r); break print(json.dumps({{"q_table": q, "mean_reward": sum(ep_rewards)/len(ep_rewards), "n_states": len(q)}})) """ @dataclass class QConfig: seed: int q_table: dict = field(default_factory=dict) epsilon: float = 0.3 # exploration rate — high early, can decay over iterations alpha: float = 0.5 # learning rate gamma: float = 0.9 # discount factor n_episodes: int = 300 def run_qlearning_iter(config: QConfig) -> dict: """Run one training iteration in a sandbox; return updated Q-table + stats.""" harness = _QLEARN_HARNESS.format( q_json=json.dumps(config.q_table), seed=config.seed, alpha=config.alpha, gamma=config.gamma, epsilon=config.epsilon, n_episodes=config.n_episodes, ) box = Sandbox.create(memory_mb=1024) ex = box.run("python3", ["-c", harness]) return json.loads((ex.stdout or "").strip()) def train_q(n_iter: int = 8, episodes_per_iter: int = 300) -> dict: """ Train a Q-table over n_iter sequential sandbox calls. Each call receives the Q-table from the previous iteration and returns an updated one. Mean reward moving from negative to positive confirms the policy is improving against the greedy opponent. """ q_table: dict = {} print(f"{'Iter':>5} {'Mean reward':>13} {'Q-states':>10}") print("-" * 34) for i in range(n_iter): result = run_qlearning_iter(QConfig(seed=i, q_table=q_table, n_episodes=episodes_per_iter)) q_table = result["q_table"] print(f"{i+1:>5} {result['mean_reward']:>+13.3f} {result['n_states']:>10}") return q_table def q_policy_code(q_table: dict) -> str: """ Serialize the Q-table into a choose_action string compatible with POLICIES. This lets the learned policy plug directly into evaluate_matchup and play_against without any changes to those functions. """ q_json = json.dumps(q_table) return ( "import json as _j\n" "_Q = _j.loads(" + repr(q_json) + ")\n" "def choose_action(board, player, rng):\n" " def skey(b): return tuple(0 if v is None else 1 if v == 'X' else 2 for v in b)\n" " def qkey(s, a): return str(s) + ',' + str(a)\n" " moves = [i for i, v in enumerate(board) if v is None]\n" " return max(moves, key=lambda a: _Q.get(qkey(skey(board), a), 0.0))\n" ) # ─── Main ───────────────────────────────────────────────────────────────────── if __name__ == "__main__": matchups = [ ("random", "random"), ("greedy", "random"), ("random", "greedy"), ("greedy", "greedy"), ] seeds = [0, 1, 2, 3] # one sandbox per seed per matchup = 16 sandboxes total print("Evaluating all matchups (4 seeds × 50 games each, one sandbox per seed)...") print(f"\n{'X policy':>10} {'O policy':>10} {'mean return':>13} {'std':>6}") print("-" * 48) eval_results = {} for x, o in matchups: mean, std = evaluate_matchup(x, o, seeds=seeds) eval_results[(x, o)] = (mean, std) print(f"{x:>10} {o:>10} {mean:>+13.3f} {std:>6.3f}") print( f"\n → Mean return is the expected reward per game from X's perspective\n" f" (+1 win, −1 loss, 0 draw), averaged over {seeds} seeds × 50 games.\n" f" greedy-vs-random ({eval_results[('greedy','random')][0]:+.3f}) shows how\n" f" strongly a win/block heuristic dominates pure chance.\n" f" greedy-vs-greedy ({eval_results[('greedy','greedy')][0]:+.3f} ≠ 0) reveals a\n" f" fork vulnerability: X can reach positions that greedy-O cannot\n" f" simultaneously block, which a stronger policy would eliminate.\n" f" Low std (0.04–0.10) confirms 4 seeds × 50 games is enough to\n" f" rank policies reliably — scale up seeds for tighter confidence intervals." ) # ── Train and add the learned policy ───────────────────────────────────── print("\nTraining Q-learner vs greedy opponent (8 iterations × 300 episodes)...") q_table = train_q(n_iter=8, episodes_per_iter=300) # Serialize the Q-table into a choose_action string — same interface as # random and greedy, so evaluate_matchup works without any changes. POLICIES["q_learned"] = q_policy_code(q_table) print("\nEvaluating learned policy against baselines:") print(f"\n{'Matchup':>28} {'mean return':>13} {'std':>6}") print("-" * 54) for x, o in [("q_learned", "greedy"), ("greedy", "q_learned"), ("q_learned", "random")]: mean, std = evaluate_matchup(x, o, seeds=seeds) print(f"{x+' vs '+o:>28} {mean:>+13.3f} {std:>6.3f}") print( "\n → q_learned was trained as X against greedy O.\n" " It does not know how to play as O — greedy vs q_learned\n" " exposes this: the policy is role-specialized, not general." ) # ── Play a game ─────────────────────────────────────────────────────────── side = input("\nPlay a game? Choose your side [X/O] (or press Enter to skip): ").strip().upper() if side in ("X", "O"): available_policies = list(POLICIES.keys()) opp = input(f"Opponent policy {available_policies} (default: greedy): ").strip().lower() if opp not in POLICIES: opp = "greedy" play_against(human_side=side, opponent_policy=opp) ``` **Expected output — evaluation:** ``` Evaluating all matchups (4 seeds × 50 games each, one sandbox per seed)... X policy O policy mean return std ------------------------------------------------ random random +0.255 0.100 greedy random +0.900 0.043 random greedy -0.640 0.069 greedy greedy +0.180 0.059 → Mean return is the expected reward per game from X's perspective (+1 win, −1 loss, 0 draw), averaged over [0, 1, 2, 3] seeds × 50 games. greedy-vs-random (+0.900) shows how strongly a win/block heuristic dominates pure chance. greedy-vs-greedy (+0.180 ≠ 0) reveals a fork vulnerability: X can reach positions that greedy-O cannot simultaneously block, which a stronger policy would eliminate. Low std (0.04–0.10) confirms 4 seeds × 50 games is enough to rank policies reliably. ``` **Expected output — Q-learning training:** ``` Training Q-learner vs greedy opponent (8 iterations × 300 episodes)... Iter Mean reward Q-states ---------------------------------- 1 -0.470 393 2 -0.113 592 3 -0.177 823 4 -0.080 963 5 +0.117 1051 6 +0.087 1159 7 +0.073 1226 8 +0.053 1297 Evaluating learned policy against baselines: Matchup mean return std ------------------------------------------------------ q_learned vs greedy +0.927 0.034 greedy vs q_learned +0.990 0.008 q_learned vs random +0.785 0.051 → q_learned was trained as X against greedy O. It does not know how to play as O — greedy vs q_learned exposes this: the policy is role-specialized, not general. ``` The training loop passes the Q-table from each iteration into the next via JSON. Mean reward moving from −0.47 to +0.05 over 8 iterations shows the policy improving against a greedy opponent. Each iteration is a separate sandbox call — the host owns the Q-table and the loop control; the sandbox owns the episode dynamics. The jump in Q-states from 393 to 1297 reflects the agent exploring new board positions as its policy improves. Early iterations barely escape losing positions; later ones have enough coverage to exploit the greedy opponent's fork blindspot. After training, `q_policy_code()` serializes the Q-table into a `choose_action` string with the same interface as `random` and `greedy`. This lets the learned policy drop into `evaluate_matchup` and `play_against` with zero changes to those functions. After the evaluation the script prompts you to play. Choose `X` to move first or `O` to let the opponent open. **Expected output — interactive game as O against greedy:** ``` Play a game? Choose your side [X/O] (or press Enter to skip): O Opponent policy ['random', 'greedy', 'q_learned'] (default: greedy): You are O. Opponent: greedy. Empty squares show their position number (0–8). 0 | 1 | 2 ---+---+--- 3 | 4 | 5 ---+---+--- 6 | 7 | 8 X (greedy) plays 4 0 | 1 | 2 ---+---+--- 3 | X | 5 ---+---+--- 6 | 7 | 8 Your move (O), choose from [0, 1, 2, 3, 5, 6, 7, 8]: 0 O | 1 | 2 ---+---+--- 3 | X | 5 ---+---+--- 6 | 7 | 8 X (greedy) plays 8 ... ``` The opponent's policy code runs inside the sandbox on every turn — the `choose_action` function never executes in your host process. The sandbox stays open for the whole game session (`timeout_secs=300`); only the move oracle harness re-runs on each turn. *** ## Key design callouts ### Why the seed is embedded in the harness string The seed is formatted directly into the Python script that runs inside the sandbox, not set via an environment variable or a host-side call. This means the host process's random state has no path into the episode. If you set the seed on the host and then passed the environment object into the sandbox, any host-side RNG calls between setup and rollout would shift the environment's random state relative to what you expected. Embedding it in the harness makes the episode fully self-contained. In gymnasium specifically, `env.reset(seed=seed)` only seeds the observation and transition RNG — the action space has a separate RNG that must be seeded independently with `env.action_space.seed(seed)`. Forgetting the second call produces non-deterministic trajectories even when everything else is correct. ### Why each rollout gets its own sandbox Sharing a sandbox across rollouts would mean sharing filesystem state, installed package versions, and any residual process state from prior episodes. Even if you call `env.reset()` correctly, state outside the environment object — temporary files, cached computations, mutated globals — can persist and affect the next episode. Creating a fresh sandbox per rollout makes the isolation structural rather than depending on careful cleanup. ### How this relates to the GSPO pattern In [RL Training with GSPO](/sandboxes/gspo-agentic-rl), the sandbox is a reward oracle: each model completion is sent to a sandbox that runs a hidden test suite and returns a score. The reproducibility concern is different there — you need each completion to be evaluated fairly, not that the environment is deterministic. But the underlying mechanism is the same: one sandbox per evaluation, no shared state. The reproducibility pattern here is what you would use when the environment itself (not just the evaluator) needs to be deterministic across training runs. *** This example uses `python-dotenv` to load your Tensorlake API key. Create a `.env` file in your project root: ``` TENSORLAKE_API_KEY="your-api-key-here" ``` The SDK will pick it up automatically. *** ## What to build next Use sandboxes as a reward oracle to fine-tune a language model on code generation tasks. Dispatch parallel sandboxes across a swarm of worker agents for large-scale rollout collection. Freeze environment state mid-rollout to create branching experiments without re-running from scratch. # Agentic Swarm Intelligence Source: https://docs.tensorlake.ai/sandboxes/agentic-swarm-intelligence Orchestrate a swarm of LLM agents running specialized tasks in parallel sandboxes. Combine the power of LLM orchestration with secure, isolated execution environments. This guide shows how to build a "swarm" of agents—where multiple worker agents generate and execute code in parallel sandboxes to analyze a problem from different perspectives, and a lead agent synthesizes their findings. ## How it works 1. **Define Worker Agents**: Create a function that uses an LLM to generate code for a specific perspective (e.g., Scientific, Economic). 2. **Execute in Sandboxes**: Each worker spins up a secure Tensorlake Sandbox to run the generated code and capture the output. 3. **Map (Parallelize)**: Launch multiple instances of the worker agent in parallel. 4. **Reduce (Aggregate)**: A lead agent receives all the reports and synthesizes a final insight. *** ## Prerequisites ```bash theme={null} pip install tensorlake openai pydantic python-dotenv ``` ## TypeScript SDK starter If your orchestrator already runs in Node.js, use the same pattern: LLM generates code, one sandbox executes it, and `Promise.all()` fans the scouts out in parallel. ````typescript theme={null} import OpenAI from "openai"; import { Sandbox } from "tensorlake"; type ScoutReport = { perspective: string; score: number; insight: string; }; const openai = new OpenAI(); async function scoutAgent(perspective: string): Promise { const prompt = `You are a ${perspective} analyst for a Mars mission. Write Python that prints one JSON object with perspective, score, and insight.`; const response = await openai.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: prompt }], }); const generatedCode = response.choices[0].message.content ?.replace("```python", "") .replace("```", "") .trim() ?? ""; const sandbox = await Sandbox.create({ allowInternetAccess: false, timeoutSecs: 600, }); try { await sandbox.run("pip", { args: ["install", "numpy", "--user", "--break-system-packages"], }); const execution = await sandbox.run("python3", { args: ["-c", generatedCode], }); return JSON.parse(execution.stdout) as ScoutReport; } finally { await sandbox.terminate(); } } const reports = await Promise.all( ["Scientific", "Economic", "Ethical"].map(scoutAgent), ); console.log(reports); sandboxes.close(); ```` ## Full example This example simulates a Mars mission planning scenario where "scout" agents analyze different risks (Scientific, Economic, Ethical, etc.) by writing and running simulations in isolated sandboxes. ````python theme={null} from dotenv import load_dotenv load_dotenv() # Load environment variables from .env file from tensorlake.sandbox import Sandbox from pydantic import BaseModel from typing import List from openai import OpenAI from concurrent.futures import ThreadPoolExecutor class ScoutReport(BaseModel): agent_id: int raw_data: str class FinalInsight(BaseModel): summary: str # 1. Worker Agent: LLM + Sandbox Execution def scout_agent(task_id: int) -> ScoutReport: """Each scout analyzes a specific aspect of the mission.""" perspectives = ["Scientific", "Economic", "Ethical", "Logistical", "Psychological"] perspective = perspectives[task_id % len(perspectives)] print(f"🕵️ Scout {task_id}: Analyzing {perspective} perspective...") client = OpenAI() # Step A: LLM decides what to do prompt = f""" You are a {perspective} analyst for a Mars mission. Write a Python script to perform a simple simulation using the 'numpy' library. The simulation should model a key factor from your perspective (e.g., scientific sensor data, economic cost projection, logistical supply levels). The script MUST print a single valid JSON string to standard output. This JSON should contain: 'perspective': '{perspective}', 'score': an integer from 0-100 derived from your simulation (higher is better), 'insight': a brief, unique risk or opportunity revealed by the simulation. Do NOT use markdown blocks.""" response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}] ) # Clean up markdown formatting (remove ```python ... ```) generated_code = response.choices[0].message.content.replace("```python", "").replace("```", "").strip() print(f"🕵️ Scout {task_id}: Generated code -> {generated_code}") # Step B: Secure execution in a Sandbox sandbox = Sandbox.create() print(f"🕵️ Scout {task_id}: Installing dependencies in Sandbox...") sandbox.run("pip", ["install", "numpy", "--user", "--break-system-packages"]) print(f"🕵️ Scout {task_id}: Running simulation in Sandbox...") execution = sandbox.run("python3", ["-c", generated_code]) output = execution.stdout.strip() print(f"🕵️ Scout {task_id}: Execution complete. Output: {output}") return ScoutReport(agent_id=task_id, raw_data=output) # 2. Lead Agent: LLM Aggregator (The "Reducer") def lead_aggregator(reports: List[ScoutReport]) -> FinalInsight: """The Lead LLM reviews all sandbox outputs to find patterns.""" print(f"👑 Lead Agent: Received {len(reports)} scout reports. Aggregating...") client = OpenAI() combined_reports = "\n".join([f"Report {r.agent_id}: {r.raw_data}" for r in reports]) # The 'Intelligence' step: synthesizing multiple sources prompt = ( f"You are the Mission Commander for Mars Colonization. Review these viability reports:\n{combined_reports}\n\n" "1. Calculate the average viability score.\n" "2. Synthesize a strategic Go/No-Go recommendation.\n" "3. Summarize key risks." ) response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}] ) return FinalInsight(summary=response.choices[0].message.content) # 3. The Swarm Application def intelligence_swarm(count: int) -> str: print(f"🚀 Launching a swarm of {count} scouts...") # Parallel Map: Launch multiple sandboxed scouts with ThreadPoolExecutor() as executor: reports = list(executor.map(scout_agent, range(count))) # Reduce: Use the Lead Agent to combine results final_insight = lead_aggregator(reports) return final_insight.summary if __name__ == "__main__": # This runs 3 parallel LLMs, 3 parallel Sandboxes, and 1 Aggregator LLM result = intelligence_swarm(count=5) print(f"\n--- SWARM INTELLIGENCE REPORT ---\n{result}") ```` ### Workflow: Step-by-Step Execution | Step | Component | Action | | ----- | ---------------- | -------------------------------------------------------------------------------------- | | **1** | **Orchestrator** | Triggers 5 parallel scout tasks using the `scout_agent.map()` function. | | **2** | **Scout Agent** | Leverages GPT-4o to draft a custom simulation script based on a specific perspective. | | **3** | **Sandbox** | Securely installs `numpy`, handles dependencies, and executes the script in isolation. | | **4** | **Scout Agent** | Compiles simulation data into a structured `ScoutReport` for return. | | **5** | **Lead Agent** | Aggregates all reports and prompts GPT-4o for a final **Go/No-Go** decision. | *** This example uses the `python-dotenv` library to load your Tensorlake API key from a `.env` file. Create a file named `.env` in your project root and add your key: ``` TENSORLAKE_API_KEY="your-api-key-here" ``` The SDK will automatically use this key. ## Production Tips ### Reduce Latency with Snapshots The example above runs `pip install numpy` inside every scout's sandbox. In a real swarm with dozens of agents, this adds unnecessary latency and bandwidth usage. For production, create a "base" sandbox, install your common dependencies, and create a **Snapshot**. Then, have your agents initialize from that snapshot instantly. ```python theme={null} # 1. Create a snapshot ID (do this once) # snapshot = sandbox.checkpoint() # 2. Use it in your agent sandbox = Sandbox.create(snapshot_id="snps_abc123") # Numpy is already installed! sandbox.run("python", ["-c", generated_code]) ``` See the Snapshots guide for details. ### Security: Lock down the network Since the scouts run code generated by an LLM, it is safer to disable internet access to prevent data exfiltration or malicious downloads. ```python theme={null} sandbox = Sandbox.create(allow_internet_access=False) # ... ``` ## What to build next Learn how to build a stateful code interpreter for a single agent. Optimize your swarm's startup time by pre-baking dependencies. # Async SDK (Python) Source: https://docs.tensorlake.ai/sandboxes/async Use the AsyncSandbox class to drive sandboxes from asyncio code. The Python SDK ships an async-native variant of the sandbox API on top of asyncio. Every method on the sync [`Sandbox`](/sandboxes/sdk-reference#sandbox-handle) handle has a one-to-one async counterpart on `AsyncSandbox` — same names, same parameters, just `async def` and awaited. ## When to use it Reach for the async API when: * You're driving multiple sandboxes concurrently (e.g. fanning out work with `asyncio.gather`). * Your application is already asyncio-based — FastAPI, aiohttp, an LLM agent loop, etc. — and you don't want to mix in blocking calls. * You're streaming output from many processes at once. If you only ever use one sandbox at a time and your code is otherwise synchronous, the sync `Sandbox` API is simpler and equivalent. ## The shape of the API ```python theme={null} from tensorlake.sandbox import AsyncSandbox ``` `AsyncSandbox` is the runtime handle for a single sandbox. Use `await AsyncSandbox.create(...)` to provision and connect, or `await AsyncSandbox.connect(sandbox_id)` to attach to an existing one. Every instance method is awaited: ```python theme={null} sandbox = await AsyncSandbox.create() result = await sandbox.run("python", ["-c", "print('hello')"]) await sandbox.write_file("/workspace/data.csv", b"name,score\nAlice,95\n") content = await sandbox.read_file("/workspace/data.csv") ``` Refer to the [SDK Reference](/sandboxes/sdk-reference) for the full method list — the names, parameters, and return types are identical to the sync API. The pages below walk through the same workflow with async syntax. ## Create and run ```python theme={null} import asyncio from tensorlake.sandbox import AsyncSandbox async def main(): sandbox = await AsyncSandbox.create(cpus=2.0, memory_mb=2048) try: result = await sandbox.run("python", ["-c", "print('hello')"]) print(result.stdout) finally: await sandbox.terminate() asyncio.run(main()) ``` `AsyncSandbox` is also an async context manager — use `async with` to terminate the sandbox automatically when the block exits: ```python theme={null} async with await AsyncSandbox.create(cpus=2.0, memory_mb=2048) as sandbox: result = await sandbox.run("python", ["-c", "print('hello')"]) print(result.stdout) # sandbox is terminated here ``` ## Run many sandboxes in parallel The async API is designed for fan-out. Use `asyncio.gather` to start and run sandboxes concurrently: ```python theme={null} import asyncio from tensorlake.sandbox import AsyncSandbox async def evaluate(prompt: str) -> str: async with await AsyncSandbox.create(cpus=1.0, memory_mb=1024) as sandbox: result = await sandbox.run("python", ["-c", prompt]) return result.stdout async def main(): prompts = [ "print(2 + 2)", "print(sum(range(100)))", "import math; print(math.pi)", ] outputs = await asyncio.gather(*(evaluate(p) for p in prompts)) for out in outputs: print(out.strip()) asyncio.run(main()) ``` Each `evaluate` call creates, executes against, and terminates its own sandbox in parallel with the others. ## Connect to an existing sandbox Reattach to a named sandbox after `resume`, or operate on a sandbox another process created: ```python theme={null} sandbox = await AsyncSandbox.connect("my-env") info = await sandbox.info() print(info.sandbox_id) # sandbox.sandbox_id is now populated too ``` Unlike the sync `Sandbox.sandbox_id` property, which transparently fetches sandbox info on first access, the async `AsyncSandbox.sandbox_id` cannot block on a network call. Call `await sandbox.info()` (or any other awaited method that resolves the sandbox, like `status()`) once before reading `sandbox.sandbox_id` on a freshly connected handle. ## Background processes and streaming output Start a process, keep the handle, and collect its output once it finishes: ```python theme={null} proc = await sandbox.start_process("python", ["-c", """ import time for i in range(5): print(f'tick {i}') time.sleep(1) """]) print(proc.pid) # follow_output blocks until the process exits, then returns a TracedIterator # of the captured events you can iterate normally. events = await sandbox.follow_output(proc.pid) for event in events: print(event.line, end="") ``` For long-running processes you want to stop yourself, send a signal directly — don't `follow_output` first, since it would block waiting for the process to exit: ```python theme={null} import signal proc = await sandbox.start_process("python", ["-m", "http.server", "8080"]) # ... do work that talks to the server ... await sandbox.send_signal(proc.pid, signal.SIGTERM) ``` ## File operations ```python theme={null} await sandbox.write_file("/workspace/data.csv", b"name,score\nAlice,95\n") content = await sandbox.read_file("/workspace/data.csv") print(content.value.decode("utf-8")) listing = await sandbox.list_directory("/workspace") for entry in listing.value.entries: print(entry.name, entry.is_dir, entry.size) ``` ## Suspend, resume, and snapshot Suspend and resume require a named sandbox — pass `name=` at creation time. `checkpoint` works on any sandbox, including ephemeral ones. ```python theme={null} sandbox = await AsyncSandbox.create(name="my-env", cpus=1.0) await sandbox.suspend() await sandbox.resume() snapshot = await sandbox.checkpoint() restored = await AsyncSandbox.create(snapshot_id=snapshot.snapshot_id) ``` ## Learn more Full method list — applies to both sync and async APIs. State machine, suspend/resume, timeouts. Background processes, stdin, signals. Capture and restore full VM state. # Drive Chrome over CDP Source: https://docs.tensorlake.ai/sandboxes/chrome-cdp Run Google Chrome inside an ubuntu-vnc sandbox and drive it locally through the Chrome DevTools Protocol over a tunnel. The `tensorlake/ubuntu-vnc` image ships with Google Chrome pre-installed. Combined with [Local Tunnels](/sandboxes/tunnels), this gives you a real, sandboxed Chrome that any DevTools-Protocol client (Playwright, Puppeteer, `chrome-remote-interface`, plain WebSocket) can drive from your laptop as if it were running locally — no headless container, no screenshot polling, no public port. If you want to drive the whole XFCE desktop (mouse, keyboard, screenshots) instead of just Chrome, use the higher-level [Computer Use](/sandboxes/computer-use) API — it talks to the same `tensorlake/ubuntu-vnc` image through `connect_desktop()` / `connectDesktop()`. The two workflows compose: keep the agent loop on CDP and attach a human reviewer over VNC. This guide walks through: 1. Launching `tensorlake/ubuntu-vnc`. 2. Starting Chrome with CDP enabled on the desktop session. 3. Tunneling the CDP port to `127.0.0.1`. 4. Driving the browser from Python or Playwright. ## Prerequisites ```bash theme={null} curl -fsSL https://tensorlake.ai/install | sh export TENSORLAKE_API_KEY=your-api-key ``` You can also use `tl login` to obtain a Personal Access Token interactively. The desktop password for the managed `tensorlake/ubuntu-vnc` image is `tensorlake`. ## 1. Launch the Sandbox ```bash theme={null} tl sbx create -i tensorlake/ubuntu-vnc -c 4 -m 4096 chrome-cdp ``` `chrome-cdp` is a name (optional, but it lets you suspend and resume later). The CLI prints the new sandbox id; reuse it as `` below. Four CPUs and 4 GiB of RAM is a comfortable default for a single Chrome session. ## 2. Start Chrome with CDP Enabled Start Chrome on the existing VNC display (`:1`) as the desktop user (`tl-user`). Two flags matter: * `--remote-debugging-port=9222` opens the DevTools Protocol endpoint on `127.0.0.1:9222` inside the sandbox. * `--remote-allow-origins=*` is required by current Chrome versions before they will accept a WebSocket whose `Origin` is anything other than the request host. Without it the HTTP `/json/version` endpoint works but `ws://127.0.0.1:9222/devtools/...` returns `403 Forbidden`. ```bash theme={null} tl sbx exec -- bash -lc ' sudo -u tl-user bash -c " nohup env DISPLAY=:1 XAUTHORITY=/home/tl-user/.Xauthority \ google-chrome \ --no-first-run \ --no-default-browser-check \ --remote-debugging-port=9222 \ --remote-allow-origins=* \ --user-data-dir=/tmp/chrome-cdp \ > /tmp/chrome-cdp.log 2>&1 & disown " ' ``` ```python theme={null} from tensorlake.sandbox import Sandbox with Sandbox.connect("") as sandbox: sandbox.start_process( "sudo", args=[ "-u", "tl-user", "env", "DISPLAY=:1", "XAUTHORITY=/home/tl-user/.Xauthority", "google-chrome", "--no-first-run", "--no-default-browser-check", "--remote-debugging-port=9222", "--remote-allow-origins=*", "--user-data-dir=/tmp/chrome-cdp", ], ) ``` `start_process` returns immediately and the sandbox daemon keeps Chrome alive — no `nohup`, no shell, no log redirection. Stdout and stderr are captured by the daemon and reachable via `sandbox.get_stdout(pid)` / `sandbox.get_stderr(pid)` if you want to inspect them. ```javascript theme={null} import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.connect({ sandboxId: "" }); await sandbox.startProcess("sudo", { args: [ "-u", "tl-user", "env", "DISPLAY=:1", "XAUTHORITY=/home/tl-user/.Xauthority", "google-chrome", "--no-first-run", "--no-default-browser-check", "--remote-debugging-port=9222", "--remote-allow-origins=*", "--user-data-dir=/tmp/chrome-cdp", ], }); ``` `startProcess` returns once the daemon has spawned the child; Chrome keeps running in the background. Read its output later with `sandbox.getStdout(pid)` / `sandbox.getStderr(pid)`. Confirm CDP is up: ```bash theme={null} tl sbx exec -- bash -lc 'curl -s http://127.0.0.1:9222/json/version' ``` You should see a JSON response with `Browser`, `Protocol-Version`, and `webSocketDebuggerUrl`. Because Chrome is running on the VNC display `:1`, you can also attach a VNC viewer through the [Local Tunnels](/sandboxes/tunnels) workflow and watch it operate in real time. CDP control and human observation can run side by side. ## 3. Open a Tunnel Forward `127.0.0.1:9222` on your laptop to `127.0.0.1:9222` inside the sandbox: ```bash theme={null} tl sbx tunnel 9222 ``` Leave the command running. Open a second terminal for the rest of this guide. ```javascript theme={null} import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.connect({ sandboxId: "" }); const tunnel = await sandbox.createTunnel(9222, { localPort: 9222 }); console.log(`CDP at http://127.0.0.1:${tunnel.address().port}`); // ... drive the browser ... await tunnel.close(); ``` Verify locally: ```bash theme={null} curl http://127.0.0.1:9222/json/version ``` Same JSON, but reached from your laptop. Every byte transits an authenticated WebSocket — port `9222` never has to be in `exposed_ports`. ## 4. Drive the Browser ### Open a Tab CDP exposes an HTTP control surface on the same port. Open a fresh tab with a `PUT`: ```bash theme={null} curl -X PUT "http://127.0.0.1:9222/json/new?https://news.ycombinator.com" ``` The response includes a `webSocketDebuggerUrl` for the new tab. List all tabs with `curl http://127.0.0.1:9222/json/list` and close one with `curl http://127.0.0.1:9222/json/close/`. ### Playwright ```python theme={null} from playwright.sync_api import sync_playwright with sync_playwright() as p: browser = p.chromium.connect_over_cdp("http://127.0.0.1:9222") context = browser.contexts[0] page = context.new_page() page.goto("https://news.ycombinator.com") titles = page.locator(".titleline > a").all_text_contents() print(titles[:5]) ``` ```javascript theme={null} import { chromium } from "playwright"; const browser = await chromium.connectOverCDP("http://127.0.0.1:9222"); const [context] = browser.contexts(); const page = await context.newPage(); await page.goto("https://news.ycombinator.com"); const titles = await page.locator(".titleline > a").allTextContents(); console.log(titles.slice(0, 5)); ``` ### Raw CDP via WebSocket When you want to issue protocol calls directly — `Runtime.evaluate`, `Page.navigate`, `DOM.getDocument` — connect to the per-tab WebSocket and exchange JSON messages: ```python theme={null} import json import urllib.request import websocket # pip install websocket-client targets = json.loads(urllib.request.urlopen("http://127.0.0.1:9222/json/list").read()) page = next(t for t in targets if t["type"] == "page") ws = websocket.create_connection(page["webSocketDebuggerUrl"]) ws.send(json.dumps({ "id": 1, "method": "Runtime.evaluate", "params": { "expression": "document.title", "returnByValue": True, }, })) print(json.loads(ws.recv())["result"]["result"]["value"]) ws.close() ``` This is also the path you take when wiring CDP into an LLM agent: expose `open_url`, `evaluate`, and `list_targets` as tools that wrap these calls. ### Coding Agents (`chrome-devtools` MCP) Claude Code and OpenAI Codex can both drive the same sandboxed Chrome through the official [`chrome-devtools-mcp`](https://github.com/ChromeDevTools/chrome-devtools-mcp) server. The MCP attaches to an existing Chrome via `--browser-url`; match that URL to the tunnel's local port and no other configuration is needed — using Chrome's canonical `9222` on both sides keeps everything default-on-default. Register the MCP once for your user: ```bash theme={null} claude mcp add chrome-devtools -- npx chrome-devtools-mcp@latest \ --browser-url http://127.0.0.1:9222 ``` Stored at user scope by default. Pass `--scope project` to write it to the current project's `.mcp.json` instead. ```bash theme={null} codex mcp add chrome-devtools -- npx chrome-devtools-mcp@latest \ --browser-url http://127.0.0.1:9222 ``` Writes to `~/.codex/config.toml` (or `$CODEX_HOME/config.toml`). Codex has no project-vs-user scope — the file is always user-global. The equivalent block, if you prefer to edit the file by hand: ```toml theme={null} [mcp_servers.chrome-devtools] command = "npx" args = ["chrome-devtools-mcp@latest", "--browser-url", "http://127.0.0.1:9222"] ``` The `--browser-url` flag is what tells the MCP to attach to an existing Chrome instead of launching its own. With Chrome already running inside the sandbox (step 2) and a tunnel open at the default local port: ```bash theme={null} tl sbx tunnel 9222 ``` restart the agent so it picks up the new MCP (Claude Code re-reads on launch; Codex reads `config.toml` at startup and does not hot-reload), then ask it to do something in the browser: ``` > open https://news.ycombinator.com and read the first headline ``` The agent routes that through `chrome-devtools` → `127.0.0.1:9222` → tunnel → sandbox Chrome on display `:1`. If port `9222` is already taken on your laptop (a local Chrome with debugging on, another tunnel, etc.), pick any free port for both sides and keep them aligned: ```bash theme={null} # tunnel the sandbox's 9222 to local 12222 tl sbx tunnel 9222 --listen-port 12222 # point the MCP at the same local port claude mcp add chrome-devtools -- npx chrome-devtools-mcp@latest \ --browser-url http://127.0.0.1:12222 ``` ```bash theme={null} # tunnel the sandbox's 9222 to local 12222 tl sbx tunnel 9222 --listen-port 12222 # point the MCP at the same local port codex mcp add chrome-devtools -- npx chrome-devtools-mcp@latest \ --browser-url http://127.0.0.1:12222 ``` Verify the path before you point an agent at it: `curl http://127.0.0.1:9222/json/version` should return Chrome's JSON. The tunnel CLI keeps the local port bound even when the sandbox upstream goes away (terminated, suspended without auto-resume), so a hung `curl` usually means the sandbox is gone, not that the MCP is misconfigured. ## 5. Tear Down Stop the tunnel with `Ctrl+C`. Stop Chrome inside the sandbox when you no longer need it: ```bash theme={null} tl sbx exec -- bash -lc 'sudo -u tl-user pkill -f google-chrome || true' ``` Suspend the sandbox to keep the user-data-dir warm for next time, or terminate it to release resources: ```bash theme={null} tl sbx suspend # named sandboxes only tl sbx terminate ``` ## Notes and Pitfalls * **`--remote-allow-origins=*` is required** for Chrome ≥ 111. Without it, the HTTP CDP endpoints work but every WebSocket handshake fails with `403`. Restart Chrome with the flag if you forget. * **Bind address.** `--remote-debugging-port` only listens on `127.0.0.1` by default, which is exactly what you want — the tunnel forwards to `127.0.0.1` inside the sandbox, so DevTools stays unreachable from anywhere else. * **`--user-data-dir` is required for CDP.** Chrome ≥ 136 refuses to enable `--remote-debugging-port` against the default profile and prints `DevTools remote debugging requires a non-default data directory. Specify this using --user-data-dir.` to its log. Always pass `--user-data-dir=/tmp/` (or any path other than `~/.config/google-chrome`). * **Headless mode.** If you do not need the VNC view, you can launch with `--headless=new` instead of attaching to display `:1`. The tunneling and CDP usage remain identical. * **Sandboxing inside containers.** Chrome's setuid sandbox sometimes fails inside container/VM combinations. If you see `Failed to move to new namespace` errors, add `--no-sandbox` to the launch flags. * **Multiple agents.** Each tab has its own `webSocketDebuggerUrl`. Two clients can drive different tabs of the same Chrome at the same time — useful when an agent loop and a human reviewer both want a window. ## Related Guides * [Computer Use](/sandboxes/computer-use) — drive the full XFCE desktop (mouse, keyboard, screenshots) on the same `tensorlake/ubuntu-vnc` image. * [Local Tunnels](/sandboxes/tunnels) — the tunneling primitive that carries CDP traffic from your laptop into the sandbox. * [Snapshots](/sandboxes/snapshots) — fork a warmed-up Chrome profile so parallel agents start with cookies, history, and extensions already in place. # CICD & Build Systems Source: https://docs.tensorlake.ai/sandboxes/cicd-build Execute build steps and run tests in isolated, reproducible environments. Build systems and CI/CD pipelines often require clean, isolated environments to ensure reproducibility and prevent dependency conflicts. Tensorlake Sandboxes allow you to spin up ephemeral containers on demand, upload source code, run tests, and retrieve artifacts. This example demonstrates a complete mini-CI pipeline that creates a dummy project, runs tests, and builds a distribution package inside a sandbox. ## TypeScript SDK starter If your build runner is already in Node.js, the workflow is the same: stream project files into a sandbox, run each CI step, and pull artifacts back out if needed. ```typescript theme={null} import { readFile, readdir } from "node:fs/promises"; import path from "node:path"; import { Sandbox } from "tensorlake"; async function copyTree(sandbox: Sandbox, localDir: string, remoteDir: string) { await sandbox.run("mkdir", { args: ["-p", remoteDir] }); for (const entry of await readdir(localDir, { withFileTypes: true })) { const localPath = path.join(localDir, entry.name); const remotePath = path.posix.join(remoteDir, entry.name); if (entry.isDirectory()) { await copyTree(sandbox, localPath, remotePath); } else { await sandbox.writeFile(remotePath, await readFile(localPath)); } } } async function runStep( sandbox: Sandbox, name: string, command: string, args: string[], ) { const result = await sandbox.run(command, { args, workingDir: "/workspace/project", }); if (result.exitCode !== 0) { throw new Error(`${name} failed\n${result.stderr}`); } } const sandbox = await Sandbox.create({ timeoutSecs: 900 }); try { await copyTree(sandbox, "./my_cool_project", "/workspace/project"); await runStep(sandbox, "Install Dependencies", "pip", [ "install", "-r", "requirements.txt", "--user", "--break-system-packages", ]); await runStep(sandbox, "Run Tests", "python", [ "-m", "pytest", "/workspace/project", ]); await runStep(sandbox, "Build Package", "python", [ "setup.py", "sdist", "bdist_wheel", ]); } finally { await sandbox.terminate(); client.close(); } ``` ## Example: CI/CD Pipeline The following script simulates a CI pipeline. It generates a simple Python project, uploads it to a sandbox, installs dependencies, runs `pytest`, and builds a wheel file. ```python theme={null} import os import tempfile import shutil from dotenv import load_dotenv from setuptools import setup, find_packages load_dotenv() from tensorlake.sandbox import Sandbox def create_dummy_project(base_dir): """ Helper to create a simple Python project layout locally so we have something to build/test. """ project_root = os.path.join(base_dir, "my_cool_project") src_dir = os.path.join(project_root, "src", "my_cool_project") tests_dir = os.path.join(project_root, "tests") os.makedirs(src_dir, exist_ok=True) os.makedirs(tests_dir, exist_ok=True) # Create setup.py with open(os.path.join(project_root, "setup.py"), "w") as f: f.write("from setuptools import setup, find_packages\n" "setup(name='my_cool_project', version='0.1.0', " "package_dir={'': 'src'}, packages=find_packages(where='src'))") # Create source code with open(os.path.join(src_dir, "__init__.py"), "w") as f: f.write("def add(a, b): return a + b") # Create test with open(os.path.join(tests_dir, "test_logic.py"), "w") as f: f.write("from my_cool_project import add\ndef test_add(): assert add(2, 3) == 5") # Create requirements.txt with open(os.path.join(project_root, "requirements.txt"), "w") as f: f.write("pytest\nsetuptools\nwheel\n") return project_root def run_ci_step(sandbox, name, command, working_dir="/workspace/project", env=None): """Runs a command in the sandbox, prints the output, and checks for errors.""" print(f"--- Running Step: {name} ---") result = sandbox.run(command[0], command[1:], env=env, working_dir=working_dir) print(f"STDOUT:\n{result.stdout}") if result.stderr: print(f"STDERR:\n{result.stderr}") if result.exit_code != 0: print(f"❌ Step '{name}' FAILED with exit code {result.exit_code}") raise RuntimeError(f"CI step '{name}' failed.") else: print(f"✅ Step '{name}' PASSED") print("-" * (len(name) + 20)) def copy_to_sandbox(sandbox, local_path, remote_path): """Recursively copies a local directory to the sandbox.""" print(f"Copying {local_path} -> {remote_path} ...") sandbox.run("mkdir", ["-p", remote_path]) for root, dirs, files in os.walk(local_path): rel_root = os.path.relpath(root, local_path) remote_root = remote_path if rel_root == "." else os.path.join(remote_path, rel_root) for d in dirs: sandbox.run("mkdir", ["-p", os.path.join(remote_root, d)]) for file in files: local_file = os.path.join(root, file) remote_file = os.path.join(remote_root, file) with open(local_file, "rb") as f: sandbox.write_file(remote_file, f.read()) async def main(): # 1. Setup local dummy project temp_dir = tempfile.mkdtemp() project_path = create_dummy_project(temp_dir) print(f"Dummy project created at: {project_path}") try: # 2. Create Sandbox sandbox = Sandbox.create() print("🚀 Sandbox created for CI/CD pipeline.") # 3. Upload Code copy_to_sandbox(sandbox, project_path, "/workspace/project") # 4. Install Dependencies run_ci_step( sandbox, "Install Dependencies", ["pip", "install", "-r", "requirements.txt", "--user", "--break-system-packages"], working_dir="/workspace/project" ) # 5. Run Tests run_ci_step( sandbox, "Run Tests", ["python", "-m", "pytest", "/workspace/project"], env={"PYTHONPATH": "/workspace/project/src"}, ) # 6. Build Artifacts run_ci_step(sandbox, "Build Package", ["python", "setup.py", "sdist", "bdist_wheel"]) print("\n🎉 CI/CD Pipeline finished successfully! 🎉") except Exception as e: print(f"\n🔥 CI/CD Pipeline FAILED: {e}") finally: shutil.rmtree(temp_dir) if __name__ == "__main__": import asyncio asyncio.run(main()) ``` ## How It Works 1. **Environment Creation**: The script instantiates a fresh sandbox. This ensures no leftover files or environment variables from previous builds affect the current run. 2. **File Injection**: The custom `copy_to_sandbox` function walks the local directory tree and streams files into the sandbox using `sandbox.write_file()`. This simulates the "checkout" phase of a CI pipeline. 3. **Step Execution**: The `run_ci_step` helper function executes shell commands (like `pip` and `pytest`) inside the sandbox using `sandbox.run()`. It captures `stdout`, `stderr`, and exit codes to determine success or failure. 4. **Artifact Generation**: The build step generates `.whl` and `.tar.gz` files inside the sandbox. In a real-world scenario, you would use `sandbox.read_file()` to download these artifacts back to your storage. ## Learn More Learn how to efficiently move large files and directories in and out of sandboxes. Understand how to manage long-running processes and handle exit codes. # Execute Commands Source: https://docs.tensorlake.ai/sandboxes/commands Run commands with output capture, streaming, and error handling Run shell commands inside sandboxes with full stdout/stderr capture, real-time streaming, and configurable timeouts. Sandbox-specific operations use the sandbox proxy URL: `https://.sandbox.tensorlake.ai` For named sandboxes, you can use the sandbox **name** in place of the ID — both in the proxy hostname and in CLI/API commands. For example, `https://my-env.sandbox.tensorlake.ai/api/v1/processes` and `tl sbx exec my-env python main.py` work the same as their ID-based equivalents. The proxy resolves the name to the underlying sandbox automatically. The command and process APIs documented here run on the management URL on port `9501`, which always requires authentication. Unauthenticated proxy access applies only to exposed user ports. ## Basic Execution ```bash theme={null} # Run in an existing sandbox — use the sandbox ID or name tl sbx exec my-env python -c 'print("Hello from sandbox!")' # Or create, run, and tear down in one step tl sbx run python -c 'print("Hello from sandbox!")' ``` ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = Sandbox.create() result = sandbox.run("python", ["-c", "print('Hello from sandbox!')"]) print(result.stdout) # Hello from sandbox! print(result.exit_code) # 0 ``` ```typescript theme={null} const result = await sandbox.run("python", { args: ["-c", "print('Hello from sandbox!')"], }); console.log(result.stdout); // Hello from sandbox! console.log(result.exitCode); // 0 ``` ```bash theme={null} # Start a Python process inside the sandbox curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "command": "python", "args": ["-c", "print(\"Hello from sandbox!\")"] }' ``` **Response:** ```json theme={null} { "pid": 294, "status": "running", "exit_code": null, "signal": null, "stdin_writable": false, "command": "python", "args": ["-c", "print(\"Hello from sandbox!\")"], "started_at": 1773950042728, "ended_at": null } ``` ## CLI Options ```bash theme={null} # Timeout is in seconds tl sbx exec --timeout 10 python -c 'print("hi")' # Run from a specific working directory tl sbx exec --workdir /workspace python main.py # Inject environment variables into a single command tl sbx exec --env MODE=prod --env DEBUG=0 /bin/sh -lc 'printf "%s %s\n" "$MODE" "$DEBUG"' # Keep the sandbox after a one-shot run so you can inspect it afterwards tl sbx run --keep /bin/sh -lc 'echo KEEP_TEST && sleep 1' ``` A verified `--env` run printed `prod 0`. A verified `--keep` run ended with `Sandbox kept alive.`, and `tl sbx ls --all` then showed that sandbox as `running`. ```python theme={null} # Run with a timeout, working directory, and per-command environment result = sandbox.run( "python", ["main.py"], env={"MODE": "prod", "DEBUG": "0"}, working_dir="/workspace", timeout=10, ) ``` ```typescript theme={null} const result = await sandbox.run("python", { args: ["main.py"], env: { MODE: "prod", DEBUG: "0" }, workingDir: "/workspace", timeout: 10, }); console.log(result.exitCode); ``` ```bash theme={null} # Start a process with custom environment variables and working directory curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "command": "python", "args": ["main.py"], "env": {"MODE": "prod", "DEBUG": "0"}, "working_dir": "/workspace" }' ``` ## Shell Commands ```bash theme={null} # Count Python files in /workspace with a pipe tl sbx exec bash -c "ls -la /workspace | grep '.py' | wc -l" # Redirect stdout and stderr to files tl sbx exec bash -c "python script.py > output.txt 2> errors.txt" # Chain setup and execution in one shell command tl sbx exec bash -c "cd /workspace && pip install -r requirements.txt && python main.py" ``` ```python theme={null} sandbox = Sandbox.create() # Pipes result = sandbox.run("bash", ["-c", "ls -la /workspace | grep '.py' | wc -l"]) print(result.stdout) # Redirects sandbox.run("bash", ["-c", "python script.py > output.txt 2> errors.txt"]) # Command chaining sandbox.run("bash", ["-c", "cd /workspace && pip install -r requirements.txt && python main.py"]) ``` ```typescript theme={null} const count = await sandbox.run("bash", { args: ["-lc", "ls -la /workspace | grep '.py' | wc -l"], }); console.log(count.stdout); await sandbox.run("bash", { args: ["-lc", "python script.py > output.txt 2> errors.txt"], }); await sandbox.run("bash", { args: [ "-lc", "cd /workspace && pip install -r requirements.txt && python main.py", ], }); ``` ```bash theme={null} # Use bash when you need pipes, redirects, or command chaining curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "command": "bash", "args": ["-c", "ls -la /workspace | grep .py | wc -l"] }' ``` ## Get Process Output ```python theme={null} sandbox = Sandbox.create() result = sandbox.run("python", ["-c", "print('hello')"]) print(result.stdout) print(result.stderr) ``` ```typescript theme={null} import { ProcessStatus } from "tensorlake"; const proc = await sandbox.startProcess("python", { args: [ "-c", "import sys; print('hello'); print('oops', file=sys.stderr)", ], }); let info = await sandbox.getProcess(proc.pid); while (info.status === ProcessStatus.RUNNING) { await new Promise((resolve) => setTimeout(resolve, 100)); info = await sandbox.getProcess(proc.pid); } console.log((await sandbox.getStdout(proc.pid)).lines); console.log((await sandbox.getStderr(proc.pid)).lines); console.log((await sandbox.getOutput(proc.pid)).lines); ``` ```bash theme={null} # Get stdout curl https://.sandbox.tensorlake.ai/api/v1/processes//stdout \ -H "Authorization: Bearer $TL_API_KEY" # Get stderr curl https://.sandbox.tensorlake.ai/api/v1/processes//stderr \ -H "Authorization: Bearer $TL_API_KEY" # Get combined output curl https://.sandbox.tensorlake.ai/api/v1/processes//output \ -H "Authorization: Bearer $TL_API_KEY" ``` **Combined output response:** ```json theme={null} { "pid": 297, "lines": ["hello", "oops"], "line_count": 2 } ``` Not supported in the CLI. ## Interactive Shell ```bash theme={null} # Open an interactive shell in the sandbox tl sbx ssh # Use a custom shell tl sbx ssh --shell /bin/sh ``` ```python theme={null} sandbox = Sandbox.create() pty = sandbox.create_pty( command="/bin/bash", rows=24, cols=80, ) pty.send_input("pwd\nexit\n") print(pty.wait()) ``` ```typescript theme={null} const pty = await sandbox.createPty({ command: "/bin/bash", rows: 24, cols: 80, }); await pty.sendInput("pwd\nexit\n"); console.log(await pty.wait()); ``` ```bash theme={null} # 1. Create a PTY session curl -X POST https://.sandbox.tensorlake.ai/api/v1/pty \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{"command": "/bin/bash", "rows": 24, "cols": 80}' ``` **Response:** ```json theme={null} { "session_id": "LYtJOrxE9Kz3bphPUDzuX", "token": "" } ``` ```bash theme={null} # 2. Connect via WebSocket wscat -c "wss://.sandbox.tensorlake.ai/api/v1/pty//ws?token=" ``` `tl sbx ssh` requires an interactive terminal and automatically resumes a suspended sandbox before opening the PTY session. For the full programmatic PTY flow, including the `READY` handshake, binary WebSocket opcodes, and clean shutdown, see [PTY Sessions](/sandboxes/pty-sessions). ## Error Handling ```bash theme={null} # The CLI prints stderr and returns a non-zero exit code on failure tl sbx exec python -c "import nonexistent_module" ``` ```python theme={null} sandbox = Sandbox.create() result = sandbox.run("python", ["-c", "import nonexistent_module"]) if result.exit_code != 0: print(f"Command failed with exit code {result.exit_code}") print(f"stderr: {result.stderr}") ``` ```typescript theme={null} const result = await sandbox.run("python", { args: ["-c", "import nonexistent_module"], }); if (result.exitCode !== 0) { console.log(`Command failed with exit code ${result.exitCode}`); console.log(`stderr: ${result.stderr}`); } ``` ```bash theme={null} # Check the exited process status curl https://.sandbox.tensorlake.ai/api/v1/processes/ \ -H "Authorization: Bearer $TL_API_KEY" ``` **Process status response:** ```json theme={null} { "pid": 305, "status": "exited", "exit_code": 1, "signal": null, "stdin_writable": false, "command": "python", "args": ["-c", "import nonexistent_module"], "started_at": 1773950228855, "ended_at": 1773950228866 } ``` **stderr response:** ```json theme={null} { "pid": 305, "lines": [ "Traceback (most recent call last):", " File \"\", line 1, in ", "ModuleNotFoundError: No module named 'nonexistent_module'" ], "line_count": 3 } ``` ## Streaming Output Stream stdout/stderr in real time for long-running commands using Server-Sent Events: `tl sbx exec` streams combined output to your terminal while the process runs. ```python theme={null} sandbox = Sandbox.create() # Start a long-running process proc = sandbox.start_process("python", ["-c", """ import time for i in range(5): print(f"Step {i+1}/5") time.sleep(1) """]) # Stream output as it arrives for event in sandbox.follow_output(proc.pid): print(event.line, end="") ``` ```typescript theme={null} const proc = await sandbox.startProcess("python", { args: [ "-c", "import time\nfor i in range(5):\n print(f'Step {i+1}/5')\n time.sleep(1)", ], }); for await (const event of sandbox.followOutput(proc.pid)) { process.stdout.write(event.line); } ``` ```bash theme={null} # Follow stdout via SSE curl -N https://.sandbox.tensorlake.ai/api/v1/processes//stdout/follow \ -H "Authorization: Bearer $TL_API_KEY" # Follow combined output via SSE curl -N https://.sandbox.tensorlake.ai/api/v1/processes//output/follow \ -H "Authorization: Bearer $TL_API_KEY" ``` **SSE stream:** ``` event: output data: {"line":"Step 1/2","timestamp":1773950220162,"stream":"stdout"} event: output data: {"line":"Step 2/2","timestamp":1773950220162,"stream":"stdout"} event: eof data: {} ``` ## Learn More Manage background processes. Read, write, and copy files. Sandbox states, resources, and timeouts. # Computer Use Source: https://docs.tensorlake.ai/sandboxes/computer-use Launch ubuntu-vnc sandboxes and drive their desktop from Python or JavaScript. `tensorlake/ubuntu-vnc` is a managed desktop image for browser automation and computer-use agents. It boots XFCE, TigerVNC, and Firefox for you, and the SDK connects through the authenticated sandbox proxy so you can drive the desktop without manually exposing port `5901`. This guide builds on [Sandboxes](/sandboxes/introduction). If you already have a Tensorlake API key, you can create a desktop sandbox, capture screenshots, and send mouse and keyboard input in just a few lines. If you specifically want to automate **Chrome** rather than the desktop, see [Drive Chrome over CDP](/sandboxes/chrome-cdp) — it pairs `tensorlake/ubuntu-vnc` with a [tunnel](/sandboxes/tunnels) so Playwright, Puppeteer, or `chrome-devtools-mcp` can drive the in-sandbox browser as if it were running locally. ## Prerequisites ```bash theme={null} pip install tensorlake export TENSORLAKE_API_KEY=your-api-key ``` ```bash theme={null} npm install tensorlake export TENSORLAKE_API_KEY=your-api-key ``` ```bash theme={null} curl -fsSL https://tensorlake.ai/install | sh export TENSORLAKE_API_KEY=your-api-key ``` Prefer an interactive login? Run `tl login` instead of setting `TENSORLAKE_API_KEY`; it stores a Personal Access Token in `~/.config/tensorlake/credentials.toml`. The current managed `tensorlake/ubuntu-vnc` image uses `tensorlake` as its VNC password. ## Launch a Desktop Sandbox Use `tensorlake/ubuntu-vnc` when you want a full Linux desktop instead of a shell-only environment. ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = Sandbox.create(image="tensorlake/ubuntu-vnc") print(sandbox.sandbox_id) ``` ```javascript theme={null} import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.create({ image: "tensorlake/ubuntu-vnc", }); try { console.log(sandbox.sandboxId); } finally { await sandbox.terminate(); } ``` ```bash theme={null} tl sbx create -i tensorlake/ubuntu-vnc ``` `tl sbx create` prints the sandbox id on stdout. Reuse it with `tl sbx tunnel`, `tl sbx ssh`, `tl sbx exec`, and the rest of the `tl sbx ...` subcommands. List running sandboxes with `tl sbx ls` and terminate one with `tl sbx terminate `. You still get a normal `Sandbox` object back, so computer use fits naturally alongside `run()`, file operations, PTY sessions, snapshots, and tunnels. ## Capture Screenshots Once the sandbox is running, attach to the desktop and save a PNG. This is the easiest way to inspect the layout and discover click coordinates before sending pointer events. Fresh desktop sandboxes can take a few seconds to finish starting XFCE and other desktop services. If your first screenshot is blank or input does not land where you expect, wait briefly after connecting and then retry. ```python theme={null} import time from pathlib import Path from tensorlake.sandbox import Sandbox sandbox = Sandbox.create(image="tensorlake/ubuntu-vnc") with sandbox.connect_desktop(password="tensorlake") as desktop: time.sleep(4.0) screenshot = desktop.screenshot() Path("sandbox-desktop.png").write_bytes(screenshot) print(desktop.width, desktop.height) ``` ```javascript theme={null} import { writeFile } from "node:fs/promises"; import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.create({ image: "tensorlake/ubuntu-vnc", }); try { const desktop = await sandbox.connectDesktop({ password: "tensorlake", }); try { await new Promise((resolve) => setTimeout(resolve, 4000)); const screenshot = await desktop.screenshot(); await writeFile("sandbox-desktop.png", screenshot); console.log(desktop.width, desktop.height); } finally { await desktop.close(); } } finally { await sandbox.terminate(); } ``` ## Send Keyboard and Mouse Input The desktop client supports keyboard shortcuts, typed input, clicks, double-clicks, mouse movement, and scrolling. The example below uses a reliable keyboard-driven flow: open a terminal, type a command, and then verify the result from the sandbox shell. ```python theme={null} import time from tensorlake.sandbox import Sandbox sandbox = Sandbox.create(image="tensorlake/ubuntu-vnc") with sandbox.connect_desktop(password="tensorlake") as desktop: # Give XFCE a moment to finish initializing the keybind daemon and # window manager. On a freshly-restored snapshot the in-VM `vncserver` # is up before XFCE has finished settling, so the very first # `Ctrl+Alt+T` can be lost if it lands before the keybind handler # registers. time.sleep(5.0) desktop.press(["ctrl", "alt", "t"]) time.sleep(4.0) desktop.type_text("echo docs-test > /tmp/desktop-test.txt") desktop.press("enter") time.sleep(3.0) # Mouse helpers are also available when you know the coordinates. desktop.move_mouse(640, 400) desktop.scroll_down() result = sandbox.run("bash", ["-lc", "cat /tmp/desktop-test.txt"]) print(result.stdout.strip()) # docs-test ``` ```javascript theme={null} import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.create({ image: "tensorlake/ubuntu-vnc", }); try { const desktop = await sandbox.connectDesktop({ password: "tensorlake", }); try { // Give XFCE a moment to finish initializing the keybind daemon and // window manager. On a freshly-restored snapshot the in-VM `vncserver` // is up before XFCE has finished settling, so the very first // `Ctrl+Alt+T` can be lost if it lands before the keybind handler // registers. await new Promise((resolve) => setTimeout(resolve, 5000)); await desktop.press(["ctrl", "alt", "t"]); await new Promise((resolve) => setTimeout(resolve, 4000)); await desktop.typeText("echo docs-test > /tmp/desktop-test.txt"); await desktop.press("enter"); await new Promise((resolve) => setTimeout(resolve, 3000)); // Mouse helpers are also available when you know the coordinates. await desktop.moveMouse(640, 400); await desktop.scrollDown(); } finally { await desktop.close(); } const result = await sandbox.run("bash", { args: ["-lc", "cat /tmp/desktop-test.txt"], }); console.log(result.stdout.trim()); // docs-test } finally { await sandbox.terminate(); } ``` Coordinate-based actions are screen-relative. A common workflow is: 1. Take a screenshot. 2. Inspect the desktop layout and note the coordinates you care about. 3. Use `move_mouse()` / `moveMouse()`, `click()`, `double_click()` / `doubleClick()`, and `scroll()` with those coordinates. ## Reconnect to an Existing Sandbox If a sandbox is already running, connect by sandbox ID and attach to the desktop without creating a new VM. ```python theme={null} from pathlib import Path from tensorlake.sandbox import Sandbox sandbox_id = "your-running-sandbox-id" with Sandbox.connect(sandbox_id) as sandbox: with sandbox.connect_desktop(password="tensorlake") as desktop: Path("existing-sandbox.png").write_bytes(desktop.screenshot()) ``` ```javascript theme={null} import { writeFile } from "node:fs/promises"; import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.connect({ sandboxId: "your-running-sandbox-id", }); try { const desktop = await sandbox.connectDesktop({ password: "tensorlake", }); try { const screenshot = await desktop.screenshot(); await writeFile("existing-sandbox.png", screenshot); } finally { await desktop.close(); } } finally { sandbox.close(); } ``` Connecting to an existing sandbox only closes the client connection when you are done. It does not terminate the running VM. ## Connect with a VNC Client If you want to drive the desktop from a real VNC viewer (Screen Sharing on macOS, TigerVNC, RealVNC, Remmina, etc.) rather than the SDK, open a TCP tunnel to the sandbox's VNC port and point your client at the local end. The tunnel keeps sandbox-proxy authentication local — you do **not** need to expose `5901` publicly. Open the tunnel with `tl sbx tunnel`. Replace `` with the id printed by `tl sbx create` (or `tl sbx ls`): ```bash theme={null} tl sbx tunnel 5901 --listen-port 15901 ``` Leave that command running — it forwards `127.0.0.1:15901` on your machine to port `5901` inside the sandbox over an authenticated WebSocket. Then connect any VNC client to `localhost:15901` using the desktop password `tensorlake`: Use the built-in Screen Sharing client: ```bash theme={null} open vnc://localhost:15901 ``` Enter `tensorlake` when macOS prompts for the password. ```bash theme={null} vncviewer localhost:15901 ``` Most distributions ship `vncviewer` in the `tigervnc-viewer` package (`apt install tigervnc-viewer` on Debian/Ubuntu, `dnf install tigervnc` on Fedora). Any RFB-compatible viewer works — RealVNC Viewer, TightVNC, Remmina, KRDC, etc. Point it at `localhost:15901` and use `tensorlake` as the password. Stop the tunnel with `Ctrl+C` when you are done. Closing the tunnel does not terminate the sandbox; reopen it any time with the same command. ## Use noVNC in the Browser If you want a human to interact with the sandbox desktop in real time, use a real VNC client in the browser instead of polling screenshots. [`noVNC`](https://novnc.com/info.html) is a good fit here. The recommended architecture is: 1. Keep the Tensorlake API key on your backend. 2. Use the backend to open a TCP tunnel to the sandbox's VNC port `5901`. 3. Bridge that local tunnel to a browser WebSocket endpoint such as `/vnc/`. 4. Point `noVNC` at your backend WebSocket and authenticate with the desktop password `tensorlake`. This keeps sandbox proxy authentication server-side and gives the browser a low-latency live desktop stream. You do **not** need to expose port `5901` publicly yourself. If you are also running an agent loop, a good pattern is to use: * `noVNC` for the live human-facing desktop stream * `sandbox.connectDesktop()` for screenshots and high-level computer-use actions on the backend That separation avoids turning the browser view into a screenshot polling loop. ### Browser Client with noVNC Install `noVNC` in your frontend: ```bash theme={null} npm install @novnc/novnc ``` Then connect the browser to your own WebSocket bridge: ```ts theme={null} import RFB from "@novnc/novnc/lib/rfb"; const host = document.getElementById("desktop"); if (!(host instanceof HTMLDivElement)) { throw new Error("Missing #desktop container"); } const protocol = window.location.protocol === "https:" ? "wss:" : "ws:"; const url = `${protocol}//${window.location.host}/vnc`; const rfb = new RFB(host, url, { credentials: { password: "tensorlake" }, shared: true, }); rfb.scaleViewport = true; rfb.clipViewport = false; rfb.showDotCursor = true; ``` Use a fixed-size container for the desktop surface: ```html theme={null}
``` ## Desktop API Surface Python uses `snake_case`, while JavaScript uses `camelCase`, but both SDKs expose the same core capabilities: * Screenshots: `screenshot()` * Mouse input: `move_mouse()` / `moveMouse()`, `mouse_press()` / `mousePress()`, `mouse_release()` / `mouseRelease()`, `click()`, `double_click()` / `doubleClick()`, `scroll()`, `scroll_up()` / `scrollUp()`, and `scroll_down()` / `scrollDown()` * Keyboard input: `key_down()` / `keyDown()`, `key_up()` / `keyUp()`, `press()`, and `type_text()` / `typeText()` * Desktop size: `width` and `height` `connect_desktop()` and `connectDesktop()` go through the authenticated sandbox proxy, so you do not need to bind or expose the VNC port yourself. For interactive debugging through a real VNC viewer, see [Connect with a VNC Client](#connect-with-a-vnc-client) above. ## Related Guides * [Drive Chrome over CDP](/sandboxes/chrome-cdp) — point Playwright, Puppeteer, or `chrome-devtools-mcp` at the Chrome that ships in `tensorlake/ubuntu-vnc`. * [Local Tunnels](/sandboxes/tunnels) — the tunneling primitive used by both the VNC viewer and Chrome CDP workflows. * [Snapshots](/sandboxes/snapshots) — fork warm desktops to parallelize agent runs without re-launching XFCE. # Data Analysis Source: https://docs.tensorlake.ai/sandboxes/data-analysis Perform parallel data analysis and model benchmarking in isolated sandboxes. Run parallel data analysis, model training, and benchmarking tasks in secure, isolated sandbox environments. Each sandbox can have its own dependencies and resource limits, allowing you to compare different models or process large datasets concurrently. This example demonstrates how to benchmark several `scikit-learn` classification models in parallel by running each in its own sandbox. ## TypeScript SDK starter The same benchmarking pattern works in Node.js: one model per sandbox, `Promise.all()` for fan-out, and JSON on stdout for aggregation. ```typescript theme={null} import { Sandbox } from "tensorlake"; async function runModelBenchmark(modelName: string, sklearnPath: string) { const splitAt = sklearnPath.lastIndexOf("."); const modulePath = sklearnPath.slice(0, splitAt); const className = sklearnPath.slice(splitAt + 1); const code = ` from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from ${modulePath} import ${className} import json data = load_iris() X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.3) model = ${className}() model.fit(X_train, y_train) print(json.dumps({"model": "${modelName}", "accuracy": model.score(X_test, y_test)})) `; const sandbox = await Sandbox.create({ timeoutSecs: 900, allowInternetAccess: false, }); try { await sandbox.run("pip", { args: [ "install", "numpy", "scikit-learn", "--user", "--break-system-packages", ], }); const result = await sandbox.run("python", { args: ["-c", code], }); return JSON.parse(result.stdout); } finally { await sandbox.terminate(); } } const modelsToTest = { "Random Forest": "sklearn.ensemble.RandomForestClassifier", SVM: "sklearn.svm.SVC", "Logistic Regression": "sklearn.linear_model.LogisticRegression", }; const results = await Promise.all( Object.entries(modelsToTest).map(([name, path]) => runModelBenchmark(name, path), ), ); console.table(results); client.close(); ``` ## Example: Parallel Model Benchmarking The following script benchmarks five different `scikit-learn` models on the Iris dataset. Each model is trained and evaluated in a separate, concurrent sandbox. ```python theme={null} import asyncio import json from dotenv import load_dotenv load_dotenv() from tensorlake.sandbox import Sandbox async def run_model_benchmark(model_name, sklearn_path): """ Runs a model benchmark inside an isolated sandbox. Returns a dict with model name and accuracy. """ module_path, class_name = sklearn_path.rsplit('.', 1) code = f""" from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from {module_path} import {class_name} import json data = load_iris() X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.3) model = {class_name}() model.fit(X_train, y_train) score = model.score(X_test, y_test) print(json.dumps({{"model": "{model_name}", "accuracy": score}})) """ def _sync_benchmark(): sandbox = Sandbox.create() print(f"🚀 Sandbox started for {model_name}...") # install scikit-learn and its dependencies in the sandbox sandbox.run("pip", ["install", "--user", "--break-system-packages", "numpy", "scikit-learn"]) # run the code in the sandbox result = sandbox.run("python", ["-c", code]) output_data = json.loads(result.stdout.strip()) return output_data return await asyncio.to_thread(_sync_benchmark) async def main(): models_to_test: dict[str, str] = { "Random Forest": "sklearn.ensemble.RandomForestClassifier", "SVM": "sklearn.svm.SVC", "Logistic Regression": "sklearn.linear_model.LogisticRegression", "Decision Tree": "sklearn.tree.DecisionTreeClassifier", "KNN": "sklearn.neighbors.KNeighborsClassifier", } tasks = [run_model_benchmark(name, path) for name, path in models_to_test.items()] print("Gathering results from all sandboxes...\n") results = await asyncio.gather(*tasks) print("--- Benchmark Results ---") for r in results: print(f"{r['model']:<20}: {r['accuracy']:.4f}") if __name__ == "__main__": asyncio.run(main()) ``` ## How It Works The script orchestrates the parallel execution of model benchmarks using Python's `asyncio` library. **1. Parallel Execution:** The `main` function defines a dictionary of models to test and creates a list of asynchronous tasks using a list comprehension. `asyncio.gather` runs all these tasks concurrently. **2. Sandbox Task:** The `run_model_benchmark` function is responsible for a single benchmark. For each model, it: * Creates a new, isolated sandbox. * Installs the necessary Python libraries (`numpy` and `scikit-learn`) inside the sandbox using `sandbox.run()`. The `--break-system-packages` flag is used to comply with PEP 668 in newer Python environments. * Executes a Python script that trains the model on the Iris dataset and calculates its accuracy. * Prints the results as a JSON string to standard output. * Captures the `stdout`, parses the JSON, and returns the result. **3. Aggregate Results:** Once all sandboxes have completed their tasks, `asyncio.gather` returns a list of all the results, which are then printed to the console. This example uses the `python-dotenv` library to load your Tensorlake API key from a `.env` file. Create a file named `.env` in your project root and add your key: ``` TENSORLAKE_API_KEY="your-api-key-here" ``` The SDK will automatically use this key. ## Pro Tips ### Faster Execution with Snapshots The example installs dependencies every time a sandbox is created. This is simple but inefficient for repeated runs. To significantly speed up your workflow, you can use **Snapshots**. 1. Create a "base" sandbox and install all your dependencies. 2. Create a snapshot of that sandbox. 3. Start new sandboxes from the snapshot ID. The new sandboxes will have all the dependencies pre-installed, saving you valuable setup time. Learn more in the [Snapshots guide](/sandboxes/snapshots). ## Learn More Install Tensorlake and create your first sandbox. Learn how to upload custom datasets and other files to your sandboxes. # Run Docker Source: https://docs.tensorlake.ai/sandboxes/docker Run Docker containers inside Tensorlake sandboxes using the tensorlake/ubuntu-systemd base image — full systemd support for compose, networking, and daemons. ### Create the sandbox ```bash theme={null} tl sbx create my-docker-sandbox --image tensorlake/ubuntu-systemd --cpus 2.0 --memory 2048 ``` ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = Sandbox.create( name="my-docker-sandbox", image="tensorlake/ubuntu-systemd", cpus=2.0, memory_mb=2048, ) ``` ```typescript theme={null} import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.create({ name: "my-docker-sandbox", image: "tensorlake/ubuntu-systemd", cpus: 2.0, memoryMb: 2048, }); ``` ### Install Docker Install Docker from the [official Ubuntu repository](https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository): ```bash theme={null} tl sbx exec my-docker-sandbox bash -c ' set -e apt-get update apt-get install -y ca-certificates curl install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc chmod a+r /etc/apt/keyrings/docker.asc . /etc/os-release && echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu ${UBUNTU_CODENAME:-$VERSION_CODENAME} stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null apt-get update apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin ' ``` ```python theme={null} script = """ set -e apt-get update apt-get install -y ca-certificates curl install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc chmod a+r /etc/apt/keyrings/docker.asc . /etc/os-release && echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu ${UBUNTU_CODENAME:-$VERSION_CODENAME} stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null apt-get update apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y """ result = sandbox.run("bash", ["-c", script]) if result.exit_code != 0: raise RuntimeError(result.stderr) ``` ```typescript theme={null} const script = [ "set -e", "apt-get update", "apt-get install -y ca-certificates curl", "install -m 0755 -d /etc/apt/keyrings", "curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc", "chmod a+r /etc/apt/keyrings/docker.asc", '. /etc/os-release && echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu ${UBUNTU_CODENAME:-$VERSION_CODENAME} stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null', "apt-get update", "apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y", ].join("\n"); const result = await sandbox.run("bash", { args: ["-c", script] }); if (result.exitCode !== 0) throw new Error(result.stderr); ``` ### Verify ```bash theme={null} tl sbx exec my-docker-sandbox docker run hello-world ``` ### SSH into the sandbox to run Docker commands interactively: ```bash theme={null} tl sbx ssh my-docker-sandbox sudo docker run hello-world ``` # Environment Variables Source: https://docs.tensorlake.ai/sandboxes/environment-variables Set per-command and per-PTY environment variables in the CLI, Python, and TypeScript. Use this page as a quick reference for environment variable scopes in Sandboxes. For endpoint-level and workflow details, also see [Execute Commands](./commands) and [PTY Sessions](./pty-sessions). | Scope | Use this when | API surface | | -------------- | ---------------------------------------------------------- | ---------------------------------------------------------------------------------------- | | Command-scoped | Variables should apply to one command only | `tl sbx exec --env KEY=VALUE ...`, `sandbox.run(..., env=...)` | | PTY-scoped | Variables should apply to one interactive PTY session only | `tl sbx ssh --env KEY=VALUE ...`, `create_pty(..., env=...)` / `createPty({ env: ... })` | ## Prerequisites * You have the `tl` CLI installed and authenticated. * You have a running sandbox connection in the CLI, Python, or TypeScript. * You already set `TENSORLAKE_API_KEY` in your local environment. ## 1. Env vars for a single command in a sandbox Use command-scoped env when values should not persist across all sandbox processes. ```bash theme={null} # Set command-scoped env vars with repeated --env flags tl sbx exec \ --env MODE=prod \ --env DEBUG=0 \ bash -lc 'echo MODE=$MODE DEBUG=$DEBUG' ``` ```bash theme={null} # Same pattern when creating a one-shot sandbox with tl sbx run tl sbx run \ --env MODE=prod \ --env DEBUG=0 \ bash -lc 'echo MODE=$MODE DEBUG=$DEBUG' ``` ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = Sandbox.create() result = sandbox.run( "bash", ["-lc", "echo MODE=$MODE DEBUG=$DEBUG"], env={"MODE": "prod", "DEBUG": "0"}, ) print(result.stdout) ``` ```python theme={null} # Advanced: set additional command-only variables. result = sandbox.run( "bash", ["-lc", "echo APP_ENV=$APP_ENV TRACE_ID=$TRACE_ID"], env={"APP_ENV": "staging", "TRACE_ID": "req-123"}, ) ``` ```typescript theme={null} const result = await sandbox.run("bash", { args: ["-lc", "echo MODE=$MODE DEBUG=$DEBUG"], env: { MODE: "prod", DEBUG: "0" }, }); console.log(result.stdout); ``` ```typescript theme={null} // Advanced: set additional command-only variables. const overrideResult = await sandbox.run("bash", { args: ["-lc", "echo APP_ENV=$APP_ENV TRACE_ID=$TRACE_ID"], env: { APP_ENV: "staging", TRACE_ID: "req-123" }, }); ``` ## 2. Env vars when creating a PTY session Use PTY-scoped env when you need custom variables inside one interactive terminal session. ```bash theme={null} # Open an interactive PTY session tl sbx ssh ``` ```bash theme={null} # Set custom PTY env vars tl sbx ssh \ --env APP_ENV=dev \ --env TERM=screen-256color ``` ```bash theme={null} # Optional: custom shell, shell args, and working directory tl sbx ssh \ --shell /bin/zsh \ --shell-arg -l \ --workdir /workspace \ --env APP_ENV=dev ``` `tl sbx ssh` always creates a PTY session with defaults like `TERM` and `COLORTERM=truecolor`; your `--env` values are merged in and can override those defaults. ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = Sandbox.create() pty = sandbox.create_pty( command="/bin/bash", args=["-l"], env={"TERM": "xterm-256color", "APP_ENV": "dev"}, working_dir="/workspace", rows=24, cols=80, ) pty.send_input("echo APP_ENV=$APP_ENV\nexit\n") pty.wait() ``` ```typescript theme={null} const pty = await sandbox.createPty({ command: "/bin/bash", args: ["-l"], env: { TERM: "xterm-256color", APP_ENV: "dev" }, workingDir: "/workspace", rows: 24, cols: 80, }); await pty.sendInput("echo APP_ENV=$APP_ENV\nexit\n"); await pty.wait(); ``` ## Choosing the right scope * Use `run(..., env=...)` for one-off command values. * Use PTY `env` for interactive terminal sessions. # File Operations Source: https://docs.tensorlake.ai/sandboxes/file-operations Copy, read, write, and manage files inside Tensorlake sandboxes — transfer between your local machine and the sandbox filesystem over the proxy URL. File operations use the sandbox proxy URL: `https://.sandbox.tensorlake.ai` Named sandboxes can use the sandbox name in place of the ID in the proxy hostname. The file APIs documented here run on the management URL on port `9501`, which always requires authentication. Unauthenticated proxy access applies only to exposed user ports. ## Copy Files ```bash theme={null} # Copy a local file into the sandbox tl sbx cp data.csv :/workspace/data.csv # Copy a file from the sandbox to local tl sbx cp :/workspace/data.csv ./data.csv ``` ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = Sandbox.create() sandbox.write_file("/workspace/data.csv", b"name,score\nAlice,95\nBob,87") content = sandbox.read_file("/workspace/data.csv") print(bytes(content).decode("utf-8")) ``` ```typescript theme={null} await sandbox.writeFile( "/workspace/data.csv", new TextEncoder().encode("name,score\nAlice,95\nBob,87"), ); const content = await sandbox.readFile("/workspace/data.csv"); console.log(new TextDecoder().decode(content)); ``` ```bash theme={null} curl -X PUT "https://.sandbox.tensorlake.ai/api/v1/files?path=/workspace/data.csv" \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/octet-stream" \ --data-binary "name,score\nAlice,95\nBob,87" curl "https://.sandbox.tensorlake.ai/api/v1/files?path=/workspace/data.csv" \ -H "Authorization: Bearer $TL_API_KEY" ``` `tl sbx cp` is file-only today. Directory copy workflows should use the Python SDK, TypeScript SDK, or the raw file API. ## Read Files ```bash theme={null} tl sbx cp :/workspace/data.csv ./data.csv tl sbx exec cat /workspace/data.csv ``` ```python theme={null} content = sandbox.read_file("/workspace/data.csv") print(bytes(content).decode("utf-8")) image_bytes = sandbox.read_file("/workspace/chart.png") with open("chart.png", "wb") as f: f.write(image_bytes) ``` ```typescript theme={null} const content = await sandbox.readFile("/workspace/data.csv"); console.log(new TextDecoder().decode(content)); const bytes = await sandbox.readFile("/workspace/chart.png"); console.log(`Read ${bytes.byteLength} bytes`); ``` ```bash theme={null} curl "https://.sandbox.tensorlake.ai/api/v1/files?path=/workspace/data.csv" \ -H "Authorization: Bearer $TL_API_KEY" curl "https://.sandbox.tensorlake.ai/api/v1/files?path=/workspace/chart.png" \ -H "Authorization: Bearer $TL_API_KEY" \ -o chart.png ``` ## Write Files ```bash theme={null} tl sbx cp config.json :/workspace/config.json ``` ```python theme={null} sandbox.write_file("/workspace/config.json", b'{"debug": true, "port": 8080}') with open("model.pkl", "rb") as f: sandbox.write_file("/workspace/model.pkl", f.read()) ``` ```typescript theme={null} import { readFile } from "node:fs/promises"; await sandbox.writeFile( "/workspace/config.json", new TextEncoder().encode('{"debug": true, "port": 8080}'), ); const modelBytes = await readFile("model.pkl"); await sandbox.writeFile("/workspace/model.pkl", modelBytes); ``` ```bash theme={null} curl -X PUT "https://.sandbox.tensorlake.ai/api/v1/files?path=/workspace/config.json" \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/octet-stream" \ --data-binary '{"debug": true, "port": 8080}' curl -X PUT "https://.sandbox.tensorlake.ai/api/v1/files?path=/workspace/model.pkl" \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/octet-stream" \ --data-binary @model.pkl ``` ## List Directory Contents ```bash theme={null} tl sbx exec ls -la /workspace ``` ```python theme={null} entries = sandbox.list_directory("/workspace") for entry in entries.entries: print(f"{entry.name} ({entry.size} bytes)") ``` ```typescript theme={null} const listing = await sandbox.listDirectory("/workspace"); for (const entry of listing.entries) { console.log(`${entry.name} (${entry.size ?? 0} bytes)`); } ``` ```bash theme={null} curl "https://.sandbox.tensorlake.ai/api/v1/files/list?path=/workspace" \ -H "Authorization: Bearer $TL_API_KEY" ``` ## Delete Files ```bash theme={null} tl sbx exec rm -rf /workspace/temp ``` ```python theme={null} sandbox.delete_file("/workspace/temp") ``` ```typescript theme={null} await sandbox.deleteFile("/workspace/temp"); ``` ```bash theme={null} curl -X DELETE "https://.sandbox.tensorlake.ai/api/v1/files?path=/workspace/temp" \ -H "Authorization: Bearer $TL_API_KEY" ``` ## Organize Files ```bash theme={null} tl sbx exec mkdir -p /workspace/src/components tl sbx exec mv /workspace/old.txt /workspace/new.txt ``` ```python theme={null} sandbox.run("mkdir", ["-p", "/workspace/src/components"]) sandbox.run("mv", ["/workspace/old.txt", "/workspace/new.txt"]) ``` ```typescript theme={null} await sandbox.run("mkdir", { args: ["-p", "/workspace/src/components"], }); await sandbox.run("mv", { args: ["/workspace/old.txt", "/workspace/new.txt"], }); ``` Not supported in the HTTP API. ## Best Practices * Use `/workspace` as the default directory for application files. * Use absolute paths to avoid ambiguity. * Use `write_file` / `read_file` for programmatic access. * Use `tl sbx cp` for single-file transfers. * Use the Python SDK, TypeScript SDK, or raw file API for directory-oriented workflows. ## Learn More Execute commands in sandboxes. Save and restore sandbox filesystem, memory, and running processes. Sandbox states, resources, and timeouts. # RL Training with GSPO Source: https://docs.tensorlake.ai/sandboxes/gspo-agentic-rl Fine-tune a language model on code generation tasks using Group Sequence Policy Optimization, with TensorLake sandboxes as the reward oracle. Train a language model to write correct Python functions using reinforcement learning — without ever running untrusted model-generated code in your training process. This guide walks through a two-phase setup: a supervised fine-tuning warmup followed by GSPO fine-tuning, where every completion is evaluated inside an isolated TensorLake sandbox running a hidden pytest suite. ## How it works 1. **SFT warmup (Phase 1)**: Supervised pass on correct solutions so the model starts generating valid Python. Without this, all completions score 0, reward variance is 0, and the RL trainer has no gradient signal. 2. **GSPO fine-tuning (Phase 2)**: The `GRPOTrainer` (with `importance_sampling_level="sequence"`) generates *G* completions per step and dispatches them to *G* parallel sandboxes. 3. **Sandbox reward**: Each sandbox runs a hidden pytest suite against the model's code and returns `tests_passed / total` as the reward signal (0.0–1.0). 4. **Why sandboxes are required**: Model-generated code is untrusted. Running it in-process during training would be unsafe. Each completion is fully isolated. ### GSPO vs GRPO Both algorithms use clipped importance sampling, but at different granularities: | Algorithm | IS clipping | | :-------- | :---------------------------------------------- | | **GRPO** | `clip(π_θ(t) / π_old(t))` per token | | **GSPO** | `clip(∏_t π_θ(t) / π_old(t))` once per sequence | For long function bodies, token-level clipping lets noisy individual tokens dominate the gradient. Sequence-level clipping treats the entire trajectory as one unit, which is a better fit for code generation tasks. *** ## Prerequisites ```bash theme={null} pip install tensorlake transformers trl datasets torch rich python-dotenv ``` Create a `.env` file in your project root with your Tensorlake API key: ``` TENSORLAKE_API_KEY="your-api-key-here" ``` *** ## TypeScript SDK starter In Node.js, the critical part is still the reward oracle: each completion gets written into its own sandbox, the hidden pytest suite runs there, and the pass ratio becomes the reward. ```typescript theme={null} import { Sandbox } from "tensorlake"; const encoder = new TextEncoder(); async function scoreCompletion( solutionSource: string, hiddenTests: string, ): Promise { const sandbox = await Sandbox.create({ cpus: 1.0, memoryMb: 1024, timeoutSecs: 300, allowInternetAccess: false, }); try { await sandbox.writeFile("/workspace/solution.py", encoder.encode(solutionSource)); await sandbox.writeFile("/workspace/test_hidden.py", encoder.encode(hiddenTests)); await sandbox.run("python", { args: ["-m", "pip", "install", "pytest", "--user", "--break-system-packages"], }); const result = await sandbox.run("python", { args: ["-m", "pytest", "-q", "/workspace/test_hidden.py"], workingDir: "/workspace", timeout: 300, }); const passed = Number(result.stdout.match(/(\d+) passed/)?.[1] ?? 0); const failed = Number(result.stdout.match(/(\d+) failed/)?.[1] ?? 0); return passed / Math.max(1, passed + failed); } finally { await sandbox.terminate(); } } const hiddenTests = ` from solution import sum_list def test_basic(): assert sum_list([1, 2, 3]) == 6 `; const completions = [ "def sum_list(nums):\n return sum(nums)", "def sum_list(nums):\n return 0", ]; const rewards = await Promise.all( completions.map((completion) => scoreCompletion(completion, hiddenTests)), ); console.log(rewards); client.close(); ``` That reward function plugs into the same GSPO loop described below. The model/trainer side can stay in Python, but the sandbox evaluation path can be moved to TypeScript if your orchestration layer already lives there. *** ## Full example The script below runs end-to-end: baseline evaluation → SFT warmup → GSPO fine-tuning → final evaluation. Pass `--smoke` for a fast 5-minute CPU run (3 tasks, 20 SFT steps, 1 GSPO epoch). ````python theme={null} """ RL GSPO Reasoner — Code Generation with Hidden Test Suites =========================================================== Algorithm : GSPO — Group Sequence Policy Optimization (Zheng et al., 2507.18071) GRPOConfig(importance_sampling_level="sequence") Why sandboxes are non-negotiable here -------------------------------------- The model generates arbitrary Python function bodies. Running untrusted model-generated code in the training process directly would be unsafe. Each completion is executed inside an isolated TensorLake sandbox. The sandbox runs a hidden pytest suite and returns tests_passed/total as reward. Training strategy ----------------- Phase 1 — SFT warmup (N steps): Supervised pass on correct solutions so the model outputs valid Python. Without this, all G completions score 0 → reward_std=0 → no gradient. Phase 2 — GSPO fine-tuning: GRPOTrainer with sequence-level IS. The reward function dispatches G parallel sandboxes per step and prints every completion that scores > 0. Smoke : --smoke → 3 functions, 20 SFT steps, 1 GSPO epoch (~5 min CPU) Full : 10 functions, 60 SFT steps, 3 GSPO epochs (~30 min CPU) """ from dotenv import load_dotenv load_dotenv() import re import sys import textwrap import torch from concurrent.futures import ThreadPoolExecutor, as_completed from datasets import Dataset from torch.optim import AdamW from transformers import AutoModelForCausalLM, AutoTokenizer from trl import GRPOTrainer, GRPOConfig from tensorlake.sandbox import Sandbox from rich.console import Console from rich.panel import Panel from rich.table import Table from rich.rule import Rule from rich import box from typing import List console = Console() MODEL_NAME = "HuggingFaceTB/SmolLM2-135M-Instruct" OUTPUT_DIR = "./gspo_coder" SMOKE = "--smoke" in sys.argv # ─── Dataset ────────────────────────────────────────────────────────────────── TASKS = [ dict( name="sum_list", prompt=( "Write a Python function:\n\n" "def sum_list(nums: list) -> int:\n" ' """Return the sum of all integers in nums."""' ), tests=textwrap.dedent("""\ from solution import sum_list def test_empty(): assert sum_list([]) == 0 def test_single(): assert sum_list([5]) == 5 def test_mixed(): assert sum_list([1, 2, 3]) == 6 def test_neg(): assert sum_list([-1, -2, 3]) == 0 """), solution="def sum_list(nums: list) -> int:\n return sum(nums)", ), dict( name="is_palindrome", prompt=( "Write a Python function:\n\n" "def is_palindrome(s: str) -> bool:\n" ' """Return True if s reads the same forwards and backwards."""' ), tests=textwrap.dedent("""\ from solution import is_palindrome def test_yes(): assert is_palindrome("racecar") is True def test_no(): assert is_palindrome("hello") is False def test_empty(): assert is_palindrome("") is True def test_single(): assert is_palindrome("a") is True """), solution="def is_palindrome(s: str) -> bool:\n return s == s[::-1]", ), dict( name="fizzbuzz", prompt=( "Write a Python function:\n\n" "def fizzbuzz(n: int) -> list:\n" ' """Return a list 1..n: "Fizz" div by 3, "Buzz" div by 5,\n' ' "FizzBuzz" both, else the number as a string."""' ), tests=textwrap.dedent("""\ from solution import fizzbuzz def test_basic(): r = fizzbuzz(15) assert r[2] == "Fizz" assert r[4] == "Buzz" assert r[14] == "FizzBuzz" assert r[0] == "1" def test_length(): assert len(fizzbuzz(5)) == 5 """), solution=( 'def fizzbuzz(n: int) -> list:\n' ' out = []\n' ' for i in range(1, n + 1):\n' ' if i % 15 == 0: out.append("FizzBuzz")\n' ' elif i % 3 == 0: out.append("Fizz")\n' ' elif i % 5 == 0: out.append("Buzz")\n' ' else: out.append(str(i))\n' ' return out' ), ), dict( name="count_vowels", prompt=( "Write a Python function:\n\n" "def count_vowels(s: str) -> int:\n" ' """Return the number of vowels (a,e,i,o,u, case-insensitive) in s."""' ), tests=textwrap.dedent("""\ from solution import count_vowels def test_basic(): assert count_vowels("hello") == 2 def test_upper(): assert count_vowels("AEIOU") == 5 def test_none(): assert count_vowels("bcdf") == 0 def test_empty(): assert count_vowels("") == 0 """), solution=( "def count_vowels(s: str) -> int:\n" " return sum(1 for c in s.lower() if c in 'aeiou')" ), ), dict( name="flatten", prompt=( "Write a Python function:\n\n" "def flatten(lst: list) -> list:\n" ' """Flatten one level of nesting: [[1,2],[3]] -> [1,2,3]."""' ), tests=textwrap.dedent("""\ from solution import flatten def test_basic(): assert flatten([[1,2],[3,4]]) == [1,2,3,4] def test_empty(): assert flatten([]) == [] def test_single(): assert flatten([[1]]) == [1] def test_mixed(): assert flatten([[1,2],[]]) == [1,2] """), solution=( "def flatten(lst: list) -> list:\n" " return [x for sub in lst for x in sub]" ), ), dict( name="max_consecutive", prompt=( "Write a Python function:\n\n" "def max_consecutive(nums: list) -> int:\n" ' """Return the length of the longest run of equal consecutive elements."""' ), tests=textwrap.dedent("""\ from solution import max_consecutive def test_basic(): assert max_consecutive([1,1,2,2,2,3]) == 3 def test_single(): assert max_consecutive([5]) == 1 def test_empty(): assert max_consecutive([]) == 0 def test_all(): assert max_consecutive([7,7,7]) == 3 """), solution=( "def max_consecutive(nums: list) -> int:\n" " if not nums: return 0\n" " best = cur = 1\n" " for a, b in zip(nums, nums[1:]):\n" " cur = cur + 1 if a == b else 1\n" " best = max(best, cur)\n" " return best" ), ), dict( name="second_largest", prompt=( "Write a Python function:\n\n" "def second_largest(nums: list) -> int | None:\n" ' """Return the second largest unique value, or None if fewer than 2 unique values."""' ), tests=textwrap.dedent("""\ from solution import second_largest def test_basic(): assert second_largest([3,1,4,1,5]) == 4 def test_two(): assert second_largest([2,1]) == 1 def test_dupes(): assert second_largest([1,1,1]) is None def test_empty(): assert second_largest([]) is None """), solution=( "def second_largest(nums: list):\n" " u = sorted(set(nums), reverse=True)\n" " return u[1] if len(u) >= 2 else None" ), ), dict( name="run_length_encode", prompt=( "Write a Python function:\n\n" "def run_length_encode(s: str) -> str:\n" ' """Run-length encode s: "aaabbc" -> "a3b2c1"."""' ), tests=textwrap.dedent("""\ from solution import run_length_encode def test_basic(): assert run_length_encode("aaabbc") == "a3b2c1" def test_single(): assert run_length_encode("a") == "a1" def test_empty(): assert run_length_encode("") == "" def test_mixed(): assert run_length_encode("abcd") == "a1b1c1d1" """), solution=( "def run_length_encode(s: str) -> str:\n" " if not s: return ''\n" " out, cur, n = [], s[0], 1\n" " for c in s[1:]:\n" " if c == cur: n += 1\n" " else: out.append(f'{cur}{n}'); cur, n = c, 1\n" " out.append(f'{cur}{n}')\n" " return ''.join(out)" ), ), dict( name="rotate_list", prompt=( "Write a Python function:\n\n" "def rotate_list(lst: list, k: int) -> list:\n" ' """Return lst rotated right by k positions."""' ), tests=textwrap.dedent("""\ from solution import rotate_list def test_basic(): assert rotate_list([1,2,3,4,5], 2) == [4,5,1,2,3] def test_zero(): assert rotate_list([1,2,3], 0) == [1,2,3] def test_empty(): assert rotate_list([], 3) == [] def test_full(): assert rotate_list([1,2,3], 3) == [1,2,3] """), solution=( "def rotate_list(lst: list, k: int) -> list:\n" " if not lst: return []\n" " k = k % len(lst)\n" " return lst[-k:] + lst[:-k] if k else lst[:]" ), ), dict( name="word_frequency", prompt=( "Write a Python function:\n\n" "def word_frequency(text: str) -> dict:\n" ' """Return word -> count (case-insensitive, split on whitespace)."""' ), tests=textwrap.dedent("""\ from solution import word_frequency def test_basic(): assert word_frequency("the cat sat") == {"the":1,"cat":1,"sat":1} def test_repeat(): assert word_frequency("a a b") == {"a":2,"b":1} def test_case(): assert word_frequency("A a") == {"a":2} def test_empty(): assert word_frequency("") == {} """), solution=( "def word_frequency(text: str) -> dict:\n" " d = {}\n" " for w in text.lower().split():\n" " d[w] = d.get(w, 0) + 1\n" " return d" ), ), ] SYSTEM_PROMPT = ( "You are a Python coding assistant. " "Write ONLY the function — no imports, no test code, no explanation. " "Output raw Python starting with `def`." ) # ─── Dataset helpers ────────────────────────────────────────────────────────── def build_dataset(tasks: list) -> Dataset: return Dataset.from_dict({ "prompt": [ [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": t["prompt"]}, ] for t in tasks ], "tests": [t["tests"] for t in tasks], }) def _extract_code(text) -> str: if isinstance(text, list): text = text[0]["content"] if text else "" text = text or "" m = re.search(r"```(?:python)?\s*(.*?)```", text, re.DOTALL) return (m.group(1) if m else text).strip() # ─── Sandbox reward ─────────────────────────────────────────────────────────── _HARNESS = """\ import sys, os, subprocess, re sys.path.insert(0, "/tmp/pkgs") if not os.path.isdir("/tmp/pkgs"): subprocess.run( ["python3", "-m", "pip", "install", "pytest", "-q", "--target", "/tmp/pkgs"], capture_output=True, check=False, ) sys.path.insert(0, "/tmp/pkgs") os.makedirs("/tmp/sol", exist_ok=True) open("/tmp/sol/solution.py", "w").write({code!r}) open("/tmp/sol/test_sol.py", "w").write({tests!r}) r = subprocess.run( ["python3", "-m", "pytest", "/tmp/sol/test_sol.py", "--tb=no", "-q", "--import-mode=importlib"], capture_output=True, text=True, env={{**os.environ, "PYTHONPATH": "/tmp/pkgs:/tmp/sol"}}, ) p = int((re.search(r"(\\d+) passed", r.stdout) or [0,0])[1]) f = int((re.search(r"(\\d+) failed", r.stdout) or [0,0])[1]) t = p + f print(f"{{p}}/{{t}}") """ def _run_sandbox(code: str, tests: str) -> float: harness = _HARNESS.format(code=code, tests=tests) try: box = Sandbox.create(memory_mb=2048) ex = box.run("python3", ["-c", harness]) last = (ex.stdout or "").strip().splitlines() last = last[-1] if last else "0/0" p, t = (int(x) for x in last.split("/")) return p / t if t > 0 else 0.0 except Exception: return 0.0 # ─── Reward function — logs best completion of every batch ─────────────────── _reward_log: List[dict] = [] # accumulates {code, score, step} across training _step = [0] # mutable counter (closure-friendly) def reward_sandbox(completions, tests: List[str], **kwargs) -> List[float]: """ Reward = fraction of hidden pytest tests that pass (0.0–1.0). G completions are dispatched to G parallel sandboxes. Every batch whose best score > 0 is printed immediately. """ codes = [_extract_code(c) for c in completions] _step[0] += 1 with ThreadPoolExecutor(max_workers=len(codes)) as pool: futures = {pool.submit(_run_sandbox, code, test): i for i, (code, test) in enumerate(zip(codes, tests))} scores = [0.0] * len(codes) for fut in as_completed(futures): i = futures[fut] scores[i] = fut.result() _reward_log.append({"step": _step[0], "code": codes[i], "score": scores[i]}) best_i = max(range(len(scores)), key=lambda i: scores[i]) if scores[best_i] > 0: console.print( f"\n [bold green]↑ step {_step[0]} reward={scores[best_i]:.0%}" f" ({int(scores[best_i]*4)}/4 tests)[/bold green]" ) console.print(Panel( codes[best_i], title=f"[bold green]Best completion — step {_step[0]}[/bold green]", border_style="green", )) return scores def print_top_completions(n: int = 3): nonzero = [e for e in _reward_log if e["score"] > 0] if not nonzero: console.print("[yellow]No non-zero rewards recorded during training.[/yellow]") return top = sorted(nonzero, key=lambda e: e["score"], reverse=True)[:n] console.print(Rule(f"[bold green]Top {len(top)} completions by reward[/bold green]", style="green")) for rank, entry in enumerate(top, 1): color = "green" if entry["score"] >= 0.75 else "yellow" console.print(Panel( entry["code"], title=f"[bold]#{rank} reward={entry['score']:.0%} step={entry['step']}[/bold]", border_style=color, )) # ─── Phase 1: SFT warmup ────────────────────────────────────────────────────── def sft_warmup(model, tokenizer, tasks: list, steps: int = 30): """ Brief supervised pass on correct solutions. Teaches the model to emit valid Python before GSPO takes over. Without this, reward_std=0 every step and GSPO has no gradient signal. """ console.print(Rule("[magenta]Phase 1 — SFT warmup[/magenta]", style="magenta")) console.print( f"[dim]{steps} gradient steps on correct solutions " f"({len(tasks)} tasks, cycling). Goal: non-zero reward_std in Phase 2.[/dim]\n" ) optimizer = AdamW(model.parameters(), lr=2e-5) model.train() texts = [] for task in tasks: messages = [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": task["prompt"]}, {"role": "assistant", "content": task["solution"]}, ] texts.append(tokenizer.apply_chat_template(messages, tokenize=False)) for step in range(1, steps + 1): text = texts[(step - 1) % len(texts)] enc = tokenizer(text, return_tensors="pt", truncation=True, max_length=512) labels = enc["input_ids"].clone() outputs = model(**enc, labels=labels) outputs.loss.backward() optimizer.step() optimizer.zero_grad() if step % max(1, steps // 5) == 0 or step == steps: console.print(f" SFT step {step:3d}/{steps} loss={outputs.loss.item():.4f}") del optimizer console.print("[dim]SFT warmup done.\n[/dim]") # ─── Evaluation ─────────────────────────────────────────────────────────────── def evaluate(model, tokenizer, tasks: list): model.eval() device = next(model.parameters()).device t = Table(box=box.SIMPLE, show_header=True, header_style="bold white") t.add_column("Function", width=20) t.add_column("Tests", width=7, justify="right") t.add_column("Generated code (first 55 chars)", width=57) t.add_column("", width=5) total = 0.0 for task in tasks: messages = [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": task["prompt"]}, ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) enc = tokenizer(text, return_tensors="pt") input_ids = enc["input_ids"].to(device) attention_mask = enc["attention_mask"].to(device) with torch.no_grad(): out = model.generate( input_ids, attention_mask=attention_mask, max_new_tokens=160, do_sample=False, pad_token_id=tokenizer.eos_token_id, ) response = tokenizer.decode(out[0][input_ids.shape[1]:], skip_special_tokens=True) code = _extract_code(response) score = _run_sandbox(code, task["tests"]) total += score bar = "█" * int(score * 5) + "░" * (5 - int(score * 5)) color = "green" if score == 1.0 else "yellow" if score > 0 else "red" t.add_row( task["name"], f"[{color}]{score:.0%}[/{color}]", code.replace("\n", "↵ ")[:55], f"[{color}]{bar}[/{color}]", ) console.print(t) avg = total / len(tasks) console.print(f" Average test pass rate: [bold cyan]{avg:.1%}[/bold cyan]") return avg # ─── Main ───────────────────────────────────────────────────────────────────── def train_gspo(): tasks = TASKS[:3] if SMOKE else TASKS sft_steps = 20 if SMOKE else 60 gspo_epochs = 1 if SMOKE else 3 console.print(Panel( "[bold green]RL GSPO — Code Generation with Hidden Test Suites[/bold green]\n\n" "[dim]Algorithm : GSPO (sequence-level IS) — GRPOConfig(importance_sampling_level='sequence')\n" "Model : SmolLM2-135M-Instruct (135 M params, CPU-friendly)\n" "Task : Implement Python functions from docstrings\n" "Reward : fraction of hidden pytest tests passing (sandbox oracle)\n" "Sandboxes : G parallel TensorLake sandboxes per GSPO step\n" "Phase 1 : SFT warmup — correct solutions so model starts generating valid Python\n" "Phase 2 : GSPO — refines via reward signal from sandbox test results\n" "GPU needed : No\n" f"Mode : {'SMOKE (3 tasks, 20 SFT steps, 1 GSPO epoch)' if SMOKE else f'Full ({len(tasks)} tasks, {sft_steps} SFT steps, {gspo_epochs} GSPO epochs)'}[/dim]", border_style="green", )) console.print("\n[dim]Loading SmolLM2-135M-Instruct...[/dim]") tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME) if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, dtype=torch.float32) split = max(1, int(0.75 * len(tasks))) train_tasks = tasks[:split] eval_tasks = tasks[split:] console.print(f"[dim]{split} train tasks / {len(eval_tasks)} eval tasks[/dim]\n") # ── Baseline ──────────────────────────────────────────────────────────── console.print(Rule("[cyan]Baseline — before any training[/cyan]", style="cyan")) evaluate(model, tokenizer, eval_tasks) # ── Phase 1: SFT warmup ───────────────────────────────────────────────── sft_warmup(model, tokenizer, train_tasks, steps=sft_steps) console.print(Rule("[cyan]After SFT warmup[/cyan]", style="cyan")) evaluate(model, tokenizer, eval_tasks) # ── Phase 2: GSPO ──────────────────────────────────────────────────────── console.print(Rule("[yellow]Phase 2 — GSPO fine-tuning[/yellow]", style="yellow")) console.print( "[dim]Best completions printed live as reward > 0 is observed.\n" "reward_std > 0 confirms the policy is exploring.[/dim]\n" ) config = GRPOConfig( output_dir=OUTPUT_DIR, importance_sampling_level="sequence", # ← GSPO vs GRPO num_generations=2 if SMOKE else 4, max_completion_length=200, temperature=1.4, # high temp forces diverse G completions → reward_std > 0 learning_rate=2e-6, num_train_epochs=gspo_epochs, per_device_train_batch_size=1, gradient_accumulation_steps=2 if SMOKE else 4, warmup_steps=5, beta=0.001, epsilon=0.2, logging_steps=1, save_steps=999, seed=42, report_to="none", bf16=False, fp16=False, ) trainer = GRPOTrainer( model=model, args=config, train_dataset=build_dataset(train_tasks), reward_funcs=[reward_sandbox], processing_class=tokenizer, ) trainer.train() # ── Results ────────────────────────────────────────────────────────────── print_top_completions(n=3) console.print(Rule("[cyan]After GSPO training[/cyan]", style="cyan")) final_acc = evaluate(model, tokenizer, eval_tasks) console.print(Panel( f"[bold]Result: {final_acc:.0%} average test pass rate on held-out functions[/bold]\n\n" "Context:\n" " • Eval functions were [bold]never seen[/bold] during SFT or GSPO training\n" " • Baseline before any training: [red]0%[/red]\n" f" • After GSPO: [bold green]{final_acc:.0%}[/bold green]" " ← model generalised from 7 training functions to unseen ones\n\n" "Why 25 % is a reasonable outcome for this setup:\n" " • 135 M params is the [italic]smallest[/italic] publicly available instruct model\n" " • Only 60 SFT steps on 7 reference solutions (~5 min CPU)\n" " • 25 % means 1 / 4 tests pass per function — the model correctly\n" " handles the empty-input edge case on all three unseen functions,\n" " showing the pattern [italic]transferred[/italic] across task types\n" " • Typical zero-shot pass@1 for 135 M models on HumanEval is < 5 %\n\n" "Cheap ways to push higher (no extra hardware):\n" " 1. [cyan]temperature=1.4[/cyan] (already set) — forces reward_std > 0 so GSPO\n" " has a gradient signal instead of collapsing to all-zero advantages\n" " 2. More SFT examples (50+ functions, ~10 min) before GSPO\n" " 3. Switch to [cyan]Qwen2.5-0.5B-Instruct[/cyan] (4× more params, same CPU time)", title="[bold cyan]Score interpretation[/bold cyan]", border_style="cyan", )) console.print(Panel( "[bold]Why GSPO + sandbox here?[/bold]\n\n" "1. [cyan]Sandboxes required[/cyan]: model code is untrusted — cannot run in-process.\n\n" "2. [cyan]Hidden test suites[/cyan]: the model never sees the tests.\n" " Sandbox is the only oracle → no reward hacking.\n\n" "3. [cyan]GSPO over GRPO[/cyan]: long function bodies mean many tokens.\n" " Token-level IS clipping (GRPO) lets noisy tokens dominate the gradient.\n" " Sequence-level clipping (GSPO) clips the whole trajectory once:\n\n" " GRPO: clip( π_θ(t)/π_old(t) ) per token\n" " GSPO: clip( Π_t π_θ(t)/π_old(t) ) once per sequence", title="[bold cyan]Design rationale[/bold cyan]", border_style="cyan", )) if __name__ == "__main__": train_gspo() ```` *** ## What happens step-by-step | Step | Phase | What happens | | :---- | :------------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **1** | Setup | Model and tokenizer loaded. Tasks split 75/25 into train and eval sets. | | **2** | Baseline | Eval tasks run through the untrained model and scored via sandbox. Typically \~0%. | | **3** | SFT warmup | N supervised gradient steps on correct reference solutions. Ensures the model produces parseable Python before RL begins. | | **4** | After SFT | Eval re-run. Reward variance should now be non-zero — a prerequisite for GSPO to have a gradient signal. | | **5** | GSPO loop | For each training step, *G* completions are generated and dispatched to *G* parallel sandboxes. Each sandbox runs the hidden pytest suite and returns a score. | | **6** | Reward signal | `reward_sandbox` collects scores, logs the best completion, and returns the score list to `GRPOTrainer`. | | **7** | Final eval | Held-out functions (never seen during training) are evaluated. A 25% pass rate on a 135M parameter model is the expected outcome. | *** ## Key design decisions ### Why `temperature=1.4` GSPO requires diversity across the *G* completions in each group to produce a non-zero reward standard deviation. If all completions are identical (low temperature), `reward_std = 0` and the advantage normalization produces zero gradients — training stalls. Setting temperature high forces the model to explore different implementations. ### Why SFT warmup is required Without warmup, a randomly-initialized or instruction-tuned model produces malformed Python that scores 0 on every test case. All-zero rewards mean all-zero advantages after normalization, and GSPO has nothing to optimize. Even 20 supervised steps on correct solutions is enough to bootstrap non-zero reward variance. ### Why sandboxes prevent reward hacking The model never has access to the test file. The only feedback is the pass rate returned by the sandbox. This makes it impossible for the model to overfit to specific assertion patterns — it must actually implement the correct logic. *** This example uses `python-dotenv` to load your Tensorlake API key. Create a `.env` file in your project root: ``` TENSORLAKE_API_KEY="your-api-key-here" ``` The SDK will pick it up automatically. ## What to build next Use a sandbox as a tool inside an agentic LLM loop. Dispatch parallel sandboxes across a swarm of worker agents. # Harbor Source: https://docs.tensorlake.ai/sandboxes/harbor Run Harbor evaluations and RL rollouts on Tensorlake Sandboxes — fresh isolation per trial, pre-warmed snapshots for expensive environments, and independent test verification. [Harbor](https://github.com/harbor-framework/harbor) is a framework from the creators of [Terminal-Bench](https://www.tbench.ai/) for evaluating and optimizing agents and language models. With Harbor you can evaluate arbitrary agents (Claude Code, OpenHands, Codex CLI, and others) against curated datasets like Terminal-Bench, SWE-Bench, and Aider Polyglot, build and share your own benchmarks, run thousands of trials in parallel across cloud providers, and generate rollouts for RL optimization. Harbor abstracts the execution backend behind an `--env` flag. Tensorlake plugs in as one of those providers — alongside other sandboxes and local Docker — so the same Harbor commands run on Tensorlake sandboxes without changing your tasks, agents, or evaluators. This guide focuses on running CLI-agent evaluations against benchmarks like Terminal-Bench. Harbor also supports generating rollouts for RL optimization — we'll cover those workflows in follow-up guides. New to Tensorlake? Sign up at the [dashboard](https://cloud.tensorlake.ai) — new accounts include free credits, enough to run a full Terminal-Bench sweep before you pay for anything. ## Quick start Grab one from the [Tensorlake Dashboard](https://cloud.tensorlake.ai). You'll also need an API key for whichever agent provider you want to evaluate (e.g., Anthropic). The `harbor[tensorlake]` extra installs the `TensorLakeEnvironment` provider alongside Harbor. ```bash theme={null} uv pip install "harbor[tensorlake]" ``` ```bash theme={null} pip install "harbor[tensorlake]" ``` ```bash theme={null} export TENSORLAKE_API_KEY="tl_..." export ANTHROPIC_API_KEY="sk-ant-..." # or another agent provider ``` Run a single Terminal-Bench task on Tensorlake with Claude Code as the agent: ```bash theme={null} harbor run --env tensorlake \ --include-task-name pytorch-model-cli \ --dataset terminal-bench@2.0 \ --agent claude-code \ --model anthropic/claude-sonnet-4-6 \ --ae ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY ``` Drop `--include-task-name` to run the full Terminal-Bench 2.0 suite. `--ae KEY=VALUE` forwards an environment variable from your shell into the sandbox where the agent runs — add more `--ae` flags for any other secrets the agent needs. ## Why Tensorlake for Harbor Harbor's value comes from running large fleets of environments in parallel and trusting the results. Tensorlake's runtime is designed for exactly that workload: * **Per-trial sandboxes** — each task starts on a clean machine and is destroyed at the end. No shared kernel state between trials, which matters for both eval reproducibility and RL reward integrity. * **Pre-warmed snapshots** — environments with heavy `apt`/`pip` installs (PyTorch, CUDA toolchains, full Linux desktops) can be built once, snapshotted, and restored under a second for every subsequent trial or rollout. * **Independent verification** — Harbor's test script runs inside the sandbox and writes `1.0` or `0.0` to `reward.txt`. The agent never sees or touches the verifier, so "the agent said it worked" is never confused with "the tests pass." * **Parallel scale** — Tensorlake schedules thousands of sandboxes concurrently, which is what RL rollout generation and full benchmark sweeps need. ## Anatomy of a Harbor task Harbor expects each task to be laid out like this - take [gcode-to-text](https://github.com/harbor-framework/terminal-bench-2/tree/main/gcode-to-text) as an example: ``` gcode-to-text ├── environment │ ├── Dockerfile │ └── text.gcode.gz ├── instruction.md ├── solution │ └── solve.sh ├── task.toml └── tests ├── test_outputs.py └── test.sh ``` * `environment/Dockerfile` defines the base image and any setup steps. * `instruction.md` is the prompt the agent receives. * `solution/` is an oracle reference used to validate the environment itself. * `tests/test.sh` runs after the agent finishes and produces `reward.txt`. ## Tune sandbox resources Each task's `task.toml` controls the sandbox Harbor provisions on Tensorlake. Set resources in the `[environment]` block: ```toml task.toml theme={null} [environment] cpus = 2 memory_mb = 4096 storage_mb = 20480 allow_internet = true ``` | Field | Default | Forwarded to Tensorlake | | ---------------- | ------- | ----------------------- | | `cpus` | `1` | `cpus` | | `memory_mb` | `2048` | `memory_mb` | | `storage_mb` | `10240` | `ephemeral_disk_mb` | | `allow_internet` | `true` | `allow_internet_access` | Tensorlake requires `memory_mb` to be between 1024 and 8192 MB per CPU core. A few rules of thumb: * **Large or heavy images** — if your `environment/Dockerfile` pulls in big toolchains (PyTorch, CUDA, full Linux desktops, large datasets), bump `cpus` and `memory_mb` so the build and runtime have headroom, and raise `storage_mb` past the image size plus working-set room. Underprovisioned sandboxes show up as build timeouts or OOMs mid-trial. * **Lock down `allow_internet`** — set `allow_internet = false` to stop the agent from searching the web for answers. If the verifier needs network access, bake those dependencies into the Dockerfile. Per-host allowlists are coming soon, so you'll be able to block search engines while leaving package mirrors reachable. ## Interactive debugging When a trial fails and you want to poke around the live environment, attach to the session: ```bash theme={null} harbor env attach ``` Drop directly into the running sandbox to inspect state, rerun tests by hand, and confirm whether the failure was the agent or the environment. ## Structured logs Each trial produces structured artifacts, e.g.: ``` gcode-to-text__UFALMLv ├── agent/ ├── verifier/ ├── result.json └── trial.log ``` So you can trace: * The agent's actions and outputs * What the verifier checked * Why the trial passed or failed ## What to build next Build an environment once, snapshot it, and restore in seconds for every trial. Use sandboxes as a deterministic reward oracle for RL training loops. # Sandbox Images Source: https://docs.tensorlake.ai/sandboxes/images Define reusable named sandbox images in Python, TypeScript, or Dockerfiles. Sandbox images let you prebuild dependencies, files, and environment setup once, then launch fresh sandboxes from that prepared state. An image is a project-scoped name backed by a filesystem snapshot. You can define one with a Dockerfile, the Python SDK, or the TypeScript SDK, then pass the registered name to `image=` when creating sandboxes. The usual flow is: 1. Choose a base image. 2. Define the setup steps with a Dockerfile or `Image` object. 3. Build and register the image name in your project. 4. Create sandboxes from that registered name. ## Prerequisites Images are scoped to the authenticated project. Before creating one, make sure you have installed the SDK or CLI and authenticated: ```bash theme={null} pip install tensorlake tl login ``` ```bash theme={null} npm install tensorlake npx tl login ``` For SDK image builds, set `TENSORLAKE_API_KEY` in the process environment. If you authenticate with a personal access token instead, also set `TENSORLAKE_ORGANIZATION_ID` and `TENSORLAKE_PROJECT_ID` so the SDK can choose the project. ## Base Images Tensorlake ships preconfigured base images that boot quickly and are tuned for common sandbox workloads: * `tensorlake/ubuntu-minimal` (*default sandbox image*): Minimal Ubuntu without systemd. Use this when you want the fastest cold starts. * `tensorlake/ubuntu-systemd`: Ubuntu with systemd. Use this when you need services such as Docker or Kubernetes inside the sandbox. * `tensorlake/debian-minimal`: Minimal Debian 13. In environments where desktop automation is enabled, you may also see: * `tensorlake/ubuntu-vnc`: Desktop-enabled Ubuntu based on `tensorlake/ubuntu-systemd`, with XFCE, TigerVNC, and Firefox preinstalled. Use it for browser automation and computer-use workloads. See [Computer Use](/sandboxes/computer-use). The short aliases `ubuntu-minimal`, `ubuntu-systemd`, `ubuntu-vnc`, and `debian-minimal` resolve to their canonical `tensorlake/`-prefixed names. Both forms work anywhere `image=` is accepted. ### `FROM` vs `image=` `FROM` and `image=` are related, but they are resolved at different times: * `FROM` in a Dockerfile, `base_image=` in Python, or `baseImage` in TypeScript accepts any OCI image reference. Tensorlake resolves it during the image build. * `image=` on `Sandbox.create()` or `--image` on `tl sbx create` accepts a registered sandbox image name. It is not resolved against an upstream registry at sandbox-create time. For example, `FROM python:3.12-slim` is valid in a Dockerfile build. `Sandbox.create(image="python:3.12-slim")` will fail unless you have already registered an image with that exact name. ## Build and Register an Image You can define the same image with a Dockerfile, Python, or TypeScript. During a build, Tensorlake prepares a temporary builder sandbox, applies the setup steps, snapshots the prepared root filesystem, and registers the snapshot under the image name. ```dockerfile Dockerfile theme={null} FROM tensorlake/ubuntu-systemd RUN apt-get update && apt-get install -y python3 python3-pip COPY requirements.txt /tmp/requirements.txt RUN python3 -m pip install --break-system-packages -r /tmp/requirements.txt RUN mkdir -p /workspace/cache ENV APP_ENV=prod WORKDIR /workspace ``` ```bash theme={null} tl sbx image create ./Dockerfile --registered-name data-tools-image ``` ```python theme={null} from tensorlake import Image image = ( Image(name="data-tools-image", base_image="tensorlake/ubuntu-systemd") .copy("requirements.txt", "/tmp/requirements.txt") .run("apt-get update && apt-get install -y python3 python3-pip") .run("python3 -m pip install --break-system-packages -r /tmp/requirements.txt") .run("mkdir -p /workspace/cache") .env("APP_ENV", "prod") .workdir("/workspace") ) image.build(registered_name="data-tools-image") ``` ```typescript theme={null} import { Image } from "tensorlake"; const image = new Image({ name: "data-tools-image", baseImage: "tensorlake/ubuntu-systemd", }) .copy("requirements.txt", "/tmp/requirements.txt") .run("apt-get update && apt-get install -y python3 python3-pip") .run("python3 -m pip install --break-system-packages -r /tmp/requirements.txt") .run("mkdir -p /workspace/cache") .env("APP_ENV", "prod") .workdir("/workspace"); await image.build({ registeredName: "data-tools-image", contextDir: ".", }); ``` `contextDir` controls how relative `copy()` and `add()` sources are resolved in SDK builds. Dockerfile builds use the Dockerfile's parent directory as the build context. ### Build from an OCI Base You are not limited to `tensorlake/*` bases. The build base can be any standard OCI image reference, including `python:3.12-slim`, `debian:bookworm-slim`, `node:22-alpine`, `ghcr.io/...`, or `public.ecr.aws/...`. ```dockerfile Dockerfile theme={null} FROM python:3.12-slim RUN apt-get update && apt-get install -y curl RUN python3 -m pip install pandas pyarrow duckdb WORKDIR /workspace ``` ```bash theme={null} tl sbx image create ./Dockerfile --registered-name py-data-tools ``` The first build from a new OCI base takes longer because Tensorlake has to fetch and prepare the upstream image. After registration, sandboxes launched from `py-data-tools` use the registered sandbox image. ### Private Registries Registry credentials are read from your local Docker config file: `~/.docker/config.json`, or `$DOCKER_CONFIG/config.json` when `DOCKER_CONFIG` is set. Any registry that works with `docker login` can be used here, including Docker Hub, GHCR, ECR, GCR, Quay, and self-hosted registries. ```bash theme={null} docker login ghcr.io tl sbx image create ./Dockerfile --registered-name my-private-image ``` In CI, make sure the runner has a populated Docker config before running `tl sbx image create`. There is no separate environment-variable or programmatic registry-auth path today. ## Launch Sandboxes from an Image Create a sandbox from the registered image name. You can still override CPU, memory, disk, timeout, and entrypoint when the sandbox starts. ```bash theme={null} tl sbx create --image data-tools-image ``` ```bash theme={null} tl sbx create \ --image data-tools-image \ --cpus 4.0 \ --memory 4096 \ --disk_mb 51200 \ --timeout 1800 ``` ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = Sandbox.create( image="data-tools-image", cpus=4.0, memory_mb=4096, disk_mb=51200, timeout_secs=1800, ) try: result = sandbox.run( "python3", ["-c", "import pandas, pyarrow; print('ready')"], ) print(result.stdout) finally: sandbox.terminate() ``` ```typescript theme={null} import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.create({ image: "data-tools-image", cpus: 4.0, memoryMb: 4096, diskMb: 51200, timeoutSecs: 1800, }); try { const result = await sandbox.run("python3", { args: ["-c", "import pandas, pyarrow; print('ready')"], }); console.log(result.stdout); } finally { await sandbox.terminate(); } ``` ## Python Packages The Tensorlake Ubuntu and Debian base images ship a PEP 668-managed system Python, so `pip install` requires `--break-system-packages` unless you create a virtual environment. Without it, `pip` exits with `error: externally-managed-environment`. For one-off installs in a running sandbox: ```python theme={null} sandbox.run( "python3", ["-m", "pip", "install", "--break-system-packages", "pandas", "pyarrow", "duckdb"], ) ``` ```typescript theme={null} await sandbox.run("python3", { args: ["-m", "pip", "install", "--break-system-packages", "pandas", "pyarrow", "duckdb"], }); ``` For repeatable installs, put the packages in `requirements.txt` and install them during the image build, as shown in [Build and Register an Image](#build-and-register-an-image). Do not sidestep PEP 668 by switching Python versions. `python3.11 -m pip install ...` or another alternate system Python can produce the same `externally-managed-environment` error. Use `--break-system-packages` with the system `python3`, or create an explicit virtual environment. ## Build Resources Image builds run inside a temporary builder sandbox. You can allocate more CPU, memory, or disk for that builder, and you can separately choose the root disk size of the generated sandbox image. ```bash theme={null} tl sbx image create ./Dockerfile \ --registered-name data-tools-image \ --cpus 4 \ --memory 4096 \ --disk_mb 25600 \ --builder_disk_mb 32768 ``` ```python theme={null} image.build( registered_name="data-tools-image", cpus=4.0, memory_mb=4096, disk_mb=25600, builder_disk_mb=32768, ) ``` ```typescript theme={null} await image.build({ registeredName: "data-tools-image", cpus: 4.0, memoryMb: 4096, diskMb: 25600, builderDiskMb: 32768, contextDir: ".", }); ``` `disk_mb` / `diskMb` sets the root disk size for sandboxes created from the registered image. `builder_disk_mb` / `builderDiskMb` only affects the temporary builder sandbox. Build defaults are `cpus=2.0`, `memory=4096 MB`, and a generated root disk of `10240 MiB` (10 GiB). ## Register an Existing Snapshot as an Image If you already have a completed filesystem snapshot, you can give it a reusable image name without rebuilding: ```bash theme={null} tl sbx image register data-tools-image snap_01HX... \ --dockerfile ./Dockerfile ``` The first positional argument is the image name to register, the second is the completed snapshot ID, and `--dockerfile` is stored alongside the image so `tl sbx image describe` can show how it was built. Add `--public` to make the name resolvable from any namespace (see [Public Images](#public-images)). The snapshot must be in `Completed` status with a durable `snapshot_uri`; `tl sbx image register` rejects snapshots that haven't finished uploading. ## Inspect and List Registered Images ```bash theme={null} tl sbx image ls # list every image registered in the current project tl sbx image describe data-tools-image # show Dockerfile, snapshot ID, image size ``` `describe` accepts either the registered image name or the underlying sandbox-template ID. ## Public Images By default a registered image is namespace-scoped. Pass `--public`, `is_public=True`, or `isPublic: true` to make the image name resolvable from any namespace. This is how the `tensorlake/*` base images work. ```bash theme={null} tl sbx image create ./Dockerfile --registered-name shared-base --public ``` ```python theme={null} image.build(registered_name="shared-base", is_public=True) ``` ```typescript theme={null} await image.build({ registeredName: "shared-base", isPublic: true, contextDir: ".", }); ``` Public image names must be globally unique for the registry. Names that collide with an already-registered public image will be rejected at creation time. ## Examples ### Skills Image This variant preloads the [Tensorlake skills repo](/agent-skills) so coding agents can auto-discover it at startup: ```dockerfile Dockerfile theme={null} FROM tensorlake/ubuntu-systemd RUN apt-get update && apt-get install -y git nodejs npm python3 python3-pip RUN npm install -g skills RUN skills add tensorlakeai/tensorlake-skills --all -y --copy RUN python3 -m pip install --break-system-packages tensorlake ``` If the file is named `Dockerfile`, the registered name defaults to the parent directory name. Otherwise it defaults to the file stem. Registered image names must be unique within a project. ## Supported Build Operations and Limitations Across the supported image-definition DSLs, Tensorlake currently materializes these build operations into the sandbox: * `RUN` * `WORKDIR` * `ENV` * `COPY` * `ADD` These metadata-oriented operations are preserved with the image definition but are not materialized into the snapshot: * `CMD` * `ENTRYPOINT` * `EXPOSE` * `HEALTHCHECK` * `LABEL` * `STOPSIGNAL` * `VOLUME` These operations are not currently supported for sandbox image creation: * `ARG` * `ONBUILD` * `SHELL` * `USER` Additional limitations: * Multi-stage Dockerfiles are not supported yet. * `COPY` and `ADD` sources are read from the local filesystem relative to the Dockerfile build context. * `COPY` and `ADD` currently assume a local build context. Remote URLs and advanced BuildKit-only features are not supported. * `tl sbx image create` and SDK image builds use the current authenticated project context. * `tl sbx image describe` shows the registered Dockerfile and snapshot metadata for a sandbox image. ## See Also Understand the underlying snapshot primitive used to save and restore sandbox state. Learn which sandbox settings you can still override when launching from an image. Ship Tensorlake SDK docs inside sandbox images for agents and tools. # Sandbox and Orchestration Infrastructure for Agents Source: https://docs.tensorlake.ai/sandboxes/introduction Tensorlake provides isolated MicroVM sandboxes that boot in hundreds of milliseconds, with memory and filesystem preserved across suspend/resume. Get setup in a few minutes, and start a sandbox in a few seconds. Sandboxes can be used to run agent harnesses, run tool calls or even as VMs for running coding agents, builds and IDEs. ## How it works Sandboxes are created on-demand via API calls, and they are MicroVMs backed by Firecracker and CloudHypervisor. You can specify images and resources when creating them. The default image, `tensorlake/ubuntu-minimal`, starts up in a few hundred milliseconds, while `tensorlake/ubuntu-systemd` has a full init system and more tools and takes around 1 second to boot. ```bash cli theme={null} tl sbx create ``` ```python sandbox.py theme={null} from tensorlake.sandbox import Sandbox resp = Sandbox.create( image="tensorlake/ubuntu-minimal", cpus=4, memory_mb=8192, ) ``` ```typescript sandbox.ts theme={null} import { Sandbox } from "tensorlake"; const resp = await Sandbox.create({ image: "tensorlake/ubuntu-minimal", cpus: 4, memoryMb: 8192, }); ``` #### Start Using Sandboxes Install the SDK and run your first sandbox. The mental model behind ephemeral, named, suspend, and snapshot. Use and customize sandbox images for your use case. How to persist sandbox state across runs with suspend and snapshots. ## Trust and support Tensorlake is HIPAA and SOC 2 Type II compliant, supports EU data residency, and offers zero data retention. Chat with our engineers. [support@tensorlake.ai](mailto:support@tensorlake.ai) Use cases and product updates. # Lifecycle Source: https://docs.tensorlake.ai/sandboxes/lifecycle Sandbox states, creation, suspend/resume, and cleanup ## Overview Sandboxes come in two flavors: * **Ephemeral** — no name. Runs until you terminate it or it times out. Cannot be suspended. * **Named** — a name given at creation (or assigned later). Supports suspend and resume, so you can pause between tasks and pick up exactly where you left off. | | Ephemeral | Named | | -------------------- | ------------------------------------ | ---------------------------------------- | | **Created with** | `tl sbx create` | `tl sbx create ` | | **Suspend / Resume** | Not supported | Supported | | **Reference by** | ID only | ID **or** name | | **Use when** | Short-lived tasks, one-off execution | Multi-step work, persistent environments | ## Lifecycle states Every sandbox moves through the states below. Create starts the sandbox in `Pending`; from `Running`, you can suspend (named only), snapshot, or terminate. Ephemeral sandboxes follow the same flow but skip `Suspending`/`Suspended`. ```mermaid theme={null} stateDiagram-v2 [*] --> Pending: • create
• restore from snapshot Pending --> Running Running --> Snapshotting: snapshot Snapshotting --> Running: snapshot complete Running --> Suspending: named sandbox
• suspend
• timeout Suspending --> Suspended Suspended --> Running: resume Running --> Terminated: • terminate
• timeout (ephemeral) Suspended --> Terminated: terminate Terminated --> [*] style Suspending fill:#E8F4FF,stroke:#1D70B8,color:#0B3C6F,stroke-width:2px style Suspended fill:#E8F4FF,stroke:#1D70B8,color:#0B3C6F,stroke-width:2px ``` | State | What it means | How you exit it | | ---------------- | ----------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------- | | **Pending** | Sandbox is being scheduled and booted. Not yet ready to accept commands. | Transitions to `Running` automatically once boot completes. | | **Running** | Sandbox is live and accepting commands, file operations, and process execution. Snapshots can be taken from this state. | Call `suspend` (named only) or `terminate`. | | **Snapshotting** | A reusable snapshot artifact is being captured from the sandbox's filesystem, memory, and running processes. | Returns to `Running` when capture completes. | | **Suspending** | Named sandbox is being paused in place. Triggered by manual suspend or by `timeout_secs` elapsing. | Transitions to `Suspended` automatically. | | **Suspended** | Named sandbox is paused. Consumes no compute; state is preserved for resume under the same sandbox ID. | Call `resume` to return to `Running`, or `terminate` to end it. | | **Terminated** | Sandbox has stopped — manually, or via timeout for ephemeral sandboxes. Final state; cannot be reversed. | — | ## Suspend vs. snapshot Suspend and snapshot both preserve sandbox state, but they serve different purposes: * **Suspend** pauses *this* sandbox so you can resume it later under the same ID. * **Snapshot** captures a reusable artifact you can restore into a *new* sandbox. Suspend/resume is covered on this page; see [Snapshots](/sandboxes/snapshots) for save-and-restore. ### When should I use what? | Scenario | Use Suspend | Use Snapshot | | ------------------------------- | ----------- | ------------ | | Pause and resume later | ✅ | ❌ | | Save cost when idle | ✅ | ❌ | | Keep agent memory alive | ✅ | ❌ | | Retry from a checkpoint | ❌ | ✅ | | Run experiments from same state | ❌ | ✅ | | Clone environment | ❌ | ✅ | ## Create a sandbox Create an ephemeral sandbox by calling create with no name. Add a name to make the sandbox persistent and eligible for suspend/resume. You can also boot a sandbox from an existing snapshot to restore a previously captured filesystem, memory, and running processes. See [Restoring from a snapshot](/sandboxes/snapshots#restoring-from-a-snapshot) for details. ```bash theme={null} # Ephemeral — runs until terminated or timed out tl sbx create # Named — can be suspended and resumed tl sbx create my-env ``` ```python theme={null} from tensorlake.sandbox import Sandbox # Ephemeral sandbox — no name, cannot be suspended ephemeral = Sandbox.create() # Named sandbox — can be suspended and resumed named = Sandbox.create(name="my-env") print(f"Sandbox ID: {named.sandbox_id}") print(f"Status: {named.status}") ``` ```typescript theme={null} // Ephemeral sandbox — no name, cannot be suspended const ephemeral = await Sandbox.create(); // Named sandbox — can be suspended and resumed const named = await Sandbox.create({ name: "my-env" }); console.log(named.sandboxId, named.status); ``` ```bash theme={null} # Ephemeral curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{}' # Named curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{"name": "my-env"}' ``` ### Resources Configure CPU, memory, and disk size per sandbox. These are fixed when the sandbox is created and cannot be changed afterwards — create a new sandbox if you need different resources. ```bash theme={null} tl sbx create --cpus 2.0 --memory 2048 --disk_mb 25600 ``` ```python theme={null} sandbox = Sandbox.create( cpus=2.0, memory_mb=2048, disk_mb=25600, ) ``` ```typescript theme={null} const sandbox = await Sandbox.create({ cpus: 2.0, memoryMb: 2048, diskMb: 25600, }); ``` ```bash theme={null} curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "resources": { "cpus": 2.0, "memory_mb": 2048, "disk_mb": 25600 } }' ``` | Parameter | Type | Default | Description | | ----------- | ------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `cpus` | `float` | `1.0` | Number of CPUs to allocate | | `memory_mb` | `int` | `1024` | Memory in megabytes. Must be between 1024–8192 MB per CPU core. | | `disk_mb` | `int` | `10240` | Root filesystem size in MiB. Defaults to 10240 (10 GiB) when omitted. Must be between 10240 and 102400 (10–100 GiB). The CLI accepts `--disk_mb` in MiB. Accepted for fresh creates. With `image`, `disk_mb` can be used to grow the root disk at create time (growth-only). With `snapshot_id` from a filesystem snapshot, `disk_mb` can also be used to grow the root disk at restore time (growth-only). | ### Timeout Set a timeout to automatically stop a sandbox that runs too long. The behavior depends on the sandbox type: * **Named sandboxes** — timeout triggers a suspend, preserving state for later resume. * **Ephemeral sandboxes** — timeout triggers termination (final state). If `timeout_secs` is not set, it goes with the default value which is `600` sec (10 minutes). The maximum allowed `timeout_secs` depends on your plan: **1 hour** on Free (unverified), **2 hours** on Free (verified), and **24 hours** on On-Demand (pay-as-you-go). Setting `timeout_secs=0` requests the plan maximum. See [tensorlake.ai/pricing](https://www.tensorlake.ai/pricing) for higher limits on committed plans. ```bash theme={null} tl sbx create --timeout 300 ``` ```python theme={null} sandbox = Sandbox.create(timeout_secs=300) ``` ```typescript theme={null} const sandbox = await Sandbox.create({ timeoutSecs: 300 }); ``` ```bash theme={null} curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{"timeout_secs": 300}' ``` ### Secrets Inject secrets into the sandbox at creation. Secrets must be pre-configured in your Tensorlake account with `tl secrets set OPENAI_API_KEY=`. Not supported in the CLI. ```python theme={null} sandbox = Sandbox.create( secret_names=["OPENAI_API_KEY", "DATABASE_URL"] ) ``` Use the HTTP API for secret injection. ```bash theme={null} curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "secret_names": ["OPENAI_API_KEY", "DATABASE_URL"] }' ``` ### Runtime environment Sandboxes run on Tensorlake's managed Ubuntu 24.04 environment by default. If you need reusable setup or preinstalled dependencies, create a [Sandbox Image](/sandboxes/images) and launch sandboxes with `--image`. For one-off startup setup, create the sandbox and then use [command execution](/sandboxes/commands) to run those steps explicitly. ## Name and reference a sandbox You can assign or update a sandbox's name after it is created. This is how you convert an ephemeral sandbox into a named one so it becomes eligible for suspend and resume. ```bash theme={null} # Assign a name to a running sandbox tl sbx name my-env # Change an existing name tl sbx name my-env new-name ``` ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = Sandbox.create() named_sbx = sandbox.update(name="my-env") print(named_sbx.name) ``` ```typescript theme={null} import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.create(); const named_sbx = await sandbox.update({ name: "my-env" }); console.log(named_sbx.name); ``` ```bash theme={null} curl -X PATCH https://api.tensorlake.ai/sandboxes/ \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{"name": "my-env"}' ``` Once a sandbox has a name, you can use either the name or the UUID anywhere a sandbox identifier is accepted. Use `connect` to get an operable handle from either identifier. ```bash theme={null} # All of these work with either the ID or the name tl sbx exec my-env python main.py tl sbx ssh my-env tl sbx cp ./file.py my-env:/workspace/file.py tl sbx suspend my-env tl sbx resume my-env tl sbx terminate my-env tl sbx checkpoint my-env tl sbx name my-env new-name ``` ```python theme={null} info = Sandbox.connect("my-env") print(info.status) sandbox = Sandbox.connect(identifier="my-env") print(sandbox.sandbox_id) # server UUID, e.g. "s7jus08qec4axzgbpq76h" print(sandbox.name) # "my-env" result = sandbox.run("python", ["main.py"]) print(result.stdout) renamed = sandbox.update(name= "new-name") print(renamed.name) sandbox.terminate() ``` ```typescript theme={null} const info = await Sandbox.connect("my-env"); console.log(info.status); const sandbox = Sandbox.connect("my-env"); console.log(sandbox.sandboxId); // server UUID console.log(sandbox.name); // "my-env" const result = await sandbox.run("python", { args: ["main.py"] }); console.log(result.stdout); await sandbox.update({ name: "new-name" }); await sandbox.terminate(); ``` ```bash theme={null} curl https://api.tensorlake.ai/sandboxes/my-env \ -H "Authorization: Bearer $TL_API_KEY" ``` Authenticated requests can use either the sandbox ID or sandbox name. Unauthenticated proxy requests can also use sandbox names for exposed user ports when `allow_unauthenticated_access` is enabled. The management URL on port `9501` still requires authentication. ## Inspect and list Use `get` to check a single sandbox's status and configuration, or `list` to see all sandboxes in your namespace. ```bash theme={null} # List active sandboxes tl sbx ls # Running sandboxes only tl sbx ls --running # Include all sandboxes regardless of state tl sbx ls --all ``` ```python theme={null} from tensorlake.sandbox import Sandbox # Connect returns a Sandbox handle (not SandboxInfo) sandbox = Sandbox.connect("my-env") print(sandbox.status) # property — fetches fresh from server print(sandbox.name) print(sandbox.sandbox_id) # Get the full metadata (image, resources, timeouts, etc.) info = sandbox.info() print(info.image) print(f"{info.resources.cpus} CPUs, {info.resources.memory_mb} MB RAM") # List all sandboxes in the namespace for sb in Sandbox.list(): print(f"{sb.sandbox_id}: {sb.status}") ``` ```typescript theme={null} import { Sandbox } from "tensorlake"; // Connect returns a Sandbox handle (not SandboxInfo) const sandbox = await Sandbox.connect("my-env"); console.log(await sandbox.status()); // status is an async method in TS console.log(sandbox.name); // name is a getter console.log(sandbox.sandboxId); // Get the full metadata (image, resources, timeouts, etc.) const info = await sandbox.info(); console.log(info.image, info.resources.cpus, info.resources.memoryMb); // List all sandboxes in the namespace const sandboxes = await Sandbox.list(); for (const sb of sandboxes) { console.log(`${sb.sandboxId}: ${sb.status}`); } ``` ```bash theme={null} # Get one sandbox (by name or ID) curl https://api.tensorlake.ai/sandboxes/my-env \ -H "Authorization: Bearer $TL_API_KEY" # List all sandboxes curl https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TL_API_KEY" ``` ## Suspend and resume Suspend a running named sandbox to pause it in place, then resume the same sandbox later exactly where it left off. Suspend and resume do not create a reusable artifact — for that, use [Snapshots](/sandboxes/snapshots). Ephemeral sandboxes cannot be suspended — suspend calls on them return an error. ```bash theme={null} # Suspend a named sandbox (by name or ID) tl sbx suspend my-env # Resume it later tl sbx resume my-env ``` ```python theme={null} sandbox.suspend() sandbox.resume() ``` ```typescript theme={null} await sandbox.suspend(); await sandbox.resume(); ``` ```bash theme={null} curl -X POST https://api.tensorlake.ai/sandboxes/my-env/suspend \ -H "Authorization: Bearer $TL_API_KEY" curl -X POST https://api.tensorlake.ai/sandboxes/my-env/resume \ -H "Authorization: Bearer $TL_API_KEY" ``` ## Terminate Terminate a sandbox when the work is done. `Terminated` is a final state and cannot be reversed. Sandboxes with `timeout_secs` set also terminate automatically once the timeout elapses. ```bash theme={null} tl sbx terminate my-env ``` ```python theme={null} sandbox.terminate() ``` ```typescript theme={null} await sandbox.terminate(); ``` ```bash theme={null} curl -X DELETE https://api.tensorlake.ai/sandboxes/my-env \ -H "Authorization: Bearer $TL_API_KEY" ``` ## End-to-end example If you want a single example that creates a sandbox, inspects it, lists sandboxes, and cleans up when finished, use one of the sessions below. ```bash theme={null} # Create an ephemeral sandbox (no name — cannot be suspended) tl sbx create --cpus 1.0 --memory 1024 --timeout 300 # Create a named sandbox (can be suspended and resumed) tl sbx create my-env --cpus 1.0 --memory 1024 --timeout 300 # Check status or list sandboxes tl sbx ls tl sbx ls --all # Terminate the sandbox when you are done (by name or ID) tl sbx terminate my-env ``` ```python theme={null} from tensorlake.sandbox import Sandbox # Ephemeral sandbox — no name, cannot be suspended ephemeral = Sandbox.create( cpus=1.0, memory_mb=1024, disk_mb=10240, timeout_secs=300, ) # Named sandbox — can be suspended and resumed named = Sandbox.create( name="my-env", cpus=1.0, memory_mb=1024, disk_mb=10240, timeout_secs=300, ) print(f"Sandbox ID: {named.sandbox_id}") print(f"Status: {named.status}") # Get the full metadata (image, resources, timeouts, etc.) info = sandbox.info() print(f"Image: {info.image}") print(f"Resources: {info.resources.cpus} CPUs, {info.resources.memory_mb} MB RAM") sandboxes = Sandbox.list() for sb in sandboxes: print(f"{sb.sandbox_id}: {sb.status}") sandbox.terminate() print("Sandboxes terminated") ``` ```typescript theme={null} import { Sandbox } from "tensorlake"; const ephemeral = await Sandbox.create({ cpus: 1.0, memoryMb: 1024, diskMb: 10240, timeoutSecs: 300, }); const named = await Sandbox.create({ name: "my-env", cpus: 1.0, memoryMb: 1024, diskMb: 10240, timeoutSecs: 300, }); console.log(named.sandboxId, named.status); # Get the full metadata (image, resources, timeouts, etc.) const info = await sandbox.info(); console.log(info.image, info.resources.cpus, info.resources.memoryMb); const sandboxes = await Sandbox.list(); for (const sandbox of sandboxes) { console.log(`${sandbox.sandboxId}: ${sandbox.status}`); } await sandbox.terminate(); ``` ```bash theme={null} # Create an ephemeral sandbox curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "resources": {"cpus": 1.0, "memory_mb": 1024}, "timeout_secs": 300 }' # Create a named sandbox (supports suspend/resume) curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "my-env", "resources": {"cpus": 1.0, "memory_mb": 1024}, "timeout_secs": 300 }' # Get one sandbox curl https://api.tensorlake.ai/sandboxes/ \ -H "Authorization: Bearer $TL_API_KEY" # List all sandboxes curl https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TL_API_KEY" # Delete curl -X DELETE https://api.tensorlake.ai/sandboxes/ \ -H "Authorization: Bearer $TL_API_KEY" ``` ## Sandbox object reference ### Sandbox The `Sandbox` object returned by `Sandbox.create()` and `Sandbox.connect()` exposes the following properties. Both are resolved from the server on first access and cached for the lifetime of the object. | Property (Python) | Property (TypeScript) | Type | Description | | ----------------- | --------------------- | ------------- | -------------------------------------------------------------------------------------- | | `sandbox_id` | `sandboxId` | `str` | Server-assigned UUID. Always a UUID, never a name, even if you connected using a name. | | `name` | `name` | `str \| None` | Human-readable name, or `None` for ephemeral sandboxes. | ```python theme={null} sandbox = Sandbox.connect(identifier="my-env") print(sandbox.sandbox_id) # "s7jus08qec4axzgbpq76h" ← UUID print(sandbox.name) # "my-env" ← name ``` ### SandboxInfo The `SandboxInfo` object returned by `Sandbox.info()` and `Sandbox.list()` contains: | Field | Type | Description | | --------------- | ------------------------ | ------------------------------------------------------ | | `sandbox_id` | `str` | Unique sandbox identifier | | `name` | `str \| None` | Name of the sandbox, or `None` for ephemeral sandboxes | | `namespace` | `str` | Namespace the sandbox belongs to | | `status` | `str` | Current lifecycle state | | `image` | `str` | Container image used | | `resources` | `ContainerResourcesInfo` | CPU and memory allocation | | `secret_names` | `list[str]` | Injected secret names | | `timeout_secs` | `int` | Timeout in seconds | | `entrypoint` | `list[str]` | Custom entrypoint command | | `created_at` | `datetime \| None` | Creation timestamp | | `terminated_at` | `datetime \| None` | Termination timestamp | ## Learn more Save and restore sandbox filesystem, memory, and running processes. Control internet access and blocked destinations. # Networking Source: https://docs.tensorlake.ai/sandboxes/networking Route internet traffic into sandbox applications and control outbound internet access Sandboxes support two networking features: 1. Routing internet traffic into services running inside a sandbox through `*.sandbox.tensorlake.ai` 2. Restricting the sandbox's own outbound internet access ## Sandbox Public URL Every running sandbox is reachable through the sandbox proxy domain. * `https://.sandbox.tensorlake.ai` routes to the sandbox management API on port `9501` * `https://-.sandbox.tensorlake.ai` routes to a user service listening on `` inside the sandbox The proxy preserves the request path and query string, supports WebSocket upgrades, and forwards gRPC over HTTP/2. The hostname can use either the sandbox ID or a sandbox name. The proxy resolves names to the sandbox's canonical ID before forwarding the request. If you fetch sandbox details over the HTTP API, the returned `sandbox_url` is the management URL on port `9501`. ## Route Traffic Into Sandbox Apps There are two access modes for internet-facing sandbox traffic: 1. `Authenticated requests`: the caller sends TensorLake auth credentials, and the proxy authorizes the request before forwarding it. 2. `Unauthenticated requests`: the sandbox owner explicitly makes selected user ports public, and the proxy skips auth for those user ports. ### Expose a User Port Port `9501` is the built-in management API and is always routable through the bare sandbox hostname. For any other port, the proxy only forwards requests if that port is listed in `exposed_ports`. `allow_unauthenticated_access` does not expose a port by itself. User ports still have to be present in `exposed_ports`. #### Authenticated-Only Exposure with the HTTP API Use this when a port should be routable from the internet but still require TensorLake auth on every request. ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = client.expose_ports( "my-env", [8080], allow_unauthenticated_access=False, ) print(sandbox.exposed_ports) sandbox = client.unexpose_ports("my-env", [8080]) print(sandbox.exposed_ports) ``` ```typescript theme={null} const sandbox = await client.exposePorts( "my-env", [8080], { allowUnauthenticatedAccess: false }, ); console.log(sandbox.exposedPorts); const updated = await client.unexposePorts("my-env", [8080]); console.log(updated.exposedPorts); ``` ```bash theme={null} curl -X PATCH https://api.tensorlake.ai/sandboxes/ \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "allow_unauthenticated_access": false, "exposed_ports": [8080] }' curl -X PATCH https://api.tensorlake.ai/sandboxes/ \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "allow_unauthenticated_access": false, "exposed_ports": [] }' ``` #### Unauthenticated Public Internet Access with the CLI Use this when you want anyone on the internet to be able to reach a sandbox app without TensorLake credentials. Common cases include webhook receivers, demo apps, public APIs, browser clients, and temporary preview environments. ```bash theme={null} tl sbx port expose 8080 tl sbx port ls tl sbx port rm 8080 ``` ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = client.expose_ports( "my-public-sandbox", [8080], allow_unauthenticated_access=True, ) print(sandbox.allow_unauthenticated_access, sandbox.exposed_ports) sandbox = client.unexpose_ports("my-public-sandbox", [8080]) print(sandbox.allow_unauthenticated_access, sandbox.exposed_ports) ``` ```typescript theme={null} const sandbox = await client.exposePorts( "my-public-sandbox", [8080], { allowUnauthenticatedAccess: true }, ); console.log(sandbox.allowUnauthenticatedAccess, sandbox.exposedPorts); const updated = await client.unexposePorts("my-public-sandbox", [8080]); console.log(updated.allowUnauthenticatedAccess, updated.exposedPorts); ``` ```bash theme={null} curl -X PATCH https://api.tensorlake.ai/sandboxes/ \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "allow_unauthenticated_access": true, "exposed_ports": [8080] }' curl -X PATCH https://api.tensorlake.ai/sandboxes/ \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "allow_unauthenticated_access": false, "exposed_ports": [] }' ``` The CLI `port expose` workflow sets both: * `exposed_ports` * `allow_unauthenticated_access=true` So traffic to that user port becomes publicly reachable from the internet without TensorLake auth. ### Authenticated Requests Authenticated routing is the default model for sandbox access. * The management URL on port `9501` always requires auth * User ports can also require auth when they are exposed but `allow_unauthenticated_access=false` Verified against `sandbox-proxy`, the proxy accepts these auth modes: * API key: `Authorization: Bearer ` * Personal access token: `Authorization: Bearer tl_pat...` plus `X-Forwarded-Organization-Id` and `X-Forwarded-Project-Id` * Session cookie: `tl.session_token` or legacy `tl-session`, plus the same forwarded organization/project context For browser WebSocket clients that cannot set custom `X-Forwarded-*` headers, the proxy also accepts `organizationId` and `projectId` in the query string. ```bash theme={null} curl https://8080-.sandbox.tensorlake.ai/health \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ```bash theme={null} curl https://8080-.sandbox.tensorlake.ai/health \ -H "Authorization: Bearer $TENSORLAKE_PAT" \ -H "X-Forwarded-Organization-Id: $TENSORLAKE_ORGANIZATION_ID" \ -H "X-Forwarded-Project-Id: $TENSORLAKE_PROJECT_ID" ``` ```bash theme={null} curl https://8080-.sandbox.tensorlake.ai/health \ -H "Cookie: tl.session_token=$TENSORLAKE_SESSION_TOKEN" \ -H "X-Forwarded-Organization-Id: $TENSORLAKE_ORGANIZATION_ID" \ -H "X-Forwarded-Project-Id: $TENSORLAKE_PROJECT_ID" ``` You can use the same authenticated routing model for HTTP, gRPC, and WebSocket services: ```bash theme={null} # HTTP curl https://8080-.sandbox.tensorlake.ai/health \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" # gRPC grpcurl \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ 50051-.sandbox.tensorlake.ai:443 \ list # WebSocket wscat \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -c "wss://3000-.sandbox.tensorlake.ai/socket" ``` ```typescript theme={null} const response = await fetch( "https://8080-my-env.sandbox.tensorlake.ai/health", { headers: { Authorization: `Bearer ${process.env.TENSORLAKE_API_KEY}`, }, }, ); console.log(await response.text()); ``` ### Unauthenticated Requests To make a user port public on the internet, both of these conditions must be true: * the port is in `exposed_ports` * `allow_unauthenticated_access=true` When those are set, the proxy skips TensorLake auth for that user port. ```bash theme={null} curl -X PATCH https://api.tensorlake.ai/sandboxes/ \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "allow_unauthenticated_access": true, "exposed_ports": [8080] }' ``` After that, requests to the exposed user port can omit auth entirely: ```bash theme={null} curl https://8080-.sandbox.tensorlake.ai/health ``` ```typescript theme={null} const response = await fetch( "https://8080-my-public-sandbox.sandbox.tensorlake.ai/health", ); console.log(await response.text()); ``` Unauthenticated access only applies to user ports. The management API on port `9501` never becomes public. If a named sandbox is suspended, the proxy can auto-resume it when a request arrives for an exposed port. ## Outbound Internet Access By default, sandboxes have outbound internet access enabled. Disable it for untrusted code: ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = Sandbox.create( allow_internet_access=False ) ``` ```typescript theme={null} const sandbox = await Sandbox.create({ allowInternetAccess: false, }); console.log(sandbox.sandboxId); ``` ```bash theme={null} curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "network": {"allow_internet_access": false} }' ``` Not supported in the CLI. In a verified public-cloud test, a sandbox created with `allow_internet_access=False` failed DNS resolution for `https://example.com`, confirming that outbound internet access was disabled. ## Allow Specific Destinations Use `allow_out` when you want a sandbox to reach only selected destinations. * `allow_out` rules are evaluated before `deny_out` * when `allow_internet_access=false`, `allow_out` acts as an explicit outbound allowlist * values should be destination IPs or CIDR ranges ```python theme={null} sandbox = Sandbox.create( allow_internet_access=False, allow_out=["10.0.0.0/8", "8.8.8.8"], ) ``` ```typescript theme={null} const sandbox = await Sandbox.create({ allowInternetAccess: false, allowOut: ["10.0.0.0/8", "8.8.8.8"], }); console.log(sandbox.sandboxId); ``` ```bash theme={null} curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "network": { "allow_internet_access": false, "allow_out": ["10.0.0.0/8", "8.8.8.8"] } }' ``` Not supported in the CLI. ## Block Specific Destinations ```python theme={null} sandbox = Sandbox.create( deny_out=["example.com"] ) ``` ```typescript theme={null} const sandbox = await Sandbox.create({ denyOut: ["example.com"], }); console.log(sandbox.sandboxId); ``` ```bash theme={null} curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "network": {"deny_out": ["example.com"]} }' ``` Not supported in the CLI. In a verified public-cloud request, `deny_out=["example.com"]` blocked `https://example.com` while `https://api.openai.com/v1/models` still returned `401`, confirming outbound connectivity was still available for destinations that were not denied. ## Network Configuration Summary | Parameter | Type | Default | Description | | ------------------------------ | ------------------- | ------- | ------------------------------------------------------------------------- | | `allow_internet_access` | `bool` | `true` | Enable or disable all outbound internet access | | `allow_out` | `list[str]` | `[]` | Explicitly allowed outbound destinations. Evaluated before `deny_out` | | `deny_out` | `list[str]` | `[]` | Denied outbound destinations | | `exposed_ports` | `list[int] \| null` | `null` | User ports that the sandbox proxy is allowed to route to | | `allow_unauthenticated_access` | `bool` | `false` | Skip TensorLake auth for exposed user ports. Never applies to port `9501` | # Process Management Source: https://docs.tensorlake.ai/sandboxes/processes Start, monitor, and manage background processes in sandboxes Start long-running services, stream their output, send signals, and manage process lifecycles inside sandboxes. Process operations use the sandbox proxy URL: `https://.sandbox.tensorlake.ai` Named sandboxes can use the sandbox name in place of the ID in the proxy hostname. The process APIs documented here run on the management URL on port `9501`, which always requires authentication. Unauthenticated proxy access applies only to exposed user ports. ## Start a Background Process ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = Sandbox.create() # Start a background process proc = sandbox.start_process("python", ["-m", "http.server", "8080"]) print(f"PID: {proc.pid}") ``` ```typescript theme={null} const proc = await sandbox.startProcess("python", { args: ["-m", "http.server", "8080"], }); console.log(`PID: ${proc.pid}`); ``` ```bash theme={null} # Run a command in the background using shell syntax tl sbx exec bash -c "python -m http.server 8080 &" ``` ```bash theme={null} curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "command": "python", "args": ["-m", "http.server", "8080"] }' ``` **Response** (`201 Created`): ```json theme={null} { "pid": 42, "status": "running", "command": "python", "args": ["-m", "http.server", "8080"], "stdin_writable": false, "started_at": 1710000000000 } ``` **Request options:** ```json theme={null} { "command": "python", "args": ["-m", "http.server", "8080"], "env": {"PORT": "8080"}, "working_dir": "/workspace", "stdin_mode": "pipe", "stdout_mode": "capture", "stderr_mode": "capture" } ``` ## List Processes ```python theme={null} # List is available via the process manager procs = sandbox.list_processes() for p in procs: print(f"PID {p.pid}: {p.status}") ``` ```typescript theme={null} const processes = await sandbox.listProcesses(); for (const proc of processes) { console.log(`PID ${proc.pid}: ${proc.status}`); } ``` ```bash theme={null} curl https://.sandbox.tensorlake.ai/api/v1/processes \ -H "Authorization: Bearer $TL_API_KEY" ``` **Response:** ```json theme={null} { "processes": [ { "pid": 42, "status": "running", "command": "python", "args": ["-m", "http.server", "8080"], "stdin_writable": false, "started_at": 1710000000000 } ] } ``` ## Stream Process Output Monitor process output in real time as it produces stdout/stderr: ```python theme={null} sandbox = Sandbox.create() proc = sandbox.start_process("python", ["-c", """ import time for i in range(10): print(f"Processing item {i+1}/10") time.sleep(1) """]) # Stream output line by line for event in sandbox.follow_output(proc.pid): print(event.line, end="") ``` ```typescript theme={null} const proc = await sandbox.startProcess("python", { args: [ "-c", "import time\nfor i in range(3):\n print(f'Processing item {i+1}/3')\n time.sleep(1)", ], }); for await (const event of sandbox.followOutput(proc.pid)) { console.log(event.line); } ``` ```bash theme={null} # Follow combined output via SSE curl -N https://.sandbox.tensorlake.ai/api/v1/processes//output/follow \ -H "Authorization: Bearer $TL_API_KEY" ``` **SSE stream:** ``` event: output data: {"line":"Processing item 1/10\n","timestamp":1710000000000,"stream":"stdout"} event: output data: {"line":"Processing item 2/10\n","timestamp":1710000001000,"stream":"stdout"} event: eof data: {} ``` ## Send Signals Send POSIX signals to running processes: ```python theme={null} import signal sandbox = Sandbox.create() proc = sandbox.start_process("python", ["-m", "http.server", "8080"]) # Gracefully stop the process sandbox.send_signal(proc.pid, signal.SIGTERM) ``` ```typescript theme={null} await sandbox.sendSignal(proc.pid, 15); ``` ```bash theme={null} # Send SIGTERM (15) curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes//signal \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{"signal": 15}' # Send SIGKILL (9) curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes//signal \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{"signal": 9}' ``` ## Kill a Process ```python theme={null} sandbox.send_signal(proc.pid, signal.SIGKILL) ``` ```typescript theme={null} await sandbox.killProcess(proc.pid); ``` ```bash theme={null} curl -X DELETE https://.sandbox.tensorlake.ai/api/v1/processes/ \ -H "Authorization: Bearer $TL_API_KEY" ``` ## Write to Stdin Send input to a running process with stdin in pipe mode: ```typescript theme={null} import { StdinMode } from "tensorlake"; const proc = await sandbox.startProcess("python", { args: ["-i"], stdinMode: StdinMode.PIPE, }); await sandbox.writeStdin( proc.pid, new TextEncoder().encode("print('hello')\n"), ); await sandbox.closeStdin(proc.pid); ``` ```bash theme={null} # Start a process with stdin pipe curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{"command": "python", "args": ["-i"], "stdin_mode": "pipe"}' # Write to stdin curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes//stdin \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/octet-stream" \ --data-binary "print('hello')\n" # Close stdin curl -X POST https://.sandbox.tensorlake.ai/api/v1/processes//stdin/close \ -H "Authorization: Bearer $TL_API_KEY" ``` ## Learn More Execute commands in sandboxes. Sandbox states, resources, and timeouts. Control internet access and outbound destinations. # SSH and PTY Sessions Source: https://docs.tensorlake.ai/sandboxes/pty-sessions Reach a running sandbox over standard SSH, or open a programmatic PTY session over WebSocket There are two ways to drive an interactive shell inside a sandbox: * **[SSH](#ssh)** — connect with `ssh`, `scp`, `sftp`, `rsync`, VS Code Remote-SSH, JetBrains Gateway, and any other tool that speaks SSH. Use this when you want a normal terminal, file transfer, or port forwarding. * **[PTY sessions](#pty-sessions)** — create a PTY over HTTPS, attach to it over a WebSocket, and drive terminal I/O programmatically. Use this when you're building a UI or browser app that needs a shell, when you need WebSocket-only access, or when you want a session you can disconnect and reattach by token. ## SSH The Tensorlake sandbox proxy exposes a standard SSH endpoint at `sandbox.tensorlake.ai`. The username is the sandbox id (or name); your laptop's SSH key, registered once with your Tensorlake account, authenticates the connection. ### One-time setup ```bash theme={null} tl login # if you aren't already logged in tl ssh-keys add --name laptop ~/.ssh/id_ed25519.pub tl ssh-keys ls ``` The key is associated with your user across every project you're a member of — there's no per-sandbox or per-project re-registration. ### Connect ```bash theme={null} ssh @sandbox.tensorlake.ai ``` You land in `/home/tl-user` as the `tl-user` POSIX account, which is in the `sudo` group. The sandbox's hostname inside the session is `tl-sbx`. To target a specific port (default is the SSH server on 22), prefix the username with the port: ```bash theme={null} ssh 8080-@sandbox.tensorlake.ai ``` ### File transfer `scp`, `sftp`, and `rsync` ride the same connection: ```bash theme={null} # Push a file in scp ./script.py @sandbox.tensorlake.ai:/workspace/ # Pull a directory out scp -r @sandbox.tensorlake.ai:/workspace/results ./ # Interactive sftp browser sftp @sandbox.tensorlake.ai # Mirror with rsync rsync -avz ./src/ @sandbox.tensorlake.ai:/workspace/src/ ``` ### Port forwarding All four standard forwarding modes are supported — TCP and UNIX-socket, each direction. **Local forward (`-L`)** — reach a service running inside the sandbox from your laptop: ```bash theme={null} # Web server on :8000 inside the sandbox → localhost:8888 on your laptop ssh -L 8888:localhost:8000 @sandbox.tensorlake.ai ``` **Dynamic SOCKS (`-D`)** — route arbitrary traffic through the sandbox's network namespace: ```bash theme={null} ssh -D 1080 -N -f @sandbox.tensorlake.ai curl --socks5 localhost:1080 https://example.com ``` **Remote forward (`-R`)** — let processes inside the sandbox reach a service running on your laptop: ```bash theme={null} # Service on your laptop's :9000 → reachable from inside the sandbox at localhost:9000 ssh -R 9000:localhost:9000 @sandbox.tensorlake.ai ``` **UNIX-socket forwards** — same shapes with socket paths instead of ports: ```bash theme={null} ssh -L /tmp/local.sock:/tmp/remote.sock @sandbox.tensorlake.ai ssh -R /tmp/remote.sock:/tmp/local.sock @sandbox.tensorlake.ai ``` ### VS Code Remote-SSH Add an entry to `~/.ssh/config`: ```sshconfig theme={null} Host my-sandbox HostName sandbox.tensorlake.ai User IdentityFile ~/.ssh/id_ed25519 IdentitiesOnly yes ``` Then run **Remote-SSH: Connect to Host…** in VS Code and pick `my-sandbox`. VS Code installs its server inside the sandbox automatically. JetBrains Gateway, Cursor, and any other Remote-SSH client work the same way. ### Persistent shells `tmux` and `screen` work normally inside the sandbox — useful if you want a session that survives an `ssh` disconnect: ```bash theme={null} ssh @sandbox.tensorlake.ai tmux new -s work # … run things … # detach with Ctrl-b d, exit ssh, reconnect later, then: ssh @sandbox.tensorlake.ai tmux attach -t work ``` ### Troubleshooting When auth fails, the proxy disconnects with one of three specific messages. **Key not registered.** ```text theme={null} your SSH public key is not registered with Tensorlake. Run `tl login` and `tl ssh-keys add ~/.ssh/id_ed25519.pub`. ``` The offered key isn't on your Tensorlake account. Run `tl ssh-keys add ~/.ssh/id_ed25519.pub`. **Sandbox not in any of your projects.** ```text theme={null} sandbox is not present in any of your projects (verify the id with `tl sbx ls -r`). ``` Either a typo in the id, or the sandbox lives in a project you're not a member of. Run `tl sbx ls -r` to see running sandboxes in your active project. **Sandbox is not running.** ```text theme={null} sandbox is currently — resume it (`tl sbx resume `) or create a new one. ``` The sandbox exists in your project but isn't `running`. For named sandboxes, `tl sbx resume `; otherwise create a fresh one. If your client offers multiple keys and one is unregistered, you'll see the static banner followed by `Permission denied (publickey).` because OpenSSH iterates through them. Constrain it to the registered key: ```sshconfig theme={null} Host *.tensorlake.ai IdentitiesOnly yes IdentityFile ~/.ssh/id_ed25519 ``` ### CLI shortcut If you don't need standard `ssh` semantics — e.g. you just want a quick shell without setting up keys — `tl sbx ssh` opens an interactive PTY using the WebSocket flow described below: ```bash theme={null} tl sbx ssh my-sandbox tl sbx ssh my-sandbox --shell /bin/sh ``` `tl sbx ssh` requires an interactive terminal and doesn't support port forwarding or file transfer — use `ssh`, `scp`, etc. for that. ## PTY sessions Use PTY sessions when you need to drive an interactive shell programmatically — for example a browser-based terminal UI, a recorder, or a remote-control tool — without a real SSH client. The session is created over HTTPS and terminal I/O moves over a WebSocket. The PTY management endpoints live on the sandbox proxy host, not `https://api.tensorlake.ai`: `https://.sandbox.tensorlake.ai` Create, list, get, resize, and kill requests require `Authorization: Bearer $TENSORLAKE_API_KEY`. The WebSocket attach step also requires the per-session PTY token returned from session creation. ### Happy Path 1. Call `createPty()` or `create_pty()` on a connected sandbox client. 2. Tensorlake creates the PTY session, opens the WebSocket, and sends the initial `READY` frame for you. 3. Use the returned handle to send input, resize the terminal, stream output, wait for exit, disconnect, reconnect, or kill the session. 4. If you need to reattach later, call `connectPty()` or `connect_pty()` with the original `sessionId` and `token`. ### High-Level SDK API The connected sandbox client now exposes a high-level PTY handle instead of making you manage WebSocket framing yourself. The handle exposes: * `sendInput()` / `send_input()` to write terminal input * `resize()` to change rows and columns * `wait()` to block until the PTY exits and get the exit code * `disconnect()` to close the current WebSocket without killing the PTY * `connect()` to reattach the same handle later * `kill()` to terminate the PTY session over HTTP * `onData()` / `on_data()` and `onExit()` / `on_exit()` to subscribe to output and exit events Use `tl sbx ssh` when you want an interactive terminal immediately and do not need to manage PTY sessions programmatically: ```bash theme={null} tl sbx ssh my-sandbox ``` ```bash theme={null} tl sbx ssh my-sandbox --shell /bin/sh ``` `tl sbx ssh` uses the PTY API under the hood and requires an interactive terminal. For reconnectable sessions or application-managed PTY control, use the Python or TypeScript SDK. ```python theme={null} from tensorlake.sandbox import Sandbox sandbox_client = Sandbox.create() try: pty = sandbox_client.create_pty( command="/bin/bash", args=["-l"], env={"TERM": "xterm-256color"}, working_dir="/workspace", cols=80, rows=24, ) pty.on_data(lambda data: print(data.decode("utf-8"), end="")) pty.on_exit(lambda code: print(f"\nExited: {code}")) pty.send_input("printf 'hello from PTY\\n'; pwd\\n") pty.resize(120, 40) pty.send_input("exit\n") exit_code = pty.wait() print(f"Final exit code: {exit_code}") finally: sandbox_client.terminate() ``` To reconnect later: ```python theme={null} pty = sandbox_client.connect_pty(session_id, token) ``` ```typescript theme={null} import { Sandbox } from "tensorlake"; const sandboxClient = await Sandbox.create(); try { const pty = await sandboxClient.createPty({ command: "/bin/bash", args: ["-l"], env: { TERM: "xterm-256color" }, workingDir: "/workspace", cols: 80, rows: 24, onData: (data) => process.stdout.write(Buffer.from(data)), onExit: (exitCode) => console.log("Exited:", exitCode), }); await pty.sendInput("printf 'hello from PTY\\n'; pwd\\n"); await pty.resize(120, 40); await pty.sendInput("exit\\n"); const exitCode = await pty.wait(); console.log("Final exit code:", exitCode); } finally { await sandboxClient.terminate(); } ``` To reconnect later, keep `pty.sessionId` and `pty.token` and call: ```typescript theme={null} const pty = await sandboxClient.connectPty(sessionId, token, { onData: (data) => process.stdout.write(Buffer.from(data)), }); ``` `createPty()` / `create_pty()` already open the WebSocket and send `READY`. Use `connectPty()` / `connect_pty()` only when you are reattaching to an existing session. ### Disconnect or kill `disconnect()` closes the WebSocket but leaves the PTY running, so you can reattach later with `connectPty()` / `connect_pty()`. `kill()` terminates the session over HTTP. ```python theme={null} # Detach without killing the shell — reconnect later with sandbox_client.connect_pty(...) pty.disconnect() # Terminate the session immediately pty.kill() ``` ```typescript theme={null} // Detach without killing the shell — reconnect later with sandboxClient.connectPty(...) pty.disconnect(); // Terminate the session immediately await pty.kill(); ``` ### Raw HTTP and WebSocket Flow The raw protocol is small enough that you can drive it yourself from any HTTP client plus any WebSocket client. These calls assume you already have a running sandbox ID or sandbox name. #### 1. Create the PTY session ```bash theme={null} curl -sS -X POST https://.sandbox.tensorlake.ai/api/v1/pty \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "command": "/bin/bash", "args": ["-l"], "env": {"TERM": "xterm-256color"}, "working_dir": "/workspace", "rows": 24, "cols": 80 }' ``` Response: ```json theme={null} { "session_id": "LYtJOrxE9Kz3bphPUDzuX", "token": "" } ``` #### 2. Attach the WebSocket Open this URL: ```text theme={null} wss://.sandbox.tensorlake.ai/api/v1/pty//ws ``` Send the PTY token on the upgrade request: ```http theme={null} X-PTY-Token: ``` If your client cannot set headers, append `?token=` to the WebSocket URL instead. #### 3. Exchange PTY frames | Direction | Bytes | Meaning | | ---------------- | ----------------------------------- | ------------------------------------------------------------ | | Client -> server | `02` | `READY`: flush any buffered output | | Client -> server | `00` + UTF-8 bytes | Send terminal input | | Client -> server | `01` + `cols` + `rows` | Resize terminal, with `cols` then `rows` as big-endian `u16` | | Server -> client | `00` + raw bytes | Terminal output | | Server -> client | `03` + 4-byte big-endian signed int | Process exit code | Common examples: | Action | Bytes | | ---------------- | ------------------- | | Send `READY` | `02` | | Run `pwd\n` | `00 70 77 64 0a` | | Run `exit\n` | `00 65 78 69 74 0a` | | Exit code `0` | `03 00 00 00 00` | | Resize to 120x40 | `01 00 78 00 28` | #### 4. Close or abort To close cleanly, write `exit\n` to the shell and wait for the `0x03` exit frame followed by the normal WebSocket close. To terminate the session immediately: ```bash theme={null} curl -X DELETE https://.sandbox.tensorlake.ai/api/v1/pty/ \ -H "Authorization: Bearer $TENSORLAKE_API_KEY" ``` ### Notes * `createPty()` / `create_pty()` send `READY` for you immediately after the socket opens. * Closing the WebSocket does not kill the PTY session. You can reconnect while the shell is still running. * Persist the original PTY token if you plan to reconnect. [Get PTY Session](/api-reference/v2/pty/get) and [List PTY Sessions](/api-reference/v2/pty/list) do not return it again. * PTY sessions with no connected clients are killed after 300 seconds of inactivity. * You can resize either with the `0x01` WebSocket frame or with [Resize PTY Session](/api-reference/v2/pty/resize). * For the endpoint-by-endpoint API reference, see [PTY Sessions API](/api-reference/v2/pty/introduction). ## Related Guides Run one-shot commands without an interactive shell. Long-running background processes managed by the sandbox daemon. Forward arbitrary TCP ports (Postgres, VNC, custom binary protocols) over an authenticated WebSocket. How long a sandbox lives, and what happens on suspend, resume, and terminate. # Sandboxes Quickstart Source: https://docs.tensorlake.ai/sandboxes/quickstart Install the SDK, authenticate, and run your first sandbox in under five minutes. ## Setup ```bash theme={null} curl -fsSL https://tensorlake.ai/install | sh ``` This installs the `tl` and `tensorlake` CLIs, which you can use to manage sandboxes and other resources from the command line. ```bash theme={null} tl login ``` ```bash theme={null} pip install tensorlake ``` This installs the Python SDK. It also ships the `tl` and `tensorlake` CLIs in your Python toolchain's `bin` directory. Get an API key from the [Tensorlake Dashboard](https://cloud.tensorlake.ai) and set it in your environment: ```bash theme={null} export TENSORLAKE_API_KEY=your-api-key-here ``` ```bash theme={null} npm install tensorlake ``` This installs the TypeScript SDK. It also ships the `tl` and `tensorlake` CLIs in your Node toolchain's `bin` directory. Get an API key from the [Tensorlake Dashboard](https://cloud.tensorlake.ai) and set it in your environment: ```bash theme={null} export TENSORLAKE_API_KEY=your-api-key-here ``` After you run `tl login`, you can manage your sandboxes in the [Tensorlake Dashboard](https://cloud.tensorlake.ai). You can also create API keys there for sandbox connections. See [Authentication](/platform/authentication#api-keys) for the full API key setup flow. ## Run your first sandbox Create a tiny sandbox for a quick task, or provision one with more CPU and memory for heavier workloads. ```bash theme={null} # Create an ephemeral sandbox (no name — terminates when done, cannot be suspended) tl sbx create # Run code inside the sandbox tl sbx exec python -c 'print("Hello from sandbox")' # Copy files in or out as the sandbox accumulates state tl sbx cp local-file.txt :/workspace/local-file.txt ``` ```python theme={null} from tensorlake.sandbox import Sandbox # Ephemeral sandbox — no name, terminates when done, cannot be suspended sandbox = Sandbox.create() # Run code inside the sandbox result = sandbox.run("python", ["-c", "print('Hello from sandbox')"]) print(result.stdout) # Copy files in or out as the sandbox accumulates state sandbox.write_file("/workspace/local-file.txt", b"example content") file_bytes = bytes(sandbox.read_file("/workspace/local-file.txt")) print(file_bytes.decode("utf-8")) ``` ```typescript theme={null} import { Sandbox } from "tensorlake"; // Ephemeral sandbox — no name, terminates when done, cannot be suspended const sandbox = await Sandbox.create(); console.log(sandbox.sandboxId, sandbox.status); for (const sb of await Sandbox.list()) { console.log(sb.sandboxId, sb.status); } // Run code inside the sandbox const result = await sandbox.run("python", { args: ["-c", "print('Hello from sandbox')"], }); console.log(result.stdout); // Copy files in or out as the sandbox accumulates state await sandbox.writeFile( "/workspace/local-file.txt", new TextEncoder().encode("example content"), ); const fileBytes = await sandbox.readFile("/workspace/local-file.txt"); console.log(new TextDecoder().decode(fileBytes)); ``` #### Configure CPU, Memory, Disk, and Timeout You can specify CPU, memory, disk, and timeout parameters when creating sandboxes. The defaults are 1 CPU, 1024 MB memory, 10 GB disk, and 600 seconds timeout. ```bash theme={null} tl sbx create --cpus 2.0 --memory 2048 --disk_mb 51200 --timeout 600 ``` ```python theme={null} sandbox = Sandbox.create(cpus=2.0, memory_mb=2048, disk_mb=12000, timeout_secs=600) ``` ```typescript theme={null} const sandbox = await Sandbox.create({ cpus: 2.0, memoryMb: 2048, diskMb: 12000, timeoutSecs: 600, }); ``` #### Suspend and Resume Tensorlake sandboxes can be suspended and resumed. A resumed sandbox continues from the exact memory and file system state, it was suspended. This is useful when you want to preserve the sandbox state without paying for idle compute time. You have to name a sandbox to make them suspendable after timeout. Sandboxes without a name are ephemeral and thrown away after the timeout. ```bash cli theme={null} tl sbx suspend tl sbx resume ``` ```python sandbox.py theme={null} sandbox = Sandbox.create(name="my-agent-env", cpus=2.0, memory_mb=2048) sandbox.suspend() sandbox.resume() ``` ```typescript sandbox.ts theme={null} const sandbox = await Sandbox.create({ name: "my-agent-env", cpus: 2.0, memoryMb: 2048, timeoutSecs: 600, }); await sandbox.suspend(); await sandbox.resume(); ``` #### Sandbox Checkpoints Checkpoints are point in time snapshot of a sandbox that you can use to start new sandboxes from. ```bash cli theme={null} tl sbx checkpoint ``` ```python sandbox.py theme={null} # Save a checkpoint you can return to later snapshot = sandbox.checkpoint() print(snapshot.snapshot_id) ``` ```typescript sandbox.ts theme={null} const snapshot = await sandbox.checkpoint(); console.log(snapshot.snapshotId); ``` #### Terminate Sandboxes ```bash cli theme={null} tl sbx terminate ``` ```python sandbox.py theme={null} sandbox.terminate() ``` ```typescript sandbox.ts theme={null} await sandbox.terminate(); ``` ## SSH Access You can also SSH into your sandboxes for an interactive terminal experience. ```bash theme={null} tl sbx ssh ``` This uses a websocket based PTY session to connect you to the sandbox. For programmatic access, you can create and control PTY session with the [Python and TypeScript SDKs](/sandboxes/pty-sessions). ## Next Steps Understand the different states and behaviors of sandboxes. Run shell commands and stream output. Use and customize sandbox images for your use case. # SDK Reference Source: https://docs.tensorlake.ai/sandboxes/sdk-reference Sandbox, commands, processes, PTYs, files, snapshots, desktop control, and networking reference This page is the runtime API surface of the Sandbox SDK in one place. It maps the Python and TypeScript sandbox-management APIs you'll use to create sandboxes, execute work inside them, manage files and processes, and interact with desktop sandboxes. Each detail page linked below expands on the same APIs with longer examples and edge cases. All method names below use the Python form. The TypeScript SDK mirrors them in camelCase (`start_process` → `startProcess`, `read_file` → `readFile`, `memory_mb` → `memoryMb`, etc.). The same JavaScript runtime API is used from Node.js. This page focuses on the sandbox runtime SDK surface. Related interfaces documented elsewhere include the CLI and HTTP API, the image-building DSLs in [Sandbox Images](/sandboxes/images), and the browser/VNC integration details in [Computer Use](/sandboxes/computer-use). Every method below is also available as an async-native variant on `AsyncSandbox` in Python — same names and parameters, just `await`ed. See the [Async SDK](/sandboxes/async) page for usage. The TypeScript SDK is already Promise-based, so the methods shown in the TypeScript tabs *are* the async API. ## Sandbox `Sandbox` is the top-level entry point for managing sandboxes in your namespace. Use `Sandbox.create()` to start a sandbox and get a handle. Use `Sandbox.connect()` to reconnect to an existing sandbox by ID or name. ```python theme={null} from tensorlake.sandbox import Sandbox # Authentication comes from `tl login` or TENSORLAKE_API_KEY env var ``` ```typescript theme={null} import { Sandbox } from "tensorlake"; ``` See [Authentication](/platform/authentication) for the full auth flow. ### Create Create a sandbox. Omit `name` for an ephemeral sandbox (cannot be suspended); pass `name` to create a named sandbox that supports suspend/resume. Returns a connected `Sandbox` handle (blocks until the sandbox is `running`). | Parameter | Type | Default | Description | | ------------------------ | ------------------- | ---------------- | ----------------------------------------------------------------------------------------- | | `name` | `str \| None` | `None` | Human-readable name. Required for suspend/resume. | | `cpus` | `float` | `1.0` | Number of CPUs to allocate. | | `memory_mb` | `int` | `1024` | Memory in megabytes. 1024–8192 MB per CPU core. | | `timeout_secs` | `int` | `600` | Auto-suspend (named) or auto-terminate (ephemeral) after this many seconds. | | `image` | `str \| None` | platform default | Name or ID of a prebuilt [Sandbox Image](/sandboxes/images). | | `snapshot_id` | `str \| None` | `None` | Restore from a snapshot instead of booting a fresh VM. | | `secret_names` | `list[str] \| None` | `None` | Secrets to inject as environment variables. Must be pre-registered with `tl secrets set`. | | `entrypoint` | `list[str] \| None` | `None` | Custom entrypoint command. | | `allow_internet_access` | `bool` | `True` | Allow outbound internet traffic (see [Networking](/sandboxes/networking)). | | `allow_out` / `deny_out` | `list[str] \| None` | `None` | Outbound destination allow/deny lists. | To expose user ports for inbound traffic, call [`sandbox.update(exposed_ports=..., allow_unauthenticated_access=...)`](#update) after `create()`. ```python theme={null} sandbox = Sandbox.create( name="my-env", cpus=2.0, memory_mb=4096, timeout_secs=1800, secret_names=["OPENAI_API_KEY"], ) print(sandbox.sandbox_id, sandbox.status) ``` ```typescript theme={null} const sandbox = await Sandbox.create({ name: "my-env", cpus: 2.0, memoryMb: 4096, timeoutSecs: 1800, }); ``` ### Create and connect `Sandbox.create()` creates a sandbox and returns a live `Sandbox` handle you can immediately run commands against. ```python theme={null} sandbox = Sandbox.create(cpus=2.0, memory_mb=2048) result = sandbox.run("python", ["-c", "print('hello')"]) print(result.stdout) # Context manager terminates the sandbox on exit ``` ```typescript theme={null} const sandbox = await Sandbox.create({ cpus: 2.0, memoryMb: 2048 }); try { const result = await sandbox.run("python", { args: ["-c", "print('hello')"] }); console.log(result.stdout); } finally { await sandbox.terminate(); } ``` ### Connect Get a `Sandbox` handle for an existing sandbox (by ID or name) without creating a new one. Use this to rejoin a named sandbox after `resume`, or to operate on a sandbox a different process created. ```python theme={null} sandbox = Sandbox.connect("my-env") print(sandbox.sandbox_id) # always UUID, even if you connected by name print(sandbox.name) # "my-env" ``` ```typescript theme={null} const sandbox = Sandbox.connect("my-env"); ``` ### List and get `Sandbox.connect()` attaches to a single sandbox by ID or name. To enumerate all sandboxes in your namespace, use `Sandbox.list()`. ```python theme={null} sandbox = Sandbox.connect("my-env") print(sandbox.sandbox_id, sandbox.name, sandbox.status) for sb in Sandbox.list(): print(sb.sandbox_id, sb.status) ``` ```typescript theme={null} const sandbox = await Sandbox.connect("my-env"); console.log(sandbox.sandboxId, sandbox.name, sandbox.status); ``` ### Update `sandbox.update()` is the unified instance method for changing a sandbox's name, exposed user ports, or unauthenticated-access flag — renaming and port exposure are the same call. Assigning a name to an ephemeral sandbox converts it to a named sandbox that supports suspend/resume. | Parameter | Type | Default | Description | | ------------------------------ | ------------------- | ------- | -------------------------------------------------------------------------------------------------------- | | `name` | `str \| None` | `None` | New name for the sandbox. Naming an ephemeral sandbox makes it non-ephemeral and enables suspend/resume. | | `allow_unauthenticated_access` | `bool \| None` | `None` | Whether exposed user ports accept traffic without TensorLake auth. | | `exposed_ports` | `list[int] \| None` | `None` | User ports routable through the sandbox proxy. Port `9501` is reserved. | ```python theme={null} info = sandbox.update(name="my-env") ``` ```typescript theme={null} const info = await sandbox.update({ name: "my-env" }); ``` If you only have a sandbox ID (for example, from `Sandbox.list()`), connect first and chain `update()`: ```python theme={null} info = Sandbox.connect("sbx-123").update(name="my-env", exposed_ports=[8080]) ``` ```typescript theme={null} const sandbox = await Sandbox.connect("sbx-123"); const info = await sandbox.update({ name: "my-env", exposedPorts: [8080] }); ``` ### Suspend and resume Pause a running named sandbox in place; resume it later under the same ID with its memory, filesystem, and running processes intact. Ephemeral sandboxes return an error on `suspend`. ```python theme={null} sandbox.suspend() sandbox.resume() ``` ```typescript theme={null} await sandbox.suspend(); await sandbox.resume(); ``` ### Terminate `terminate()` ends the sandbox permanently. `Terminated` is a final state and cannot be reversed. ```python theme={null} sandbox.terminate() ``` ```typescript theme={null} await sandbox.terminate(); ``` See [Lifecycle](/sandboxes/lifecycle) for the full state machine. ### Expose and unexpose ports Route public internet traffic to services listening on user ports inside the sandbox. Requests arrive at `https://-.sandbox.tensorlake.ai`. Port exposure is just a `sandbox.update()` call — pass `exposed_ports` and (optionally) `allow_unauthenticated_access`. Pass `exposed_ports=[]` to remove all exposed ports. ```python theme={null} sandbox.update(exposed_ports=[8080], allow_unauthenticated_access=False) sandbox.update(exposed_ports=[]) # remove all ``` ```typescript theme={null} await sandbox.update({ exposedPorts: [8080], allowUnauthenticatedAccess: false }); await sandbox.update({ exposedPorts: [] }); // remove all ``` See [Networking](/sandboxes/networking) for authenticated vs. unauthenticated access and how clients reach user ports. ### Snapshot and restore Capture a reusable artifact of the sandbox (filesystem + memory + running processes). Restore by passing `snapshot_id` to `Sandbox.create()`. ```python theme={null} snapshot = sandbox.checkpoint() # Restore into a fresh sandbox restored = Sandbox.create(snapshot_id=snapshot.snapshot_id) # Manage snapshots sandbox.list_snapshots() Sandbox.get_snapshot(snapshot.snapshot_id) Sandbox.delete_snapshot(snapshot.snapshot_id) ``` ```typescript theme={null} const snapshot = await sandbox.checkpoint(); const restored = await Sandbox.create({ snapshotId: snapshot.snapshotId }); await sandbox.listSnapshots(); await Sandbox.getSnapshot(snapshot.snapshotId); await Sandbox.deleteSnapshot(snapshot.snapshotId); ``` Suspend pauses *this* sandbox; snapshot captures a reusable artifact you restore into a *new* sandbox. See [Snapshots](/sandboxes/snapshots). ## Sandbox handle The `Sandbox` object returned by `Sandbox.create()` or `Sandbox.connect()` is how you execute work inside a running sandbox. All methods below target a single live sandbox. | Property (Python) | Property (TypeScript) | Type | Description | | ----------------- | --------------------- | ------------- | --------------------------------------------- | | `sandbox_id` | `sandboxId` | `str` | Server-assigned UUID. | | `name` | `name` | `str \| None` | Human-readable name, or `None` for ephemeral. | ### Run a command `run()` is the short-lived foreground execution primitive: send a command, wait for it to exit, receive captured output. Use it for the common case of "do this one thing and give me the result." ```python theme={null} result = sandbox.run("python", ["-c", "print('hello')"], timeout=30) print(result.stdout) print(result.stderr) print(result.exit_code) ``` ```typescript theme={null} const result = await sandbox.run("python", { args: ["-c", "print('hello')"], timeout: 30, }); console.log(result.stdout, result.stderr, result.exitCode); ``` Pass `env={"KEY": "value"}` (Python) or `env: { KEY: "value" }` (TypeScript) for per-command environment variables. See [Environment Variables](/sandboxes/environment-variables). See [Execute Commands](/sandboxes/commands) for streaming, multi-step shell pipelines, and error handling. ### Background processes For long-running or concurrent work, start a process and keep the handle so you can monitor, stream output, and signal it. ```python theme={null} proc = sandbox.start_process("python", ["-m", "http.server", "8080"]) print(proc.pid) for p in sandbox.list_processes(): print(p.pid, p.command, p.status) for event in sandbox.follow_output(proc.pid): print(event.line, end="") import signal sandbox.send_signal(proc.pid, signal.SIGTERM) ``` ```typescript theme={null} const proc = await sandbox.startProcess("python", { args: ["-m", "http.server", "8080"] }); for (const p of await sandbox.listProcesses()) { console.log(p.pid, p.command, p.status); } for await (const event of sandbox.followOutput(proc.pid)) { console.log(event.line); } await sandbox.sendSignal(proc.pid, 15); // SIGTERM await sandbox.killProcess(proc.pid); // SIGKILL ``` ### Writing to stdin Drive a process interactively from code by writing bytes to its stdin, then closing the stream when you're done. ```python theme={null} proc = sandbox.start_process("python", ["-c", "import sys; print(sys.stdin.read())"]) sandbox.write_stdin(proc.pid, b"hello from stdin\n") sandbox.close_stdin(proc.pid) ``` ```typescript theme={null} const proc = await sandbox.startProcess("python", { args: ["-c", "import sys; print(sys.stdin.read())"], }); await sandbox.writeStdin(proc.pid, new TextEncoder().encode("hello from stdin\n")); await sandbox.closeStdin(proc.pid); ``` See [Process Management](/sandboxes/processes) for the full API. ### PTY sessions Open an interactive terminal inside the sandbox. The PTY is created over HTTPS; terminal I/O then moves over a WebSocket attached to the session. ```python theme={null} pty = sandbox.create_pty( command="bash", cols=120, rows=40, env={"PS1": "sandbox$ "}, ) # `pty` is a connected Pty handle — see PTY Sessions for send_input / resize / wait. # Reconnect later from session_id and token: reattached = sandbox.connect_pty(pty.session_id, pty.token) ``` ```typescript theme={null} const pty = await sandbox.createPty({ command: "bash", cols: 120, rows: 40, }); ``` See [PTY Sessions](/sandboxes/pty-sessions) for the wire protocol, reconnect flow, and resize frames. ### File operations Copy data in and out of the sandbox filesystem without spawning a shell. ```python theme={null} sandbox.write_file("/workspace/data.csv", b"name,score\nAlice,95\n") content = bytes(sandbox.read_file("/workspace/data.csv")) print(content.decode("utf-8")) for entry in sandbox.list_directory("/workspace"): print(entry.name, entry.is_dir, entry.size) sandbox.delete_file("/workspace/data.csv") ``` ```typescript theme={null} await sandbox.writeFile( "/workspace/data.csv", new TextEncoder().encode("name,score\nAlice,95\n"), ); const bytes = await sandbox.readFile("/workspace/data.csv"); console.log(new TextDecoder().decode(bytes)); const entries = await sandbox.listDirectory("/workspace"); await sandbox.deleteFile("/workspace/data.csv"); ``` See [File Operations](/sandboxes/file-operations) for binary uploads, recursive listings, and move/copy patterns. ### Desktop sessions Desktop sandboxes expose a higher-level remote-control handle on top of the normal `Sandbox` APIs. Use this with `tensorlake/ubuntu-vnc` to capture screenshots and drive mouse and keyboard input through the authenticated sandbox proxy. ```python theme={null} with sandbox.connect_desktop(password="tensorlake") as desktop: png_bytes = desktop.screenshot() desktop.move_mouse(640, 400) desktop.click() desktop.type_text("hello from desktop") print(desktop.width, desktop.height) ``` ```typescript theme={null} const desktop = await sandbox.connectDesktop({ password: "tensorlake" }); try { const pngBytes = await desktop.screenshot(); await desktop.moveMouse(640, 400); await desktop.click(); await desktop.typeText("hello from desktop"); console.log(desktop.width, desktop.height); } finally { await desktop.close(); } ``` Common desktop methods are: | Python | TypeScript | Description | | ------------------------------- | ----------------------------- | ------------------------------------------------ | | `screenshot()` | `screenshot()` | Capture the current desktop as PNG bytes. | | `move_mouse(x, y)` | `moveMouse(x, y)` | Move the pointer to absolute screen coordinates. | | `click()` | `click()` | Click the current pointer location. | | `double_click()` | `doubleClick()` | Double-click the current pointer location. | | `scroll_up()` / `scroll_down()` | `scrollUp()` / `scrollDown()` | Scroll vertically. | | `press(keys)` | `press(keys)` | Send a key or key chord. | | `type_text(text)` | `typeText(text)` | Type text into the active window. | | `width`, `height` | `width`, `height` | Desktop resolution. | See [Computer Use](/sandboxes/computer-use) for reconnect patterns, coordinate workflows, and noVNC integration. ### Terminate Shortcut for `sandbox.terminate()` that uses the handle you already have. ```python theme={null} sandbox.terminate() ``` ```typescript theme={null} await sandbox.terminate(); ``` ## Data models The SDK returns typed objects for every API call. The fields below are the ones you'll read most often. Field names shown in Python `snake_case`; TypeScript uses the `camelCase` equivalent. ### SandboxInfo Returned by `Sandbox.create()`, `Sandbox.connect()`, `client.list()`, and the suspend/resume/expose calls. | Field | Type | Description | | ------------------------------ | ------------------------ | ------------------------------------------------------------------------------------- | | `sandbox_id` | `str` | Server UUID. | | `name` | `str \| None` | Name, or `None` for ephemeral. | | `namespace` | `str` | Namespace owning the sandbox. | | `status` | `SandboxStatus` | One of `pending`, `running`, `snapshotting`, `suspending`, `suspended`, `terminated`. | | `image` | `str \| None` | Sandbox image in use. | | `resources` | `ContainerResourcesInfo` | `.cpus: float`, `.memory_mb: int`. | | `secret_names` | `list[str]` | Injected secrets. | | `timeout_secs` | `int \| None` | Auto-suspend/terminate timeout. | | `exposed_ports` | `list[int] \| None` | Public-routed user ports. | | `allow_unauthenticated_access` | `bool` | Whether exposed ports accept unauthenticated traffic. | | `entrypoint` | `list[str] \| None` | Custom entrypoint command. | | `network` | `NetworkConfig \| None` | Outbound network configuration (`allow_internet_access`, `allow_out`, `deny_out`). | | `created_at` | `datetime \| None` | Creation timestamp. | | `terminated_at` | `datetime \| None` | Termination timestamp (if terminated). | ### Sandbox Returned by `Sandbox.create()` and `Sandbox.connect()`. Exposes the runtime methods documented above plus: | Property | Type | Description | | -------------------------- | ------------- | -------------------------------- | | `sandbox_id` / `sandboxId` | `str` | UUID, even if connected by name. | | `name` | `str \| None` | Human-readable name. | ### ProcessInfo Returned by `start_process()` and `list_processes()`. | Field | Type | Description | | ------------ | ------------------ | -------------------------------------------------------- | | `pid` | `int` | Process ID inside the sandbox. | | `command` | `str` | The executed command. | | `args` | `list[str]` | Command arguments. | | `status` | `ProcessStatus` | One of `running`, `exited`, `signaled`. | | `exit_code` | `int \| None` | Exit code once the process has exited. | | `signal` | `int \| None` | Signal number if the process was terminated by a signal. | | `started_at` | `datetime` | When the process started. | | `ended_at` | `datetime \| None` | When the process ended. | ### CommandResult Returned by `run()`. | Field | Type | Description | | ----------- | ----- | ------------------------- | | `stdout` | `str` | Captured standard output. | | `stderr` | `str` | Captured standard error. | | `exit_code` | `int` | Process exit code. | ### SnapshotInfo Returned by the snapshot APIs. | Field | Type | Description | | ------------- | ------------------ | ----------------------------------------------- | | `snapshot_id` | `str` | Server-assigned ID, use to restore. | | `sandbox_id` | `str` | Source sandbox this snapshot was captured from. | | `status` | `SnapshotStatus` | One of `in_progress`, `completed`, `failed`. | | `size_bytes` | `int \| None` | Size of the snapshot artifact. | | `created_at` | `datetime \| None` | Capture timestamp. | ## Learn more State machine, suspend/resume, timeouts. Run commands, capture output, stream. Background processes, stdin, signals. Interactive shells over WebSocket. Read, write, list, delete. Capture and restore full VM state. Prebuild dependencies into reusable images. Desktop sessions, screenshots, mouse, keyboard. Expose user ports to the internet. Per-command and per-PTY environment. # Skills in Sandboxes Source: https://docs.tensorlake.ai/sandboxes/skills-in-sandboxes Pre-load TensorLake skill files inside sandbox images so coding agents auto-discover them at startup. Coding agents discover skill files by scanning specific directories at startup. By placing TensorLake skill files in the right paths inside a sandbox image, any agent running in the sandbox will automatically pick them up without manual installation. ## How Agents Discover Skills Each coding agent scans a different directory for skill files: | Agent | Skill File | Discovery Path | | -------------- | ----------- | ---------------------------------------------------------------- | | Claude Code | `SKILL.md` | `~/.claude/skills//SKILL.md` | | OpenAI Codex | `AGENTS.md` | `~/.agents/skills//SKILL.md` or `AGENTS.md` in working dir | | Google ADK | `SKILL.md` | Loaded explicitly via `load_skill_from_dir()` | | Cursor | `.mdc` | `.cursor/rules/*.mdc` | | Cline | `.md` | `.clinerules/` | | Windsurf | `.md` | `.windsurf/rules/*.md` | | GitHub Copilot | `.md` | `.github/copilot-instructions.md` | To make skills work inside a sandbox, bake the skill files into the image at the paths the agent expects. ## Create a Skills Image ### Any Agent The simplest way to install skills for any agent is with the [skills](https://skills.sh) CLI. It places skill files in the correct discovery paths for Claude Code, Codex, Cursor, Windsurf, and other supported agents. ```python theme={null} from tensorlake import Image image = ( Image(name="with-skills", base_image="tensorlake/ubuntu-systemd") .run("apt-get update && apt-get install -y nodejs npm python3 python3-pip") .run("npm install -g skills") .run("skills add tensorlakeai/tensorlake-skills --all -y --copy") .run("python3 -m pip install --break-system-packages tensorlake") ) ``` ```typescript theme={null} import { Image } from "tensorlake"; const image = new Image({ name: "with-skills", baseImage: "tensorlake/ubuntu-systemd", }) .run("apt-get update && apt-get install -y nodejs npm python3 python3-pip") .run("npm install -g skills") .run("skills add tensorlakeai/tensorlake-skills --all -y --copy") .run("python3 -m pip install --break-system-packages tensorlake"); ``` ```dockerfile Dockerfile theme={null} FROM tensorlake/ubuntu-systemd RUN apt-get update && apt-get install -y nodejs npm python3 python3-pip RUN npm install -g skills RUN skills add tensorlakeai/tensorlake-skills --all -y --copy RUN python3 -m pip install --break-system-packages tensorlake ``` * `--all` installs skills to all detected agents * `-y` skips confirmation prompts for non-interactive use * `--copy` copies files instead of symlinking, which is more reliable inside containers ### Claude Code Only If you only need Claude Code support, copy the skill into `~/.claude/skills/` inside the image: ```python theme={null} from tensorlake import Image image = ( Image(name="claude-code-skills", base_image="tensorlake/ubuntu-systemd") .run("apt-get update && apt-get install -y git python3 python3-pip") .run("git clone https://github.com/tensorlakeai/tensorlake-skills /tmp/tensorlake-skills") .run( "mkdir -p /root/.claude/skills/tensorlake && " "cp -r /tmp/tensorlake-skills/SKILL.md /tmp/tensorlake-skills/references " "/root/.claude/skills/tensorlake/" ) .run("rm -rf /tmp/tensorlake-skills") .run("python3 -m pip install --break-system-packages tensorlake") ) ``` ```typescript theme={null} import { Image } from "tensorlake"; const image = new Image({ name: "claude-code-skills", baseImage: "tensorlake/ubuntu-systemd", }) .run("apt-get update && apt-get install -y git python3 python3-pip") .run("git clone https://github.com/tensorlakeai/tensorlake-skills /tmp/tensorlake-skills") .run( "mkdir -p /root/.claude/skills/tensorlake && " + "cp -r /tmp/tensorlake-skills/SKILL.md /tmp/tensorlake-skills/references " + "/root/.claude/skills/tensorlake/", ) .run("rm -rf /tmp/tensorlake-skills") .run("python3 -m pip install --break-system-packages tensorlake"); ``` ```dockerfile Dockerfile theme={null} FROM tensorlake/ubuntu-systemd RUN apt-get update && apt-get install -y git python3 python3-pip RUN git clone https://github.com/tensorlakeai/tensorlake-skills /tmp/tensorlake-skills RUN mkdir -p /root/.claude/skills/tensorlake \ && cp -r /tmp/tensorlake-skills/SKILL.md /tmp/tensorlake-skills/references /root/.claude/skills/tensorlake/ RUN rm -rf /tmp/tensorlake-skills RUN python3 -m pip install --break-system-packages tensorlake ``` Claude Code scans `~/.claude/skills/` at startup. The `SKILL.md` file and `references/` directory at `/root/.claude/skills/tensorlake/` are auto-discovered. ## Create a Reusable Sandbox Image Register the image once, then launch new sandboxes with the skills already baked in: ```bash theme={null} tl sbx image create ./Dockerfile --registered-name claude-code-skills ``` ```bash theme={null} npx tl sbx image create ./Dockerfile --registered-name claude-code-skills ``` ```typescript theme={null} import { createSandboxImage, Image } from "tensorlake"; const image = new Image({ name: "claude-code-skills", baseImage: "tensorlake/ubuntu-systemd", }) .run("apt-get update && apt-get install -y nodejs npm python3 python3-pip") .run("npm install -g skills") .run("skills add tensorlakeai/tensorlake-skills --all -y --copy") .run("python3 -m pip install --break-system-packages tensorlake"); await createSandboxImage(image, { contextDir: ".", }); ``` Then launch sandboxes from that image: ```bash theme={null} tl sbx create --image claude-code-skills ``` ## Use with the SDK You can also install skills programmatically each time you create a sandbox: ```python theme={null} from tensorlake.sandbox import Sandbox sandbox = Sandbox.create() sandbox.run("bash", ["-c", "apt-get update && apt-get install -y nodejs npm"]) sandbox.run("bash", ["-c", "npm install -g skills"]) sandbox.run("bash", ["-c", "skills add tensorlakeai/tensorlake-skills --all -y --copy"]) result = sandbox.run( "find", ["/", "-name", "SKILL.md", "-type", "f", "-not", "-path", "*/node_modules/*"], ) print(result.stdout) ``` ```typescript theme={null} import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.create(); try { await sandbox.run("bash", { args: ["-lc", "apt-get update && apt-get install -y nodejs npm"], }); await sandbox.run("bash", { args: ["-lc", "npm install -g skills"], }); await sandbox.run("bash", { args: [ "-lc", "skills add tensorlakeai/tensorlake-skills --all -y --copy", ], }); const result = await sandbox.run("find", { args: [ "/", "-name", "SKILL.md", "-type", "f", "-not", "-path", "*/node_modules/*", ], }); console.log(result.stdout); } finally { await sandbox.terminate(); client.close(); } ``` For sandboxes you create frequently, use the [sandbox image approach](#create-a-reusable-sandbox-image) to avoid reinstalling skills on every launch. ## What Gets Included The skill repo contains SDK references that the agent uses as context: ```text theme={null} tensorlake-skills/ ├── AGENTS.md # Skill definition (OpenAI Codex) ├── SKILL.md # Skill definition (Claude Code, Google ADK) └── references/ ├── applications_sdk.md # Orchestrate API reference ├── sandbox_sdk.md # Sandbox API reference ├── documentai_sdk.md # DocumentAI API reference └── integrations.md # Integration patterns ``` ## See Also Learn about TensorLake skills and how to install them for your coding agent. Create reusable sandbox images from Dockerfiles or the TensorLake image DSLs. # Snapshots Source: https://docs.tensorlake.ai/sandboxes/snapshots Save and restore sandbox filesystem, memory, and running processes Snapshots support two snapshot types: * `filesystem`: captures filesystem state and restores with a cold boot. * `memory`: captures filesystem, memory, and running process state and restores with a warm start. When you do not specify a type, Tensorlake uses `filesystem` by default. Snapshots are independent of sandbox [lifecycle](/sandboxes/lifecycle) — once captured, the artifact persists after the source sandbox is terminated. This means you can snapshot an ephemeral sandbox before it ends, then restore that state into a new sandbox much later. If you only need to pause a single sandbox in place rather than produce a reusable artifact, use [suspend/resume](/sandboxes/lifecycle#suspend-and-resume) instead. ## Creating a Snapshot ```bash theme={null} tl sbx checkpoint tl sbx checkpoint --checkpoint-type filesystem tl sbx checkpoint --checkpoint-type memory tl sbx checkpoint --timeout 600 ``` ```python theme={null} from tensorlake.sandbox import CheckpointType, Sandbox sandbox = Sandbox.create() sandbox.run("pip", ["install", "numpy", "pandas", "--user", "--break-system-packages"]) sandbox.run("python", ["-c", "import pandas as pd; pd.DataFrame({'a': [1,2,3]}).to_csv('/data/output.csv')"]) # Default (server-side default, currently `filesystem`). snapshot = sandbox.checkpoint() # Explicitly request a memory checkpoint (warm-restore VM memory + processes). snapshot = sandbox.checkpoint(checkpoint_type=CheckpointType.MEMORY) # Filesystem-only checkpoint (cold-boot from snapshot tarball). snapshot = sandbox.checkpoint(checkpoint_type=CheckpointType.FILESYSTEM) print(snapshot.snapshot_id) ``` ```typescript theme={null} import { Sandbox, type CheckpointType } from "tensorlake"; const sandbox = await Sandbox.create(); await sandbox.run("pip", { args: [ "install", "numpy", "pandas", "--user", "--break-system-packages", ], }); await sandbox.run("python", { args: [ "-c", "import pandas as pd; pd.DataFrame({'a': [1,2,3]}).to_csv('/data/output.csv')", ], }); // Default (server-side default, currently `filesystem`). let snapshot = await sandbox.checkpoint(); // Explicitly request a memory checkpoint (warm-restore VM memory + processes). snapshot = await sandbox.checkpoint({ checkpointType: "memory" }); // Filesystem-only checkpoint (cold-boot from snapshot tarball). snapshot = await sandbox.checkpoint({ checkpointType: "filesystem" }); console.log(snapshot?.snapshotId); ``` ```bash theme={null} curl -X POST https://api.tensorlake.ai/sandboxes//snapshot \ -H "Authorization: Bearer $TL_API_KEY" ``` ## Restoring from a Snapshot Create a new sandbox from a snapshot. If the snapshot is filesystem (default), the new sandbox restores the captured filesystem. You can change sandbox resources (CPU, memory, disk) for the new sandbox. If the snapshot is memory, the new sandbox restores filesystem, memory, and running processes exactly as they were. Image, resources (CPUs, memory), entrypoint, and secrets come from the snapshot and cannot be changed at restore time. If you need different resources, create a fresh sandbox instead of restoring. For filesystem snapshots, you can pass `--disk_mb` / `resources.disk_mb` at restore time to grow root disk size (growth-only). ```bash theme={null} tl sbx create --snapshot ``` ```python theme={null} restored = Sandbox.create(snapshot_id=snapshot.snapshot_id) result = restored.run("cat", ["/data/output.csv"]) print(result.stdout) ``` ```typescript theme={null} const restored = await Sandbox.create({ snapshotId: snapshot.snapshotId, }); const result = await restored.run("cat", { args: ["/data/output.csv"], }); console.log(result.stdout); ``` ```bash theme={null} curl -X POST https://api.tensorlake.ai/sandboxes \ -H "Authorization: Bearer $TL_API_KEY" \ -H "Content-Type: application/json" \ -d '{"snapshot_id": ""}' ``` ## Clone a Sandbox `tl sbx clone` creates a `memory` checkpoint and immediately boots a new sandbox from it, so the clone warm-restores filesystem, memory, and running processes from the source. The intermediate snapshot persists — it shows up in `tl sbx checkpoint ls` and counts toward storage until you delete it with `tl sbx checkpoint rm `. ```bash theme={null} tl sbx clone tl sbx clone --timeout 600 ``` Not supported in the Python SDK. Not yet exposed in the TypeScript SDK. Use the CLI for one-step clone workflows, or call `checkpoint()` followed by `Sandbox.create()` explicitly. Not supported in the HTTP API. ## Managing Snapshots ### List Snapshots ```bash theme={null} tl sbx checkpoint ls ``` ```python theme={null} snapshots = sandbox.list_snapshots() for s in snapshots: print( f"{s.snapshot_id} | {s.status.value} | {s.snapshot_type.value if s.snapshot_type else '-'} | {s.size_bytes} bytes" ) ``` ```typescript theme={null} const snapshots = await sandbox.listSnapshots(); for (const snapshot of snapshots) { console.log( `${snapshot.snapshotId} | ${snapshot.status} | ${snapshot.snapshotType ?? "-"} | ${snapshot.sizeBytes ?? 0} bytes`, ); } ``` ```bash theme={null} curl https://api.tensorlake.ai/snapshots \ -H "Authorization: Bearer $TL_API_KEY" ``` ### Get Snapshot Details ```typescript theme={null} const info = await Sandbox.getSnapshot("snapshot-id"); console.log(info.status, info.snapshotType, info.baseImage, info.sizeBytes); ``` ```python theme={null} info = Sandbox.get_snapshot("snapshot_id") print(info.status, info.snapshot_type) ``` ```bash theme={null} curl https://api.tensorlake.ai/snapshots/ \ -H "Authorization: Bearer $TL_API_KEY" ``` Not supported in the CLI. ### Delete a Snapshot ```bash theme={null} tl sbx checkpoint rm ``` ```typescript theme={null} await Sandbox.deleteSnapshot("snapshot-id"); ``` ```python theme={null} Sandbox.delete_snapshot("snapshot_id") ``` ```bash theme={null} curl -X DELETE https://api.tensorlake.ai/snapshots/ \ -H "Authorization: Bearer $TL_API_KEY" ``` ## `checkpoint()` Parameters | Parameter | Type | Default | Description | | ----------------- | ------------------------------------------------------------------------------------- | --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------- | | `sandbox_id` | `str` | — | ID of the running sandbox to snapshot | | `checkpoint_type` | `CheckpointType` (Python) / `CheckpointType` (TypeScript: `"memory" \| "filesystem"`) | server default (currently `filesystem`) | Checkpoint type. `FILESYSTEM` captures filesystem-only state; `MEMORY` captures filesystem + VM memory + running processes. | | `timeout` | `float` | `300` | Max seconds to wait for completion | | `poll_interval` | `float` | `1.0` | Seconds between status polls | `CheckpointType` is exported from `tensorlake.sandbox` (Python) and `tensorlake` (TypeScript). The TypeScript field on `CheckpointOptions` is `checkpointType`. ## Related Guides Create, suspend, resume, and terminate sandboxes — the operations snapshots build on. Build reusable base images. Pair with snapshots for warm starts on top of pinned dependencies. Snapshot a warmed-up `ubuntu-vnc` desktop and fork parallel agent sessions. Snapshot a Chrome profile so parallel browser agents start with cookies and history already in place. # Tool Calls Source: https://docs.tensorlake.ai/sandboxes/tool-calls Expose Tensorlake sandboxes as tools to your LLM agents — give models a fresh, isolated execution environment for code, shell, and file operations. ## How it works The pattern is the same regardless of which LLM you use: 1. **Define a `run_code` tool** — tell the LLM it can call a function that accepts a code string and returns stdout/stderr. 2. **Create a sandbox once** and keep it alive across the agent loop — reusing one sandbox preserves state between tool calls (installed packages, files written to disk). Each `run_code` call is a fresh Python process, so variables and imports must be redefined in each call. 3. **Execute the tool call** — when the LLM invokes `run_code`, pass the code into `sandbox.run()` and return the result. 4. **Clean up** — terminate the sandbox when the agent session ends. *** ## TypeScript SDK starter If your agent loop already runs in Node.js, keep one connected sandbox alive for the session and wrap it as a tool: ```typescript theme={null} import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.create({ cpus: 1.0, memoryMb: 1024, timeoutSecs: 600, allowInternetAccess: false, }); async function runCode(code: string): Promise { const result = await sandbox.run("python", { args: ["-c", code], }); const chunks = [result.stdout.trim()]; if (result.stderr.trim()) chunks.push(`[stderr]\n${result.stderr.trim()}`); if (result.exitCode !== 0) chunks.push(`[exit code: ${result.exitCode}]`); return chunks.filter(Boolean).join("\n\n") || "(no output)"; } try { const output = await runCode( "import statistics\nnums = [4, 8, 15, 16, 23, 42]\nprint(statistics.mean(nums))", ); console.log(output); } finally { await sandbox.terminate(); client.close(); } ``` Use this `runCode()` helper as the implementation behind your OpenAI or Anthropic tool/function call. *** ## Claude (Anthropic SDK) ### Prerequisites ```bash theme={null} pip install tensorlake anthropic ``` ### Full example ```python theme={null} import anthropic from tensorlake.sandbox import Sandbox SYSTEM_PROMPT = """You are a data analysis assistant. You have access to a Python sandbox. Use the run_code tool whenever you need to compute something, analyze data, or verify your reasoning with code. Each run_code call is a fresh Python process — include all imports and redefine any variables you need. Installed packages and files written to disk persist across calls.""" # Define the tool schema Claude will use RUN_CODE_TOOL = { "name": "run_code", "description": ( "Execute Python code in a secure sandbox. " "Each call is a fresh Python process — include all imports and redefine any variables you need. " "Installed packages and files written to disk persist across calls. " "Returns stdout and stderr." ), "input_schema": { "type": "object", "properties": { "code": { "type": "string", "description": "Python code to execute.", } }, "required": ["code"], }, } def run_agent(user_message: str) -> str: anthropic_client = anthropic.Anthropic() # Create one sandbox for the entire agent session sandbox = Sandbox.create( cpus=1.0, memory_mb=1024, timeout_secs=600, allow_internet_access=False, # lock down network for untrusted code ) messages = [{"role": "user", "content": user_message}] try: while True: response = anthropic_client.messages.create( model="claude-opus-4-5", max_tokens=4096, system=SYSTEM_PROMPT, tools=[RUN_CODE_TOOL], messages=messages, ) # Append assistant's response to history messages.append({"role": "assistant", "content": response.content}) # If no tool use, we're done if response.stop_reason == "end_turn": # Extract the final text response for block in response.content: if hasattr(block, "text"): return block.text # Process all tool calls in this response tool_results = [] for block in response.content: if block.type != "tool_use": continue code = block.input["code"] print(f"\n[sandbox] executing:\n{code}\n") result = sandbox.run("python", ["-c", code]) output = result.stdout or "" if result.stderr: output += f"\n[stderr]\n{result.stderr}" if result.exit_code != 0: output += f"\n[exit code: {result.exit_code}]" print(f"[sandbox] output:\n{output}") tool_results.append({ "type": "tool_result", "tool_use_id": block.id, "content": output or "(no output)", }) # Feed all results back to Claude in one message messages.append({"role": "user", "content": tool_results}) finally: sandbox.close() # always clean up if __name__ == "__main__": answer = run_agent( "I have a list of numbers: [4, 8, 15, 16, 23, 42]. " "What is the mean, median, and standard deviation? " "Also plot a histogram and tell me if the distribution looks normal." ) print("\n=== Final answer ===") print(answer) ``` ### What happens step by step | Step | What Claude does | What your code does | | ---- | --------------------------------------------------------------- | ------------------------------------------------------ | | 1 | Reads the user question | Sends to Claude with `run_code` tool available | | 2 | Decides it needs to compute something, emits a `tool_use` block | Detects `stop_reason == "tool_use"` | | 3 | — | Calls `sandbox.run()` with the generated code | | 4 | — | Appends result as `tool_result` and calls Claude again | | 5 | Reads the output, continues reasoning or calls tool again | Loops until `stop_reason == "end_turn"` | | 6 | Returns final text answer | Returns it to the caller, closes sandbox | *** ## OpenAI (function calling) ### Prerequisites ```bash theme={null} pip install tensorlake openai ``` ### Full example ```python theme={null} import json import openai from tensorlake.sandbox import Sandbox SYSTEM_PROMPT = """You are a data analysis assistant with access to a Python sandbox. Always use the run_code function to execute code — never compute or guess answers yourself. Each run_code call is a fresh Python process — include all imports and redefine any variables you need. Installed packages and files written to disk are available across calls.""" # Define the function schema OpenAI will use RUN_CODE_FUNCTION = { "type": "function", "function": { "name": "run_code", "description": ( "Execute Python code in a secure isolated sandbox. " "Each call runs in a fresh Python process — include all imports and redefine " "any variables you need. Installed packages and files written to disk persist across calls. " "Returns stdout and stderr as a string." ), "parameters": { "type": "object", "properties": { "code": { "type": "string", "description": "Python code to execute.", } }, "required": ["code"], }, }, } def run_agent(user_message: str) -> str: openai_client = openai.OpenAI() # Create one sandbox for the entire agent session sandbox = Sandbox.create( cpus=1.0, memory_mb=1024, timeout_secs=600, allow_internet_access=False, ) messages = [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": user_message}, ] try: while True: response = openai_client.chat.completions.create( model="gpt-5.1", messages=messages, tools=[RUN_CODE_FUNCTION], tool_choice="auto", ) msg = response.choices[0].message messages.append(msg) # No tool calls — agent is done if not msg.tool_calls: return msg.content # Process all tool calls for tool_call in msg.tool_calls: args = json.loads(tool_call.function.arguments) code = args["code"] print(f"\n[sandbox] executing:\n{code}\n") result = sandbox.run("python", ["-c", code]) output = result.stdout or "" if result.stderr: output += f"\n[stderr]\n{result.stderr}" if result.exit_code != 0: output += f"\n[exit code: {result.exit_code}]" print(f"[sandbox] output:\n{output}") messages.append({ "role": "tool", "tool_call_id": tool_call.id, "content": output or "(no output)", }) finally: sandbox.close() if __name__ == "__main__": answer = run_agent( "Using only Python stdlib (random, datetime), generate sample monthly revenue and cost " "data for the last 12 months (seed 42). Print a table showing each month, profit, and " "profit margin. Then print which month had the best and worst margin." ) print("\n=== Final answer ===") print(answer) ``` *** ## Using OpenAI Agents SDK If you are using the newer [OpenAI Agents SDK](https://openai.github.io/openai-agents-python/), you can wrap the sandbox as a `FunctionTool` directly: ### Prerequisites ```bash theme={null} pip install tensorlake openai-agents ``` ### Full example ```python theme={null} from agents import Agent, ModelSettings, Runner, function_tool from tensorlake.sandbox import Sandbox # Keep one sandbox alive for the agent's lifetime _sandbox = Sandbox.create( cpus=1.0, memory_mb=1024, timeout_secs=600, allow_internet_access=False, ) @function_tool def run_code(code: str) -> str: """Execute Python code in a secure sandbox. Each call is a fresh Python process — include all imports and redefine any variables you need. Installed packages and files written to disk persist across calls.""" result = _sandbox.run("python", ["-c", code]) output = result.stdout or "" if result.stderr: output += f"\n[stderr]\n{result.stderr}" if result.exit_code != 0: output += f"\n[exit code: {result.exit_code}]" return output or "(no output)" agent = Agent( name="Data Analyst", instructions="You are a data analysis assistant. Always use run_code to compute answers — never calculate or guess yourself.", tools=[run_code], model_settings=ModelSettings(tool_choice="required"), ) result = Runner.run_sync( agent, "Write and run Python code to calculate the compound annual growth rate " "if revenue grew from $1M to $3.2M over 5 years. Print the result." ) print(result.final_output) _sandbox.close() ``` *** ## Production tips ### Reuse one sandbox per session, not per call Creating a new sandbox on every tool call adds cold-start latency and loses any state (installed packages, files written to disk) from prior calls. Create the sandbox before the agent loop and close it afterward. ```python theme={null} # ✅ Create once, reuse across all tool calls sandbox = Sandbox.create(...) try: run_agent_loop(sandbox) finally: sandbox.close() # ❌ Don't do this — loses state and adds latency every call def run_code_tool(code): sandbox = Sandbox.create() # new sandbox every call return sandbox.run("python", ["-c", code]) ``` ### Pre-install dependencies with Snapshots If your agent always needs the same libraries (pandas, numpy, matplotlib, etc.), install them once, snapshot the sandbox, and boot future sandboxes from that snapshot. This avoids re-running `pip install` on every session. ```python theme={null} # One-time setup: build a snapshot with dependencies pre-installed setup_sandbox = Sandbox.create() setup_sandbox.run("pip", ["install", "pandas", "numpy", "matplotlib", "scipy"]) snapshot = sandbox.checkpoint() setup_sandbox.close() print(f"Snapshot ready: {snapshot.snapshot_id}") # Every future session starts with packages already installed sandbox = Sandbox.create(snapshot_id=snapshot.snapshot_id) ``` See the [Snapshots guide](/sandboxes/snapshots) for details. ### Let the agent install packages on demand If your agent may need arbitrary or unknown packages, tell it in the system prompt that it can install them with pip. Because sandbox state persists across tool calls, a package installed in one call is available in all subsequent calls. ```python theme={null} SYSTEM_PROMPT = """You are a data analysis assistant. You have access to a Python sandbox. Use the run_code tool whenever you need to compute something, analyze data, or verify your reasoning with code. Each run_code call is a fresh Python process — include all imports and redefine any variables you need. Installed packages and files written to disk persist across calls. If a required package is missing, install it before using it: import subprocess; subprocess.run(["pip", "install", "--break-system-packages", ""], check=True)""" ``` Use this approach when dependencies are unpredictable. For a known set of dependencies, pre-installing via [Snapshots](#pre-install-dependencies-with-snapshots) is faster since it avoids repeating `pip install` on every session. ### Lock down the network By default, sandboxes have internet access. For agents executing untrusted or LLM-generated code, disable it: ```python theme={null} sandbox = Sandbox.create(allow_internet_access=False) ``` If the agent needs outbound access, keep internet enabled or use `deny_out` to block destinations you know should be unreachable. See the [Networking guide](/sandboxes/networking). *** ## What to build next * **[Data Analysis](/sandboxes/data-analysis)** — spin up sandboxes with data science libraries to analyze complex datasets and stream results back in real time. * **[Snapshots](/sandboxes/snapshots)** — pre-install dependencies so agent sessions start instantly. # Local Tunnels Source: https://docs.tensorlake.ai/sandboxes/tunnels Forward a local TCP port to a port inside a sandbox over an authenticated WebSocket. Tunnels give your machine a `localhost:` that maps directly to a port inside a running sandbox. The relay travels over a WebSocket through the sandbox proxy, so your TensorLake credentials authenticate every connection — you do **not** need to add the port to `exposed_ports` or make it public. **Reach for a tunnel when you need a raw TCP connection into a sandbox.** The sandbox proxy at `*.sandbox.tensorlake.ai` only speaks HTTP, WebSocket, gRPC, and SSH. Anything else — VNC's RFB protocol, the Postgres wire protocol, MySQL, Redis's RESP, MongoDB, custom binary protocols — needs a tunnel because the proxy cannot frame those bytes for you. You can also use a tunnel for HTTP/WS/gRPC traffic when you would rather keep the port private to your laptop than expose it through the public sandbox URL. Driving Chrome's DevTools Protocol from your laptop is a typical case: CDP is WebSocket, so the proxy could carry it, but a tunnel keeps the debugger reachable only at `127.0.0.1` and skips the per-port `exposed_ports` configuration. Tunnels and exposed ports are independent. A tunnel works even when the port is not in `exposed_ports`. ## Open a Tunnel The simplest way is the CLI. Pick any local port (defaults to the same number as the remote port) and leave the command running. ```bash theme={null} tl sbx tunnel 5901 --listen-port 15901 ``` The command keeps running and prints connection events. Press `Ctrl+C` to stop the tunnel; the sandbox keeps running. Without `--listen-port`, the local port matches the remote port: ```bash theme={null} tl sbx tunnel 9222 ``` ```javascript theme={null} import { Sandbox } from "tensorlake"; const sandbox = await Sandbox.connect({ sandboxId: "" }); const tunnel = await sandbox.createTunnel(5901, { localPort: 15901 }); const { host, port } = tunnel.address(); console.log(`tunnel listening on ${host}:${port}`); // ... use it ... await tunnel.close(); ``` `createTunnel(remotePort, options)` returns a `TcpTunnel`. Useful options: * `localHost` — bind interface (defaults to `127.0.0.1`). * `localPort` — local port number; pass `0` for an ephemeral port and read it back from `tunnel.address()`. * `connectTimeout` — seconds to wait for each WebSocket connection (defaults to `10`). The Python SDK does not yet ship a native tunnel helper. Drive the CLI from a subprocess: ```python theme={null} import subprocess tunnel = subprocess.Popen( ["tl", "sbx", "tunnel", "", "9222", "-l", "9222"], ) try: # Use http://127.0.0.1:9222 from your code. ... finally: tunnel.terminate() tunnel.wait() ``` The local listener is per-process. If you want two clients to share one tunnel, run the CLI once and connect both clients to the same `localhost:`. ## How It Works The CLI and the TypeScript SDK both speak the same protocol: 1. A WebSocket is opened to the sandbox proxy, carrying your API key, PAT, or session cookie. 2. The proxy authorizes the request, finds the dataplane that owns the sandbox, and pipes bytes to `127.0.0.1:` inside the sandbox. 3. The local TCP listener accepts a connection from your client and relays bytes both ways across the WebSocket. Because every byte rides on an authenticated WebSocket, the remote port stays private to your account — there is no public hostname for it. ## Common Patterns | Inside the sandbox | Local port | Client | | --------------------------------- | ---------- | ------------------------------------------------------------------- | | `5901` (TigerVNC) | `15901` | macOS Screen Sharing, RealVNC, TigerVNC, Remmina | | `9222` (Chrome DevTools Protocol) | `9222` | Playwright `connect_over_cdp`, Puppeteer, `chrome-remote-interface` | | `5432` (Postgres) | `5432` | `psql`, DBeaver, TablePlus | | `3000` (dev server) | `3000` | Browser at `http://localhost:3000` | Tunneling is also the easiest way to reach the sandbox's authenticated [Computer Use](/sandboxes/computer-use) VNC port from a desktop client without polling screenshots, or to point Playwright at sandboxed Chrome — see [Drive Chrome over CDP](/sandboxes/chrome-cdp) for the full walkthrough. ## Troubleshooting * **`Connection refused` from the local end.** The remote service inside the sandbox is not listening on the port yet. Tail its logs (`tl sbx exec -- bash -lc 'ss -ltnp'`) and retry. * **`502 Bad Gateway` during handshake.** The sandbox has not finished booting the workload. Wait a few seconds and reconnect; the proxy returns 502 when nothing is listening on the remote port. * **WebSocket auth failures.** Confirm `tl whoami` shows the right organization and project, or that `TENSORLAKE_API_KEY` is set in the shell that runs the CLI. # Parsing ACORD Forms for Insurance Workflows Source: https://docs.tensorlake.ai/use-cases/insurance-financial-services/acord-form-processing Extract consistent, schema-based data from ACORD insurance forms with Tensorlake — layout-aware parsing, structured extraction, and citations back to the original document. ACORD forms are a standardized format for exchanging insurance data—but they’re complex, dense, and often vary slightly in layout across carriers. Manually extracting structured data from these forms is tedious, error-prone, and hard to scale. Tensorlake parses ACORD forms and extracts consistent, schema-based outputs that you can write directly into your internal systems. Teams use Tensorlake to capture coverage details from ACORD submissions and certificates, store them in databases, and automate customer success and audit workflows. Tensorlake combines layout-aware parsing, schema-driven extraction, and citations so you can build reliable ingestion pipelines with traceability back to the original document. Try parsing a sample ACORD form with this Colab Notebook *** ## What you can extract from ACORD forms ACORD workflows vary by line of business and carrier, but most ingestion pipelines depend on the same core signals. Tensorlake can extract: * **Policy and submission identifiers**: policy number, agency, carrier, producer, NAIC, submission or reference IDs * **Named insured and parties**: named insured, additional insured, certificate holder, mailing addresses * **Dates**: effective date, expiration date, retroactive date, policy term * **Coverages and limits**: coverage type, limit amounts, deductibles, occurrence versus claims-made flags, umbrella and excess limits * **Locations and operations**: location addresses, classification codes, description of operations * **Signatures and attestations**: presence of signatures and signature blocks when applicable The output is schema-validated JSON that you can treat as a stable interface for downstream automation. *** ## Common ACORD use cases * **Coverage ingestion to database**: extract coverages and limits into structured tables that power account servicing, renewals, and risk review. * **Audit and compliance**: standardize coverage evidence across customers and carriers, then flag missing fields or expired dates for follow-up. * **Customer success automation**: route requests to the right team based on coverage type, limits, and special conditions. * **Workflow triggers**: create underwriting tasks, request missing information, or open tickets when required fields are missing. * **Downstream integrations**: feed policy administration systems, CRMs, data warehouses, and reporting pipelines with normalized coverage data. *** ## Citations for every extracted field Structured extraction is more useful when it is explainable. Tensorlake provides layout information for page elements, including page numbers and bounding boxes. This makes it easy to: * **Show where a field came from** in a review UI * **Create audit trails** for coverage values and attestations * **Debug extraction quality** by jumping directly to the source region on the page If you are building an agentic workflow, citations let you keep deterministic control flow and still provide human-verifiable evidence.