Tensorlake Documentation

Run parallel data analysis, model training, and benchmarking tasks in secure, isolated sandbox environments. Each sandbox can have its own dependencies and resource limits, allowing you to compare different models or process large datasets concurrently. This example demonstrates how to benchmark several scikit-learn classification models in parallel by running each in its own sandbox.

TypeScript SDK starter

The same benchmarking pattern works in Node.js: one model per sandbox, Promise.all() for fan-out, and JSON on stdout for aggregation.

import { Sandbox } from "tensorlake";


async function runModelBenchmark(modelName: string, sklearnPath: string) {
  const splitAt = sklearnPath.lastIndexOf(".");
  const modulePath = sklearnPath.slice(0, splitAt);
  const className = sklearnPath.slice(splitAt + 1);
  const code = `
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from ${modulePath} import ${className}
import json

data = load_iris()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.3)
model = ${className}()
model.fit(X_train, y_train)
print(json.dumps({"model": "${modelName}", "accuracy": model.score(X_test, y_test)}))
`;

  const sandbox = await Sandbox.create({
    timeoutSecs: 900,
    allowInternetAccess: false,
  });

  try {
    await sandbox.run("pip", {
      args: [
        "install",
        "numpy",
        "scikit-learn",
        "--user",
        "--break-system-packages",
      ],
    });
    const result = await sandbox.run("python", {
      args: ["-c", code],
    });
    return JSON.parse(result.stdout);
  } finally {
    await sandbox.terminate();
  }
}

const modelsToTest = {
  "Random Forest": "sklearn.ensemble.RandomForestClassifier",
  SVM: "sklearn.svm.SVC",
  "Logistic Regression": "sklearn.linear_model.LogisticRegression",
};

const results = await Promise.all(
  Object.entries(modelsToTest).map(([name, path]) =>
    runModelBenchmark(name, path),
  ),
);

console.table(results);
client.close();

Example: Parallel Model Benchmarking

The following script benchmarks five different scikit-learn models on the Iris dataset. Each model is trained and evaluated in a separate, concurrent sandbox.

import asyncio
import json

from dotenv import load_dotenv
load_dotenv()

from tensorlake.sandbox import Sandbox


async def run_model_benchmark(model_name, sklearn_path):
    """
    Runs a model benchmark inside an isolated sandbox.
    Returns a dict with model name and accuracy.
    """
    module_path, class_name = sklearn_path.rsplit('.', 1)

    code = f"""
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from {module_path} import {class_name}
import json

data = load_iris()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.3)

model = {class_name}()
model.fit(X_train, y_train)

score = model.score(X_test, y_test)
print(json.dumps({{"model": "{model_name}", "accuracy": score}}))
"""

    def _sync_benchmark():
        sandbox = Sandbox.create()
            print(f"🚀 Sandbox started for {model_name}...")
            # install scikit-learn and its dependencies in the sandbox
            sandbox.run("pip", ["install", "--user", "--break-system-packages", "numpy", "scikit-learn"])
            # run the code in the sandbox
            result = sandbox.run("python", ["-c", code])

            output_data = json.loads(result.stdout.strip())

            return output_data

    return await asyncio.to_thread(_sync_benchmark)

async def main():
    models_to_test: dict[str, str] = {
    "Random Forest": "sklearn.ensemble.RandomForestClassifier",
    "SVM": "sklearn.svm.SVC",
    "Logistic Regression": "sklearn.linear_model.LogisticRegression",
    "Decision Tree": "sklearn.tree.DecisionTreeClassifier",
    "KNN": "sklearn.neighbors.KNeighborsClassifier",
}

    tasks = [run_model_benchmark(name, path) for name, path in models_to_test.items()]
    print("Gathering results from all sandboxes...\n")
    results = await asyncio.gather(*tasks)

    print("--- Benchmark Results ---")
    for r in results:
        print(f"{r['model']:<20}: {r['accuracy']:.4f}")

if __name__ == "__main__":
    asyncio.run(main())

How It Works

The script orchestrates the parallel execution of model benchmarks using Python’s asyncio library. 1. Parallel Execution: The main function defines a dictionary of models to test and creates a list of asynchronous tasks using a list comprehension. asyncio.gather runs all these tasks concurrently. 2. Sandbox Task: The run_model_benchmark function is responsible for a single benchmark. For each model, it:

Creates a new, isolated sandbox.
Installs the necessary Python libraries (numpy and scikit-learn) inside the sandbox using sandbox.run(). The --break-system-packages flag is used to comply with PEP 668 in newer Python environments.
Executes a Python script that trains the model on the Iris dataset and calculates its accuracy.
Prints the results as a JSON string to standard output.
Captures the stdout, parses the JSON, and returns the result.

3. Aggregate Results: Once all sandboxes have completed their tasks, asyncio.gather returns a list of all the results, which are then printed to the console.

This example uses the python-dotenv library to load your Tensorlake API key from a .env file. Create a file named .env in your project root and add your key:

TENSORLAKE_API_KEY="your-api-key-here"

The SDK will automatically use this key.

Pro Tips

Faster Execution with Snapshots

The example installs dependencies every time a sandbox is created. This is simple but inefficient for repeated runs. To significantly speed up your workflow, you can use Snapshots.

Create a “base” sandbox and install all your dependencies.
Create a snapshot of that sandbox.
Start new sandboxes from the snapshot ID. The new sandboxes will have all the dependencies pre-installed, saving you valuable setup time.

Learn more in the Snapshots guide.

Learn More

Sandboxes Overview

Install Tensorlake and create your first sandbox.

File Operations

Learn how to upload custom datasets and other files to your sandboxes.

Documentation Index

​TypeScript SDK starter

​Example: Parallel Model Benchmarking

​How It Works

​Pro Tips

​Faster Execution with Snapshots

​Learn More

Sandboxes Overview

File Operations

TypeScript SDK starter

Example: Parallel Model Benchmarking

How It Works

Pro Tips

Faster Execution with Snapshots

Learn More