Turn images into structured data

With Tensorlake Applications, you can easily build programs that process multimodal data and store structured outputs for further analysis and consumption. This use case demonstrates how to build an application that extracts driver license data from images using an OpenAI model. This use case also demonstrates how to use custom Tensorlake images and secrets to personalize your application’s environment. In this case, we’ll build a custom image with additional dependencies, and use a custom secret to access OpenAI’s API.

structured_extraction.py

import os
import base64

import requests
from pydantic import BaseModel
from tensorlake.applications import application, function, Image, RequestError

# Install dependencies for your application
image = Image().run("pip install openai pydantic requests")
# List of secrets required by the application.
# The application expects to find these secrets in the environment.
secrets = ["OPENAI_API_KEY"]

class DrivingLicense(BaseModel):
    name: str
    date_of_birth: str
    address: str
    license_number: str
    license_expiration_date: str

@application()
@function(image=image, secrets=secrets)
def extract_driving_license_data(url: str) -> DrivingLicense:
    from openai import OpenAI

    # Download image from URL
    response = requests.get(url)
    response.raise_for_status()

    # Encode image as base64
    image_base64 = base64.b64encode(response.content).decode("utf-8")

    # Determine image format from content type or URL
    content_type = response.headers.get("content-type", "")
    if "jpeg" in content_type or "jpg" in content_type:
        image_format = "jpeg"
    elif "png" in content_type:
        image_format = "png"
    else:
        # Default to jpeg if can't determine
        image_format = "jpeg"

    openai = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

    response = openai.beta.chat.completions.parse(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": "Extract the personal information from the driving license image.",
            },
            {
                "role": "user",
                "content": [
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": f"data:image/{image_format};base64,{image_base64}"
                        },
                    }
                ],
            },
        ],
        response_format=DrivingLicense,
    )
    dl: DrivingLicense = response.choices[0].message.parsed
    return dl

Building custom images allows you to install pretty much anything you want in your function’s environment. Take a look at the Dependency management guide to learn more about it. Before we deploy this application on Tensorlake, we need to make sure the function can access the secret api key. You can do this by running the tensorlake secrets command in your terminal:

tensorlake secrets set OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>

If you want to learn more about how we manage secrets, take a look at the Secrets management guide.

Now that you’ve defined your custom image, and set the secret api key, you can deploy the application:

tensorlake deploy structured_extraction.py

You should see the tensorlake stream build logs as your image is being built. Once the image is built, you can invoke the application as before.

curl -N -X POST https://api.tensorlake.ai/applications/driving_license_extractor \
-H "Authorization: Bearer "$TENSORLAKE_API_KEY"" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-d '"https://tlake.link/dl"'

Get Started

Agents and Workflows

Document Ingestion

FAQ

Open Source

Turn images into structured data