The Tensorlake Playground is the easiest way to explore and understand how our document parsing framework works, even before writing any code.

For this example, we’ll upload a real estate purchase agreement and configure a parsing pipeline that extracts key information like the buyer and seller names, signature status, and signature dates.

Try it out yourself in the Tensorlake Playground by following these steps:

  1. Document Upload: Upload the PDF document you want to parse. If you don’t have a document with signatures to test, you can download the one from this blog post here.

  2. Structured Extraction: Define what you want to extract using the schema builder or JSON. This includes specifying the fields for Buyer and Seller names, signature status, and signature data. Here is the schema for this example:

    real-estate-schema.json
    {
        "properties": {
        "buyer": {
            "properties": {
                "buyer_name": { "type": "string" },
                "buyer_signature_date": { "type": "string" },
                "buyer_signed": { "type": "boolean" }
            },
            "type": "object"
        },
        "seller": {
            "properties": {
                "seller_name": { "type": "string" },
                "seller_signature_date": { "type": "string" },
                "seller_signed": { "type": "boolean" }
            },
            "type": "object"
        }
        },
        "title": "real_estate_purchase_agreement",
        "type": "object"
    }
  3. Extraction and Parsing Options: Configure options like page range, chunking strategy, table summarization, and signature detection. For this example, we want to skip OCR because we want to detect the handwritten signature and not just convert it to text. We’ll also parse all pages and enable signature detection.

  1. As output, you will get a markdown version of your document, a markdown preview, a JSON file with all of the extracted data, and a JSON file with the structured data you defined.

    Specifically for this example, we are focused on the JSON file with the extracted data, including the buyer and seller names, signature dates, and whether they signed the agreement:

    output.json
    {
        "pages": [
            {
                "page_number": 0,
                "json_result": {
                    "buyer": {
                        "buyer_name": "Nova Ellison",
                        "buyer_signature_date": "September 10, 2025",
                        "buyer_signed": true
                    },
                    "seller": {
                        "seller_name": "Juno Vega",
                        "seller_signature_date": "September 10, 2025",
                        "seller_signed": true
                    }
                }
            }
        ]
    }

    With this output you have precision and automation with very little code—and can be extended with webhooks, LangGraph agents, or even downstream CRM updates.