Introduction - Tensorlake

Document Ingestion APIs

The Tensorlake Parse API allows you to submit documents, slides, spreadsheets, images, and other unstructured data, and get structured data back in a simple and efficient way. It returns chunks of text (based on the specified chunking strategy), and structured data based on the schemas you provide.

Interact with the APIs

Get your API Keys, and start interacting with the APIs.

It offers the following capabilities:

Parse Documents

Markdown conversion

Converts a document into markdown chunks. The primary use-case is to make it easy to post-process Documents, so that they can be indexed and summarized. The document layout is also provided, which includes the bounding boxes of page elements such as tables, charts, images, etc.

Summarization of tables and charts

With the enrichment options of the Parse API, you can create detailed summaries of tables and charts from your documents, spreadsheets, and presentations.

Structured Extraction

Provide a JSON schema to the Parse API, and it will provide JSON objects that match the schema. This is useful for extracting structured data from documents, such as invoices, receipts, contracts, and other business documents.

Document Classification

Classify each page of a document into a category. This is useful for categorizing and tagging documents into different types, such as invoices, form types, receipts, contracts, and other business documents.

Signature Detection

Our in-house models can detect signatures in documents, and return the bounding boxes of the signatures.

API Documentation

Document Ingestion

​Document Ingestion APIs

Interact with the APIs

​Parse Documents

​Markdown conversion

​Summarization of tables and charts

​Structured Extraction

​Document Classification

​Signature Detection