Introduction
Tensorlake API Reference for Document Ingestion
Document Ingestion APIs
The Tensorlake Parse API allows you to submit documents, slides, spreadsheets, images, and other unstructured data, and get structured data back in a simple and efficient way. It returns chunks of text (based on the specified chunking strategy), and structured data based on the schemas you provide.
Interact with the APIs
Get your API Keys, and start interacting with the APIs.
It offers the following capabilities:
Parse Documents
Markdown conversion
Converts a document into markdown chunks. The primary use-case is to make it easy to post-process Documents, so that they can be indexed and summarized. The document layout is also provided, which includes the bounding boxes of page elements such as tables, charts, images, etc.
Summarization of tables and charts
With the enrichment options of the Parse API, you can create detailed summaries of tables and charts from your documents, spreadsheets, and presentations.
Structured Extraction
Provide a JSON schema to the Parse API, and it will provide JSON objects that match the schema. This is useful for extracting structured data from documents, such as invoices, receipts, contracts, and other business documents.
Document Classification
Classify each page of a document into a category. This is useful for categorizing and tagging documents into different types, such as invoices, form types, receipts, contracts, and other business documents.
Signature Detection
Our in-house models can detect signatures in documents, and return the bounding boxes of the signatures.