Document Understanding
API for parsing documents, chunking and layout analysis, and table extraction
The Document Parsing API parses a document, and returns -
- Markdown form of the document, and optionally chunk the document into sections or fragments.
- JSON form of the document, which has more details about the layout of the document.
- Bounding Boxes for each text, table, figure and page.
- Page number indexed dictionary of the document.
- The layout type of individual text, table, figure and other elements on the pages.
- Tables are encoded as either LaTeX, CSV or Markdown.
- Tables are summarized in addition to the raw data.
- Figures are handled by extracting any text or summarizing non-textual visual content.
- Supported file types - PDF, JPEG, PNG
Quick Start
Upload the Document
Upload the document to the API.
This returns a file ID, which is used to parse the document.
Parse the Document
Parse the document using the parse_async
endpoint.
This returns a job ID, which is used to get the result of the parsing.
Get the Result
Get the result from the API.
This returns the result of the parsing.
File Upload API
You can upload a file before parsing it. The file is returned with a tensorlake://
URL, which can be used to parse the document.
You can also provide a pre-signed S3 URL or publicly accessible URL to the parse endpoint.
Listing Uploaded Files
You can list all the documents in a project using the following API call:
Parse API Reference
URL: https://api.tensorlake.ai/documents/v1/parse_async
Output Modes
Attribute: outputMode
The Document Parsing API supports two output modes:
markdown
- Parses the document into markdown, ideal for indexing and other text based post-processing.json
- Parses the document into JSON, which has more details about the layout of the document, and details about the page elements.
Chunking Strategy
Attribute: chunkStrategy
Documents are chunked automatically when they are parsed. Each strategy logically divides the parsed output for further processing.
The supported strategies are,
None
- This is the default option the entire parsed output is returned as one entity.Page
- Output data is separated by each individual page using a page number indexed dictionary.Section
- The parsing model tries to detect a logical section and returns outputs separated by section.Fragment
- The parsing model uses detected fragments (TextBox, Table or Figure) to separate the parsed output.
Switching between Models
Attribute: parseMode
You can trade off between speed and accuracy while parsing.
fast
: Faster and smaller models are used for low latency parsing.accurate
: Slightly slower, uses combinations of vision and VLMs, but more accurate.
Webhook Delivery
Attribute: deliverWebhook
You can configure the API to deliver a webhook when the parsing job finishes. Learn more about Webhooks.
Retrieving the Result
You can retrieve the result of the parsing using the jobId
returned from the parse_async
endpoint.
status: PROCESSING
, SUCCESSFUL
or FAILED
.
chunks: List of chunks returned from the parsing step.
Listing and Deleting Jobs
Listing Jobs
You can list all the jobs in a project using the following API call:
DeletingJobs
You can delete a Job and associated extracted data once you have downloaded the data.
Was this page helpful?