Read Documents

The Read API converts Documents to Markdown and provides spatial layouts of pages. The response of the Read API contains:

Markdown representation of pages. The elements in pages ordered by their natural reading order
Tables encoded as Markdown or HTML
Summary of tables and figures guided by custom prompts
Bounding boxes for each page element(e.g. signature, key-value pair, figure)

Read the Overview for understanding how to integrate Document Parsing to your existing workflows.

API Usage Guide

Calling the read endpoint will create a new document parsing job, starting in the pending state. It will transition to the processing state and then to the successful state when it’s parsed successfully.

Python SDK
REST API

If you are using the Python SDK, all the configuration options described above are expressed through the ParsingOptions class.

from tensorlake.documentai import (
  DocumentAI,
  ParsingOptions,
  ChunkingStrategy,
  TableOutputMode,
  TableParsingFormat,
)

doc_ai = DocumentAI(api_key="xxxx")
file_id = "file_xxxx"

parsing_options = ParsingOptions(
    chunking_strategy=ChunkingStrategy.FRAGMENT,
    table_output_mode=TableOutputMode.MARKDOWN
)

parse_id = doc_ai.read(file_id=file_id, page_range="1-2", parsing_options=parsing_options)

Options for Parsing Documents

Document Parsing can be customized by providing the parsing_options and enrichment_options in your request.

Parameter	Description
`parsing_options`	Customizes the OCR and table parsing process and chunking strategies. See Parsing Options.
`enrichment_options`	Enables and configures table and figure summarization. See Summarization.

Get a full list of the configuration setting options on the /parse section of the API reference.

Parsing Options

Parameter	Description	Default Value
`chunking_strategy`	Choose between , , , or .	`None`
`table_output_mode`	Choose between Markdown, .	`HTML`
`ocr_model`	Chose between `model03`, `model01`, `model02`, `model03` and `gemini3`	`model01`
`disable_layout_detection`	Boolean flag to skip layout detection and directly extract text. Useful for documents with many tables or images.	`false`
`skew_detection`	Detect and correct skewed or rotated pages. Please note this can increase the processing time.	`false`
`signature_detection`	Detect signatures in the document. Please note this can increase the processing time, and incurs additional costs.	`false`
`remove_strikethrough_lines`	Remove strikethrough lines from the document. Please note this can increase the processing time, and incurs additional costs.	`false`
`ignore_sections`	A set of document fragments to ignore during parsing. This can be useful for excluding irrelevant sections from the output.	`[]`
`cross_page_header_detection`	A boolean flag to enable header hierarchy detection across pages. This can improve the accuracy of header extraction in multi-page documents.	`false`

OCR Models

Tensorlake has a few different OCR models, with different strengths and weaknesses. We recommend experimenting with the models on your documents and using the best model for your use case.

model03 - This is our recommended model. It has the ability to read and describe complex tables and figures. Supports large scale ingestion of documents.
model01 - Provides balanced accuracy and throughput charactaristics.
model02 - Often works better than Model01 with complex tables such as the ones found in financial reports.
gemini3 - Uses Google’s Gemini3 for OCR.

A key difference between Model03 and Model01/02 is that Model01/02 provides bounding boxes of the table cells while Model03 doesn’t. Gemini3 doesn’t provide any bounding boxes.

Retrieve Output

The parsed document output can be retrieved using the /parse/{parse_id} endpoint, or using the get_job SDK function.

result = doc_ai.get_parsed_result(parse_id)

Markdown Chunks

Leveraging the markdown chunks is a common next step after parsing documents.

for chunk in result.chunks:
print(f"## Page Number: {chunk.page_number}\n")
print(f"## Content: {chunk.content}\n")

See Parse Output for more details about the output.

Bounding Boxes

Each page fragment includes bounding box coordinates that specify the exact location of the content on the page. This is useful for creating citations, highlighting source content in a UI, or debugging extraction quality.

Google Colab Notebook

Accessing Bounding Boxes

result = doc_ai.parse_and_wait(file_id)

for page in result.pages:
  for fragment in page.page_fragments:
    bbox = fragment.bbox
    print(f"Fragment type: {fragment.fragment_type}")
    print(f"Top-left: ({bbox['x1']}, {bbox['y1']})")
    print(f"Bottom-right: ({bbox['x2']}, {bbox['y2']})")

Coordinate System

Bounding boxes use the following coordinate system:

x1, y1: Top-left corner of the bounding box
x2, y2: Bottom-right corner of the bounding box
Origin (0,0): Top-left corner of the page
Units: Pixels

All fragment types include bounding box coordinates.

Table and Figure Summarization

Document Ingestion API can be used to summarize tables and figures in documents.

Parameter	Description	Default Value
`table_summarization`	Enable summarization of tables present in the document. This will generate a summary of the table content, including key insights and trends.	`false`
`figure_summarization`	Enable summarization of figures present in the document. This will generate a summary of the figure content, including key insights and trends.	`false`
`table_summarization_prompt`	A custom prompt to use for table summarization. This can be used to provide additional context or instructions to the LLM. If not specified, the default prompt will be used.	-
`figure_summarization_prompt`	A custom prompt to use for figure summarization. This can be used to provide additional context or instructions to the LLM. If not specified, the default prompt will be used.	-
`include_full_page_image`	Include the full page image as additional context when summarizing tables and figures, which can improve accuracy by capturing surrounding headers, captions, and related content.	`false`

Tables

Tales can be summarized by setting table_summarization to true in the enrichment_options JSON object when calling the parse API.

Google Colab Notebook

from tensorlake.documentai import DocumentAI
from tensorlake.documentai.models.options import (
    EnrichmentOptions,
)

enrichment_options = EnrichmentOptions(
    table_summarization=True,
    table_summarization_prompt="Summarize the table in a concise manner.",
)

doc_ai = DocumentAI(api_key=API_KEY)

parse_id = doc_ai.read(
    file_id="file_XXX",  # Replace with your file ID or URL
    enrichment_options=enrichment_options,
)

Figures

Figures can be summarized by setting figure_summarization to true in the enrichment_options JSON object when calling the parse API.

Google Colab Notebook

from tensorlake.documentai import (
    DocumentAI,
    EnrichmentOptions,
)

doc_ai = DocumentAI(api_key=API_KEY)

enrichment_options = EnrichmentOptions(
    figure_summarization=True,
    figure_summary_prompt="Summarize the figure in a way that is easy to understand and use for answering questions.",
)

parse_id = doc_ai.read(
    file_id="file_XXX",  # Replace with your file ID or URL
    enrichment_options=enrichment_options,
)

Full Page Image Context

When summarizing tables and figures, you can optionally include the full page image as additional context. This helps the model better understand the surrounding content, headers, footers, and relationships between elements on the page.

Google Colab Notebook

enrichment_options = EnrichmentOptions(
    table_summarization=True,
    figure_summarization=True,
    include_full_page_image=True
)

result = doc_ai.parse_and_wait(
    file_id,
    enrichment_options=enrichment_options
)

Tensorlake

Applications

Document Ingestion

FAQ

Open Source

API Usage Guide

Options for Parsing Documents

Parsing Options

OCR Models

Retrieve Output

Markdown Chunks

Bounding Boxes

Accessing Bounding Boxes

Coordinate System

Table and Figure Summarization

Tables

Figures

Full Page Image Context

Tensorlake

Applications

Document Ingestion

FAQ

Open Source

​API Usage Guide

​Options for Parsing Documents

​Parsing Options

​OCR Models

​Retrieve Output

​Markdown Chunks

​Bounding Boxes

​Accessing Bounding Boxes

​Coordinate System

​Table and Figure Summarization

​Tables

​Figures

​Full Page Image Context

API Usage Guide

Options for Parsing Documents

Parsing Options

OCR Models

Retrieve Output

Markdown Chunks

Bounding Boxes

Accessing Bounding Boxes

Coordinate System

Table and Figure Summarization

Tables

Figures

Full Page Image Context