Call the parse endpoint
file_id
returned from uploading a file to Tensorlake Cloud.file_url
that points to a publicly accessible file.page_range
: The range of pages to parse, ex: 1-2
or 1,3,5
. By default, all pages will be parsed.labels
: Metadata to identify the parse request. The labels are returned along with the parse response.parse_id
: The unique ID Tensorlake uses to reference the specific parsing job. This ID can be used to get the output when the parsing job is completed and re-visit previously used settings.Query the status of the parsing job
/parse/{parse_id}
endpoint will return:status
: The status of the parsing job. This can be failure
, pending
, processing
, or successful
.pending
or processing
, you should wait a few seconds and then check again by re-calling the endpoint.Retrieve the parsed result
successful
, you can retrieve the parsed result by calling the /parse/{parse_id}
endpoint.
The response payload will include an Response
object:chunks
: An array of objects that contain a chunk number (specified by the chunk strategy) and the markdown content for that chunk.document_layout
: A JSON representation of the document’s visual structure, including page dimensions, bounding boxes for each element (text, tables, figures, signatures), and reading order.labels
: Labels associated with the parse job.Core Function | Description |
---|---|
Structured Data Extraction | Pull out fields from a document. Specify schema using either JSON Schema or Pydantic Models. |
Page Classification | Automatically identify and label different sections or types of pages (e.g., cover, table of contents, appendix) within a document. |
Document Chunking | Enable Agents to read documents or index chunks for building RAG and Knowledge Graphs applications. |
Bounding Boxes | Specifically reference every element in the document for citations and highlighting. |
Summarization | Summarize tables, charts and figures in documents. |
Unlimited Pages and File Size | Parse any number of large documents. You pay only for what you use |
Unlimited Fields Per Document | Capture every detail in even the most complex documents. Don’t be limited to only ~100 fields by using other APIs |
Flexible Usage | All of the above features can be consumed individually or in combination in a single API call, thereby not requiring you to not need to build custom multi-stage document parsing pipelines. |