parse_id
field. You can query the status and results of the parse operation
with the Get Parse Result endpoint.
file_id
: The ID of a file that has been previously uploaded to the Upload Files. This is the most common method.file_url
: A publicly accessible URL that points to the file you want to parse. The API will download the file from this URL.
Redirects are also supported, but the URL and the Location
header must point to a file that is publicly accessible.raw_text
: Raw text content, if you want to perform structured extraction from non-file sources; such as emails, HTML, CSV, XML, etc.text/plain
: Plain text files (default for raw_text
)text/csv
: CSV files.text/html
: HTML files.application/pdf
: PDF files.application/vnd.openxmlformats-officedocument.wordprocessingml.document
**: DOCX files.application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
: XLSX files.application/vnd.ms-excel.sheet.macroEnabled.12
: XLSM files.application/vnd.openxmlformats-officedocument.presentationml.presentation
: PPTX files.application/vnd.ms-excel
: XLS files.image/jpeg
: JPEG images.image/png
: PNG images.mime_type
field to override the inferred mime-type. This is useful if you know the content type of the file and want to ensure the model interprets it correctly.
For the raw_text
method, you must specify the mime_type
field to indicate the type of content you are providing. This is necessary for the model to correctly interpret the text.
page_classifications
field. The API will return the page class for each page of the document.
structured_extraction_options
array, which can contain multiple objects.
Known limitations include:
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
This object defines the request body for the parse endpoint.
Created parse job details
The response is of type object
.