Initiate upload process for large files
This API call initiates the upload process for large files. It returns a presigned URL that can be used to upload the file in chunks.
This API call returns a temporary identifier for the file, which is used to identify the file during the upload process. The identifier is of the form tl-presigned-<temporary_file_id>
.
The presigned URL is valid for 1 hour. After the file is uploaded, you must call the Finalize the upload process for large files API to complete the upload process.
Files which have not been finalized won’t be available for parsing or structured extraction.
Uploading to a presigned URL
The presigned URL is an AWS S3 presigned URL. You can use any HTTP client to upload the file to the presigned URL. The file must be uploaded as a PUT
request with the following headers:
Content-Type
: The content type of the file being uploaded. This is required. For example,application/pdf
for PDF files,image/jpeg
for JPEG images, etc.Content-Length
: The size of the file being uploaded. This is required. This value must be set to the size of the file in bytes.x-amz-sdk-checksum-algorithm
: The checksum algorithm used to calculate the SHA256 hash. This is required. This value must be set toSHA256
.x-amz-checksum-sha256
: The SHA256 hash of the file being uploaded. This is required. This value must be set to the base64 encoded SHA256 hash of the file in hex format.
This is an example of how to calculate the SHA256 hash of a file in Python:
And here there is an example of how to calculate the SHA256 header in bash:
Note: The official Document AI Python SDK handles the large file upload process for you. You can use the Document AI Python SDK to upload large files. The SDK will automatically calculate the SHA256 hash and upload the file to the presigned URL.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
Response
The response is of type object
.