POST
/
documents
/
v1
/
files_large
curl --request POST \
  --url https://api.tensorlake.ai/documents/v1/files_large \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "file_size": 123,
  "filename": "<string>",
  "mime_type": "<string>",
  "sha256_checksum": "<string>"
}'
{
  "id": "<string>",
  "presigned_url": "<string>"
}

This API call initiates the upload process for large files. It returns a presigned URL that can be used to upload the file in chunks.

This API call returns a temporary identifier for the file, which is used to identify the file during the upload process. The identifier is of the form tl-presigned-<temporary_file_id>.

The presigned URL is valid for 1 hour. After the file is uploaded, you must call the Finalize the upload process for large files API to complete the upload process.

Files which have not been finalized won’t be available for parsing or structured extraction.

Uploading to a presigned URL

The presigned URL is an AWS S3 presigned URL. You can use any HTTP client to upload the file to the presigned URL. The file must be uploaded as a PUT request with the following headers:

  • Content-Type: The content type of the file being uploaded. This is required. For example, application/pdf for PDF files, image/jpeg for JPEG images, etc.

Note: The official Document AI Python SDK handles the large file upload process for you. The SDK will automatically calculate the SHA256 hash and upload the file to the presigned URL.

TypeScript Example

const TENSORLAKE_API_KEY = "tl_api_key_****";

/**
 * Calculates the SHA-256 hash of a File object using the Web Crypto API.
 * This is used for verifying file integrity before upload.
 * 
 * @param file - The File object to hash (e.g. from an <input type="file">)
 * @returns A Promise that resolves to the hexadecimal SHA-256 hash string
 */
async function calculateSha256(file: File): Promise<string> {
  const arrayBuffer = await file.arrayBuffer();
  const hashBuffer = await crypto.subtle.digest("SHA-256", arrayBuffer);
  const hashArray = Array.from(new Uint8Array(hashBuffer));
  const hashHex = hashArray
    .map((b) => b.toString(16).padStart(2, "0"))
    .join("");
  return hashHex;
}

/**
 * Uploads a large file to Tensorlake using the presigned URL flow.
 *
 * @param file - The File object to upload
 * @returns Promise resolving to the Tensorlake file ID after successful upload
 */
async function uploadLargeFile({ file }: { file: File }): Promise<string> {
  // Step 1: Generate SHA-256 checksum of the file
  const sha256Checksum = await calculateSha256(file);

  // Step 2: Request a presigned URL for the large file upload
  const response = await fetch("https://api.tensorlake.ai/documents/v1/files_large".{
    method: "POST",
    body: {
      filename: file.name,
      mime_type: file.type,
      file_size: file.size,
      sha256_checksum: sha256Checksum,
    },
    headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${TENSORLAKE_API_KEY}`,
    }
  });

// Step 3: Handle response and validate presigned upload setup
  if (response.error) throw new Error(JSON.stringify(response.error));

  if (!response.data) throw new Error("Upload failed - no data received");

  const { presigned_url, id: presignedId } = response.data;
  if (!presigned_url) {
    throw new Error("Upload failed - no presigned URL received");
  }

// Step 4: Upload the file directly to the cloud using the presigned URL
  try {
    const response = await fetch(presigned_url, {
      method: "PUT",
      body: file,
      headers: {
        "Content-Type": file.type,
      },
    });

    if (!response.ok) {
      const errorText = await response.text();
      throw new Error(
        `File upload failed with status ${response.status}: ${errorText}`
      );
    }
  } catch (error) {
    throw new Error(
      `File upload failed: ${
        error instanceof Error ? error.message : "Unknown error"
      }`
    );
  }

  // Step 5: Finalize the upload so Tensorlake can process it
  const finalizeResponse = await fetch(`https://api.tensorlake.ai/documents/v1/files_large/${presignedId}`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": `Bearer ${TENSORLAKE_API_KEY}`,
    },
  });

// Step 6: Confirm finalization and return the file ID
  if (finalizeResponse.error)
    throw new Error(JSON.stringify(finalizeResponse.error));

  if (!finalizeResponse.data)
    throw new Error("Upload failed - no data received");

  return finalizeResponse.data.id;
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Response

200
application/json

Response object with the temporary File ID and the presigned URL.

The response is of type object.