You can upload files to Tensorlake for parsing without relying any external storage like S3.

The file upload API returns a unique file_id that can be used with the /parse endpoint to parse the file.

We also support pre-signed URLs or any publicly accessible URLs for files. You can skip the upload step and directly use the /parse endpoint with file URLs.

Upload Files

Files uploaded are scoped to a specific project in your account. The API key provided with the API calls is used to determine the project to which the file is uploaded. This is used to secure the files uploaded and isolate them from other projects in your account.

from tensorlake.documentai import DocumentAI

doc_ai = DocumentAI(api_key="xxxx")
file_id = doc_ai.upload(file_path="/path/to/file.pdf")

The Python SDK handles files of any size.

List Files

Using the API key for a specific Tensorlake project, you can list all of the files that are a part of that project.

from tensorlake.documentai import DocumentAI, FileInfo
doc_ai = DocumentAI(api_key="xxxx")

files_page = doc_ai.files()

Delete Files

If you have documents you want to remove from Tensorlake Cloud, you can quickly delete them by passing in the file_id.

from tensorlake.documentai import DocumentAI, ParsingOptions

doc_ai.delete_file(file_id="tensorlake-unique_id")

Supported File Types

Tensorlake supports the following file types:

  • PDF
  • Images (PNG, JPG)
  • Presentations (PPTX, Keynote)
  • Raw Text (plain text, HTML)
  • Spreadsheets (XLSX, XLS, CSV)
  • Word Documents (DOCX)