Parse with Dataset - Tensorlake

cURL

curl --request POST \
  --url https://api.tensorlake.ai/documents/v2/datasets/{dataset_id}/parse \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "file_id": "<string>",
  "file_url": "https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/real-estate-purchase-all-signed.pdf",
  "raw_text": "<string>",
  "page_range": "1",
  "mime_type": null,
  "labels": {
    "priority": "high",
    "source": "email"
  }
}'

{
  "parse_id": "parse_id-12345",
  "created_at": "2023-10-01T12:00:00Z"
}

POST

documents

datasets

{dataset_id}

parse

cURL

curl --request POST \
  --url https://api.tensorlake.ai/documents/v2/datasets/{dataset_id}/parse \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "file_id": "<string>",
  "file_url": "https://pub-226479de18b2493f96b64c6674705dd8.r2.dev/real-estate-purchase-all-signed.pdf",
  "raw_text": "<string>",
  "page_range": "1",
  "mime_type": null,
  "labels": {
    "priority": "high",
    "source": "email"
  }
}'

{
  "parse_id": "parse_id-12345",
  "created_at": "2023-10-01T12:00:00Z"
}

Use the Dataset’s configuration to parse a document and get parsed results in the Dataset. This endpoint allows you to submit a file for parsing using the settings defined in a specific dataset.

Using a file

When submitting a parse job to the dataset, you can provide the content of the file in one of three ways:

file_id: The ID of a file that has been previously uploaded to the Upload File endpoint. This is the most common method.
file_url: A publicly accessible URL that points to the file you want to parse. The API will download the file from this URL. Redirects are also supported, but the URL and the Location header must point to a file that is publicly accessible.
raw_text: Raw text content, if you want to perform structured extraction from non-file sources; such as emails, HTML, CSV, XML, etc.

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

dataset_id

string

required

The ID of the dataset to parse

Body

application/json

Response

200

application/json

Dataset file parsed successfully

The response is of type object.

Create Dataset List Dataset Parse jobs

API Documentation

Document Ingestion

Parse

Using a file

Authorizations

Path Parameters

Body

Response

API Documentation

Document Ingestion

​Using a file

Authorizations

Path Parameters

Body

Response

Using a file