List Dataset Parse jobs

cURL

curl --request GET \
  --url https://api.tensorlake.ai/documents/v2/datasets/{dataset_id}/data \
  --header 'Authorization: Bearer <token>'

{
  "items": [
    {
      "parse_id": "parse_abcd1234",
      "dataset_id": null,
      "parsed_pages_count": 5,
      "status": "pending",
      "error": null,
      "pages": null,
      "chunks": [],
      "structured_data": null,
      "page_classes": null,
      "created_at": "",
      "finished_at": null,
      "labels": {}
    }
  ],
  "has_more": true,
  "next_cursor": "<string>",
  "prev_cursor": "<string>"
}

GET

documents

datasets

{dataset_id}

data

cURL

curl --request GET \
  --url https://api.tensorlake.ai/documents/v2/datasets/{dataset_id}/data \
  --header 'Authorization: Bearer <token>'

{
  "items": [
    {
      "parse_id": "parse_abcd1234",
      "dataset_id": null,
      "parsed_pages_count": 5,
      "status": "pending",
      "error": null,
      "pages": null,
      "chunks": [],
      "structured_data": null,
      "page_classes": null,
      "created_at": "",
      "finished_at": null,
      "labels": {}
    }
  ],
  "has_more": true,
  "next_cursor": "<string>",
  "prev_cursor": "<string>"
}

List all the parse jobs associated with a specific dataset. This endpoint allows you to retrieve the status and metadata of each parse job that has been submitted under the specified dataset.

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

dataset_id

string

required

The id of the dataset to retrieve data for

Query Parameters

cursor

string | null

Optional cursor for pagination.

This is a base64-encoded string representing a timestamp. It is used to paginate through the results.

direction

enum<string>

The direction of pagination.

This can be either next or prev.

The default is next, which means the next page of results will be returned.

Available options:

next,

prev

limit

integer

The maximum number of results to return per page.

The default is 100.

Required range: x >= 0

status

enum<string> | null

The status of the parse operation to filter the results by.

This is an optional parameter that can be used to filter the results by the status of the parse operation.

The possible values are running and `idle``.

Available options:

pending,

processing,

successful,

failure

parse_id

string | null

The ID of the parse operation to filter the results by.

This is an optional parameter that can be used to filter the results by the ID of the parse operation.

Prefer using /documents/v2/parse/{parse_id} endpoint to get the details of a specific parse operation instead of filtering by parse_id.

file_name

string | null

The name of the file to filter the results by.

This is an optional parameter that can be used to filter the results by the name of the file associated with the parse operation.

created_after

string | null

The date and time after which the parse operation was created.

The date should be in RFC3339 format.

created_before

string | null

The date and time before which the parse operation was created.

The date should be in RFC3339 format.

finished_after

string | null

The date and time after which the parse operation was finished.

The date should be in RFC3339 format.

finished_before

string | null

The date and time before which the parse operation was finished.

The date should be in RFC3339 format.

Response

200

application/json

List of dataset jobs retrieved successfully

The response is of type object.

Parse with Dataset Get Dataset Details

API Documentation

Document Ingestion

Data

Authorizations

Path Parameters

Query Parameters

Response