Skip to main content
GET
/
documents
/
v2
/
datasets
/
{dataset_id}
cURL
curl --request GET \
  --url https://api.tensorlake.ai/documents/v2/datasets/{dataset_id} \
  --header 'Authorization: Bearer <token>'
{
  "name": "Invoices Dataset",
  "dataset_id": "dataset_12345",
  "status": "idle",
  "created_at": "2023-10-01T12:00:00Z",
  "updated_at": "2023-10-01T12:00:00Z",
  "description": "This dataset contains invoices for the year 2023.",
  "analytics": "<unknown>"
}
Get the details of a specific dataset associated with your project. This endpoint allows you to retrieve information about the dataset, including its ID, name, description, and any associated metadata. The dataset’s settings can be modified using the Update Dataset endpoint. The properties of the dataset include:
  • name: The name given to the dataset.
  • dataset_id: The unique identifier for the dataset.
  • description: A brief description of the dataset if provided.
  • status: The current status of the dataset (e.g., idle, processing). If the dataset has at least one parse job in the processing or pending state, the dataset status will be processing.
  • created_at: The timestamp when the dataset was created, formatted as a RFC 3339 string.
  • updated_at: The timestamp when the dataset was last updated, formatted as a RFC 3339 string.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

dataset_id
string
required

The ID of the dataset to retrieve

Query Parameters

include_analytics
boolean

Retrieve the dataset analytics.

When set to true, the response will include the dataset's analytics data. Including:

  • Number of running parsing jobs
  • Number of completed parsing jobs
  • Number of failed parsing jobs
  • Number of pending parsing jobs

Defaults to false.

Response

Dataset retrieved successfully

name
string
required

The name of the dataset.

This is a human-readable name that identifies the dataset.

Example:

"Invoices Dataset"

dataset_id
string
required

The unique identifier for the dataset.

This identifier is used to refer to the dataset in API endpoints and operations.

This value is automatically generated and is unique within the organization and project context.

Example:

"dataset_12345"

status
enum<string>
required

The current status of the dataset.

This indicates whether the dataset is currently idle or processing.

Available options:
idle,
processing
created_at
string
required

The date and time when the dataset was created.

The data is in RFC 3339 format (e.g., "2023-10-01T12:00:00Z").

Example:

"2023-10-01T12:00:00Z"

updated_at
string
required

The date and time when the dataset was last updated.

The data is in RFC 3339 format (e.g., "2023-10-01T12:00:00Z").

Example:

"2023-10-01T12:00:00Z"

description
string | null

An optional description of the dataset.

This description is the one provided during dataset creation or update.

Example:

"This dataset contains invoices for the year 2023."

analytics
object

Understand the status of the dataset and its parse jobs.

This field provides insights into the dataset's processing state, including the number of parse jobs in various states (processing, pending, error, successful).

To retrieve detailed analytics, you can pass the include_analytics query parameter

This is useful for monitoring and analytics purposes.