> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tensorlake.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Retrieve Dataset Data

> Retrieve the data stored in a Tensorlake Dataset.

A Tensorlake Dataset is a collection of parsed results from documents that were parsed using the options defined by the Dataset.
You can retrieve the parsed result data stored in a Dataset using the `/datasets/{dataset_id}/data` endpoint.

<CodeGroup>
  ```python Python theme={null}
  from tensorlake.documentai.client import (
    DocumentAI,
    ParseStatus,
  )

  doc_ai = DocumentAI(api_key="YOUR_TENSORLAKE_API_KEY")

  # If you don't know your dataset ID, you can go to https://cloud.tensorlake.ai
  # and find it in the Datasets section.
  dataset = doc_ai.get_dataset("your_dataset_id")

  dataset_data = doc_ai.get_dataset_data(dataset)

  for parsed_result in dataset_data.items
      print(f"Parse ID: {parsed_result.parse_id}")

      if parsed_result.status == ParseStatus.SUCCESSFUL:
          print("Parsed Document:")
          print(parsed_result.document)
      else:
          print(f"Parse Status: {parsed_result.status}")
  ```

  ```bash curl theme={null}
  curl --request GET \
    --url https://api.tensorlake.ai/documents/v2/datasets/{dataset_id}/data \
    --header 'Authorization: Bearer ${TENSORLAKE_API_KEY}' \
    --header 'Content-Type: application/json'
  ```
</CodeGroup>

Dataset data is returned as a paginated list of results from [parse jobs initiated via the Dataset](/document-ingestion/datasets/create).

Each item in the list follows the same structure as the [parse results from regular parse jobs](/document-ingestion/parsing/read#understand-the-parsing-output).

Both the API and the Python SDK use cursor-based pagination to retrieve the Dataset data. The response will include
a `next_cursor` field that you can use to retrieve the next page of results.

### Filtering Dataset Data

The [`/datasets/{dataset_id}/data`](/api-reference/v2/datasets/data) endpoint supports filtering the Dataset data by various parameters. You can filter by:

* `status`: Filter by the status of the parse job (e.g., `Pending`, `Processing`, `Successful`, `Failure`).
* `file_name`: Filter by the name of the file that was parsed. This may not be available if the file used was not a file uploaded to Tensorlake (e.g. if you used a `file_url` or `raw_text`).
* `created_after`: Filter by an inclusive date after which the parse job was created. Date should be in RFC 3339 format (e.g., `2023-10-01T00:00:00Z`).
* `created_before`: Filter by an inclusive date before which the parse job was created. Date should be in RFC 3339 format (e.g., `2023-10-01T00:00:00Z`).
* `finished_after`: Filter by an inclusive date after which the parse job was finished. Date should be in RFC 3339 format (e.g., `2023-10-01T00:00:00Z`).
* `finished_before`: Filter by an inclusive date before which the parse job was finished. Date should be in RFC 3339 format (e.g., `2023-10-01T00:00:00Z`).
