Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
This object defines the request body for creating a new dataset.
A Dataset is a collection of parsed results from files.
It can be used to store and manage related data, such as invoices, receipts, or any other documents that need to be parsed and analyzed.
Once a dataset is created, you can use it to parse related files using the same configuration and options, allowing for consistent and efficient data extraction.
The name of the dataset.
The name can only contain alphanumeric characters, hyphens, and underscores.
The name must be unique within the organization and project context.
"invoices dataset"
The properties of this object define the configuration for the document parsing process.
Tensorlake provides sane defaults that work well for most documents, so this object is not required. However, every document is different, and you may want to customize the parsing process to better suit your needs.
The properties of this object define the configuration for structured data extraction.
If this object is present, the API will perform structured data extraction on the document.
The properties of this object define the configuration for page classify.
If this object is present, the API will perform page classify on the document.
The properties of this object help to extend the output of the document parsing process with additional information.
This includes summarization of tables and figures, which can help to provide a more comprehensive understanding of the document.
This object is not required, and the API will use default settings if it is not present.
A description of the dataset.
This field is optional and can be used to provide additional context about the dataset.
"This dataset contains all invoices from 2023."
Response
Dataset created successfully
The human-readable name of the dataset provided during creation.
"invoices dataset"
The unique identifier for the dataset.
This identifier is used to refer to the dataset in API endpoints and operations.
This value is automatically generated and is unique within the organization and project context.
"dataset_12345"
The date and time when the dataset was created.
The date is in RFC 3339 format (e.g., "2023-10-01T12:00:00Z").
"2023-10-01T12:00:00Z"