- Convert the Document to Markdown for feeding into an LLM.
- Extract structured data from the document specified by a JSON schema.
Prerequisites
- A Tensorlake API key
- [Optional] Tensorlake SDK for Python
Convert to Markdown
- Python SDK
- REST API
1
Install the SDK
2
Set your API key
Export the variable and the SDK will reference your environment variables, looking for
TENSORLAKE_API_KEY
:3
Parse a document
quickstart.py
4
Wait for the job to complete
quickstart.py
5
Use the results
quickstart.py
Output
When the parsing is complete, you will see -- markdown_chunks.md
Markdown Chunks
Extract Structured Data
- Python SDK
- REST API
1
Parse a document
quickstart.py
2
Wait for the job to complete
quickstart.py
3
Use the results
quickstart.py
Output
When the parsing is complete, you will see the structured data in the console.- structured_data.json