Document Ingestion API can be used to summarize tables, figures and charts in documents.

Why would you want to summarize tables, figures and charts?

  • Even though LLMs have long context, embedding models often don’t. In such cases, summarizing tables, embedding them, and storing their image along side the summary can help retreive the right table or figure when needed for the LLM to answer questions.

  • Figures often encode complex information which can’t be converted to Markdown or HTML. Summarizing and indexing them can help retreive the right figure when relevant questions are asked.

Summarizing Tables

Tales can be summarized by setting tableSummarization to true in the settings JSON object when calling the parse API.

{
    "settings": {
        "tableSummarization": true,
        "tableSummarizationPrompt": "Summarize the table in a way that is easy to understand and use for answering questions."
    }
}

The table summary prompt is optional. If not provided, a default prompt will be used.

Summarizing Figures and Charts

Figures can be summarized by setting figureSummarization to true in the settings JSON object when calling the parse API.

{
    "settings": {
        "figureSummarization": true,
        "figureSummarizationPrompt": "Summarize the figure in a way that is easy to understand and use for answering questions."
    }
}

The figure summary prompt is optional. If not provided, a default prompt will be used.