> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tensorlake.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Summarization

> Summarize Tables, Figures and Charts in Documents

Document Ingestion API can be used to summarize tables, figures and charts in documents.

| Parameter                     | Description                                                                                                                                                                                                                 | Default Value |
| ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- |
| `table_summarization`         | Enable summarization of tables present in the document. This will generate a summary of the table content, including key insights and trends.                                                                               | `false`       |
| `figure_summarization`        | Enable summarization of figures present in the document. This will generate a summary of the figure content, including key insights and trends.                                                                             | `false`       |
| `table_summarization_prompt`  | A custom prompt to use for table summarization. This can be used to provide additional context or instructions to the AI model for summarizing tables in the document. If not specified, the default prompt will be used.   | -             |
| `figure_summarization_prompt` | A custom prompt to use for figure summarization. This can be used to provide additional context or instructions to the AI model for summarizing figures in the document. If not specified, the default prompt will be used. | -             |
| `chart_extraction`            | Extraction of chart type and structured data series from images, delivered as clean JSON suitable for analytics and ingestion.                                                                                              | `false`       |

#### Why would you want to summarize tables, figures and charts?

* Even though LLMs have long context, embedding models often don't. In such cases, summarizing tables, embedding them,
  and storing their image along side the summary can help retreive the right table or figure when needed for the LLM
  to answer questions.

* Figures often encode complex information which can't be converted to Markdown or HTML. Summarizing and indexing them
  can help retreive the right figure when relevant questions are asked.

## Summarizing Tables

Tales can be summarized by setting `table_summarization` to `true` in the `enrichment_options` JSON object when calling the `parse` API.

<CodeGroup>
  ```json JSON Request theme={null}
  {
      "enrichment_options": {
          "table_summarization": true,
          "table_summarization_prompt": "Summarize the table in a way that is easy to understand and use for answering questions."
      }
  }
  ```

  ```python Python SDK theme={null}
  from tensorlake.documentai import DocumentAI
  from tensorlake.documentai.models.options import (
      EnrichmentOptions,
  )

  enrichment_options = EnrichmentOptions(
      table_summarization=True,
      table_summarization_prompt="Summarize the table in a concise manner.",
  )

  doc_ai = DocumentAI(api_key=API_KEY)

  parse_id = doc_ai.read(
      file_id="file_XXX",  # Replace with your file ID or URL
      enrichment_options=enrichment_options,
  )
  ```
</CodeGroup>

The table summary prompt is optional. If not provided, a default prompt will be used.

## Summarizing Figures and Charts

Figures can be summarized by setting `figure_summarization` to `true` in the `enrichment_options` JSON object when calling the `parse` API.

<CodeGroup>
  ```json JSON Request theme={null}
  {
      "enrichment_options": {
          "figure_summarization": true,
          "figure_summary_prompt": "Summarize the figure in a way that is easy to understand and use for answering questions."
      }
  }
  ```

  ```python Python SDK theme={null}
  from tensorlake.documentai import (
      DocumentAI,
      EnrichmentOptions,
  )

  doc_ai = DocumentAI(api_key=API_KEY)

  enrichment_options = EnrichmentOptions(
      figure_summarization=True,
      figure_summary_prompt="Summarize the figure in a way that is easy to understand and use for answering questions.",
  )

  parse_id = doc_ai.read(
      file_id="file_XXX",  # Replace with your file ID or URL
      enrichment_options=enrichment_options,
  )
  ```
</CodeGroup>

The figure summary prompt is optional. If not provided, a default prompt will be used.

### Charts

Structured information about charts can be extracted by setting `chart_extraction` to `true` in the `enrichment_options` JSON object when calling the `parse` API.

<CodeGroup>
  ```python Python SDK theme={null}
  from tensorlake.documentai import DocumentAI
  from tensorlake.documentai.models.options import (
      EnrichmentOptions,
  )

  enrichment_options = EnrichmentOptions(
      chart_extraction=True,
  )

  doc_ai = DocumentAI(api_key=API_KEY)

  parse_id = doc_ai.read(
      file_id="file_XXX",  # Replace with your file ID or URL
      enrichment_options=enrichment_options,
  )
  ```

  ```json REST API theme={null}
  {
      "enrichment_options": {
          "chart_extraction": true,
      }
  }
  ```
</CodeGroup>
