Document Ingestion API can be used to detect signatures in documents in two ways:

  1. Get the bounding boxes of detected signatures in the document.
  2. Get the context of the signatures detected in the document.

Getting Bounding Boxes of Signatures

Bounding boxes of signatures can be detected by setting signature_detection to true in the parse_options JSON object when calling the parse API.

from tensorlake.documentai import DocumentAI
from tensorlake.documentai.models.options import ParsingOptions

doc_ai = DocumentAI(api_key="YOUR_API_KEY")

parsing_options = ParsingOptions(
    signature_detection=True,
)

parse_id = doc_ai.parse(
    file="tensorlake-XXX",  # Replace with your file ID or URL
    parsing_options=parsing_options,
)

Response

The bounding boxes of signatures are present in the Document object returned by the parse API. This is a JSON object which contains all the detected objects in the document such as tables, figures, charts, signatures, etc.

results = doc_ai.get_job(job_id)
# There is a signature on page 10 of this document
# result.outputs.document.pages[10].page_fragments[0]
# PageFragment(fragment_type=<PageFragmentType.SIGNATURE: 'signature'>, content=Text(content='Signature detected'), reading_order=-1, page_number=None, bbox={'x1': 79.0, 'x2': 200.0, 'y1': 812.0, 'y2': 855.0})

Getting Context of Signatures

Context of signatures can be detected by using the Structured Extraction API. You can specify a schema that captures the context, such as has the signature been signed by the signee, name of the personal signing, etc.

A sample schema for signature context is shown below:

from typing import List, Optional
from pydantic import BaseModel, Field

from tensorlake.documentai import DocumentAI
from tensorlake.documentai.models.options import (
    StructuredExtractionOptions,
    ParsingOptions
)

class Signature(BaseModel):
    has_signed: Optional[str] = Field(
        None, description="Has the signee signed the signature"
    )
    name_signee: Optional[str] = Field(None, description="Name of the signee")

class Signatures(BaseModel):
    signatures: List[Signature]

signatures_extraction = StructuredExtractionOptions(
    schema_name="signatures",
    json_schema=Signatures
)

parsing_options = ParsingOptions(
  signature_detection=True
)

doc_ai = DocumentAI(api_key="YOUR_API_KEY")
parse_id = doc_ai.parse(
    file="tensorlake-XXX",  # Replace with your file ID or URL
    parsing_options=parsing_options,
    structured_extraction_options=[signatures_extraction],
)

results = doc_ai.wait_for_completion(parse_id)

print(results)
# {
#   "signatures": [
#     {
#       "has_signed": "Yes",
#       "name_signee": "John Doe"
#     }
#   ]
# }