Configuration

The server is configured by a YAML configuration file. The easiest way to start is by generating it with the CLI or by downloading a sample configuration file, and then tweaking it to fit your needs.

Overview

listen_addr: 0.0.0.0:8900
state_store_path: /tmp/indexify/state
blob_storage:
  backend: disk
  disk:
    path: /tmp/indexify/blobs

listen_addr: The interface on which the servers listens on. Typically you would want to listen on all interfaces. Default: 0.0.0.0:8900.
state_store_path: Path where the state store is stored. This is where the state of the graph is stored. This is needed for resuming the graph from where it left off in case of a failure. Default: indexify_storage.
blob_storage: Configuration for storing blobs. Blobs are raw bytes of data that are stored in the system. This is used for storing intermediate data between functions.

Blob Storage Configuration

Blob storage is used to store the output of functions. We support two forms of blob storage at the moment - Disk and S3 Storage.

Disk

blob_storage:
  backend: disk
  disk:
    path: /tmp/indexify-blob-storage

S3 Storage

For S3 Storage, you’ll need to also ensure you have the two following environment variables configured. Once you’ve configured these environment variables, our S3 integration will take care of the rest

AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_REGION

blob_storage:
  backend: s3
  s3:
    path: "s3://my-bucket/"

You can create the bucket using the following command:

aws s3api create-bucket --bucket my-bucket --region us-east-1

Executor health checks

You can fetch the current Executor health status by issuing an HTTP GET request to its monitoring endpoint, e.g.:

curl localhost:7000/monitoring/health
{"status": "nok", "message": "A Function Executor health check failed", "checker": "GenericHealthChecker"}

An HTTP status 200 means that Executor is healthy, 503 means that health check failed. We recommend to automatically restart the Executor if its health checks are failing because it helps to mitigate recurring bugs in functions, OS drivers and etc. You can specify a different monitoring server and port for an Executor using its CLI arguments --monitoring-server-host and --monitoring-server-port.

Tensorlake

Document Ingestion

Workflows

FAQ

Open Source

Overview

Blob Storage Configuration

Disk

S3 Storage

Executor health checks

Tensorlake

Document Ingestion

Workflows

FAQ

Open Source

​Overview

​Blob Storage Configuration

​Disk

​S3 Storage

​Executor health checks

Overview

Blob Storage Configuration

Disk

S3 Storage

Executor health checks