Indexify
Open-Source Compute Engine for Building Multi-Stage Data-Intensive Workflows
Indexify is a compute engine for building durable data-intensive workflows and serving them as APIs. The workflows are elastic, functions run in paralellel across mutliple machines, automatically managing data flow between dependent functions. The Graphs are served as live API endpoints for seamless integration with existing systems.
If you know Python functions, you already know how to use Indexify!
Quick Start
Jump right into building a website summarizer!
Key Features
- Conditional Branching and Data Flow: Router functions can dynamically choose one or more edges in Graph making it easy to invoke expert models based on inputs.
- Local Inference: Run LLMs in workflow functions using LLamaCPP, vLLM, or Hugging Face Transformers.
- Distributed Map and Reduce: Automatically parallelizes functions over sequences across multiple machines. Reducer functions are durable and invoked as map functions finish.
- Version Graphs and Backfill: Backfill API to update previously processed data when functions or models are updated.
- Request Queuing and Batching: Automatically queues and batches parallel workflow invocations to maximize GPU utilization.
While traditional workflows were often linear, we choose a graph-based approach to unlock inherent parallelism in AI tasks such as embeddings, chunking, summarization, object detection, and transcription.
A webscraper and summarizer workflow built using Indexify.
Migrating to Indexify
You can incrementally adopt Indexify, get an overview of the steps with an example.
Why Indexify?
Interacting with models isnโt the most challenging aspect of building Gen AI applications. However, developing these applications requires making models use a significant amount of business data in enterprises or user data in consumer applications. A large portion of this data is unstructured, and text and vision LLMs are particularly effective at processing it.
Using LLMs on a small amount of data in a notebook is the easiest way to quickly develop workflows. Indexify provides a way for you to write workflows without interruptions during iterations of local development. You can write software as you normally would, using Python classes and functions.
However, several hurdles arise when moving from a prototype to a production-ready service:
- State Management: Sharing and persisting the state of dependent stages in your application.
- Version Control and Data Migration: As models evolve, developers must version workflow code and re-process existing data with newer models (e.g., improved structured extraction, summarization, or embedding models).
- Compound Systems: Applications often require multiple models based on input context, necessitating dynamic routing of data to different functions. For instance, different document extraction models might be optimal for specific layouts, requiring a modular workflow that can adapt to various input types and contexts.
- Hardware Optimization: For local inference using open-source or custom-trained models, efficiently utilizing GPUs for model inference and CPUs for other workflow components.
Once you start using Indexify, you donโt have to worry about:
- Handling state between functions.
- Distributing functions across multiple machines for parallelism.
- Building APIs to integrate with other systems.
- Developing scripts for versioning and backfilling data.
You can program workflows as if they were running on a single machine, while benefiting from a robust, distributed infrastructure.
Was this page helpful?