Retrieval-Augmented Generation (RAG) makes general purpose LLMs access your private data sources. Many LLM use-cases RAG under the hood to ground the LLMs on accurate and up-to-date data.

RAG is usually comprised of the following stages -

  1. Indexing - The process of loading your data from various sources, convert them into forms that can be queried easily by LLM applications.
  2. Querying and Generation - The process of querying the indexed data and generating responses. The querying part is usually done by a retriever that retrieves the most relevant data from the indexed data.

Indexing and Querying in the real world is usually done in parallel and continously. This means that beyond the core RAG algorithms, you need to build a system that can continously and reliably index and query data.

RAG using Indexify

You can perform data loading, impelement indexing and other data transformation algorithms as workflows. Indexify makes it easy to build and operationalize pipelines that can process and index data continously.

You can migrate from one RAG algorithm to another, re-index already processed data if you migrate embedding models by running data migrations. Create namespaces for different security sensitive data, and manage access control.

You have full control over how you want to index and query data. Which one you choose depends on your use-case, data and LLMs you use for generation.

Example RAG Algorithms

We show some examples of RAG algorithms that can be implemented using Indexify.