Some Perspectives on HybridRAG in an ArangoDB World
Estimated reading time: 7 minutes
Introduction
Graph databases continue to gain momentum, thanks to their knack for handling intricate relationships and context. Developers and tech leaders are seeing the potential of pairing them with the creative strength of large language models (LLMs). This combination is opening the door to more precise, context-aware answers to natural language prompts. That’s where RAG comes in—it pulls in useful information, whether from raw text (VectorRAG) or a structured knowledge graph (GraphRAG), and feeds it into the LLM. The result? Smarter, more relevant responses that are grounded in actual data.
A recent, collaborative study between NVIDIA and BlackRock titled, “HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction” (Bhaskarjit, Pasquali, Mehta, Rao, Hall, Patel) delves into this realm in a very compelling and thought-provoking fashion. HybridRAG contemplates a “best of both worlds” combination of GraphRAG and VectorRAG.
The study shows that a combination of knowledge graphs and vector-based retrieval methods increases the effectiveness of information extraction, especially in complex, domain-specific applications. This blog will unpack the original work’s compelling propositions and methodology, findings, and conclusions. We will also discuss whether GraphRAG alone may in some cases be sufficient.
Study Methodology
Data Preparation
The methodology of the cited study begins with constructing a vector store and a knowledge graph from a data set of corporate earnings reports. The knowledge graph captures relationships between entities, enabling structured and precise querying. At the same time, vector embeddings were generated that represent the semantic meaning of the text, which is crucial for performing similarity searches.
Evaluation Metrics
The study evaluates HybridRAG using accuracy metrics such as Faithfulness, Answer Relevance, Context Precision, and Context Recall, comparing its performance to standalone GraphRAG and VectorRAG methods.
Study Findings and Conclusions
The study concludes that HybridRAG outperforms the standalone GraphRAG and VectorRAG methods in terms of overall accuracy, particularly in complex domains. The combination of graph and vector techniques could allow for more nuanced and contextually relevant results.
VectorRAG | GraphRAG | HybridRAG | |
---|---|---|---|
F | 0.94 | 0.96 | 0.96 |
AR | 0.91 | 0.89 | 0.96 |
CP | 0.84 | 0.96 | 0.79 |
CR | 1 | 0.85 | 1 |
Figure 1.0 – Performance Metrics
In Figure 1.0 above, the study’s findings highlight Performance Metrics for Different RAG Pipelines. Here, F, AR, CP and CR refer to Faithfulness, Answer Relevance, Context Precision and Context Recall.
Complete study results and findings can be found here.
ArangoDB’s Perspective on GraphRAG vs. VectorRAG
At ArangoDB, we have no philosophical objection to HybridRAG. Indeed, the original study presents a compelling case for HybridRAG based on the findings, especially as it relates to Answer Relevance. But what would be the steps required for data preparation and processing in a HybridRAG scenario? Is it worth the extra effort? As we will see, it depends on the situation.
It’s important to note that the study contemplates a separate set of steps for three main data requirements of HybridRAG:
- Chunking up the earnings reports and creating vector embeddings that are stored in the vector database (and recreation of vector embeddings and reindexing of data as it changes).
- Creation and loading of the Knowledge Graph.
- Merging the results of GraphRAG and VectorRAG queries to generate meaningful (and in some cases richer and more accurate) results via HybridRAG.
Naturally, each of these steps involve human and computational resources. In order to assess whether (and when) additional steps/resources are required, let’s first understand how GraphRAG is fundamentally different from VectorRAG in FOUR key areas:
- Embedding Costs
- Predictable Query Cost Model
- Multi-Model Flexibility
- Context vs. Semantics
1. Embeddings Considerations
Generating vector embeddings can be time-consuming and costly. The time and resources required are not always predictable and depend on the dataset size and complexity. Moreover, once data is added or updated, vector embeddings may need to be recomputed and reindexed. Finally, there are considerations for storing embeddings cost-effectively.
With GraphRAG alone, there’s no need to explicitly generate and refresh embeddings, allowing for immediate queries against your data. This not only saves time, but also reduces computational overhead, making the information retrieval process more efficient and cost- effective.
As the study notes, a GraphRAG query may benefit from “encoding the graph structure into embeddings that the [LLM] model can interpret”. Crucially, however, there is no strict requirement to precompute graph embeddings. When a query is run in GraphRAG, relevant nodes and edges can be retrieved directly by traversing the knowledge graph. Or, subgraphs can be easily created to make GraphRAG more efficient without the use of embeddings, and without the need to traverse the entire graph.
2. Predictable Query Cost Model
The cost of precomputing, recomputing, indexing, and storing vector embeddings is significant, but a major challenge lies in accurately predicting and budgeting for querying costs in VectorRAG as the embedding volumes scale. While GPUs can reduce the variability in query latency and associated costs, they do not eliminate the unpredictability caused by fluctuating data distributions and evolving workload demands
In contrast, GraphRAG offers a more predictable cost model, for two reasons. First, because your data is already within ArangoDB, by using ArangoDB’s AQL (ArangoDB Query Language) you can control the amount of data returned in each query. For example:
FOR v, e IN 1..3 OUTBOUND ‘nodes/startNode’ GRAPH ‘exampleGraph’
FILTER v.attribute == ‘desiredValue’
LIMIT 100
RETURN v
This example limits the number of results returned, making it easier to manage query performance and cost in a predictable fashion.
Second, ArangoDB’s outstanding horizontal scaling features (including dynamic data distribution and lights out load balancing across nodes as they are added to the cluster) enables you to support massive graph datasets that maintain query performance even as the data grows exponentially. With the ability to elastically scale the cluster, a GraphRAG approach using ArangoDB has the added benefit of not needing to over-provision resources. It also has a side benefit of avoiding the need to precompute graph embeddings if you have a massive knowledge graph.
3. Multi-Model Flexibility
ArangoDB’s multi-model approach allows you to incorporate various data types—graphs, documents, full-text, key-value, and geospatial data—into a single GraphRAG query. Vector stores alone do not offer these multi-model benefits, but this by itself does not invalidate the potential of HybridRAG. ArangoDB’s multi-model benefits can be adequately paired with VectorRAG in the right scenarios.
4. Context vs. Semantics
When comparing VectorRAG and GraphRAG, it also boils down to whether you’re looking for semantic meaning or relationship context. VectorRAG is great when you need to pull information based on the meaning of the text, even if the wording isn’t exact. GraphRAG, conversely, is built for prompts that rely on understanding the relationships between different entities or data.
Imagine a patent attorney is searching for patents that explain “machine learning” as “learning from data” or “pattern recognition”. VectorRAG might retrieve something like: “Adaptive Algorithms for Data-Driven Decision Making.” This result doesn’t use the exact terms “machine learning,” “learning from data,” or “pattern recognition,” but the semantic meaning overlaps. It captures the concept of algorithms adapting based on data, which is closely related to machine learning.
In such cases, GraphRAG would fall short because it focuses more on the relationships between entities (e.g., authors, organizations, or topics) rather than identifying subtle semantic connections between different phrasings in the text.
Now, let’s flip the script and look at where GraphRAG is uniquely suited. If the attorney asks, “Show me patents filed by [organization names] related to ‘machine learning’ research,” GraphRAG would excel. It can map the relationships between organizations and patents, showing how different entities are connected.
VectorRAG would – in most cases – be at a disadvantage here, as it might return individual patents that mention “machine learning” and some organization, but it wouldn’t understand that the organization filed the patent or connect multiple patents under one organization. In some cases you wouldn’t get any results. VectorRAG lacks the ability to navigate the structured relationships between entities, which are essential for answering a query about who filed which patents.
In summary, VectorRAG is better when you need to match the meaning of text, while GraphRAG excels when you need to trace relationships between entities. The right choice depends on your use case and the type of application you are building. If your application needs both types, HybridRAG is really the only way to go.
Conclusion
The cited HybridRAG study highlights the potential benefits of integrating knowledge graphs and vector-based retrieval. ArangoDB’s GraphRAG approach utilizes the strengths of its multi-model database, while also offering unique advantages such as cost predictability and the ability to query multiple data models simultaneously.
A combination of GraphRAG and VectorRAG can indeed be beneficial, and in some cases absolutely required. A careful assessment of the natural language requirements of each application will yield an answer to the million dollar question: can ArangoDB’s GraphRAG only approach offer the most efficient and rapidly-deployable solution for AI-driven information extraction?
Finally – for those interested in leveraging the full power of ArangoDB’s GraphRAG, consider how the system’s integration with LangChain and support for both public and private LLMs to dynamically generate AQL queries offers additional and unparalleled flexibility and performance.
Get the latest tutorials, blog posts and news: