Back to blog

Enhance your RAG with RAGA: The Key to Precision and Relevance

TechTalks_Generic.webp

Published: Aug 02, 2024

Time to read: 2 min


In the fast-paced world of AI, integrating applications with large language models (LLMs) and generative AI is the new frontier. Retrieval-Augmented Generation (RAG) emerged to combat hallucinations, but introduced a new challenge: the “Lost in the Middle” issue. This occurs when the retrieved context is irrelevant or lacks benchmarks. Even tech giants like Google and AWS have faced these hurdles with their custom RAG clients.

Enter RAGA benchmarks—a game-changer for evaluating and refining your AI application.

What is RAGA?

RAGA, or Retrieval-Augmented Generation Assessment, is your toolkit for ensuring your AI's output meets expectations. It assesses whether the data retrieved and the answers generated by your LLM align with the question. Key metrics include:

  • Faithfulness: Are the generated claims supported by the context?
  • Answer Relevancy: Is the generated answer relevant to the question?
  • Context Relevancy: Does the context pertain to the question?
  • Context Precision: How accurate is the retrieved context?

Unlocking the RAGA Advantage

By moving forward with custom metrics, it's possible to calculate and rank feature rollouts, enforcing strict thresholds before anything goes live. Here's the current state of the application:

Component

Context Precision

Context Relevancy

Retriever

0.57

0.647

Component

Faithfulness

Answer Relevancy

Generator

0.81

0.82

For the generator, the formulas implemented were:

  • Faithfulness = (# claims that can be derived from context) / (# total claims) 
  • Answer Relevancy = mean( cosine_similarity(artificially generated questions from the answers, question))

For the retriever, the formulas used were:

  • Context Precision = (# of chunks that ranked high @k) / (total chunks)
  • Context Relevancy = |S| / (total # of sentences) 

Here, k represents the precision count for the nth chunk, and S denotes the number of sentences relevant to the question.

Adjusting parameters like chunk size and overlap led to new experiments and the following results:

Component

Context Precision

Context Relevancy

Retriever

0.53

0.6

Component

Faithfulness

Answer Relevancy

Generator

0.85

0.86

The custom approach to RAGA has highlighted areas for improvement and identified what can remain unchanged:

  • Retriever Performance: With context precision at 0.53 and context relevancy at 0.6, there is significant room for improvement. Enhancing the relevance and accuracy of retrieved chunks and adjusting chunk size and overlap are essential steps.
  • Generator Performance: The generator shows promising results with a faithfulness score of 0.85 and answer relevancy of 0.86, indicating that the LLM generates responses consistent with the provided context and relevant to the questions asked.

RAGA in Action: Elevating AI Excellence

RAGA (Retrieval-Augmented Generation Assessment) metrics are game-changers for improving the quality and reliability of RAG-based AI applications. These metrics provide crucial insights into the performance of both retriever and generator components, offering a quantitative basis for continuous improvement.

Adopting RAGA addresses the "Lost in the Middle" problem and establishes a robust framework for evaluating and enhancing AI applications. This approach ensures more accurate, relevant, and reliable AI-generated responses, leading to a better user experience and increased trust in the system.

Charting the Future with RAGA

Looking ahead, iterating on these metrics and incorporating additional RAGA parameters will continuously refine the RAG pipeline. This commitment to quantifiable quality assurance is essential for navigating the rapidly evolving landscape of AI and LLM applications.

Ready to elevate your RAG game? Dive into the world of RAGA and transform your AI application into a powerhouse of precision and relevancy!

Share on:

About Contentstack

The Contentstack team comprises highly skilled professionals specializing in product marketing, customer acquisition and retention, and digital marketing strategy. With extensive experience holding senior positions at renowned technology companies across Fortune 500, mid-size, and start-up sectors, our team offers impactful solutions based on diverse backgrounds and extensive industry knowledge.

Contentstack is on a mission to deliver the world’s best digital experiences through a fusion of cutting-edge content management, customer data, personalization, and AI technology. Iconic brands, such as AirFrance KLM, ASICS, Burberry, Mattel, Mitsubishi, and Walmart, depend on the platform to rise above the noise in today's crowded digital markets and gain their competitive edge.

In January 2025, Contentstack proudly secured its first-ever position as a Visionary in the 2025 Gartner® Magic Quadrant™ for Digital Experience Platforms (DXP). Further solidifying its prominent standing, Contentstack was recognized as a Leader in the Forrester Research, Inc. March 2025 report, “The Forrester Wave™: Content Management Systems (CMS), Q1 2025.” Contentstack was the only pure headless provider named as a Leader in the report, which evaluated 13 top CMS providers on 19 criteria for current offering and strategy.

Follow Contentstack on LinkedIn.

TechTalks_Generic.webp

Published: Aug 02, 2024

Time to read: 2 min


Background.png