Retrieval Augmented Generation (RAG)

Introduction

RAG is the technique to harness the power of LLMs such that LLMs can be used to serve a user's custom purpose. LLMs are, in almost all the cases, trained with the data that is available online. This means the contents to train LLMs have following properties:

Let us take an example of some proprietary documents from your company. You want to use the power of LLM to be able to answer questions from the information contained within those documents. Training of LLMs could pose unnecessary cost; plus, your company may not have enough data or resource or time to train a model for this specific case. This is where the concept of RAG comes into play.

RAG still uses LLMs to be able to answer questions. However, the text string that is fed into the LLM as a prompt is modified such that the LLM is forced to derive answer from the content of the string or the prompt. If we can construct the prompt from the documents that we are interested in, and force the LLM to answer from the content of the prompt, we achieve our goal of acquiring answers from the document that we want the answers from.

Thus, Retrieval Augmented Generation (RAG) is a type of prompt engineering that unleashes the power of pre-trained LLM to extract information from user prescribed documents.

Please visit the RAG description page for a complete picture on the concept of RAG.

RAG Development

Getting Ready

First, let us install the required components. Following are the items that we will be needing.

pip install chromadb pip install sentence_transformers pip install pandas pip install rouge_score pip install nltk pip install accelerate transformers

Code Development

We have identified 4 major steps in the process of RAG development. We will include a prelude section, that needs to be done before the 4 steps. The 4 steps will be included sequentially below, following the prelude section. All the code presented here, collected and compiled should be runnable.

Please visit the RAG description page for a complete picture on the concept of RAG, and to know more about the steps.

Prelude

Step 1: Embedding Documents

Step 2 & 3: Searching Relevant Contexts for Question

Step 4: Getting Answer

Validation

In my case, the question picked at random is shown below:

Following is the answer produced: