Serverless Q&A App with RAG Using Amazon Kendra, Bedrock, and AWS Amplify

Published in

Towards AWS

11 min readDec 22, 2023

In this article, I show how to build a Retrieval-Augmented Generation serverless question-and-answer application using Amazon Bedrock with Amazon Kendra. We explore the benefits of RAG and how it reduces hallucination from FMs.

Let's get right into it!

Foundation Models

Firstly before diving deep into this article, we need to understand what acts as the brain behind the application; Foundation Model(FM).

So what are FMs?

Foundation models refer to the large, pre-trained language models that serve as the basis for various downstream natural language processing (NLP) tasks.

These models are trained on massive datasets and have a broad understanding of natural language. They are capable of general language understanding and can be fine-tuned for specific tasks.

Amazon Bedrock

Amazon Bedrock is a serverless service that makes it easy to build and scale generative AI applications using foundation models via a single API.

These FMs are from leading AI companies like Anthropic(Claude Model), AI21 Labs (Jurassic Model), Cohere(Command and Embed Model), Meta (Llama 2 Model), Stability AI(Stable Diffusion XL Model), and Amazon(Titan Model).

It supports the customization of models to suit personal requirements using fine-tuning. Also, it supports RAG through its Knowledge Base(KB) feature. For the article, however, we won't use Knowledge Base but Kendra for RAG implementation, due to KB's non-support for web crawling.

AWS Amplify

AWS Amplify makes it easy to build full-stack web and mobile apps. It helps build and deploy generative AI applications by providing the tools needed to build full-stack apps that integrate with Amazon Bedrock and Amazon Kendra.

Retrieval Augmented Generation (RAG)

So what exactly is RAG (Retrieval-Augmented Generation)?

Simply put, it is a strategy that boosts FMs by incorporating external data sources. This enhancement enables the models to generate answers that are contextually aware, leading to the discovery of valuable insights.

By fetching data from company data sources and feeding the prompt with that data, it helps deliver more relevant and accurate responses.

Amazon Kendra

As per the AWS website: “Amazon Kendra is an intelligent enterprise search service that helps you search across different content repositories with built-in connectors”.

It connects multiple data repositories to an index, ingesting and crawling documents and websites into it. This leads to a more robust search experience.

Project Setup

Definitions out of the way, we head straight into development.

Setting up Next.Js

npx create-next-app@latest

Output:

Next, cd into the newly created app and add Amplify.

cd qa-app && amplify init

Output:

Adding Authentication

Since there is a charge for every interaction with an LLM, it is recommended that the endpoint that interacts with it be secured.

Run the command to add authentication to your Amplify project:

amplify add auth

Next, run the push command to apply these configurations to the cloud

amplify push

You will see the following:

Building the App

Time to set up the application

Setup

We need to implement the UI users would use to interact with the LLM using the RAG approach. To do that we would use the react library to build the interface we need.

Add the amplify package to help with that:

npm i --save @aws-amplify/ui-react@5.0.6 aws-amplify@5.0.25

Next, head over to the src/pages/_app.tsx page and replace the content of the file with the following:

import "@aws-amplify/ui-react/styles.css";
import { Amplify } from "aws-amplify";
import awsconfig from "../aws-exports";
import type { AppProps } from "next/app";
import { Authenticator } from "@aws-amplify/ui-react";

Amplify.configure({
  ...awsconfig,
  // this lets you run Amplify code on the server-side in Next.js
  ssr: true,
});

export default function App({ Component, pageProps }: AppProps) {
  return (
    <Authenticator>
      <Component {...pageProps} />
    </Authenticator>
  );
}

This configures Amplify in the application, and wraps the application with the Authenticator component, providing the needed authentication for the app without the need to implement it manually. That's the beauty of Amplify!

Run the app to test:

npm run dev

You should see the Authentication page displayed:

After creating an account and signing in, the default Next.js starter page is then displayed:

Adding API Layer

Add Amazon Bedrock and Amazon Kendra dependencies to the app

npm i --save @aws-sdk/client-bedrock-runtime @aws-sdk/client-kendra

Next, rename api/hello.ts to api/chat.ts and replace the content with the below. This gets the questions from the client and sends them directly to Amazon Bedrock.

import {
  BedrockRuntimeClient,
  InvokeModelCommand,
} from "@aws-sdk/client-bedrock-runtime";

import { Amplify, withSSRContext } from "aws-amplify";

import type { NextApiRequest, NextApiResponse } from "next";
import awsExports from "@/aws-exports";

Amplify.configure({ ...awsExports, ssr: true });

export default async function handler(
  req: NextApiRequest,
  res: NextApiResponse
) {
  const question = JSON.parse(req.body).question;
  if (!question) {
    return res.status(400).json({ error: "Please provide question" });
  }

  const SSR = withSSRContext({ req });
  const credentials = await SSR.Auth.currentCredentials();
  const settings = {
    serviceId: "bedrock",
    region: "us-east-1",
    credentials,
  };
  const bedrock = new BedrockRuntimeClient(settings);

  const prompt = `Human:${question}\n\nAssistant:`;

  const result = await bedrock.send(
    new InvokeModelCommand({
      modelId: "anthropic.claude-v2",
      contentType: "application/json",
      accept: "*/*",
      body: JSON.stringify({
        prompt,
        max_tokens_to_sample: 2000,
        temperature: 1,
        top_k: 250,
        top_p: 0.99,
        stop_sequences: ["\n\nHuman:"],
        anthropic_version: "bedrock-2023-05-31",
      }),
    })
  );
  res.status(200).json(JSON.parse(new TextDecoder().decode(result.body)));
}

Update UI - Add QA Form

Next, update the index.tsx file with the following content. This creates a basic form to be used to gather user's questions, as well as display answers from the Foundation Model.

import { useEffect, useState } from "react";

import { Inter } from "next/font/google";

const inter = Inter({ subsets: ["latin"] });

interface FormElements extends HTMLFormControlsCollection {
  message: HTMLInputElement;
}
interface QAFormElement extends HTMLFormElement {
  readonly elements: FormElements;
}

export default function Home() {
  const [isloading, setIsLoading] = useState(false);
  const [question, setQuestion] = useState("");
  const [answer, setAnswer] = useState("");

  const handleSubmit = async (event: React.FormEvent<QAFormElement>) => {
    event.preventDefault();

    const question = event.currentTarget.question.value;
    if (!question) {
      alert("Please enter question");
      return;
    }

    setIsLoading(true);
    setAnswer("Loading Answer...");

    fetch("/api/chat", {
      method: "POST",
      body: JSON.stringify({ question }),
    })
      .then((res) => res.json())
      .then((data) => {
        setIsLoading(false);
        if (data.error) {
          alert(data.error);
          return;
        }
        setAnswer(data?.completion || "An error occured");
      });
  };

  const clearForm = () => {
    setQuestion("");
    setAnswer("");
  };

  const handleQuestionChange = (e: React.ChangeEvent<HTMLTextAreaElement>) => {
    setQuestion(e.target.value);
  };

  const handleAnswerChange = (e: React.ChangeEvent<HTMLTextAreaElement>) => {
    setAnswer(e.target.value);
  };

  return (
    <main
      className={`flex flex-col items-center justify-center min-h-screen p-8 ${inter.className}`}
    >
      <h1 className="text-xl font-bold p-2 m-2"> Q&A app</h1>
      <form
        onSubmit={handleSubmit}
        className="bg-white shadow-md rounded px-8 pt-6 pb-8 mb-4"
      >
        <div className="mb-4">
          <textarea
            id="question"
            rows={2}
            cols={15}
            className="block p-2.5 w-full text-sm text-gray-900 bg-gray-50 rounded-lg border border-gray-300 focus:ring-blue-500 focus:border-blue-500 dark:bg-gray-700 dark:border-gray-600 dark:placeholder-gray-400 dark:text-white dark:focus:ring-blue-500 dark:focus:border-blue-500"
            placeholder="Write your thoughts here..."
            value={question}
            onChange={handleQuestionChange}
          ></textarea>
        </div>
        <div className="mb-4">
          <textarea
            id="answer"
            rows={10}
            cols={60}
            className="block p-2.5 w-full text-sm text-gray-900 bg-gray-50 rounded-lg border border-gray-300 focus:ring-blue-500 focus:border-blue-500 dark:bg-gray-700 dark:border-gray-600 dark:placeholder-gray-400 dark:text-white dark:focus:ring-blue-500 dark:focus:border-blue-500"
            placeholder="Answer would be displayed here..."
            value={answer}
            onChange={handleAnswerChange}
            disabled
          ></textarea>
        </div>
        <div className="flex flex-col m-2 w-full">
          <button
            className="w-full bg-blue-500 hover:bg-blue-700 text-white font-bold my-2 py-2 px-4 rounded focus:outline-none focus:shadow-outline"
            type="submit"
            disabled={isloading}
          >
            {isloading ? "Loading..." : "Send"}
          </button>
          <button
            className="w-full bg-black hover:bg-gray-400 text-white font-bold my-2 py-2 px-4 rounded focus:outline-none focus:shadow-outline"
            type="button"
            onClick={clearForm}
          >
            Clear
          </button>
        </div>
      </form>
    </main>
  );
}

If you click the send button, you will get an error like such:

This is because the authenticated cognito user isn't authorized to access Bedrock.

To fix the problem, we would update the permission to allow cognito access to Amazon Bedrock and Amazon Kendra (Since we would later use this).

amplify override project

This opens the qa-app/amplify/backend/awscloudformation/override.ts file in the default editor.

Next, add a policy to allow the authenticated role access to Amazon Bedrock’s InvokeModel and Amazon Kendra’s Retrieve command like this:

import {
  AmplifyProjectInfo,
  AmplifyRootStackTemplate,
} from "@aws-amplify/cli-extensibility-helper";

export function override(
  resources: AmplifyRootStackTemplate,
  amplifyProjectInfo: AmplifyProjectInfo
) {
  const authRole = resources.authRole;

  const basePolicies = Array.isArray(authRole.policies)
    ? authRole.policies
    : [authRole.policies];

  authRole.policies = [
    ...basePolicies,
    {
      policyName: "CognitoAuthorizedPolicy",
      policyDocument: {
        Version: "2012-10-17",
        Statement: [
          {
            Effect: "Allow",
            Action: "bedrock:InvokeModel",
            Resource:
              "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-v2",
          },
          {
            Effect: "Allow",
            Action: "kendra:Retrieve",
            Resource: "arn:aws:kendra:us-east-1:*:index/*",
          },
        ],
      },
    },
  ];
}

After editing, run amplify push to apply the changes.

amplify push

Testing the App

Before applying Kendra, which involves the RAG approach, I would proceed to ask questions about myself.

Prompt: Who is Chibuike Nwachukwu?

As can be seen, it can't answer the question due to its lack of context. We proceed to fix this using Amazon Kendra to provide the Foundation Model context to be able to answer the question.

Integrating Amazon Kendra

First, we need to set up Kendra and create an index. Head over to the console and select Amazon Kendra.

Click Create an Index. Provide index name. Under IAM Select Create a new role (Recommended). Enter role name. Click Next, then Next in the follow-up screens, and Click Create.

Copy the Kendra Index ID and keep it. We would need it when interacting with Kendra later on.

Next, Add data sources. For this, I added my website found here as a data source.

Click on Add data sources . Search for Web, pick Web Crawler V2.0 and click Add connector

Provide the data source name and click Next. Next, use https://chibuikenwa.com as the Source URLs. Select Create a new role (Recommended) at the bottom, type in the name, and click Next.

Keep default options and select Run on demand for the Frequency option. Click Next, then Next again then Add data resources.

Once the data source is created, click the Sync Now button. We now have the Data Source active:

Integrating Kendra with the API

Update the content of the chat.ts file to the below. This simply gets the question from the client, queries Kendra to find all relevant results from this question, and then augments the initial query to Bedrock with this new context:

import {
  BedrockRuntimeClient,
  InvokeModelCommand,
} from "@aws-sdk/client-bedrock-runtime";
import {
  KendraClient,
  RetrieveCommand,
  RetrieveCommandOutput,
} from "@aws-sdk/client-kendra";
import { Amplify, withSSRContext } from "aws-amplify";

import type { NextApiRequest, NextApiResponse } from "next";
import awsExports from "@/aws-exports";

Amplify.configure({ ...awsExports, ssr: true });

async function retrieve_kendra_docs(
  kendraClient: KendraClient,
  QueryText: string
): Promise<RetrieveCommandOutput> {
  try {
    const input = {
      IndexId: process.env.kendraId,
      QueryText,
    };

    const command = new RetrieveCommand(input);
    return await kendraClient.send(command);
  } catch (e) {
    console.log(e);
    return {
      ResultItems: [{ Content: "" }],
    } as RetrieveCommandOutput;
  }
}

export default async function handler(
  req: NextApiRequest,
  res: NextApiResponse
) {
  const question = JSON.parse(req.body).question;
  if (!question) {
    return res.status(400).json({ error: "Please provide question" });
  }

  const SSR = withSSRContext({ req });
  const credentials = await SSR.Auth.currentCredentials();
  const settings = {
    serviceId: "bedrock",
    region: "us-east-1",
    credentials,
  };
  const bedrock = new BedrockRuntimeClient(settings);
  const kendraClient = new KendraClient(settings);

  let context = await retrieve_kendra_docs(kendraClient, question);
  const chunks = context!.ResultItems!.map(
    (retrieveResult) => retrieveResult.Content
  );

  let refinedContext = chunks.join("\n");
  const prompt = `Human:You are a friendly, concise chatbot. Here is some context, contained in <context> tags:

    <context>
    ${refinedContext}
    </context>

    Given the context answer this question: ${question}
    \n\nAssistant:`;

  const result = await bedrock.send(
    new InvokeModelCommand({
      modelId: "anthropic.claude-v2",
      contentType: "application/json",
      accept: "*/*",
      body: JSON.stringify({
        prompt,
        max_tokens_to_sample: 2000,
        temperature: 1,
        top_k: 250,
        top_p: 0.99,
        stop_sequences: ["\n\nHuman:"],
        anthropic_version: "bedrock-2023-05-31",
      }),
    })
  );
  res.status(200).json(JSON.parse(new TextDecoder().decode(result.body)));
}

Create a .env.local file and add the content below, replacing kendraId, with your own Kendra Index ID

kendraId=your-kendra-index-id

If you are using TypeScript, you will need to ignore the Amplify directory in the tsconfig file at the root of your project. Update the “exclude” array in your tsconfig to have “amplify”:

"exclude": ["node_modules", "amplify"]

Testing the App

Head back to the browser, and test the application to see what we have built so far.

The following are the updated test results after adding Amazon Kendra:

Test 1: Prompt: Who is Chibuike Nwachukwu

Result:

Test 2: Prompt: Which 5 technologies is Chibuike skilled at, and which company did he first work with?

Result:

And just like that, we have the application provide accurate answers using relevant context and information retrieved from my website.

And that is the power of RAG!

Hosting the App

To quickly host the application on AWS Amplify, run:

amplify add hosting

NB: Create a new role and use the default setting in the role as provided.

Next, click to expand the Additional settings and add the Kendra index ID you saved as an environment variable.

Click the Add package version overide button and update the Node.js version to use v20, else the frontend application build will fail:

Update the build image settings to use Amazon Linux:2023. I also made an update to the Node.js version to v20, as v21 was breaking the frontend build:

Lastly, update the build command to enable AWS Amplify to export the environment variable kendraId into a .env.production file. This would enable Next.js to make use of it.

I added this line to make this happen:

 - env | grep -e kendraId >> .env.production

Once the build is complete, head back to the terminal to complete the deployment.

Visit the Domain and you should have the application fully running!

Conclusion

In this article, we learned how to build a serverless Q&A application using Amazon Kendra, Bedrock, and AWS Amplify. We proceeded to host the Web App on AWS Amplify with just a few clicks!

You can find the source code here for the completed application.

Generative AI is here to stay. I can’t wait to see what you build with these great tools!

Serverless Q&A App with RAG Using Amazon Kendra, Bedrock, and AWS Amplify

Foundation Models

Amazon Bedrock

AWS Amplify

Retrieval Augmented Generation (RAG)

Amazon Kendra

Project Setup

Setting up Next.Js

Adding Authentication

Building the App

Setup

Adding API Layer

Update UI - Add QA Form

Testing the App

Integrating Amazon Kendra

Integrating Kendra with the API

Testing the App

Hosting the App

Conclusion

Written by Nwachukwu Chibuike