This repo containes codebase for a Retrieval-Augmented Generation (RAG) based chatbot at https://chat.jayeshdev.com
that I built for my technical blog.
This repo also has a companion blog post Chat with my blog: A RAG based chatbot that talks about me and my blog !
The application is built using the following technologies:
- Backend:
- Frontend:
- NextJS & Chakra UI for the UI
- LangchainJS for interacting with backend APIs
- Deployment:
- Docker for containerization and multi stage builds
- Docker Compose for orchestrating multi-container applications
The codebase is built on top of the excellent chat-langchain repo by langchain, and carries MIT License. I made the following modifications to the original code:
- Backend:
- Refactor to use self hosted Chroma Vector Database (with security) instead of Weaviate Cloud.
- use Together AI for embedding (msmarco-bert-base-dot-v5) and answer generation (Mixtral-Instruct-v0.1).
- Add support for parsing using Unstructured IO during ingestion.
- An improved chain that generates better standalone questions and incorporates summary of chat history.
- Refactoring to improve modularity and maintainability.
- Improved prompts with step-by-step instructions and few-shot examples.
- Add support for using Open Source Langfuse instead of Langsmith for monitoring.
- Frontend:
- Removed Langsmith integration
- Modified the example prompts and page contents
- Added footer element for links to my social
- Deployment:
- Added Dockerfiles with multi stage building for backend and frontend to keep deployment lightweight.
To get started, clone this repository to your local machine using the following command:
git clone https://github.com/jayeshmahapatra/rag-chatbot
- Backend
- Modify
dev.config
orprod.config
atrag_chatbot_backend/chatbot_backend/configs
depending on your deployment target. - Create a
chroma/chroma.env
file with the same format and info aschroma/chroma.env.example
. - Create a
rag_chatbot_backend/keys.env
file with the same format and info asrag_chatbot_backend/keys.env.example
.
- Modify
- Frontend env file
- create a
rag_chatbot_frontend/.env.local
file with the same format and info asrag_chatbot_frontend/.env.example
- create a
Use docker compose to build and deploy in detached mode.
docker compose -f docker-compose.dev.yml up --build -d
For production environment
docker compose -f docker-compose.prod.yml up --build -d
If the mounted folders have no data in them, the Chroma Vector Database will be empty.
You can populate it by running an interactive session with the backend container and running the ingestion_pipeline.py
.
Find the name or ID of backend container using docker
docker ps
Launch an interactive session
docker exec -it <backend_container_id_or_name> /bin/bash
Execute the ingestion pipeline
python ingestion_pipeline.py
The repository structure is organized as follows:
-
Root:
- Contains the Docker Compose files for both development
docker-compose.dev.yml
and productiondocker-compose.prod.yml
.
- Contains the Docker Compose files for both development
-
Chroma:
- Contains environment files
chroma.env
for the Chroma Vector Database used in the project.
- Contains environment files
-
rag_chatbot_backend:
- Contains the backend codebase for the RAG chatbot.
- chatbot_backend:
- Contains the core components of the chatbot backend:
chain
: Implements the retrieval-augmented generation logic using langchain.- configs: Contains configuration files
dev.config
,prod.config
for different deployment environments. ingestion_pipeline.py
: Script for populating the Chroma Vector Database.main.py
: Main FastAPI entry point for the backend langserve server.- utils: Contains utility functions that are used during ingestion.
- Contains the core components of the chatbot backend:
-
rag_chatbot_frontend:
- Contains the frontend codebase for the RAG chatbot.
- app:
- components: Contains reusable UI components for the chatbot interface.
globals.css
: Global styles for the frontend.layout.tsx
andpage.tsx
: Layout and page components.- utils: Contains constants (
constants.tsx
).
- Other configuration and build files:
.env.example
and.env.local
: Environment variable files.