A simple application that helps to talk with raw texts using RAG (Retrieval Augmented Generation).
Have docker pre-installed in the host / machine that you are planning to run this app, and configured as well.
Note: For faster performance, good to use machines having GPUs.
- First build the image of the app using docker
docker build -t talk-to-text:latest .
- Then do docker compose, so that all the other services like vector database and nosql database comes up, that are used by the application to faciltate RAG.
docker compose up -d
Then go to url to access the fast API api docs to trigger the APIs, and get started.
http://localhost:8080/docs
- Upload a raw text using the texts API.
- Create a conversation using the textId obtained in the text API.
- Then post the query that you want to get from the text to the conversation.
- Use Ollama instead of Hugging Face
- Use fastAPI's router interface for multiple routings
- Provide API way to customize the hugging face model used. Same goes for prompt template.
- Provide API way to customize text chunking, and vectorization.
- Add UI / UX to have a Chat UI.
- Use multiple models in same conversations. (Query can be directed to a specific model, based on '@' annotation)