`Auto-evaluator` 🧠 📝

This is a lightweight evaluation tool for question-answering using Langchain to:

Ask the user to input a set of documents of interest
Apply an LLM (GPT-3.5-turbo) to auto-generate question-answer pairs from these docs
Generate a question-answering chain with a specified set of UI-chosen configurations
Use the chain to generate a response to each question
Use an LLM (GPT-3.5-turbo) to score the response relative to the answer
Explore scoring across various chain configurations

Run as Streamlit app

pip install -r requirements.txt

streamlit run auto-evaluator.py

Inputs

num_eval_questions - Number of questions to auto-generate (if the user does not supply an eval set)

split_method - Method for text splitting

chunk_chars - Chunk size for text splitting

overlap - Chunk overlap for text splitting

embeddings - Embedding method for chunks

retriever_type - Chunk retrieval method

num_neighbors - Neighbors for retrieval

model - LLM for summarization of retrieved chunks

grade_prompt - Prompt choice for model self-grading

Blog

https://blog.langchain.dev/auto-eval-of-question-answering-tasks/

UI

Disclaimer

You will need an OpenAI API key with access to `GPT-4` and an Anthropic API key to take advantage of all of the default dashboard model settings. However, additional models (e.g., from Hugging Face) can be easily added to the app.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
docs/karpathy-lex-pod		docs/karpathy-lex-pod
img		img
README.md		README.md
auto-evaluator.py		auto-evaluator.py
requirements.txt		requirements.txt
text_utils.py		text_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs/karpathy-lex-pod

docs/karpathy-lex-pod

img

img

README.md

README.md

auto-evaluator.py

auto-evaluator.py

requirements.txt

requirements.txt

text_utils.py

text_utils.py

Repository files navigation

`Auto-evaluator` 🧠 📝

About

Releases

Packages

Languages

thrivewithai/auto-evaluator

Folders and files

Latest commit

History

Repository files navigation

Auto-evaluator 🧠 📝

About

Resources

Stars

Watchers

Forks

Languages

`Auto-evaluator` 🧠 📝