Moondream2 Vision Model Streamlit App

This is a Streamlit app that uses the Moondream2 Vision Model to generate text based on an uploaded image and a user-provided prompt.

Features

Upload an image in PNG or JPEG format.
Enter a prompt to guide the text generation.
Generate text based on the uploaded image and prompt.

How to Run

Install the required Python packages:

pip install -r requirements.txt

Run the Streamlit app:

streamlit run vision.py

Open the app in your web browser at http://localhost:8501.

Usage

Upload an image using the file uploader.
Enter a prompt in the text input field.
Click the "Generate" button to generate text based on the image and prompt.

About the Model

The Moondream1 Vision Model is a small but powerful vision model that outperforms models twice its size. It was created by @vikhyatk.

License

This project is open source under the MIT license.