Skip to content

shivamarora1/msp

Repository files navigation

MSP, or Movie Search by Plot, employs semantic-based search techniques to identify relevant movies. Unlike traditional search algorithms that rely on similarity measures, MSP understands the plot of a story, allowing users to find titles that match their actual interests.

Let's see some example:

Input: Adventure of scientist in his regular life
Result: The Creeping Flesh

Input: Horror love story
Result: Terror in the Aisles

👉 Live Demo

recording

Architecture

msp_architecture drawio (1)

Milvus: Vector database to store embedding vectors. Milvus free community cloud version also (available)[https://cloud.zilliz.com/]
all-MiniLM-L6-v2: Sentence model used to map sentence and paragraphs in 384 dimensional vector space. This model converts normal sentence to vector embeddings.

Steps to run project

  1. Run bash standalone_embed.sh start to host Milvus database in local.
  2. Set MILVUS_URI, MILVUS_TOKEN in .env file with appropriate value.
  3. Movies dataset
Release Year | Title | Origin/Ethnicity | Director | Cast | Genre | Wiki Page | Plot | Image

You can download data set from Hugging face.

  1. Set up virtual environment.
python3 -m venv .venv
source .venv/bin/activate
  1. Install required dependencies
pip install -r requirements
  1. Create and store movies plot embeddings
python create_embeddings.py
  1. Run application
make run-streamlit-app
  1. Streamlit app should be open in browser

Follow more such content 👉 : https://medium.com/@shivamarora1