Building a Multimodal Product Recommender Demo Using Milvus and Streamlit

Company

Zilliz

Date Published

July 30, 2024

Author

Christy Bergman, David Wang, and Reina Wang

Word count

1134

Language

English

Hacker News points

None

URL

zilliz.com/blog/build-multimodal-product-recommender-demo-using-milvus-and-streamlit

Summary

The Milvus Multimodal RAG demo is a product recommendation system that uses Google's MagicLens multimodal embedding model to encode both images and text into a single multimodal vector. This vector is then used to search for the closest-matching Amazon products from a Milvus vector database. The technologies used in this demo include Google DeepMind's MagicLens, OpenAI's GPT-4o, Milvus, and Streamlit. The data comes from the Amazon Reviews 2023 dataset, with a subset of 5K items being used for the demonstration. The setup instructions for MagicLens involve setting up an environment, installing dependencies, and downloading model weights. The Milvus server is used to store, index, and search vectors, while Streamlit provides a user-friendly interface for uploading images and entering text instructions. The Ask GPT function utilizes OpenAI's GPT-4o mini multimodal generative model to provide AI-powered recommendations based on the search results.