Gemini 1.5: Google's Generative AI Model with Mixture of Experts Architecture

Company

Encord

Date Published

Feb. 17, 2024

Author

Stephen Oladele

Word count

2924

Language

English

Hacker News points

None

URL

encord.com/blog/google-gemini-1-5-generative-ai-model-with-mixture-of-experts

Summary

Google's Gemini 1.5 Pro is a highly capable multimodal model with token context lengths ranging from 128K to 1 million token context lengths for production applications and up to 10 million for research. It excels at long-term recall and retrieval, generalizing zero-shot to long instructions like analyzing 3 hours of video, and 22 hours of audio with near-perfect recall. The model uses a mixture-of-experts (MoE) architecture for efficient training & higher-quality responses, reducing compute requirements for training despite the larger context windows. Gemini 1.5 Pro demonstrates remarkable improvements over state-of-the-art models in tasks spanning text, code, vision, and audio, setting a new standard in AI's ability to recall and reason across extensive multimodal contexts.