Meta Imagine AI Just got an Impressive GIF Update

Company

Encord

Date Published

May 13, 2024

Author

Stephen Oladele

Word count

1606

Language

English

Hacker News points

None

URL

encord.com/blog/meta-imagine-ai-image-generators

Summary

Imagine Flash, a new distillation framework from Meta AI, accelerates diffusion models like Emu by reducing inference times while maintaining high-quality image generation. It achieves faster image generation using just one to three denoising steps, which is an improvement over existing methods. The approach combines three key components: Backward Distillation, Shifted Reconstruction Loss, and Noise Correction. This framework reduces the number of iterations required for high-quality image synthesis from 25 in Emu to just 3, achieving comparable results while significantly reducing inference time. Imagine Flash also handles extended context capabilities, such as generating images up to 128K, and can produce images at over 800 tokens per second using a fast-sampling approach. Its three components are refined through preformatted fine-tunes, enabling the model to specialize in specific domains or styles and improve versatility across different use cases. The framework has been shown to outperform existing methods in both quantitative metrics and perceptual quality, achieving performance comparable to the teacher model using only three denoising steps.