SAM 2`, released by Meta AI, is a groundbreaking new foundation model designed for segmenting objects in both images and videos. The model represents a significant leap forward in computer vision, offering state-of-the-art segmentation and tracking capabilities for both video and images in a unified model. SAM 2 brings robustness to zero-shot generalization, real-time interactivity, and performance enhancements, including superior accuracy and speed compared to its predecessor, Segment Anything Model. The model's architecture incorporates frame embeddings and memory conditioning, a sophisticated per-session memory module, and a mask decoder architecture that predicts multiple masks for addressing potential ambiguities in video frames. SAM 2 is available under an Apache 2.0 license, and the SA-V dataset, web demo, and research paper are also released to facilitate innovation and development in computer vision systems.