The WebUOT-1M dataset is a groundbreaking collection of 1.1 million annotated frames, finally bringing underwater object tracking into the modern era. It was created to address the shortcomings of earlier underwater object tracking datasets, which lack sufficient scale and diversity in target categories and scenarios. The dataset covers various target categories and scenarios, including different types of marine life, water environments, and lighting conditions. It provides high-quality bounding box annotations, attributes, and labels for each frame, making it a valuable resource for research in underwater vision understanding, marine environmental monitoring, and marine animal conservation. The dataset can be explored using the FiftyOne app, which allows users to visualize the data, filter based on labels, and create dashboards of plots for various information fields. Additionally, the Hiera embedding model from Facebook is used to compute video embeddings relatively easily with a plugin, and SAM2 is applied for object segmentation and detection, showing promising results for basic segmentation tasks. However, real-world underwater object tracking requires sophisticated systems that can handle identity preservation, temporal consistency, occlusion recovery, and group dynamics, which are particularly acute in underwater environments.