The text discusses the use of NVIDIA's Multi-Instance GPU (MIG) feature on H100 GPUs, which allows developers to split a single physical GPU into two or more virtual GPUs, each with its own memory and compute resources. This feature enables efficient model serving for machine learning models by providing equal or better performance compared to A100 GPUs at a 20% lower cost. The fractional H100 GPUs offer advantages such as support for FP8 precision, increased flexibility, and availability of GPUs across cloud providers and regions. The guide provides an overview of how MIG works, the specs of fractional H100 GPUs, and what performance to expect serving models on H100 MIG-based instances.