The text discusses the creation of movement based on written descriptions, which has various applications in industries such as gaming, filmmaking, robotics animation, and the future of the metaverse. It presents a novel framework called T2M-GPT that utilizes a Vector Quantised Variational AutoEncoder (VQ-VAE) and a Generative Pretrained Transformer (GPT) to generate human motion capture from textual descriptions. The T2M-GPT framework achieves superior performance compared to other state-of-the-art approaches, including recent diffusion-based methods. Human motion synthesis is also discussed, which involves developing algorithms and models that can simulate human motion using motion capture data, physical laws, and artificial intelligence techniques.