In this post, we discussed how we optimized the performance of Cloudflare's Web Application Firewall (WAF) Machine Learning (ML) models by employing various techniques such as pre-processing optimization, model inference optimization, and caching. We achieved a significant reduction in WAF ML execution time, cutting it down from 1519 microseconds to 275 microseconds on average, which is approximately 81.90% faster. This optimization has allowed us to handle more traffic with the same resources, improving our overall system performance and scalability.