Vlad Krasnov recently joined CloudFlare and has been working on low level optimization of their servers. In a recent blog post, he discusses an improvement to PicoHTTPParser using the SSE4.2 instruction PCMPESTRI for finding delimiters in HTTP requests/responses. However, this method has limitations such as high latency and limited throughput. Krasnov proposes using AVX2 instructions instead, which operate on 32 bytes and have a higher throughput. He also suggests changing the logical flow of the program from latency bound to throughput oriented by creating bitmaps for all occurrences in a long string. This results in significant performance improvements compared to the previous version using PCMPESTRI.