/plushcap/analysis/cloudflare/html-parsing-2

A History of HTML Parsing at Cloudflare: Part 2

What's this blog post about?

In 2017, developers using the Cloudflare edge compute platform Workers wanted HTML rewriting capabilities similar to those used internally by Cloudflare. To meet this demand, a streaming HTML rewriter/parser with a CSS-selector based API was built in Rust and open-sourced as LOL HTML. The major change compared to the previous rewriter, LazyHTML, is the dual-parser architecture required to overcome the additional performance overhead of wrapping/unwrapping each token when propagating tokens to the Workers runtime. This new approach significantly speeds up parsing and reduces output latency and memory consumption.

Company
Cloudflare

Date published
Nov. 29, 2019

Author(s)
Andrew Galloni, Ivan Nikulin

Word count
3142

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.