IT Brief Canada - Technology news for CIOs & IT decision-makers
Cloudflare

Cloudflare uses ClickHouse for quadrillion-event analytics

Tue, 24th Mar 2026

Cloudflare uses ClickHouse to run analytics across quadrillions of events a day, underpinning query performance across its global network.

Details released by the companies show Cloudflare has relied on the open-source database for nearly a decade, making it one of the earlier large-scale users of the software. The system is designed to keep returning results even when traffic surges or large parts of network capacity are lost.

At a recent demonstration of its analytics environment, a single query scanned 96 trillion events in an hour and returned in less than two seconds, with a margin of error below 1%. The same query, run across a full day, covered 1.61 quadrillion events and also returned in less than two seconds.

Scale test

Those figures illustrate the scale of the data Cloudflare handles as it serves about one-fifth of the world's websites. For the company, traffic growth and outages create similar engineering pressures because both can push systems beyond normal operating limits.

Cloudflare also simulated the loss of a major North American data centre, followed by the loss of North America as a whole. Errors rose, but queries still returned results as European clusters absorbed the load through its active-active network design.

The network spans more than 300 data centres, and query results remained within the same narrow margin of error during the disruption scenario.

Steady response

Engineers also expanded the query window from an hour to a day, a week, a month and a year. Response times remained steady across those larger data sets.

The case offers a glimpse into how internet infrastructure groups are rethinking analytics systems as data volumes rise sharply. Rather than build around a fixed peak, operators are trying to keep services usable during both extreme growth and major infrastructure failures.

Why it fits

One reason ClickHouse has suited Cloudflare's environment is its use of HTTP for interactions between analytics clients and the database. Compared with systems that rely on more tightly coupled interfaces, that approach simplifies integration with other tools and services.

Cloudflare also pointed to a design that requires less coordination between nodes than some distributed database systems. In practice, that lets teams query different nodes and form changing combinations of resources without heavy orchestration.

Soft clusters

Cloudflare refers to this as using "soft clusters", a model that helps engineers steer workloads towards healthy nodes when parts of the system are under strain. It also allows teams to switch features on and off depending on the demands of a specific workload.

Even the database's SQL dialect, which took time to learn, has become useful for how its teams analyse large volumes of operational data. Access to the open-source code and the surrounding developer community has also been important over the years.

Scaling lesson

Jamie Herre, Senior Director of Engineering at Cloudflare, said the broader lesson is that scaling work is never finished. He framed the issue not simply as handling more traffic, but as preparing systems to remain useful when capacity disappears unexpectedly.

"At Cloudflare we're always scaling," Herre said.

"There's always more trouble tomorrow. But we've designed our system around ClickHouse to be able to deal with that."

He also described how the company approaches planning for sudden shifts in demand or infrastructure availability.

"It's never too early, and it's never too late," Herre said. "You'll never be done."

"Say your workload suddenly scales 10x or 100x-would it fail in a good way, or would it fail in a bad way? And then the inverse of that is, what if you lose nine-tenths of the capacity? Would it just keel over? Or would you still be able to use it successfully?"