🔥 The DB Grill 🔥

Where database blog posts get flame-broiled to perfection

ClickHouse® Kafka Engine vs Tinybird Kafka Connector
Originally from tinybird.co/blog-posts
December 31, 2025 • Roasted by Rick "The Relic" Thompson Read Original Article

Alright, settle down, kids. Let me put down my coffee—the kind that's brewed strong enough to dissolve a tape head—and take a look at this... this masterpiece of modern data engineering someone just forwarded me. "Compare ClickHouse OSS Kafka Engine and Tinybird's Kafka connector." Oh, this is a treat. It's like watching two toddlers argue over which one invented the crayon.

You're talking about the "tradeoffs" and "failure modes" of getting data from one bucket into another. A "pipeline," you call it. Adorable. Back in my day, we called that a batch job. It was written in COBOL, scheduled with JCL, and if it failed, you got a single, unambiguous error code. Usually Abend S0C7. You didn't need a 2,000-word blog post to diagnose it. You fixed the bad data on the punch card and resubmitted the job. Problem solved.

So, let's see. First, we have the "Kafka Engine." You're telling me you've built a database inside your database just to read from a log file? Congratulations, you've invented the READ statement and layered it under eight levels of abstraction and a hundred thousand lines of Go. We used to read from sequential files on tape reels the size of a manhole cover. You ever have to physically mount a tape in a Unisys drive at 3 a.m. because the nightly billing run failed? The smell of ozone and desperation? That builds character. This... this "consumer group lag" you worry about sounds like a personal problem, not a system architecture issue.

And its competitor in this grand showdown? "Tinybird's Kafka connector." Tinybird. What's next, your data warehouse is called "FluffyBunnyDB"? We had names like IMS, IDMS, DB2. Information Management System. It sounded important because it was. It held the payroll for a company with 50,000 employees, not the clickstream data for a cat photo-sharing app. This "connector" is just another piece of middleware, another black box to fail mysteriously when the full moon hits the network switch just right. We called it a "program," and we had the source code. Printed on green bar paper.

The article talks about handling "exactly-once semantics." You kids are obsessed with this. We had "it-ran-or-it-didn't-once" semantics. The job either completed and updated the master record, or it abended and we rolled back the transaction log. A log which, by the way, was also on tape. You want a war story? Try recovering a corrupted VSAM file from a backup tape that's been sitting in an off-site salt mine for six months. That's a "failure mode" that'll put some hair on your chest.

Understand tradeoffs, failure modes and when to choose each solution...

Let me tell you the tradeoff. In 1985, we could have built this. We'd have a CICS transaction reading from a message queue—yes, we had those—and stuffing the data into a DB2 table. We'd have a materialized query table, which you all rediscovered and called "materialized views" twenty years later, to handle the "analytics." It would have run on a single mainframe, used about 100 MIPS, and it would still be running today, untouched, processing billions of transactions without a single blog post written about its "observability."

The issues you list here are just symptoms of needless complexity:

So you can have your Kafka Engines and your... chuckles... your Tinybirds. I'll be here, sipping my burnt coffee, secure in the knowledge that there is nothing new under the sun, especially not in databases. Everything you're "inventing" now is just a poorly-remembered version of something we perfected on a System/370 while you were still learning to use a fork.

Thanks for the read, but I've got a COBOL program to debug. It's only 40 years old, practically brand new. And don't worry, I've already set up a rule to send any future emails from this blog directly to the bit bucket.