Where database blog posts get flame-broiled to perfection
Ah, another dispatch from the trenches of "industry." A performance benchmark. How utterly... quantifiable. One imagines the authors, chests puffed out, racing their souped-up jalopies down a drag strip, entirely oblivious to the fact that the wheels are about to fall off, the engine is leaking oil, and they've completely forgotten the passenger their sole purpose was to transport.
They're so obsessed with the raw, brutal speed of their queries that they've completely forgotten the point of a database management system. What of the relational model? What of Codd's Twelve Rules? I'd wager they couldn't name three of them without consulting a web browser—the very oracle of their intellectual decay. They're measuring transactions per second while blithely violating Rule 3, the Systematic Treatment of Null Values, with every other INSERT. It's... breathtakingly philistine.
And look at the subjects of this grand experiment! Community MySQL, Percona, MariaDB. Ooh, the variety! It's like comparing three different shades of beige paint for a house that has no foundation. They're all just frantic attempts to bolt ever-larger engines onto a chassis that was designed in an era when "web scale" meant your GeoCities page had a guestbook. They tweak buffer pools and fiddle with query caches, all while ignoring the fundamental compromises they've made to data integrity.
I particularly enjoyed this little confession:
...we know that results may vary depending on how you […]
Oh, you know, do you? "Results may vary." That, my dear practitioners, is what one writes when one has failed to control for variables. It is the last refuge of the scientifically inept. It is an open admission that your "benchmark" is less a rigorous experiment and more a child shaking a toy to see what noises it makes. Clearly they've never read Stonebraker's seminal work on benchmarking methodologies. That would have required, you know, reading a paper, an activity seemingly as archaic to them as using a card catalog.
This frantic obsession with raw throughput is a symptom of a deeper sickness. It is the direct result of a generation that believes the CAP theorem is a menu from which one can pick two, rather than a fundamental, mathematically proven constraint on distributed systems that demands careful, thoughtful design. They gleefully sacrifice Consistency for Availability and then spend millions on "data observability" platforms to figure out why their numbers don't add up.
They'll champion their eventual consistency models and their NoSQL "innovations," conveniently forgetting that they are simply rediscovering the problems we solved with ACID properties forty years ago. For these people:
READ UNCOMMITTED is for, isn't it? Let the chaos reign!Still, one must encourage the children. So, bravo. You've made the number go up. It's a very big number indeed. By all means, continue measuring how fast you can drive your car towards the cliff's edge.
Now, if you'll excuse me, I have a first-year graduate seminar on relational algebra to prepare. Perhaps one of you might sit in on it someday. You might learn something foundational.