Where database blog posts get flame-broiled to perfection
Ah, yes. I happened upon yet another dispatch from the front lines of 'modern' data engineering, this one breathlessly describing the trials of running a database inside... Kubernetes. It reads less like an engineering document and more like a cry for help from a group of children who have just discovered that playing with matches can, in fact, burn down the treehouse. One is almost compelled to feel pity, but frankly, they brought this upon themselves.
It seems a systematic review of their, shall we say, innovations is in order.
They begin by celebrating the ephemeral nature of their infrastructure. "Pods are ephemeral; nodes can come and go," they chirp, as if building a repository of record on a foundation of quicksand were a laudable design goal. The entire point of a database management system, my dear industry cowboys, is to provide a stable abstraction on top of unreliable hardware. We have known this for half a century. To instead embrace the chaos and call it "cloud-native" is an intellectual capitulation of the highest order. Itβs a feature, not a bug!
This invariably leads to their absolute fetish for "eventual consistency." This is a delightful euphemism for "currently incorrect." They've traded the 'C' and 'I' in ACID for a vague promise that your data might be correct... eventually. Perhaps next Tuesday. A bank that is only 'eventually consistent' about one's account balance is a bank that is committing fraud. But slap a trendy name on it, and suddenly it's a paradigm shift. The intellectual sloppiness is simply breathtaking.
Then there is the willful, almost proud, ignorance of Brewer's CAP theorem. They prance around shouting about "Globally Distributed ACID Transactions" as if they've suspended the laws of physics through sheer force of marketing. They speak of high availability and strong consistency in the same sentence without a hint of irony. Clearly they've never read Stonebraker's seminal work on the matter, or they simply chose to ignore it in favor of a more marketable fantasy. They haven't "solved" the trade-off; they've just hidden it behind a dozen layers of YAML and hoped no one would notice.
"Kubernetes moves workloads as needed" Yes, and in doing so, creates precisely the network partitions the theorem warned you about. You've invented a self-inflicted problem. Bravo.
And the data model! If one can even call it that. They've abandoned the mathematical purity of Codd's relational model for what amounts to a glorified key-value store where you can stuff a 20MB JSON document and pray. It violates the spirit, if not the letter, of nearly all Twelve Rules. The idea of a systematic, logical foundation has been replaced by a "flexible schema," which is academic-speak for having no standards whatsoever. It is the informational equivalent of a teenager's bedroom floor.
But do carry on with your little containerized experiments. It's... charming... to see you all discovering, with great fanfare, the very problems that Jim Gray and his contemporaries solved in the 1980s. Keep iterating! With enough venture capital, you might just reinvent the B-Tree next. Now, if you'll excuse me, I have a lecture to prepare on third normal form; a concept I fear is now considered hopelessly quaint.