Where database blog posts get flame-broiled to perfection
(Leans back in a creaking, ergonomic-nightmare of a chair, stained with coffee from the Reagan administration. Squints at the screen over a pair of bifocals held together with electrical tape.)
Well, look at this. The kids have discovered that if you try to make a relational database act like something it's not, it still acts like a relational database. Groundbreaking stuff. It's a real barn-burner of an article, this one. "Think PostgreSQL with JSONB can replace a document database? Be careful." You don't say. Next, you'll tell me that my station wagon can't win the Indy 500 just because I put a racing stripe on it.
Back in my day, we didn't have "domain-driven aggregates." We had a master file on a tape reel and a transaction file on another. You read 'em both, you wrote a new master file. We called it a "batch job," and it was written in COBOL. If you wanted "data that is always queried together" to be in the same place, you designed your record layouts on a coding form, by hand, and you didn't whine about it. You kids and your fancy "document models"... you've just reinvented the hierarchical database, but with more curly braces and a worse attitude. IMS/DB was doing this on mainframes when your CEO was still learning how to use a fork.
So this fella goes through all this trouble to prove a point. He loads up a million rows of nonsense by piping /dev/urandom into base64. Real cute. We had a keypunch machine and a stack of 80-column cards. Our test data had structure, even if it was just EBCDIC gibberish. You learn respect for data when you can drop it on your foot.
And the big "gotcha"? He discovers TOAST.
In PostgreSQL, however, the same JSON value may be split into multiple rows in a separate TOAST table, only hiding the underlying index traversal and joins.
Let me get this straight. You took a bunch of related data, jammed it into a single column to avoid having a second table with a foreign key, and the database... toasted it by splitting it up and storing it in... a second table with an internal key. And this is presented as a shocking exposé?
Son, we called this "overflow blocks" in DB2 back in 1985. When a VARCHAR field got too big, the system would dutifully stick the rest of it somewhere else and leave a pointer. It wasn't magic, it was just sensible engineering. You're acting like you've uncovered a conspiracy when all you've done is read the first chapter of the manual. The database is just cleaning up your mess behind the scenes, and you're complaining about the janitor's methods. This whole song and dance with pageinspect and checking B-Tree levels to "prove" there's an index... of course there's an index! How else did you think it was going to find the data chunks? Wishful thinking? Synergy?
The best part is this line right here: "the lookup to a TOAST table is similar to the old N+1 problem with ORMs." You kids are adorable. You think the "N+1 problem" is some new-fangled issue from these object-relational mappers. We called it "writing a shitty, row-by-row loop in your application code." We didn't write a blog post about it; we just took away your 3270 terminal access until you learned how to write a proper join.
So after all that, the performance is worse. Reading the "embedded" document is slower than the honest, god-fearing JOIN on two properly normalized tables. The buffer hits go up. The query plan looks like a spaghetti monster cooked up by a NodeJS developer on a Red Bull bender. And the final conclusion is... drumroll please...
"If your objective is to simulate MongoDB and use a document model to improve data locality, JSONB may not be the best fit."
You have spent thousands of words, generated gigabytes of random data, and meticulously analyzed query plans to arrive at the stunning conclusion that a screwdriver makes a lousy hammer. Congratulations. You get a gold star. We've known this since Codd himself laid down the law. You're treating Rule #8 on data independence like you just discovered it on some ancient scroll, but we were living it while you were still trying to figure out how to load a program from a cassette tape.
This whole fad is just history repeating itself. In the 90s, it was object databases. In the 2000s, it was shoving everything into giant XML columns. Now it's JSONB. And I'll tell you what happens next, because I've seen this movie before. In about three to five years, there will be a new wave of blog posts. They'll be titled "The Great Un-JSONing: Migrating from JSONB back to a Relational Model." A whole new generation of consultants will make a fortune untangling this mess, writing scripts to parse these blobs back into clean, normalized tables. And I'll be right here, cashing my pension checks and laughing into my Sanka.
Now if you'll excuse me, I've got a backup tape from '98 that needs to be restored. It's probably got a more sensible data model on it than this.