Daily Database Roasts

Postgres 18 beta2: large server, Insert Benchmark, part 2

Originally from smalldatum.blogspot.com/feeds/posts/default

August 1, 2025 Read Original Article

Alright, gather 'round, folks, because the titans of database research have dropped another bombshell! We're talking about the earth-shattering revelations from Postgres 18 beta2 performance! And let me tell you, when your main takeaway is 'up to 2% less throughput' on a benchmark step you had to run for 10 times longer because you apparently still can't figure out how long to run your 'work in progress' steps, well, that's just riveting stuff, isn't it? It’s not a benchmark, it’s a never-ending science fair project.

And this 'tl;dr' summary? Oh, it's a masterpiece of understatement. We've got our thrilling 2% decline in one corner, dutifully mimicking previous reports – consistency, at least, in mediocrity! Then, in the other corner, a whopping 12% gain on a single, specific benchmark step that probably only exists in this particular lab's fever dreams. They call it 'much better,' I call it grasping at straws to justify the whole exercise.

The 'details' are even more glorious. A single client, cached database – because that's exactly how your high-traffic, real-world systems are configured, right? No contention, no network latency, just pure, unadulterated synthetic bliss. We load 50 million rows, then do 160 million writes, 40 million more, then create three secondary indexes – all very specific, very meaningful operations, I'm sure. And let's not forget the thrilling suspense of 'waiting for N seconds after the step finishes to reduce variance.' Because nothing says 'robust methodology' like manually injecting idle time to smooth out the bumps.

Then we get to the alphabet soup of benchmarks: l.i0, l.x, qr100, qp500, qr1000. It's like they're just mashing the keyboard and calling it a workload. My personal favorite is the 'SLA failure' if the target insert rate isn't sustained during a synthetic test. News flash: an SLA failure that only exists in your test harness isn't a failure, it's a toy. No actual customer is calling you at 3 AM because your qr100 benchmark couldn't hit its imaginary insert rate.

And finally, the crowning achievement: relative QPS, meticulously color-coded like a preschooler's art project. Red for less than 0.97, green for greater than 1.03. So, if your performance changes by, say, 1.5% in either direction, it's just 'grey' – which, translated from corporate-speak, means "don't look at this, it's statistically insignificant noise we're desperately trying to spin." Oh, and let's not forget the glorious pronouncement: "Normally I summarize the summary but I don't do that here to save space." Because after pages of highly specific, utterly meaningless numerical gymnastics, that's where we decide to be concise.

So, what does this groundbreaking research mean for you, the actual developer or DBA out there? Absolutely nothing. Your production Postgres instance will continue to operate exactly as it did before, blissfully unaware of the thrilling 2% regression on a synthetic query in a cached environment. My prediction? In the next beta, they'll discover a 0.5% gain on a different, equally irrelevant metric, and we'll have to sit through this whole song and dance again. Just deploy the damn thing and hope for the best, because these 'insights' certainly aren't going to save your bacon.

🔥 The DB Grill 🔥