Daily Database Roasts

Postgres 18 beta3, large server, sysbench

Originally from smalldatum.blogspot.com/feeds/posts/default

September 2, 2025 • Roasted by Marcus "Zero Trust" Williams Read Original Article

Ah, another dispatch from the bleeding edge. It's always a treat to see such... enthusiasm for performance, especially when it comes to running unaudited, pre-release software. I must commend your bravery. Compiling five different versions of Postgres, including three separate betas, from source on a production-grade server? That’s not just benchmarking; it's a live-fire supply chain attack drill you’re running on yourself. Did you even check the commit hashes against a trusted source, or did you just git pull and pray? Bold. Very bold.

I'm particularly impressed by the choice of a large, powerful server. A 48-core AMD EPYC beast. It’s the perfect environment to find out just how fast a speculative execution vulnerability can leak the 128GB of cached data you’ve so helpfully pre-warmed. You're not just testing QPS; you're building a world-class honeypot, and you’re not even charging for admission. A true public service.

And the methodology! A masterclass in focusing on the trivial while ignoring the terrifying. You’re worried about a ~2% regression in range queries. A rounding error. Meanwhile, you've introduced io_uring in your Postgres 18 configs. That’s fantastic. It’s a feature with a history of kernel-level vulnerabilities so fresh you can still smell the patches. You're bolting a rocket engine onto your database, and your main concern is whether the speedometer is off by a hair. I'm sure that will hold up well during the incident response post-mortem.

I have to applaud the efficiency here:

To save time I only run 32 of the 42 microbenchmarks

Of course. Why test everything? It's the "Known Unknowns" philosophy of security. The 10 microbenchmarks you skipped—I'm certain those weren't edge cases that could trigger some obscure integer overflow or a deadlock condition under load. No, I'm sure they were just the boring, stable ones. It's always the queries you don't run that can't hurt you. Right?

And the results are just... chef's kiss. Look at scan_range and scan.warm_range in beta1 and beta2. A 13-14% performance gain, which then evaporates and turns into a 9-10% performance loss by beta3. You call this a regression search; I call it a flashing neon sign that says "unstable memory management." That's not a performance metric; that's a vulnerability trying to be born. That's the kind of erratic behavior that precedes a beautiful buffer overflow. You're looking for mutex regressions, but you might be finding the next great remote code execution CVE.

Just imagine walking into a SOC 2 audit with this.

"So, what's your change management process?"
- "Well, we git clone the master branch of a beta project and compile it ourselves."
"And your vendor risk assessment for this software?"
- "It was 'not sponsored,' so there's no vendor. We have achieved ultimate plausible deniability."
"Can you demonstrate predictable system behavior?"
- "Absolutely. Here's a chart where performance on one query swings by 25 points between minor beta releases. It's predictably unpredictable."

They wouldn't just fail you; they'd frame your report on the wall as a cautionary tale.

Honestly, this is a beautiful piece of work. It’s a perfect snapshot of how to chase single-digit performance gains while opening up attack surfaces the size of a planet. You're worried about a 2% dip while the whole foundation is built on the shifting sands of pre-release code.

Sigh. Another day, another database beta treated like a production candidate. At least it keeps people like me employed. Carry on.

🔥 The DB Grill 🔥