Daily Database Roasts

Postgres 18.0 vs sysbench on a small server

Originally from smalldatum.blogspot.com/feeds/posts/default

September 26, 2025 • Roasted by Marcus "Zero Trust" Williams Read Original Article

Alright, let's pull up a chair. I've got my coffee, my risk assessment matrix, and a fresh pot of existential dread. Let's read this... benchmark report.

"Postgres continues to do a great job at avoiding regressions over time." Oh, that's just wonderful. A round of applause for the Postgres team. You've managed to not make the car actively slower while bolting on new features. I feel so much safer already. It’s like celebrating that your new skyscraper design includes floors. The bar is, as always, on the ground.

But let's dig in, shall we? Because the real gems, the future CVEs, are always in the details you gloss over.

First, your lab environment. An ASUS ExpertCenter PN53. Are you kidding me? That's not a server; that's the box my CFO uses for his Zoom calls. You're running "benchmarks" on a consumer-grade desktop toy with SMT disabled, probably because you read a blog post about Spectre from 2018 and thought, "I'm something of a security engineer myself." What other mitigations did you forget? Is the lid physically open for "air-gapped cooling"? This isn't a hardware spec; it's a cry for help.

And you compiled from source. Fantastic. I hope you enjoyed your make command. Did you verify the GPG signature of the tarball? Did you run a checksum against a trusted source? Did you personally audit the entire toolchain and all dependencies for supply chain vulnerabilities? Of course you didn't. You just downloaded it and ran it, introducing a beautiful, gaping hole for anyone who might've compromised a mirror or a developer's GitHub account. Your entire baseline is built on a foundation of "I trust the internet," which is a phrase that should get you fired from any serious organization.

Let's look at your methodology. "To save time I only run 32 of the 42 microbenchmarks." I'm sorry, you did what? You cut corners on your own test plan? What dark secrets are lurking in those 10 missing tests? Are those the ones that expose race conditions? Unhandled edge cases? The queries that actually look like the garbage a front-end developer would write? You didn't save time; you curated your results to tell a happy story. That's not data science; that's marketing.

And the test itself: 1 client, 1 table, 50M rows. This is a sterile, hermetically sealed fantasy land. Where's the concurrency? Where are the deadlocks? Where are the long-running analytical queries stomping all over the OLTP workload? Where's the malicious user probing for injection vulnerabilities by sending crafted payloads that look like legitimate queries? You're not testing a database; you're testing a calculator in a vacuum. Any real-world application would buckle this setup in seconds.

Now for my favorite part: the numbers. You see these tiny 1% and 2% regressions and you hand-wave them away as "new overhead in query execution setup." I see something else. I see non-deterministic performance. I see a timing side-channel. You think that 2% dip is insignificant? An attacker sees a signal. They see a way to leak information one bit at a time by carefully crafting queries and measuring the response time. That tiny regression isn't a performance issue; it's a covert channel waiting for an exploit.

And this... this is just beautiful:

col-1 col-2 col-3 point queries 1.01 1.01 0.97 hot-points_range=100

You turned on io_uring, a feature that gives your database a more direct, privileged path to the kernel's I/O scheduler, and in return, you got a 3% performance loss. You've widened your attack surface, introduced a world of complexity and potential kernel-level vulnerabilities, all for the privilege of making your database slower. This isn't an engineering trade-off; this is a self-inflicted wound. Do you have any idea how an auditor's eye twitches when they see io_uring in a change log? It's a neon sign that says "AUDIT ME UNTIL I WEEP."

You conclude that there are "no regressions larger than 2% but many improvements larger than 5%." You say that like it's a victory. You're celebrating single-digit improvements in a synthetic, best-case scenario while completely ignoring the new attack vectors, the unexplained performance jitters, and the utterly insecure foundation of your testing. This entire report is a compliance nightmare. You can't use this to pass a SOC 2 audit; you'd use this to demonstrate to an auditor that you have no internal controls whatsoever.

But hey, don't let me stop you. Keep chasing those fractional gains on your little desktop machine. It's a cute hobby. Just do us all a favor and don't let this code, or this mindset, anywhere near production data. You've built a faster car with no seatbelts, no brakes, and a mysterious rattle you "hope to explain" later. Good luck with that.

🔥 The DB Grill 🔥