Daily Database Roasts

DocumentDB: Comparing Emulation Internals with MongoDB

Originally from dev.to/feed/franckpachot

September 1, 2025 • Roasted by Sarah "Burnout" Chen Read Original Article

Alright, let's pour another cup of stale coffee and talk about this. I've seen this movie before, and I know how it ends: with me, a blinking cursor, and the sinking feeling that "compatible" is the most dangerous word in tech. This whole "emulate MongoDB on a relational database" trend gives me flashbacks to that time we tried to run a key-value store on top of SharePoint. Spoiler alert: it didn't go well.

So, let's break down this masterpiece of misplaced optimism, shall we?

First, we have the glorious promise of the "Seamless Migration" via a compatible API. This is the siren song that lures engineering managers to their doom. The demo looks great, the simple queries run, and everyone gets a promotion. Then you hit production traffic. This article's "simple" query—finding 5 records in a range—forced the "compatible" DocumentDB to scan nearly 60,000 index keys, fetch them all, and then sort them in memory just to throw 59,930 of them away. Native Mongo scanned five. Five! That's not a performance gap; that's a performance chasm. It's the technical equivalent of boiling the ocean to make a cup of tea.
Then there's the Doubly-Damned Debugging™. My favorite part of any new abstraction layer is figuring out which layer is lying to me at 3 AM. The beauty of this setup is that you don't just get one execution plan; you get two! You get the friendly, happy MongoDB-esque plan that vaguely hints at disaster, and then you get to docker exec into a container and tail PostgreSQL logs to find the real monstrosity of an execution plan underneath. The Oracle version is even better, presenting a query plan that looks like a lost chapter from the Necronomicon. So now, to fix a slow query, I need to be an expert in Mongo query syntax, the emulation's translation layer, and the deep internals of a relational database it's bolted onto. Fantastic. My on-call anxiety just developed a new subtype.
Let's talk about the comically catastrophic corner cases. The author casually mentions that a core performance optimization—pushing the ORDER BY down to the index scan for efficient pagination—is a "TODO" in the DocumentDB RUM index access method. A TODO. In the critical path of a database that's supposed to be production-ready. I can already hear the conversation: "Why does page 200 of our user list take 30 seconds to load?" Because the database is secretly reading every single user from A to Z, sorting them by hand, and then picking out the five you asked for. This isn't a database; it's a very expensive Array.prototype.sort().
And the pièce de résistance: the illusion of simplicity. The sales pitch is "keep your relational database that your team knows and trusts!" But this article proves that to make it work, you have to install a constellation of extensions (rum, documentdb_core, pg_cron...), become a Docker and psql wizard just to get a query plan, and then learn about proprietary index types like documentdb_rum that behave differently from everything else. You haven't simplified your stack; you've created a fragile, custom-built contraption. It’s like avoiding learning how to drive a new car by instead welding your old car's chassis onto a tractor engine. Sure, you still have your fuzzy dice, but good luck when it breaks down in the middle of the highway.

In the end, these emulations are just another beautiful, brilliant way to create new and exciting failure modes. We're not solving problems; we're just shifting the complexity around until it lands on the person who gets paged when it all falls over.

...sigh. I need more coffee.

🔥 The DB Grill 🔥