MongoDB was a big deal when it arrived back circa 2009. It had a very different take on databases. Unlike the venerable MySQL or Oracle DBs, MongoDB did not rely on schemas. Rather, it stored the data into unstructured documents, taking cues from JSON. This similarily, among other features, made it very popular with the nascent NodeJS community. It became the M in "MEAN" stack which soon took over LAMP stacks and other tools.
But the cracks began to spread. Many would complain of missing data, eventually culiminating in the Jepsen data report about lack of consistent data retrieval. MongoDB's general response that these databases weren't set up properly felt odd given how much of a "great, get up and running" database they proclaimed to be.
But what problem does MongoDB solve?
Are there are other ways to approach this?
What problem does MongoDB solve?
MongoDB promises one big Thing: no forced struture.
No structure is big. It's a pain to write little SQL scripts to add a column here, change a default, or enforce a new type. When you're working in the beginning, you want to focus on delivering value for your customer and building out features. Having to not write migrations is one less thing to worry about.
Additionally, though less often talked about, is that MongoDB allows you to manipulate data as if it were JSON.
Using something like PHP, it was common to mainpulate data through
functions that directly called SQL code. You'd write a SQL statement,
whether it be a
INSERT, then use string formatting to
fill in new data. Your function would accept the data that would be
put into that string format. Once you wrote escapes to prevent
SQL-injection, you were good to go.
MongoDB gave you JSON-like functionality. This is principally why it got populate with NodeJS users. You could manipulate the data the way you always had with JSON.
Why It's not as Useful in Practice
Object Relation Mappers, or OBMs, have become ubiquitous with web development frameworks like Ruby-On-Rails and Django. They allow you to manipulate objects backed by a SQL database much like you would manipulate any object in their respective languages.
This negates some of the value from MongoDB. No longer did you have to
write functions to call SQL and format it as you'd like. Now you could
use an object and just call
model.save(). This was huge.
ORMs still had drawbacks with migrations. These were annoying to do
and were now doubly annoying to write. They still are annoying to
write with tools like
sequelize, an ORM
But Django found
south to mitigate a lot of this. True, you did
eventually have to write your own migrations but this took a lot of
the pain out of doing this. Now you could fly as your ORM came up with
good enough defaults for migrating your database. Each new field could
be quickly put into your database and the whole thing synced
south became so useful it was integrated into the Django
ORMs really took a lot out of what MongoDB brought to the table. You no longer had to worry as much about structure because you could add fields quickly and then migrate the database. Later on down the line you were going to have to write raw SQL code and muck with the generate migrations but that was likely going to be true beyond a certain scale anyway. MongoDB's advantages of being quick to set up and iterate with disappeared.
PostgreSQL can do MongoDB just as well as MongoDB (in practice)
Tease out valuable columns as you gain structure
Amazon's DocumentDB is supposedly based on using the PostgreSQL JSONB column with a wrapper for Mongo commands.
Some people dislike how querying for PostgreSQL's json columns works as the syntax is very clunky. In practice you're taking these values out, modifying them, then replacing entirely as opposed to doing in-database changes.
MongoDB is incredibly flexible with what you put into it but PostgreSQL can be too.
These column types are almost as functional as MongoDB. You can put an arbitrary JSON-formatted data into them just as you can in MongoDB. Unlike Mongo, they are backed by a rock solid database that has seen many war scars and its bugs fixed thoroughly. It also has SQL functionality so as you build out your database you can move to a structured format for querying with a lot less pain. Note that JSON uses strings and JSONB uses bytes on the database. The latter is a bit faster in most circumstances. They've gotten so popular that SQLite now supports them.
Sure, it's not perfect. The querying syntax for PostgreSQL's json types is clunky and its probably going to be slow if you're not using a dedicated column and data type. Trying to assign values without pulling them out is like pulling teeth. But I'm often interfacing through an ORM anyhow so this isn't much of an issue. I suspect the same is true to most readers. The tradeoffs here are so worth it that one HackerNews reader suspects Amazon uses it for DocumentDB.
There are likely some use cases where this isn't true but they seem to be extremely specific to your use case. It may be worth re-examining whether your assumptions hold.
MongoDB's ease of use with JSON-like notation and dead-simple schema-less design was extremely appealing early on. But its since been overcome by the introduction of JSON columns. These are far more flexible being that you can get to a SQL database without as much hassle and will give you the same benefits 99% of the time.
If anybody knows of a reason to use MongoDB over PostgreSQL JSON columns -- please reach out.
Links to MongoDB Stories
I'm intrigued by MongoDB's rise and fall(?) so below are some collected stories. They'll likely be added to over time. Feel free to email me more of these