This blog post follows up on my previous one, How to Upgrade MongoDB Using Backups Through Many Major Versions, in which I analyzed the possibility of using backups to upgrade MongoDB through multiple major versions and ended up stumbling on a specific issue regarding restoring a particular subset of Binary data with Oplog Replay.

The Oplog Dump and the Oplog Replay tools are very important to ensure data consistency during MongoDB logical backups and restores, as they capture updates in data after the collection dumps have started for further replaying in a target deployment.

Unfortunately, the values before and after the restore differ for a specific subset of data. I should clarify that, according to my humble tests, this happens only with binary fields with subtype 2, which are byte arrays. Although this subtype is classified as (old) in the documentation, it is still possible to use it freely, so we expect it to work correctly in all instances and tools.

Environment

The problem is not localized to a cross-version restore or a single MongoDB version. I tested the scenario in all the latest minor versions from 3.6 to 8.0, with their respective mongodb-tools latest versions. For the versions before 4.4, I used the mongodump and mongorestore shipped within their Community installation packages.

Process

The process was composed of the following steps:

  1. Load the source 1-node replica set with 750k documents (500k in a collection, 250k in another) containing multiple field types (string, number, date, array, boolean, and binary subtypes 0, 2, 3, and 4)
  2. Start another data load process to create documents while mongodump runs
  3. Take a backup using mongodump with the –oplog option
  4. Restore the backup using mongorestore with the –oplogReplay option
  5. Compare the data between the source and target in the database
  6. Compare the data between the source and target with the BSON files generated by mongodump

The detailed steps, scripts used, and oplog dumps can be found in my repository: https://github.com/pclaudinoo/blog_posts/tree/main/oplog_replay_issue

reasons to switch from Mongodb to percona for mongodb

The comparison

The data comparison was the most crucial part of this investigation, so I decided to detail it a bit more. To compare two documents, I used the capacity that the mongo shell has to create a session to another server and store it in a variable. I first logged in to the target replica set:

Then, from inside of it, I created a connection to the source one:

With that, I could query both replica sets in the same terminal session:

I could compare their results visually or via a script. The script I used compared the fields’ lengths and their toString() and value() results. Its source can be found here: https://raw.githubusercontent.com/pclaudinoo/blog_posts/refs/heads/main/oplog_replay_issue/scripts/data_compare.js

The script fetched all _ids from the target replica set, found them in the source, and compared field by field. The output result when they mismatch is this:

After finding a mismatch, I would then compare the whole document in both the collection01 and the oplog collections, on both source and target, to double-check the error visually:

It is possible to see in both that the source holds the value BinData(2,”AAAAAA==”) while the target holds BinData(2,”BAAAAAAAAAA=”). After that confirmation, I would compare it with the oplog BSON file generated by mongodump:

And the field bindata2 holds the same value as the client: {“$binary”:{“base64″:”AAAAAA==”,”subType”:”02″}}. Somehow, some versions of the Oplog Replay feature of mongorestore transform this value to something else when writing to the target. Unfortunately, I don’t have enough knowledge to dive deep into its source code and find out why, so I created the following ticket to report it: https://jira.mongodb.org/browse/TOOLS-3730.

Results per version:

Besides the matching results in each case, I could spot some other issues in my tests:

  • The data of type binary subtype 2 is converted to binary subtype 0 when using mongorestore v4.0.24
  • mongodump 100.5.4 had authorization issues to dump data from v6.0.19, so I had to upgrade to v100.9.5. It was executed successfully without any changes to the user permissions. The error: Failed: error creating intents to dump: error creating intents for database config: error counting config.system.preimages: (Unauthorized) not authorized on config to execute command { count: “system.preimages”, lsid: { id: UUID(“5b15e198-3417-4b5b-824e-75f75abf6310”) }

Conclusion

The blog post that originated this one was about using backups to upgrade MongoDB and the potential issues we can find. This one expanded the investigation about potential data consistency issues using backups and restores in general, leading us to reinforce that it is crucial to test our backups regularly, including tools to compare and validate samples of documents when possible, so we don’t have unpleasant surprises when we need to recover from a disaster or replicate an environment. Percona Experts are available 24/7 to assist you with these and other issues with your MongoDB deployments.


Whether you’re a seasoned DBA well-versed in MongoDB or a newcomer looking to harness its potential, this ebook provides the insights, strategies, and best practices to guide you through MongoDB upgrades, ensuring they go as smoothly as possible and your databases remain optimized and secure.

 

From Planning to Performance: MongoDB Upgrade Best Practices

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments