Debugging a Binary Data Oplog Replay Issue in MongoDB

This blog post follows up on my previous one, How to Upgrade MongoDB Using Backups Through Many Major Versions, in which I analyzed the possibility of using backups to upgrade MongoDB through multiple major versions and ended up stumbling on a specific issue regarding restoring a particular subset of Binary data with Oplog Replay.

The Oplog Dump and the Oplog Replay tools are very important to ensure data consistency during MongoDB logical backups and restores, as they capture updates in data after the collection dumps have started for further replaying in a target deployment.

Unfortunately, the values before and after the restore differ for a specific subset of data. I should clarify that, according to my humble tests, this happens only with binary fields with subtype 2, which are byte arrays. Although this subtype is classified as (old) in the documentation, it is still possible to use it freely, so we expect it to work correctly in all instances and tools.

Environment

The problem is not localized to a cross-version restore or a single MongoDB version. I tested the scenario in all the latest minor versions from 3.6 to 8.0, with their respective mongodb-tools latest versions. For the versions before 4.4, I used the mongodump and mongorestore shipped within their Community installation packages.

Process

The process was composed of the following steps:

Load the source 1-node replica set with 750k documents (500k in a collection, 250k in another) containing multiple field types (string, number, date, array, boolean, and binary subtypes 0, 2, 3, and 4)
Start another data load process to create documents while mongodump runs
Take a backup using mongodump with the –oplog option
Restore the backup using mongorestore with the –oplogReplay option
Compare the data between the source and target in the database
Compare the data between the source and target with the BSON files generated by mongodump

The detailed steps, scripts used, and oplog dumps can be found in my repository: https://github.com/pclaudinoo/blog_posts/tree/main/oplog_replay_issue

The comparison

The data comparison was the most crucial part of this investigation, so I decided to detail it a bit more. To compare two documents, I used the capacity that the mongo shell has to create a session to another server and store it in a variable. I first logged in to the target replica set:

$ mongo --host rsBlogTarget/localhost:27042 --authenticationDatabase admin -u testuser -p testpwd

1	$ mongo --host rsBlogTarget/localhost:27042 --authenticationDatabase admin -u testuser -p testpwd

Then, from inside of it, I created a connection to the source one:

conn2 = new Mongo("mongodb://testuser:testpwd@localhost:27041/admin?replicaSet=rsBlogSource")
db2 = conn2.getDB('test')
coll_name="collection01"

conn2 = new Mongo("mongodb://testuser:testpwd@localhost:27041/admin?replicaSet=rsBlogSource")

db2 = conn2.getDB('test')

coll_name="collection01"

With that, I could query both replica sets in the same terminal session:

rsBlogTarget:PRIMARY> db2.getSiblingDB('test').getCollection(coll_name).find({_id: ObjectId('6758898aa3c2f16c28a26a78')})
{ "_id" : ObjectId("6758898aa3c2f16c28a26a78"), "mode" : "bulk", "name" : "3b8B", "lastname" : "2D_F", "birthdate" : ISODate("1984-10-30T14:22:49Z"), "age" : 60, "company" : "-P9BO", "receive_sms" : false, "uuid4" : BinData(4,"TSjEvTPoEZer"), "bindata0" : BinData(0,"yg=="), "bindata2" : BinData(2,"AAAAAA=="), "bindata3" : BinData(3,"GRGcF0Md8Vu/G7DNcy93rvyuHxM="), "self_bio" : "i6=l#0K439MOXhCZN8P)2cMºveo&Vztrbn/T8H6kgM9§=4MgC6JZHmKRx7=%C51t8vHy!C1YYRD_&WJptººtooL&5/oRleYeUUªWPP/BBP=1!ok-±6±s%+Kofnz(8/vWUqem6dgCwWCjIXE±CkJª#Yhk±0_m±_+Uj§bLS++§_ltMi&G??*Vu(=§NyTdSEA)9j?#+nTER±vODJ8Is-0Qfºy6MvN)&LdªP?8§oDbk)VE9RKF-§Mf47Kz2ºWªk/ZS4tm§4J§I0%XhTQOWºe#7±UCD!&zBvnHkutvilXvomZ5#KQ-25§p#(/Ai±DT9Bz7)MiWB*2h9v&USpWsp3LvUDp)Q%7tLEGo±6E4lS_?(J-Y6bxª±dS4±P)WZ?cttCog!Ab(m!77dvI3s/Ue(sRK4#BS*S6+q?ERUO(WWa/SCªDZPDKOfOJ7/&iA)Z#ci4-Ubeu9ndFC1pldGpBY13U4UK?Okw&yygN8oo?aQr-6sOUHNzXAUfoGvcahED-x49Ci7dvkJYGyPtQO%KHª%bm2DlhebH?YhBqRMfººNP/hL(-fI4NdSCDdVcKkt?-x1ZEQGW6-Gs-_%ºx5º+ªº__69KJ/8/Ra!*31±D7ET±uhFNtbc_)Ysi5FhPTps)R!7yTnjs§=/qov7kN)5&wYDJn9&CW-kf#S_edQodQRrCCg§rmpv3ryFAYjpaFG)28iYus§t2(Z)/ZJW)Sl3Aª8+Ei7lBOO&pxyFx8Zª5#J1TS3xeLPn==IndZ5wJtªbp7±EaO0%5Vuhh1_§0º=F5Mº*+dºn4wiwºC2j#fP1Hy4(8F?BbRAwzIZ?lKBgkIXE!GsN=IVVPMªzmBU%§*)LwDdeaJd/h5!sN5±#=??!hGs=Iw+WIw8IM=s2rHzmfVlUOzTaQ+%57=±+a_3coa3Kj&Bn+2e)tj)&gpwl±knlETzgVk-HAp911M?mWLehºUR§Qx+8§h4no8C5YGUQq2Mª5xlg*OwW)9L8rv?0tp(/mªeB9zm0al+DIn9HCv3TJA)±AE+XemV-2ZG", "create_date" : ISODate("2024-12-10T18:33:46.585Z") }
rsBlogTarget:PRIMARY> db.getSiblingDB('test').getCollection(coll_name).find({_id: ObjectId('6758898aa3c2f16c28a26a78')})
{ "_id" : ObjectId("6758898aa3c2f16c28a26a78"), "mode" : "bulk", "name" : "3b8B", "lastname" : "2D_F", "birthdate" : ISODate("1984-10-30T14:22:49Z"), "age" : 60, "company" : "-P9BO", "receive_sms" : false, "uuid4" : BinData(4,"TSjEvTPoEZer"), "bindata0" : BinData(0,"yg=="), "bindata2" : BinData(2,"AAAAAA=="), "bindata3" : BinData(3,"GRGcF0Md8Vu/G7DNcy93rvyuHxM="), "self_bio" : "i6=l#0K439MOXhCZN8P)2cMºveo&Vztrbn/T8H6kgM9§=4MgC6JZHmKRx7=%C51t8vHy!C1YYRD_&WJptººtooL&5/oRleYeUUªWPP/BBP=1!ok-±6±s%+Kofnz(8/vWUqem6dgCwWCjIXE±CkJª#Yhk±0_m±_+Uj§bLS++§_ltMi&G??*Vu(=§NyTdSEA)9j?#+nTER±vODJ8Is-0Qfºy6MvN)&LdªP?8§oDbk)VE9RKF-§Mf47Kz2ºWªk/ZS4tm§4J§I0%XhTQOWºe#7±UCD!&zBvnHkutvilXvomZ5#KQ-25§p#(/Ai±DT9Bz7)MiWB*2h9v&USpWsp3LvUDp)Q%7tLEGo±6E4lS_?(J-Y6bxª±dS4±P)WZ?cttCog!Ab(m!77dvI3s/Ue(sRK4#BS*S6+q?ERUO(WWa/SCªDZPDKOfOJ7/&iA)Z#ci4-Ubeu9ndFC1pldGpBY13U4UK?Okw&yygN8oo?aQr-6sOUHNzXAUfoGvcahED-x49Ci7dvkJYGyPtQO%KHª%bm2DlhebH?YhBqRMfººNP/hL(-fI4NdSCDdVcKkt?-x1ZEQGW6-Gs-_%ºx5º+ªº__69KJ/8/Ra!*31±D7ET±uhFNtbc_)Ysi5FhPTps)R!7yTnjs§=/qov7kN)5&wYDJn9&CW-kf#S_edQodQRrCCg§rmpv3ryFAYjpaFG)28iYus§t2(Z)/ZJW)Sl3Aª8+Ei7lBOO&pxyFx8Zª5#J1TS3xeLPn==IndZ5wJtªbp7±EaO0%5Vuhh1_§0º=F5Mº*+dºn4wiwºC2j#fP1Hy4(8F?BbRAwzIZ?lKBgkIXE!GsN=IVVPMªzmBU%§*)LwDdeaJd/h5!sN5±#=??!hGs=Iw+WIw8IM=s2rHzmfVlUOzTaQ+%57=±+a_3coa3Kj&Bn+2e)tj)&gpwl±knlETzgVk-HAp911M?mWLehºUR§Qx+8§h4no8C5YGUQq2Mª5xlg*OwW)9L8rv?0tp(/mªeB9zm0al+DIn9HCv3TJA)±AE+XemV-2ZG", "create_date" : ISODate("2024-12-10T18:33:46.585Z") }

rsBlogTarget:PRIMARY> db2.getSiblingDB('test').getCollection(coll_name).find({_id: ObjectId('6758898aa3c2f16c28a26a78')})

{ "_id" : ObjectId("6758898aa3c2f16c28a26a78"), "mode" : "bulk", "name" : "3b8B", "lastname" : "2D_F", "birthdate" : ISODate("1984-10-30T14:22:49Z"), "age" : 60, "company" : "-P9BO", "receive_sms" : false, "uuid4" : BinData(4,"TSjEvTPoEZer"), "bindata0" : BinData(0,"yg=="), "bindata2" : BinData(2,"AAAAAA=="), "bindata3" : BinData(3,"GRGcF0Md8Vu/G7DNcy93rvyuHxM="), "self_bio" : "i6=l#0K439MOXhCZN8P)2cMºveo&Vztrbn/T8H6kgM9§=4MgC6JZHmKRx7=%C51t8vHy!C1YYRD_&WJptººtooL&5/oRleYeUUªWPP/BBP=1!ok-±6±s%+Kofnz(8/vWUqem6dgCwWCjIXE±CkJª#Yhk±0_m±_+Uj§bLS++§_ltMi&G??*Vu(=§NyTdSEA)9j?#+nTER±vODJ8Is-0Qfºy6MvN)&LdªP?8§oDbk)VE9RKF-§Mf47Kz2ºWªk/ZS4tm§4J§I0%XhTQOWºe#7±UCD!&zBvnHkutvilXvomZ5#KQ-25§p#(/Ai±DT9Bz7)MiWB*2h9v&USpWsp3LvUDp)Q%7tLEGo±6E4lS_?(J-Y6bxª±dS4±P)WZ?cttCog!Ab(m!77dvI3s/Ue(sRK4#BS*S6+q?ERUO(WWa/SCªDZPDKOfOJ7/&iA)Z#ci4-Ubeu9ndFC1pldGpBY13U4UK?Okw&yygN8oo?aQr-6sOUHNzXAUfoGvcahED-x49Ci7dvkJYGyPtQO%KHª%bm2DlhebH?YhBqRMfººNP/hL(-fI4NdSCDdVcKkt?-x1ZEQGW6-Gs-_%ºx5º+ªº__69KJ/8/Ra!*31±D7ET±uhFNtbc_)Ysi5FhPTps)R!7yTnjs§=/qov7kN)5&wYDJn9&CW-kf#S_edQodQRrCCg§rmpv3ryFAYjpaFG)28iYus§t2(Z)/ZJW)Sl3Aª8+Ei7lBOO&pxyFx8Zª5#J1TS3xeLPn==IndZ5wJtªbp7±EaO0%5Vuhh1_§0º=F5Mº*+dºn4wiwºC2j#fP1Hy4(8F?BbRAwzIZ?lKBgkIXE!GsN=IVVPMªzmBU%§*)LwDdeaJd/h5!sN5±#=??!hGs=Iw+WIw8IM=s2rHzmfVlUOzTaQ+%57=±+a_3coa3Kj&Bn+2e)tj)&gpwl±knlETzgVk-HAp911M?mWLehºUR§Qx+8§h4no8C5YGUQq2Mª5xlg*OwW)9L8rv?0tp(/mªeB9zm0al+DIn9HCv3TJA)±AE+XemV-2ZG", "create_date" : ISODate("2024-12-10T18:33:46.585Z") }

rsBlogTarget:PRIMARY> db.getSiblingDB('test').getCollection(coll_name).find({_id: ObjectId('6758898aa3c2f16c28a26a78')})

I could compare their results visually or via a script. The script I used compared the fields’ lengths and their toString() and value() results. Its source can be found here: https://raw.githubusercontent.com/pclaudinoo/blog_posts/refs/heads/main/oplog_replay_issue/scripts/data_compare.js

The script fetched all _ids from the target replica set, found them in the source, and compared field by field. The output result when they mismatch is this:

Collection: collection01, _id: 67588d39470199a7c6a2865f field bindata2 toString does not match: ^@^@^@^@,  /  value does not match: ^@^@^@^@,  /  length does not match: 4, 0
Collection: collection01, _id: 67588d3a470199a7c6a286b3 field bindata2 toString does not match: ^@^@^@^@,  /  value does not match: ^@^@^@^@,  /  length does not match: 4, 0

Collection: collection01, _id: 67588d39470199a7c6a2865f field bindata2 toString does not match: ^@^@^@^@, / value does not match: ^@^@^@^@, / length does not match: 4, 0

Collection: collection01, _id: 67588d3a470199a7c6a286b3 field bindata2 toString does not match: ^@^@^@^@, / value does not match: ^@^@^@^@, / length does not match: 4, 0

After finding a mismatch, I would then compare the whole document in both the collection01 and the oplog collections, on both source and target, to double-check the error visually:

// collection01
rsBlogTarget:PRIMARY> db2.getSiblingDB('test').getCollection(coll_name).find({_id: ObjectId('67588d39470199a7c6a2865f')})
{ "_id" : ObjectId("67588d39470199a7c6a2865f"), "mode" : "no_bulk", "name" : "eVwG", "lastname" : "EjIG", "birthdate" : ISODate("1971-08-23T05:22:17Z"), "age" : 45, "company" : "7H?/v", "receive_sms" : false, "uuid4" : BinData(4,"Z/F5K+/52RyLmyHvr4TMHAE="), "bindata0" : BinData(0,"UEpI9xufT8cc"), "bindata2" : BinData(2,"AAAAAA=="), "bindata3" : BinData(3,"wzOx/ZoO/Cb3stliR8iPXuTbYA=="), "self_bio" : "p0x7rl8OY3sO+0y1A_0Ol%Hhb&W?G-mY", "create_date" : ISODate("2024-12-10T18:49:29.313Z") }
rsBlogTarget:PRIMARY> db.getSiblingDB('test').getCollection(coll_name).find({_id: ObjectId('67588d39470199a7c6a2865f')})
{ "_id" : ObjectId("67588d39470199a7c6a2865f"), "mode" : "no_bulk", "name" : "eVwG", "lastname" : "EjIG", "birthdate" : ISODate("1971-08-23T05:22:17Z"), "age" : 45, "company" : "7H?/v", "receive_sms" : false, "uuid4" : BinData(4,"Z/F5K+/52RyLmyHvr4TMHAE="), "bindata0" : BinData(0,"UEpI9xufT8cc"), "bindata2" : BinData(2,"BAAAAAAAAAA="), "bindata3" : BinData(3,"wzOx/ZoO/Cb3stliR8iPXuTbYA=="), "self_bio" : "p0x7rl8OY3sO+0y1A_0Ol%Hhb&W?G-mY", "create_date" : ISODate("2024-12-10T18:49:29.313Z") }

// oplog
rsBlogTarget:PRIMARY> db2.getSiblingDB('local').getCollection('oplog.rs').find({"o._id": ObjectId('67588d39470199a7c6a2865f')})
{ "lsid" : { "id" : UUID("8792d733-13f6-4103-844e-97e8bf72e3e8"), "uid" : BinData(0,"AFRKns4A5Cw1N/JwEr9PYhaJfg9avn0rWp9KjbtfX6w=") }, "txnNumber" : NumberLong(7245), "op" : "i", "ns" : "test.collection01", "ui" : UUID("0bbdc214-86fa-4476-97b8-7e0b5f0ccfb6"), "o" : { "_id" : ObjectId("67588d39470199a7c6a2865f"), "mode" : "no_bulk", "name" : "eVwG", "lastname" : "EjIG", "birthdate" : ISODate("1971-08-23T05:22:17Z"), "age" : 45, "company" : "7H?/v", "receive_sms" : false, "uuid4" : BinData(4,"Z/F5K+/52RyLmyHvr4TMHAE="), "bindata0" : BinData(0,"UEpI9xufT8cc"), "bindata2" : BinData(2,"AAAAAA=="), "bindata3" : BinData(3,"wzOx/ZoO/Cb3stliR8iPXuTbYA=="), "self_bio" : "p0x7rl8OY3sO+0y1A_0Ol%Hhb&W?G-mY", "create_date" : ISODate("2024-12-10T18:49:29.313Z") }, "o2" : { "_id" : ObjectId("67588d39470199a7c6a2865f") }, "stmtId" : 0, "ts" : Timestamp(1733856569, 37), "t" : NumberLong(1), "v" : NumberLong(2), "wall" : ISODate("2024-12-10T18:49:29.316Z"), "prevOpTime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) } }
rsBlogTarget:PRIMARY> db.getSiblingDB('local').getCollection('oplog.rs').find({"o._id": ObjectId('67588d39470199a7c6a2865f')})
{ "op" : "i", "ns" : "test.collection01", "ui" : UUID("0e5152f4-3c4b-43c7-b936-ce2e2a70fa65"), "o" : { "_id" : ObjectId("67588d39470199a7c6a2865f"), "mode" : "no_bulk", "name" : "eVwG", "lastname" : "EjIG", "birthdate" : ISODate("1971-08-23T05:22:17Z"), "age" : 45, "company" : "7H?/v", "receive_sms" : false, "uuid4" : BinData(4,"Z/F5K+/52RyLmyHvr4TMHAE="), "bindata0" : BinData(0,"UEpI9xufT8cc"), "bindata2" : BinData(2,"BAAAAAAAAAA="), "bindata3" : BinData(3,"wzOx/ZoO/Cb3stliR8iPXuTbYA=="), "self_bio" : "p0x7rl8OY3sO+0y1A_0Ol%Hhb&W?G-mY", "create_date" : ISODate("2024-12-10T18:49:29.313Z") }, "o2" : { "_id" : ObjectId("67588d39470199a7c6a2865f") }, "ts" : Timestamp(1733856811, 299), "t" : NumberLong(1), "v" : NumberLong(2), "wall" : ISODate("2024-12-10T18:53:31.601Z") }

// collection01

rsBlogTarget:PRIMARY> db2.getSiblingDB('test').getCollection(coll_name).find({_id: ObjectId('67588d39470199a7c6a2865f')})

{ "_id" : ObjectId("67588d39470199a7c6a2865f"), "mode" : "no_bulk", "name" : "eVwG", "lastname" : "EjIG", "birthdate" : ISODate("1971-08-23T05:22:17Z"), "age" : 45, "company" : "7H?/v", "receive_sms" : false, "uuid4" : BinData(4,"Z/F5K+/52RyLmyHvr4TMHAE="), "bindata0" : BinData(0,"UEpI9xufT8cc"), "bindata2" : BinData(2,"AAAAAA=="), "bindata3" : BinData(3,"wzOx/ZoO/Cb3stliR8iPXuTbYA=="), "self_bio" : "p0x7rl8OY3sO+0y1A_0Ol%Hhb&W?G-mY", "create_date" : ISODate("2024-12-10T18:49:29.313Z") }

rsBlogTarget:PRIMARY> db.getSiblingDB('test').getCollection(coll_name).find({_id: ObjectId('67588d39470199a7c6a2865f')})

{ "_id" : ObjectId("67588d39470199a7c6a2865f"), "mode" : "no_bulk", "name" : "eVwG", "lastname" : "EjIG", "birthdate" : ISODate("1971-08-23T05:22:17Z"), "age" : 45, "company" : "7H?/v", "receive_sms" : false, "uuid4" : BinData(4,"Z/F5K+/52RyLmyHvr4TMHAE="), "bindata0" : BinData(0,"UEpI9xufT8cc"), "bindata2" : BinData(2,"BAAAAAAAAAA="), "bindata3" : BinData(3,"wzOx/ZoO/Cb3stliR8iPXuTbYA=="), "self_bio" : "p0x7rl8OY3sO+0y1A_0Ol%Hhb&W?G-mY", "create_date" : ISODate("2024-12-10T18:49:29.313Z") }

// oplog

rsBlogTarget:PRIMARY> db2.getSiblingDB('local').getCollection('oplog.rs').find({"o._id": ObjectId('67588d39470199a7c6a2865f')})

{ "lsid" : { "id" : UUID("8792d733-13f6-4103-844e-97e8bf72e3e8"), "uid" : BinData(0,"AFRKns4A5Cw1N/JwEr9PYhaJfg9avn0rWp9KjbtfX6w=") }, "txnNumber" : NumberLong(7245), "op" : "i", "ns" : "test.collection01", "ui" : UUID("0bbdc214-86fa-4476-97b8-7e0b5f0ccfb6"), "o" : { "_id" : ObjectId("67588d39470199a7c6a2865f"), "mode" : "no_bulk", "name" : "eVwG", "lastname" : "EjIG", "birthdate" : ISODate("1971-08-23T05:22:17Z"), "age" : 45, "company" : "7H?/v", "receive_sms" : false, "uuid4" : BinData(4,"Z/F5K+/52RyLmyHvr4TMHAE="), "bindata0" : BinData(0,"UEpI9xufT8cc"), "bindata2" : BinData(2,"AAAAAA=="), "bindata3" : BinData(3,"wzOx/ZoO/Cb3stliR8iPXuTbYA=="), "self_bio" : "p0x7rl8OY3sO+0y1A_0Ol%Hhb&W?G-mY", "create_date" : ISODate("2024-12-10T18:49:29.313Z") }, "o2" : { "_id" : ObjectId("67588d39470199a7c6a2865f") }, "stmtId" : 0, "ts" : Timestamp(1733856569, 37), "t" : NumberLong(1), "v" : NumberLong(2), "wall" : ISODate("2024-12-10T18:49:29.316Z"), "prevOpTime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) } }

rsBlogTarget:PRIMARY> db.getSiblingDB('local').getCollection('oplog.rs').find({"o._id": ObjectId('67588d39470199a7c6a2865f')})

{ "op" : "i", "ns" : "test.collection01", "ui" : UUID("0e5152f4-3c4b-43c7-b936-ce2e2a70fa65"), "o" : { "_id" : ObjectId("67588d39470199a7c6a2865f"), "mode" : "no_bulk", "name" : "eVwG", "lastname" : "EjIG", "birthdate" : ISODate("1971-08-23T05:22:17Z"), "age" : 45, "company" : "7H?/v", "receive_sms" : false, "uuid4" : BinData(4,"Z/F5K+/52RyLmyHvr4TMHAE="), "bindata0" : BinData(0,"UEpI9xufT8cc"), "bindata2" : BinData(2,"BAAAAAAAAAA="), "bindata3" : BinData(3,"wzOx/ZoO/Cb3stliR8iPXuTbYA=="), "self_bio" : "p0x7rl8OY3sO+0y1A_0Ol%Hhb&W?G-mY", "create_date" : ISODate("2024-12-10T18:49:29.313Z") }, "o2" : { "_id" : ObjectId("67588d39470199a7c6a2865f") }, "ts" : Timestamp(1733856811, 299), "t" : NumberLong(1), "v" : NumberLong(2), "wall" : ISODate("2024-12-10T18:53:31.601Z") }

It is possible to see in both that the source holds the value BinData(2,”AAAAAA==”) while the target holds BinData(2,”BAAAAAAAAAA=”). After that confirmation, I would compare it with the oplog BSON file generated by mongodump:

$ bsondump --outFile=oplog_dump.json --bsonFile=dump/oplog.bson
$ grep '67588d39470199a7c6a2865f|67588d3a470199a7c6a286b3' oplog_dump.json
{"lsid":{"id":{"$binary":{"base64":"h5LXMxP2QQOETpfov3Lj6A==","subType":"04"}},"uid":{"$binary":{"base64":"AFRKns4A5Cw1N/JwEr9PYhaJfg9avn0rWp9KjbtfX6w=","subType":"00"}}},"txnNumber":{"$numberLong":"7245"},"op":"i","ns":"test.collection01","ui":{"$binary":{"base64":"C73CFIb6RHaXuH4LXwzPtg==","subType":"04"}},"o":{"_id":{"$oid":"67588d39470199a7c6a2865f"},"mode":"no_bulk","name":"eVwG","lastname":"EjIG","birthdate":{"$date":{"$numberLong":"51772937000"}},"age":{"$numberInt":"45"},"company":"7H?/v","receive_sms":false,"uuid4":{"$binary":{"base64":"Z/F5K+/52RyLmyHvr4TMHAE=","subType":"04"}},"bindata0":{"$binary":{"base64":"UEpI9xufT8cc","subType":"00"}},"bindata2":{"$binary":{"base64":"AAAAAA==","subType":"02"}},"bindata3":{"$binary":{"base64":"wzOx/ZoO/Cb3stliR8iPXuTbYA==","subType":"03"}},"self_bio":"p0x7rl8OY3sO+0y1A_0Ol%Hhb&W?G-mY","create_date":{"$date":{"$numberLong":"1733856569313"}}},"o2":{"_id":{"$oid":"67588d39470199a7c6a2865f"}},"stmtId":{"$numberInt":"0"},"ts":{"$timestamp":{"t":1733856569,"i":37}},"t":{"$numberLong":"1"},"v":{"$numberLong":"2"},"wall":{"$date":{"$numberLong":"1733856569316"}},"prevOpTime":{"ts":{"$timestamp":{"t":0,"i":0}},"t":{"$numberLong":"-1"}}}
{"lsid":{"id":{"$binary":{"base64":"h5LXMxP2QQOETpfov3Lj6A==","subType":"04"}},"uid":{"$binary":{"base64":"AFRKns4A5Cw1N/JwEr9PYhaJfg9avn0rWp9KjbtfX6w=","subType":"00"}}},"txnNumber":{"$numberLong":"7329"},"op":"i","ns":"test.collection01","ui":{"$binary":{"base64":"C73CFIb6RHaXuH4LXwzPtg==","subType":"04"}},"o":{"_id":{"$oid":"67588d3a470199a7c6a286b3"},"mode":"no_bulk","name":"7ZhM","lastname":"Z++N","birthdate":{"$date":{"$numberLong":"1624873736000"}},"age":{"$numberInt":"72"},"company":"&Qka)","receive_sms":false,"uuid4":{"$binary":{"base64":"Ca+rfQ==","subType":"04"}},"bindata0":{"$binary":{"base64":"3DABOW5khBPMUOZ2OplwwevQiuve","subType":"00"}},"bindata2":{"$binary":{"base64":"AAAAAA==","subType":"02"}},"bindata3":{"$binary":{"base64":"pAPY3IcWJR35InOR4oa8ek/kCduu","subType":"03"}},"self_bio":"TiE%xB9ckrª#*sN08O0KdwohBC-*zI)2","create_date":{"$date":{"$numberLong":"1733856570041"}}},"o2":{"_id":{"$oid":"67588d3a470199a7c6a286b3"}},"stmtId":{"$numberInt":"0"},"ts":{"$timestamp":{"t":1733856570,"i":7}},"t":{"$numberLong":"1"},"v":{"$numberLong":"2"},"wall":{"$date":{"$numberLong":"1733856570044"}},"prevOpTime":{"ts":{"$timestamp":{"t":0,"i":0}},"t":{"$numberLong":"-1"}}}

$ bsondump --outFile=oplog_dump.json --bsonFile=dump/oplog.bson

$ grep '67588d39470199a7c6a2865f|67588d3a470199a7c6a286b3' oplog_dump.json

{"lsid":{"id":{"$binary":{"base64":"h5LXMxP2QQOETpfov3Lj6A==","subType":"04"}},"uid":{"$binary":{"base64":"AFRKns4A5Cw1N/JwEr9PYhaJfg9avn0rWp9KjbtfX6w=","subType":"00"}}},"txnNumber":{"$numberLong":"7245"},"op":"i","ns":"test.collection01","ui":{"$binary":{"base64":"C73CFIb6RHaXuH4LXwzPtg==","subType":"04"}},"o":{"_id":{"$oid":"67588d39470199a7c6a2865f"},"mode":"no_bulk","name":"eVwG","lastname":"EjIG","birthdate":{"$date":{"$numberLong":"51772937000"}},"age":{"$numberInt":"45"},"company":"7H?/v","receive_sms":false,"uuid4":{"$binary":{"base64":"Z/F5K+/52RyLmyHvr4TMHAE=","subType":"04"}},"bindata0":{"$binary":{"base64":"UEpI9xufT8cc","subType":"00"}},"bindata2":{"$binary":{"base64":"AAAAAA==","subType":"02"}},"bindata3":{"$binary":{"base64":"wzOx/ZoO/Cb3stliR8iPXuTbYA==","subType":"03"}},"self_bio":"p0x7rl8OY3sO+0y1A_0Ol%Hhb&W?G-mY","create_date":{"$date":{"$numberLong":"1733856569313"}}},"o2":{"_id":{"$oid":"67588d39470199a7c6a2865f"}},"stmtId":{"$numberInt":"0"},"ts":{"$timestamp":{"t":1733856569,"i":37}},"t":{"$numberLong":"1"},"v":{"$numberLong":"2"},"wall":{"$date":{"$numberLong":"1733856569316"}},"prevOpTime":{"ts":{"$timestamp":{"t":0,"i":0}},"t":{"$numberLong":"-1"}}}

{"lsid":{"id":{"$binary":{"base64":"h5LXMxP2QQOETpfov3Lj6A==","subType":"04"}},"uid":{"$binary":{"base64":"AFRKns4A5Cw1N/JwEr9PYhaJfg9avn0rWp9KjbtfX6w=","subType":"00"}}},"txnNumber":{"$numberLong":"7329"},"op":"i","ns":"test.collection01","ui":{"$binary":{"base64":"C73CFIb6RHaXuH4LXwzPtg==","subType":"04"}},"o":{"_id":{"$oid":"67588d3a470199a7c6a286b3"},"mode":"no_bulk","name":"7ZhM","lastname":"Z++N","birthdate":{"$date":{"$numberLong":"1624873736000"}},"age":{"$numberInt":"72"},"company":"&Qka)","receive_sms":false,"uuid4":{"$binary":{"base64":"Ca+rfQ==","subType":"04"}},"bindata0":{"$binary":{"base64":"3DABOW5khBPMUOZ2OplwwevQiuve","subType":"00"}},"bindata2":{"$binary":{"base64":"AAAAAA==","subType":"02"}},"bindata3":{"$binary":{"base64":"pAPY3IcWJR35InOR4oa8ek/kCduu","subType":"03"}},"self_bio":"TiE%xB9ckrª#*sN08O0KdwohBC-*zI)2","create_date":{"$date":{"$numberLong":"1733856570041"}}},"o2":{"_id":{"$oid":"67588d3a470199a7c6a286b3"}},"stmtId":{"$numberInt":"0"},"ts":{"$timestamp":{"t":1733856570,"i":7}},"t":{"$numberLong":"1"},"v":{"$numberLong":"2"},"wall":{"$date":{"$numberLong":"1733856570044"}},"prevOpTime":{"ts":{"$timestamp":{"t":0,"i":0}},"t":{"$numberLong":"-1"}}}

And the field bindata2 holds the same value as the client: {“$binary”:{“base64″:”AAAAAA==”,”subType”:”02″}}. Somehow, some versions of the Oplog Replay feature of mongorestore transform this value to something else when writing to the target. Unfortunately, I don’t have enough knowledge to dive deep into its source code and find out why, so I created the following ticket to report it: https://jira.mongodb.org/browse/TOOLS-3730.

Results per version:

Besides the matching results in each case, I could spot some other issues in my tests:

The data of type binary subtype 2 is converted to binary subtype 0 when using mongorestore v4.0.24
mongodump 100.5.4 had authorization issues to dump data from v6.0.19, so I had to upgrade to v100.9.5. It was executed successfully without any changes to the user permissions. The error: Failed: error creating intents to dump: error creating intents for database config: error counting config.system.preimages: (Unauthorized) not authorized on config to execute command { count: “system.preimages”, lsid: { id: UUID(“5b15e198-3417-4b5b-824e-75f75abf6310”) }

Conclusion

The blog post that originated this one was about using backups to upgrade MongoDB and the potential issues we can find. This one expanded the investigation about potential data consistency issues using backups and restores in general, leading us to reinforce that it is crucial to test our backups regularly, including tools to compare and validate samples of documents when possible, so we don’t have unpleasant surprises when we need to recover from a disaster or replicate an environment. Percona Experts are available 24/7 to assist you with these and other issues with your MongoDB deployments.

Whether you’re a seasoned DBA well-versed in MongoDB or a newcomer looking to harness its potential, this ebook provides the insights, strategies, and best practices to guide you through MongoDB upgrades, ensuring they go as smoothly as possible and your databases remain optimized and secure.

From Planning to Performance: MongoDB Upgrade Best Practices

0 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Why Percona for MongoDB?

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Debugging a Binary Data Oplog Replay Issue in MongoDB

Environment

Process

The comparison

Results per version:

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

Does FCV Have Any Impact on MongoDB Performance?

Behind the Scenes: How Percona Support Diagnosed a MongoDB FTDC Freeze

Run PostgreSQL on Kubernetes: A Practical Guide with Benchmarks and Best Practices

MOST POPULAR ARTICLES

Auditing Login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Why Percona for MongoDB?

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Debugging a Binary Data Oplog Replay Issue in MongoDB

Environment

Process

The comparison

Results per version:

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Does FCV Have Any Impact on MongoDB Performance?

Behind the Scenes: How Percona Support Diagnosed a MongoDB FTDC Freeze

Run PostgreSQL on Kubernetes: A Practical Guide with Benchmarks and Best Practices

MOST POPULAR ARTICLES

Auditing Login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation