Do not assume _id is ObjectId
See original GitHub issueOplogThread assumes that _id in the document is an ObjectId here and here
This also assumes the _ids returned by DocManager.search()
and DocManager.get_last_doc()
can be coerced to ObjectId.
But mongo does not require _id to be ObjectId.
_id ... may be of any type other than an array
The unit tests don’t even assume it is ObjectId. e.g. the test_solr_doc_manager.p unit tests for search use the string ‘1’ as the _id.
Most of the existing DocManagers coerce the _id to a string and store that, and return that string in search and get_last_doc and OplogThread attempts to coerce that string to ObjectId. So I think fixing this would require DocManagers to return the original _id instance and not a string.
Issue Analytics
- State:
- Created 7 years ago
- Comments:14 (5 by maintainers)
Top Results From Across the Web
Can I determine if a string is a MongoDB ObjectID?
It works great, but I'd like to lookup by a different attribute if I determine the string is not an ID. db.collection "pages",...
Read more >ObjectId — MongoDB Manual
Generate a New ObjectId. To generate a new ObjectId, use ObjectId() with no argument: x = ObjectId() ; Specify a Hexadecimal String. To...
Read more >Entropy of a MongoDB ObjectId | Code for Hire
The contents is not random, but consists of a few pieces of data. While you should never rely on knowing an ObjectId value...
Read more >Mongodb Primary Key: Example to set _id field with ObjectId()
If you want to ensure that MongoDB does not create the _id Field when the collection is created and if you want to...
Read more >Is the _id Property in MongoDB 100% Unique?
The object ID is only unique as long as the counter does not overflow ! The counter overflow problem is when the counter...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
This is tricky; the
_id
field is coerced into a string for the purposes of storing it in Elastic/Solr, sinceObjectId
is a MongoDB-specific type. However, we have to turn those strings back into whatever their original type was, so that we can query MongoDB. This means we need to know the original type of_id
, which isn’t encoded anywhere. It could even be different types from document to document (insane, perhaps, but possible).This is obviously a bug, but I’m not sure how big the impact is. I’m guessing that most are using ObjectId as the type for
_id
, since that’s the default. Of those people who are using something else, only a fraction might experience a rollback that requires mongo-connector’s intervention (and thus hit this bug). The fix for this will require encoding the original type somehow, and will probably require a breaking change. 😦Looking over the code in oplog_manager.py, I see the proposed changes here to encode the _id as extended JSON were never actually implemented (I’d believed they were based on the reply preceding mine thanking llvtt for resolving the problem).
And thanks for that formatting edit @jaraco, I see you 👍