"CollectionInvalid: collection <CC> already exists" exception when catching up the op-log
See original GitHub issueMy use case is moving a replica set from A to B.
- I take an EBS snapshot of A, at time t1.
- Seed B from that finishing at time t2.
- Use mongo-connector bring B into sync with A via A’s oplog and keep it in sync. Use noDump to kick that off.
When step 3 starts the oplog is big, and it’s oldest entry is from time t0. It’s slower than one might like if t0 is much earlier than t1, but that’s ok.
But. I’m encountering a problem where mongo connector fails
2017-04-11 20:53:38.814307500 2017-04-11 20:53:38,813 [ERROR] mongo_connector.util:106 - Fatal Exception
2017-04-11 20:53:38.814321500 Traceback (most recent call last):
2017-04-11 20:53:38.814322500 File "/home/bhyde/.pyenv/versions/2.7.11/lib/python2.7/site-packages/mongo_connector/util.py", line 104, in wrapped
2017-04-11 20:53:38.814323500 func(*args, **kwargs)
2017-04-11 20:53:38.814324500 File "/home/bhyde/.pyenv/versions/2.7.11/lib/python2.7/site-packages/mongo_connector/oplog_manager.py", line 283, in run
2017-04-11 20:53:38.814325500 timestamp)
2017-04-11 20:53:38.814325500 File "/home/bhyde/.pyenv/versions/2.7.11/lib/python2.7/site-packages/mongo_connector/util.py", line 35, in wrapped
2017-04-11 20:53:38.814326500 return f(*args, **kwargs)
2017-04-11 20:53:38.814327500 File "/home/bhyde/.pyenv/versions/2.7.11/lib/python2.7/site-packages/mongo_connector/doc_managers/mongo_doc_manager.py", line 150, in handle_command
2017-04-11 20:53:38.814328500 self.mongo[new_db].create_collection(coll)
2017-04-11 20:53:38.814329500 File "/home/bhyde/.pyenv/versions/2.7.11/lib/python2.7/site-packages/pymongo/database.py", line 342, in create_collection
2017-04-11 20:53:38.814329500 raise CollectionInvalid("collection %s already exists" % name)
2017-04-11 20:53:38.814330500 CollectionInvalid: collection <CC> already exists
Which I presume means that <CC> was created during the interval t0…t1.
Mumble … idempotent … mumble.
I guess I’m thinking the work around is to catch the exception, delete the collection, and try again. Or preflight the operation and delete the collection before creating it.
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (1 by maintainers)
Top Results From Across the Web
db already exists with different case other - java - Stack Overflow
This error indicates that you are trying to create a database that differs by case only from a database name that already exists....
Read more >[SERVER-6992] "collection already exists" error misses code
This command will create the collection on the Mongo DB database before returning the collection object. If the collection already exists it ...
Read more >[cygnus-ngsi][NGSIMongoDBSink] Fix error handling when ...
Hi, I already posted this issue on Stackoverflow but I want to post it here too. I want to save historical data into...
Read more >Check Email is already present or not in org using apex
The error is self explanatory, both return statements are unreachable due to the throwed exceptions. Since no matters what happens you ...
Read more >Handling exceptions and errors (Reference) - Prisma
The following example tries to create a user with an already existing email record. This will throw an error because the email field...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Progress on the issue seems to have stalled - but I am still facing it. I would suggest that a --continue-on-error flag would be useful for the oplog as well (–continue-on-oplog-error?) or at least something specific for the collection already exists error
In my use case, where we don’t delete collections or tinker with their assorted properties; we only add 'em in the most vanilla way. So, I think i’m fine ignoring the error.
In the general case it’s accurate to catch the error, delete the collection, and then recreate the collection. At least I’d think so.
Seeding from the snapshot + noDump is a huge win if the database is huge; for example you don’t need to unwind the compression, serialize, move it over wires, etc. etc.