Users should be able to set the timeout and the chunk_size for elastic search bulk requests.
See original GitHub issueWhile dumping huge data to elasticsearch, the mongo-connector can’t work normally because it often crashes due to connection timeout. The default timeout is 10, and can’t be changed. It should be an option of the mongo-connector command so that user can change it when necessary.
2015-10-20 01:48:04,992 [CRITICAL] mongo_connector.oplog_manager:543 - Exception during collection dump
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/mongo_connector/oplog_manager.py", line 495, in do_dump
upsert_all(dm)
File "/usr/lib/python2.6/site-packages/mongo_connector/oplog_manager.py", line 479, in upsert_all
dm.bulk_upsert(docs_to_dump(namespace), mapped_ns, long_ts)
File "/usr/lib/python2.6/site-packages/mongo_connector/util.py", line 32, in wrapped
return f(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/mongo_connector/doc_managers/elastic_doc_manager.py", line 190, in bulk_upsert
for ok, resp in responses:
File "/usr/lib/python2.6/site-packages/elasticsearch/helpers/__init__.py", line 138, in streaming_bulk
raise e
ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host=u'172.31.1.254', port=9200): Read timed out. (read timeout=10))
Thanks,
Issue Analytics
- State:
- Created 8 years ago
- Comments:5
Top Results From Across the Web
Bulk API | Elasticsearch Guide [8.5]
The request body contains a newline-delimited list of create , delete , index , and update actions and their associated source data. create....
Read more >Elasticsearch Bulk insert w/ Python - socket timeout error
When creating your Elasticsearch object, you specified chunk_size=10000 . This means that the streaming_bulk call will try to insert chunks ...
Read more >Adding timeout to Bulk API??? #867 - elastic/elasticsearch-js
If you want to specify a request timeout instead, you should use the requestTimeout option. await client.bulk({ body: [.
Read more >Hibernate Search 6.1.7.Final: Reference Documentation
Allows indexing of ORM entities on multiple application nodes, storing the index on a remote Elasticsearch or OpenSearch cluster (to ...
Read more >Using Asyncio with Elasticsearch
Async variants of all helpers are available in elasticsearch.helpers and are ... timeout – Time each individual bulk request should wait for shards...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I made the following change (line 4) in elastic_doc_manager to increase the time out to 60, it solves my issue:
I hope the request_timeout can be a param of the mongo-connector command, or at least set the default time out to be a larger number, 10 seconds is too short.
Please close this issue, both of the timeout and chunk_size can be configured by the configuration file as follows:
Thanks!