question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bulk Insert is failing silently

See original GitHub issue

NEST/Elasticsearch.Net version: 6.3.0

Elasticsearch version: 6.2.3

Description of the problem including expected versus actual behavior: Bulk insert is failing silently. I have a BulkAllObserver with a listener on onError to log out any errors that occur during the batching but no errors are being output. This has been hiding data loading errors for weeks in production and we only just noticed. This seems to be true for all error conditions, but I will provide a single case that should be testable

Steps to reproduce:

  1. Create a new, blank index and define a data schema
  2. Create a bulk operation to insert a ton of documents into this index but have some of the object properties violate the schema (e.g. try to insert Foo : string when the schema requires Foo : int)
  3. Subscribe a BulkAllObserver to output errors during the operation
  4. Verify that no errors were logged out
  5. Verify that the index contains no documents

Here is the code I am currently using:

var operationId = Guid.NewGuid();
Logger.Information("Beginning bulk insert operation to ElasticSearch. Operation ID: {operationId}", this, operationId);

var waitHandle = new CountdownEvent(1);
var bulkInsert = _client.BulkAll(vehicles, b => b
	.Index("inventory")
	.BackOffRetries(2)
	.BackOffTime("30s")
	.RefreshOnCompleted()
	.MaxDegreeOfParallelism(4)
	.Size(100));

bulkInsert.Subscribe(new BulkAllObserver(
	onError:     ex => Logger.Fatal("Bulk insert error {message} on operation {operationId}", this, ex, ex.Message, operationId),
	onCompleted: () =>
	{
		Logger.Information(
			"Successfully inserted {count} documents. Operation ID: {operationId}", 
			this, 
			vehicles.Count, 
			operationId);

		waitHandle.Signal();
	}));

waitHandle.Wait();

Provide ConnectionSettings (if relevant):

Provide DebugInformation (if relevant):

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
Mpdreamzcommented, Sep 14, 2018

Hi @dasjestyr thank you for reporting this.

Agreed this is completely unexpected behaviour. When this was initially written the idea was to continue to try and succeed but nothing in the contract or docs hints at this. Working on a PR now that will halt the bulk and feed into OnError by default and provide an opt in flag to continue on dropped documents with a callback to feed these documents in to another system/index/DLQ.

0reactions
balukantucommented, Jul 11, 2022

Hi Team,

Is there a way to insert a document if the document doesn’t exist or else update the document in Elasticsearch using BulkAll?

How can we handle not duplicating the records if we do the import twice?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Dynamic bulk insert silently failing - sql server
When I re-run the job it works fine and all the data is loaded. When there are issues with the file format, SQL...
Read more >
FIX: An error message may occur when you run a "BULK ...
Typically, an assertion failure is caused by a software bug or data corruption. To check for database corruption, consider running DBCC CHECKDB.
Read more >
Bulk create silently failing - Elasticsearch
Elasticsearch 2.4.4 Hi, I have a script where I'm creating trying to create a large number of documents in an index that already...
Read more >
INSERT-SELECT Statement occasionally fails silently and ...
We have a batch import process that BULK INSERTs raw data into staging tables, does some data mapping, and then pushes it to...
Read more >
Insert failing silently, sometimes - Microsoft SQL Server
This is called repeatedly from a multi-user application to insert over a thousand rows at one go. Sometimes (once in a few months)...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found