question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Ingesting stream of entities in parallel

See original GitHub issue

First of all, thanks a lot for the great work! When saving entities, the PgBulkInsert and the BulkProcessor classes both have synchronized blocks. In particular, when saving a parallel stream with PgBulkInsert, the saveEntitySynchronized method seems to constitute a consumer bottleneck. Would it make any sense to have several threads/connection copying data into the database in parallel? If so, what would be the recommended way do that?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:22 (22 by maintainers)

github_iconTop GitHub Comments

2reactions
bchapuiscommented, Jan 6, 2019

@bytefish sure, with pleasure, let me just clean it and make it a bit more generic. 😄

1reaction
bytefishcommented, Jan 7, 2019

Great job! I am currently also experimenting with importing large scale dataset in parallel with .NET: https://github.com/bytefish/GermanWeatherDataExample. It’s a little different in C#, but… I have the problem, that the database writes the data fast enough, but the CSV Reader / Mapping is too slow - no matter how much I optimized it. If I find a solution, that scales I will let you know.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Ingesting stream of entities in parallel · Issue #33 - GitHub
The reason I want to delete the saveAll(PGConnection connection, Stream stream) is because a Parallel Stream doesn't give any benefit, when it comes...
Read more >
parallel processing with infinite stream in Java - Stack Overflow
Stream. iterate returns 'an infinite sequential ordered Stream'. Therefore, making a sequential stream parallel is not too useful. According to ...
Read more >
Alternating between Java streams and parallel streams at ...
StreamSupport.stream creates a new sequential or parallel Stream from a Spliterator (which in turn can be obtained from any Collection).
Read more >
Self‐adaptation on parallel stream processing: A systematic ...
Self-adaptation can be broadly defined as the capability of the systems/environments to be autonomous, deciding and changing their behavior in ...
Read more >
Parallelization of Structured Streaming Jobs Using Delta Lake
In conclusion, we will discuss an advanced topic on running a parallel streaming backfill job and the nuances in handling failure and recovery....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found