question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fail fast on creating mutations that cannot be stored

See original GitHub issue

This is in the context of Beam CloudBigtableIO. The client has a bulk mutation limit of 100k. As of #1497 this means as long as each individual Put is under that limit everything will be fine.

A Dataflow job can have large variability in the amount of data per row. It is also natural to write processElement using a single Put, especially since that is what the examples guide new users to do.

If a single Put is larger than 100k mutations then it is impossible to store. The exception it causes is thrown later, and from a different thread.

The client should throw the TooManyMutations exception much earlier, preferably so the user code that caused the 100001th mutation it is still in the stack trace.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
sduskiscommented, Aug 30, 2018

We definitely show throw exceptions earlier. We can also add an option (with a big Caveat Emptor) to allow multiple Puts in this case.

0reactions
sduskiscommented, Sep 30, 2018

@alugowski, thanks for the context.

FYI, Put is owned by HBase, which doesn’t have the 100K limitation, so we couldn’t add an exception there. We added the warning in our implementation of BufferedMutator.mutate, which is called under the cover in CloudBigtableIO.

We were exploring possibilities of how to improve the experience. A note in BigQueryBigtableTransfer would help. We may also fix our import job, where users can’t control the code themselves.

Read more comments on GitHub >

github_iconTop Results From Across the Web

javierbrea/cypress-fail-fast: A Cypress plugin to skip ... - GitHub
Cypress Fail Fast. Enables fail fast in Cypress, skipping the rest of tests on first failure. It can be configured to skip all...
Read more >
Adaptive mutation: implications for evolution - PMC - NCBI - NIH
First, recombination-dependent mutation might be an important source of spontaneous mutations when cells are not actively replicating their genomes. Second, ...
Read more >
Types of CFTR Mutations - Cystic Fibrosis Foundation
Cystic fibrosis is caused by mutations, or errors, in the cystic fibrosis transmembrane conductance regulator (CFTR) gene, which result in either no CFTR ......
Read more >
Detection of ultra-rare mutations by next-generation sequencing
We determine that Duplex Sequencing has a theoretical background error rate of less than one artifactual mutation per billion nucleotides sequenced. In addition ......
Read more >
This family carried a rare mutation that should have been lethal
Their genome couldn't contain something so lethal: They were all very much alive. The researchers tried again, inserting the family's genetic ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found