Fail fast on creating mutations that cannot be stored
See original GitHub issueThis is in the context of Beam CloudBigtableIO
. The client has a bulk mutation limit of 100k. As of #1497 this means as long as each individual Put
is under that limit everything will be fine.
A Dataflow job can have large variability in the amount of data per row. It is also natural to write processElement
using a single Put
, especially since that is what the examples guide new users to do.
If a single Put
is larger than 100k mutations then it is impossible to store. The exception it causes is thrown later, and from a different thread.
The client should throw the TooManyMutations exception much earlier, preferably so the user code that caused the 100001th mutation it is still in the stack trace.
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
javierbrea/cypress-fail-fast: A Cypress plugin to skip ... - GitHub
Cypress Fail Fast. Enables fail fast in Cypress, skipping the rest of tests on first failure. It can be configured to skip all...
Read more >Adaptive mutation: implications for evolution - PMC - NCBI - NIH
First, recombination-dependent mutation might be an important source of spontaneous mutations when cells are not actively replicating their genomes. Second, ...
Read more >Types of CFTR Mutations - Cystic Fibrosis Foundation
Cystic fibrosis is caused by mutations, or errors, in the cystic fibrosis transmembrane conductance regulator (CFTR) gene, which result in either no CFTR ......
Read more >Detection of ultra-rare mutations by next-generation sequencing
We determine that Duplex Sequencing has a theoretical background error rate of less than one artifactual mutation per billion nucleotides sequenced. In addition ......
Read more >This family carried a rare mutation that should have been lethal
Their genome couldn't contain something so lethal: They were all very much alive. The researchers tried again, inserting the family's genetic ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
We definitely show throw exceptions earlier. We can also add an option (with a big Caveat Emptor) to allow multiple Puts in this case.
@alugowski, thanks for the context.
FYI,
Put
is owned by HBase, which doesn’t have the 100K limitation, so we couldn’t add an exception there. We added the warning in our implementation ofBufferedMutator.mutate
, which is called under the cover inCloudBigtableIO
.We were exploring possibilities of how to improve the experience. A note in
BigQueryBigtableTransfer
would help. We may also fix our import job, where users can’t control the code themselves.