question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

isLiveTable check breaks Kinesis like streams

See original GitHub issue

PR https://github.com/apache/pinot/pull/7756/files introduced a check for liveTable in the path of adding new partitionGroups. This branch was introduced in setupNewPartitionGroup (prior to this PR we had just the else part)

if (isLiveTable) {
    startOffset = getPartitionGroupSmallestOffset(streamConfig, partitionGroupId);
 } else {
    startOffset = partitionGroupMetadata.getStartOffset();
 }

For new table, we go to the else, and for existing table detecting a new partitionGroup we go to if. Within the if, a call is made to getNewPartitionGroupMetadataList(streamConfig, Collections.emptyList());. For Kinesis like streams, the response from this call depends on the currentList passed. If a Kinesis like stream received empty list, it will only return the very first parent shards. This is because in Kinesis, the shards have a sequence (we started with 0, split it to 1, 2, so 1, 2 will only be returned if the current state tells it that 0 is done ingesting).

As a result of this PR change, no new shards can get detected in Kinesis.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
npawarcommented, Jan 21, 2022

Regarding, getPartitionGroupSmallestOffset(), also agree that this should be fixed regardless for Kinesis as it is used in validation manager. But that’s a much smaller and infrequent case, so i think we should prioritize first fixing the bigger case of broken new partitiongroups.

0reactions
npawarcommented, Jan 21, 2022

Yup 7743 should work as that one doesn’t use empty list for current

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshooting Kinesis Data Streams Consumers
Some Kinesis Data Streams Records are Skipped When Using the Kinesis Client Library. The most common cause of skipped records is an unhandled...
Read more >
Apache Kafka vs Amazon Kinesis – Comparing Setup ...
This article compares Apache Kafka and Amazon Kinesis based on setup, ... Similar to partitions in Kafka, Kinesis breaks the data streams ......
Read more >
Does AWS Kinesis Firehose stream overrides a LOCK on table
I just want to test the Kinesis firehose behaviour when the table is LOCKED, does it fail of it breaks the lock every...
Read more >
Amazon Kinesis Data Streams FAQs | Amazon Web Services
Amazon Kinesis Data Streams is a fully managed streaming data service. You can continuously add various types of data such as clickstreams, application...
Read more >
How and Why You Should Use Amazon Kinesis for Your Data ...
Kinesis as a streaming tool has some distinct advantages. ... Veritone makes Kinesis video streams easy to search by tagged information, like audio, ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found