question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Reader group initialization should respect truncation point

See original GitHub issue

Problem description When a reader group is created (or reset) with a configuration not containing a start position, the start position is implicitly set to the head of the stream. Issue: the underlying calculation is not respecting segment truncation and simply assumes an offset of zero for the available segments. In reality the offset may be non-zero.

The issue causes a truncation exception, such as seen below (following a reset of the RG with name hulksmallScaleforgetful0). Observe that the initial offset of hulk/smallScale/53 is 0 yet the subsequent response of SegmentIsTruncated makes clear that startOffset is 1189328.

2018-05-30 00:36:13,589 10806536 [pool-8-thread-4] INFO  c.e.n.h.t.w.readers.PravegaReader - Added Reader HulkSmallScale7d669bdb9f2f4e7fbd9c35969f9ec4f9 To ReaderGroup hulksmallScaleforgetful0

2018-05-30 00:36:13,602 10806549 [pool-8-thread-4] INFO  i.p.c.s.impl.EventStreamReaderImpl - EventStreamReaderImpl( id=HulkSmallScale7d669bdb9f2f4e7fbd9c35969f9ec4f9) acquiring segments {hulk/smallScale/76=0, hulk/smallScale/79=0, hulk/smallScale/74=0, hulk/smallScale/70=0, hulk/smallScale/96=0, hulk/smallScale/97=0, hulk/smallScale/98=0, hulk/smallScale/67=0, hulk/smallScale/92=0, hulk/smallScale/93=0, hulk/smallScale/94=0, hulk/smallScale/95=0, hulk/smallScale/88=0, hulk/smallScale/89=0, hulk/smallScale/90=0, hulk/smallScale/58=0, hulk/smallScale/91=0, hulk/smallScale/84=0, hulk/smallScale/53=0, hulk/smallScale/86=0, hulk/smallScale/87=0, hulk/smallScale/81=0, hulk/smallScale/82=0}

2018-05-30 00:36:14,028 10806975 [epollEventLoopGroup-5-4] INFO  i.p.c.s.i.AsyncSegmentInputStreamImpl - Received segmentIsTruncated WireCommands.SegmentIsTruncated(type=SEGMENT_IS_TRUNCATED, requestId=0, segment=hulk/smallScale/53, startOffset=1189328)
2018-05-30 00:36:14,028 10806975 [epollEventLoopGroup-5-4] WARN  i.p.c.s.i.AsyncSegmentInputStreamImpl - Exception while reading from Segment : hulk/smallScale/53
io.pravega.client.segment.impl.SegmentTruncatedException: null

Problem location ReaderGroupImpl::getSegmentsForStreams -> ControllerService::getSegmentsAtTime

Suggestions for an improvement

  • Adjust ControllerService::getSegmentsAtTime to return the startOffset of the respective segment.
  • undo the duplicative logic that was added to BatchClient in #2551

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:16 (16 by maintainers)

github_iconTop GitHub Comments

2reactions
shridscommented, Jun 7, 2018

How do we get the offset information to feed it to the reader?

There is no need to provide offset information to the reader. Invoking the readNextEvent() again after receiving the TruncatedDataException would result in a read of the next available data.

The below flow describes the steps that need be performed when resetReaderGroup is invoked.

  1. Reset the reader group.
rg.resetReaderGroup(ReaderGroupConfig.builder().disableAutomaticCheckpoints()
                                             .stream(Stream.of(SCOPE, STREAM3))
                                             .build());
  1. All existing readers will throw a ReinitializationException on performing existingReader.readNextEvent() post a resetReaderGroupOperation. On this exception close the existingReader existingReader.close() and create a new readers.

  2. Create a new Reader.

EventStreamReader<String> reader = clientFactory.createReader("readerId2", "group", serializer,
                ReaderConfig.builder().build());
  1. Invoke reader.readNextEvent( timeout)
  • Step 4 will throw a TruncatedDataException if the reader observes that the Segment it is reading from is truncated.
  1. Invoking reader.readNextEvent(timeout) again would read from the next available data. i.e. if no action needs to be taken on receiving TruncatedDataExcecption then this exception can be ignored and reader.readNextEvent() can be invoked again to fetch the next available data.
1reaction
fpjcommented, Jun 7, 2018

Despite the design goal mentioned by Flavio, Pravega does not reliably produce a truncated exception when the client attempts to read from the head of a truncated stream. For example, if a ‘clean’ truncation occurred wherein old segments were dropped but no segments were partially truncated, then the client would not receive a truncation exception; it would simply read from the head.

Throwing the exception should be consistent, otherwise the semantics is broken.

I agree with @jkhalack’s statement that, in the absence of a specific start point, the semantic should be ‘earliest available’ as opposed to ‘absolute zero’

That’s a possibility, but not the only one. In fact, that’s the reason why systems like Kafka have configuration parameters like auto.offset.reset.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Sybase OGG Extract Abends With OGG-00146 Call To ...
WHEN FAILED : During log reader initialization, while validating secondary truncation point. WHERE FAILED : Sybase Log Transfer Module
Read more >
MSER-5 - analysis of test results - the University of Warwick
MSER-5 can sometimes erroneously report a truncation point at the end of the data series (from here on in referred to as an...
Read more >
A comparison of five steady-state truncation heuristics for simulation
We compare the performance of five well-known truncation heuristics for mitigating the effects of initialization bias in the output analysis of steady-state.
Read more >
Enterprise PL/I for z/OS Language Reference - IBM
A form for readers' comments is provided at the back of this publication. ... Every statement must be contained within some enclosing group...
Read more >
Frequently asked questions for replication administrators
A list of frequently asked questions relevant to replication administrators for SQL Server.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found