FasterLog based Pubsub hangs if log file deleted
See original GitHub issueIn the code below, I expect that after I delete the log file and restart the producer and consumer that my consumer would obviously miss everything stored in the log file in iteration 1, but it should be able to consume the messages produced during the 2nd iteration. However, it seems like the consumer completely hangs and has no way to get out. Even if I wire up cancellation tokens, the iterations just keep on cancelling out at the iteration.WaitAsync() call. How can we get out of this state?
Version: 1.9.10
namespace FasterLogPlayground
{
using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.CompilerServices;
using System.Text;
using System.Threading.Tasks;
using FASTER.core;
internal class Program
{
private static IDevice device;
private static FasterLog log;
private static FasterLogScanIterator iterator;
static async Task Main(string[] args)
{
// Produce some logs and consume them partially
device = Devices.CreateLogDevice("hlog.log", useIoCompletionPort: true);
log = new FasterLog(new FasterLogSettings { LogDevice = device, MemorySizeBits = 26, PageSizeBits = 20, MutableFraction = 0.5, SegmentSizeBits = 20 });
iterator = log.Scan(log.BeginAddress, long.MaxValue, "logtest1", true, ScanBufferingMode.DoublePageBuffering, true);
TaskCompletionSource<bool> firstIterationTaskCompletionSource = new TaskCompletionSource<bool>();
Task.Run(() => CommitterAsync(log, TimeSpan.FromMilliseconds(100), firstIterationTaskCompletionSource));
var tasks = new List<Task>();
tasks.Add(ProducerAsync(log, "Message ", 100));
tasks.Add(ConsumerAsync(iterator, log, "Consumer1", 10));
await Task.WhenAll(tasks).ConfigureAwait(false);
firstIterationTaskCompletionSource.SetResult(true);
// Give some delay for commit to complete
await Task.Delay(500).ConfigureAwait(false);
iterator.Dispose();
log.Dispose();
device.Dispose();
// Delete the log file
File.Delete("hlog.log.0");
// Now try to produce and consume again.
// Expected behavior: My expectation is that we will recover and start to consume whatever is produced starting at this point.
// Actual behavior: The producer produces everything, but consumer is stuck forever
device = Devices.CreateLogDevice("hlog.log", useIoCompletionPort: true);
log = new FasterLog(new FasterLogSettings { LogDevice = device, MemorySizeBits = 26, PageSizeBits = 20, MutableFraction = 0.5, SegmentSizeBits = 20 });
iterator = log.Scan(log.BeginAddress, long.MaxValue, "logtest1", true, ScanBufferingMode.DoublePageBuffering, true);
TaskCompletionSource<bool> secondIterationTaskCompletionSource = new TaskCompletionSource<bool>();
Task.Run(() => CommitterAsync(log, TimeSpan.FromMilliseconds(100), secondIterationTaskCompletionSource));
tasks = new List<Task>();
tasks.Add(ProducerAsync(log, "AnotherMessage ", 100));
tasks.Add(ConsumerAsync(iterator, log, "Consumer2", 10));
await Task.WhenAll(tasks).ConfigureAwait(false);
secondIterationTaskCompletionSource.SetResult(true);
Console.WriteLine("Done");
}
static async Task CommitterAsync(FasterLog log, TimeSpan delay, TaskCompletionSource<bool> tcs)
{
while (!tcs.Task.IsCompleted)
{
await Task.Delay(delay).ConfigureAwait(false);
await log.CommitAsync().ConfigureAwait(false);
}
}
static async Task ProducerAsync(FasterLog log, string prefix, int numberOfIterations)
{
var i = 0;
while (i < numberOfIterations)
{
try
{
await log.EnqueueAsync(Encoding.UTF8.GetBytes(prefix + i.ToString())).ConfigureAwait(false);
await log.RefreshUncommittedAsync().ConfigureAwait(false);
}
catch (Exception ex)
{
Console.WriteLine($"Enqueue failed in Iteration {i}: {ex}");
}
finally
{
i++;
}
}
Console.WriteLine($"Producer {prefix} complete");
}
static async Task ConsumerAsync(FasterLogScanIterator iterator, FasterLog log, string name, int numberOfIterations)
{
var i = 0;
while (i < numberOfIterations)
{
try
{
byte[] result;
int length;
long nextAddress;
while (!iterator.GetNext(out result, out length, out _, out nextAddress))
{
// THIS WAITASYNC BELOW HANGS
if (!await iterator.WaitAsync().ConfigureAwait(false))
{
throw new InvalidOperationException("InMemoryQueueWithPersistence has been shutdown and cannot dequeue any more.");
}
}
iterator.CompleteUntil(nextAddress);
log.TruncateUntil(nextAddress);
Console.WriteLine($"Consumer {name} consumed: {Encoding.UTF8.GetString(result, 0, length)}");
}
catch (Exception ex)
{
Console.WriteLine($"Dequeue failed in Iteration {i}: {ex}");
}
finally
{
i++;
}
}
Console.WriteLine($"Consumer {name} complete");
}
}
}
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
FasterLog Basics - FASTER
FasterLog is a blazing fast, persistent, concurrent, and recoverable log for C#. You can perform appends, commits, iteration, and log truncation ...
Read more >Kafka vs. Redpanda performance – do the claims add up?
The solution seems to be fsync. It's what it's for. It's very appealing to wave it away because it's expensive. The situation above...
Read more >Kafka 1.1 Documentation
Kafka abstracts away the details of files and gives a cleaner abstraction of log or event data as a stream of messages. This...
Read more >Documentation - Apache Kafka
Kafka abstracts away the details of files and gives a cleaner abstraction of log or event data as a stream of messages. This...
Read more >Commits · instana/mesosphere-fork-universe
The Mesosphere Universe package repository. Contribute to instana/mesosphere-fork-universe development by creating an account on GitHub.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Linked PR makes recovery throw an exception that is caught by upstream FasterLog recovery logic, which will in turn not recover to the specified commit. Thus, we will start with a clean unrecovered slate in this case, which is the best we can do. In v2, this exception can be caught by user by explicitly calling fasterlog.Recover() instead of setting
logSettings.TryRecoverLatest
. Then, the user can do any other custom repair.Much appreciated @badrishc . May I know when the next release is scheduled for? Would like this fix along with the you made recently which deletes old segments which are not in memory