question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Activity calls from CreateRetryableClient are non-deterministic

See original GitHub issue

Issue

It appears that proxy objects generated by OrchestrationContext.CreateRetriableClient result in clients that invoke activities in a way that is non-deterministic. In particular, there appears to be threading problems with these clients that often result in stuck orchestrations or orchestrations that fail with errors like the following:

DurableTask.Core.Exceptions.NonDeterministicOrchestrationException: Non-Deterministic workflow detected: TaskScheduledEvent: 4 TaskScheduled Method5 
   at DurableTask.Core.TaskOrchestrationContext.HandleTaskScheduledEvent(TaskScheduledEvent scheduledEvent) in C:\GitHub\durabletask\src\DurableTask.Core\TaskOrchestrationContext.cs:line 268
   at DurableTask.Core.TaskOrchestrationExecutor.ProcessEvent(HistoryEvent historyEvent) in C:\GitHub\durabletask\src\DurableTask.Core\TaskOrchestrationExecutor.cs:line 141
   at DurableTask.Core.TaskOrchestrationExecutor.ExecuteCore(IEnumerable`1 eventHistory) in C:\GitHub\durabletask\src\DurableTask.Core\TaskOrchestrationExecutor.cs:line 82
   at DurableTask.Core.TaskOrchestrationContext.HandleTaskScheduledEvent(TaskScheduledEvent scheduledEvent) in C:\GitHub\durabletask\src\DurableTask.Core\TaskOrchestrationContext.cs:line 268

Previous maintainers of this repro have mentioned that this API is buggy.

The issue may be related to the use of Dynamity for dynamically invoking activity functions.

Workarounds

  • Use OrchestrationContext.ScheduleWithRetry - unfortunately this is not a type-safe way to call activity tasks.
  • Use OrchestrationContext.CreateClient - unfortunately this doesn’t support automatic retries.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:17 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
auelokacommented, Jun 20, 2022

@davidmrdavid , In the mean time, I’ve opened a PR with the changes mentioned here.

https://github.com/Azure/durabletask/pull/750

2reactions
auelokacommented, Jun 20, 2022

@cgillum , I did a test with Castle.Core library that seems a much better alternative to the current Dynamity. Castle.Core is also used internally by the Moq library.

Castle.Core supports creating proxy objects from interfaces, as well as abstract and non-sealed classes. In addition, it mitigates a separate edge case bug with Dynamity where the proxy method return types cannot be private or less visible than the proxy target. With Castle.Core, this is what the RetryProxy.cs would look like:

internal class RetryProxy : IInterceptor
{
    private readonly OrchestrationContext context;
    private readonly RetryOptions retryOptions;

    /// <summary>
    /// Initializes a new instance of the <see cref="RetryProxy"/> class.
    /// </summary>
    /// <param name="context">The orchestration context.</param>
    /// <param name="retryOptions">The retry options.</param>
    public RetryProxy(OrchestrationContext context, RetryOptions retryOptions)
    {
        this.context = context;
        this.retryOptions = retryOptions;
    }

    /// <inheritdoc/>
    public void Intercept(IInvocation invocation)
    {
        var returnType = invocation.Method.ReturnType;

        if (!typeof(Task).IsAssignableFrom(returnType))
        {
            throw new InvalidOperationException($"Invoked method must return a task. Current return type is {invocation.Method.ReturnType}");
        }

        if (returnType == typeof(Task))
        {
            invocation.ReturnValue = this.InvokeWithRetry<object>(invocation);
            return;
        }

        returnType = invocation.Method.ReturnType.GetGenericArguments().Single();

        MethodInfo? invokeMethod = this.GetType().GetMethod("InvokeWithRetry", BindingFlags.Instance | BindingFlags.NonPublic);

        Debug.Assert(invokeMethod != null, "null");

        MethodInfo genericInvokeMethod = invokeMethod.MakeGenericMethod(returnType);
        invocation.ReturnValue = genericInvokeMethod.Invoke(this, new object?[] { invocation });

        return;
    }

    private async Task<TReturnType?> InvokeWithRetry<TReturnType>(IInvocation invocation)
    {
        Task<TReturnType> RetryCall()
        {
            invocation.Proceed();
            return (Task<TReturnType>)invocation.ReturnValue;
        }

        var retryInterceptor = new RetryInterceptor<TReturnType>(this.context, this.retryOptions, RetryCall);
        return await retryInterceptor.Invoke();
    }
}

Similarly, the ScheduleProxy.cs would become:

internal sealed class ScheduleProxy : IInterceptor
{
    private readonly OrchestrationContext context;
    private readonly bool useFullyQualifiedMethodNames;

    /// <summary>
    /// Initializes a new instance of the <see cref="ScheduleProxy"/> class.
    /// </summary>
    /// <param name="context">The orchestration context.</param>
    public ScheduleProxy(OrchestrationContext context)
        : this(context, false)
    {
    }

    /// <summary>
    /// Initializes a new instance of the <see cref="ScheduleProxy"/> class.
    /// </summary>
    /// <param name="context">The orchestration context.</param>
    /// <param name="useFullyQualifiedMethodNames">A flag indicating whether to use fully qualified method names.</param>
    public ScheduleProxy(OrchestrationContext context, bool useFullyQualifiedMethodNames)
    {
        this.context = context;
        this.useFullyQualifiedMethodNames = useFullyQualifiedMethodNames;
    }

    /// <inheritdoc/>
    public void Intercept(IInvocation invocation)
    {
        var returnType = invocation.Method.ReturnType;

        if (!typeof(Task).IsAssignableFrom(returnType))
        {
            throw new InvalidOperationException($"Invoked method must return a task. Current return type is {invocation.Method.ReturnType}");
        }

        Type[] genericArgumentValues = invocation.GenericArguments;
        List<object?> arguments = new(invocation.Arguments);

        foreach (var typeArg in genericArgumentValues)
        {
            arguments.Add(new TypeMetadata(typeArg.Assembly.FullName!, typeArg.FullName!));
        }

        object[] args = arguments.ToArray()!;

        string normalizedMethodName = NameVersionHelper.GetDefaultName(invocation.Method, this.useFullyQualifiedMethodNames);

        if (returnType == typeof(Task))
        {
            invocation.ReturnValue = this.context.ScheduleTask<object>(normalizedMethodName, NameVersionHelper.GetDefaultVersion(invocation.Method), args);
            return;
        }

        returnType = invocation.Method.ReturnType.GetGenericArguments().Single();

        MethodInfo scheduleMethod = typeof(OrchestrationContext).GetMethod(
            "ScheduleTask",
            new[] { typeof(string), typeof(string), typeof(object[]) }) ??
            throw new Exception($"Method 'ScheduleTask' not found. Type Name: {nameof(OrchestrationContext)}");

        MethodInfo genericScheduleMethod = scheduleMethod.MakeGenericMethod(returnType);

        var result = genericScheduleMethod.Invoke(this.context, new object[]
        {
            normalizedMethodName,
            NameVersionHelper.GetDefaultVersion(invocation.Method),
            args!,
        });

        invocation.ReturnValue = result;
        return;
    }
}

With these changes, the CreateClient<T>() and CreateRetryableClient<T> would change like below:

public T CreateClient<T>(bool useFullyQualifiedMethodNames)
    where T : class
{
    if (!typeof(T).IsInterface)
    {
        throw new InvalidOperationException("Pass in an interface.");
    }

    var proxyGenerator = new ProxyGenerator();  // This should be a static readonly instance

    IInterceptor scheduleProxy = new ScheduleProxy(context, useFullyQualifiedMethodNames);
    return proxyGenerator.CreateInterfaceProxyWithoutTarget<T>(scheduleProxy);
}

public T CreateRetryableClient<T>(RetryOptions retryOptions, bool useFullyQualifiedMethodNames)
    where T : class
{
    if (!typeof(T).IsInterface)
    {
        throw new InvalidOperationException("Pass in an interface.");
    }

    var proxyGenerator = new ProxyGenerator(); // This should be a static readonly instance

    IInterceptor scheduleProxy = new ScheduleProxy(context, useFullyQualifiedMethodNames);
    IInterceptor retryProxy = new RetryProxy(context, retryOptions);

    T scheduleInstance = proxyGenerator.CreateInterfaceProxyWithoutTarget<T>(scheduleProxy);
    return proxyGenerator.CreateInterfaceProxyWithTarget(scheduleInstance, retryProxy);
}
Read more comments on GitHub >

github_iconTop Results From Across the Web

Durable activity sometimes detected as Non-Deterministic ...
First, it calls an activity to do an update on a SQL Server database. Then a sub orchestrator task is called to some...
Read more >
Non-Deterministic workflow detected in Durable Functions
The activity function returns the orders that have been rated, and then I save those orders to the blob along with the rules...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found