Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple levels of SubOrchestrations - an error gets thrown when reading a SubOrchestrationCompleteEvent of a nested one #1011

Open
preslavnd opened this issue Dec 4, 2023 · 0 comments

Comments

@preslavnd
Copy link

The situation I end up with is I have an Orchestration, which creates and waits on a SubOrchestration, which does also create and wait on another SubOrchestration.

When the inner-most one completes, I get the following exception:

Backing off for 1 seconds until 5 successful operations
      System.InvalidCastException: Unable to cast object of type 'DurableTask.Core.History.ExecutionStartedEvent' to type 'DurableTask.Core.History.SubOrchestrationInstanceCreatedEvent'.
         at DurableTask.Core.TaskOrchestrationDispatcher.ReconcileMessagesWithState(TaskOrchestrationWorkItem workItem, String dispatcher, ErrorPropagationMode errorPropagationMode, LogHelper logHelper) in /_/src/DurableTask.Core/TaskOrchestrationDispatcher.cs:line 844
         at DurableTask.Core.TaskOrchestrationDispatcher.<OnProcessWorkItemAsync>d__25.MoveNext() in /_/src/DurableTask.Core/TaskOrchestrationDispatcher.cs:line 345
      --- End of stack trace from previous location where exception was thrown ---
         at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
         at DurableTask.Core.TaskOrchestrationDispatcher.<OnProcessWorkItemAsync>d__25.MoveNext() in /_/src/DurableTask.Core/TaskOrchestrationDispatcher.cs:line 594
      --- End of stack trace from previous location where exception was thrown ---
         at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
         at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
         at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
         at DurableTask.Core.TaskOrchestrationDispatcher.<OnProcessWorkItemSessionAsync>d__24.MoveNext() in /_/src/DurableTask.Core/TaskOrchestrationDispatcher.cs:line 211
      --- End of stack trace from previous location where exception was thrown ---
         at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
         at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
         at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
         at DurableTask.Core.WorkItemDispatcher`1.<ProcessWorkItemAsync>d__45.MoveNext() in /_/src/DurableTask.Core/WorkItemDispatcher.cs:line 375

What happens next is, looking at the database data (I am using SqlServer provider), the SubOrchestrationCompleted event of the inner-most orchestration is lost (NewEvents does not contain it, nor it gets applied to the instance Histories is not showing it being received). A SubOrchestrationCompleted event for the parent is waiting inside NewEvents but cannot be processed as well and generates this exception everytime the application is trying to pick new events. By looking at the code, it seems this error probably comes from

if (Activity.Current != null)
                {
                    HistoryEvent historyEvent = message.Event;
                    if (historyEvent is TimerFiredEvent timerFiredEvent)
                    {
                        // We immediately publish the activity span for this timer by creating the activity and immediately calling Dispose() on it.
                        TraceHelper.EmitTraceActivityForTimer(workItem.OrchestrationRuntimeState.OrchestrationInstance, workItem.OrchestrationRuntimeState.Name, message.Event.Timestamp, timerFiredEvent);
                    }
                    else if (historyEvent is SubOrchestrationInstanceCompletedEvent subOrchestrationInstanceCompletedEvent)
                    {
                        SubOrchestrationInstanceCreatedEvent subOrchestrationCreatedEvent = (SubOrchestrationInstanceCreatedEvent)workItem.OrchestrationRuntimeState.Events.FirstOrDefault(x => x.EventId == subOrchestrationInstanceCompletedEvent.TaskScheduledId);

                        // We immediately publish the activity span for this sub-orchestration by creating the activity and immediately calling Dispose() on it.
                        TraceHelper.EmitTraceActivityForSubOrchestrationCompleted(workItem.OrchestrationRuntimeState.OrchestrationInstance, subOrchestrationCreatedEvent);
                    }
                    else if (historyEvent is SubOrchestrationInstanceFailedEvent subOrchestrationInstanceFailedEvent)
                    {
                        SubOrchestrationInstanceCreatedEvent subOrchestrationCreatedEvent = (SubOrchestrationInstanceCreatedEvent)workItem.OrchestrationRuntimeState.Events.FirstOrDefault(x => x.EventId == subOrchestrationInstanceFailedEvent.TaskScheduledId);

                        // We immediately publish the activity span for this sub-orchestration by creating the activity and immediately calling Dispose() on it.
                        TraceHelper.EmitTraceActivityForSubOrchestrationFailed(workItem.OrchestrationRuntimeState.OrchestrationInstance, subOrchestrationCreatedEvent, subOrchestrationInstanceFailedEvent, errorPropagationMode);
                    }
                }

where a wrong event type SubOrchestrationInstanceCreatedEvent is assumed to be returned?
The end result is that the Orchestration and it's SubOrchestration are stuck and can never progress/finish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant