Controller is not being called after options.WatcherHttpTimeout has elapsed.
See original GitHub issueDescribe the bug After the watcher http timeout has elapsed, the watcher is not being recreated. This looks to be related to the fix in https://github.com/buehler/dotnet-operator-sdk/issues/460 https://github.com/buehler/dotnet-operator-sdk/commit/a2290ff9edc41bbbb3a7b47a13a00c99e469f8f9?diff=split
To Reproduce
I’m not sure if the original issue was reliably reproduced, but I was getting the original Microsoft.Rest.SerializationException
. If you were getting this I expect you should be able to reproduce this issue as well.
Expected behavior Microsoft.Rest.SerializationException should not break the watcher.
Screenshots
I’d attach logs but I haven’t been able to get any logs out of the watcher class. I tried setting Logging__LogLevel__Microsoft=Trace
on the pod.
Additional context
Prior to a2290ff9edc41bbbb3a7b47a13a00c99e469f8f9, the SerializationException
would result in a call to restart the watcher WatchResource().ConfigureAwait(false);
. We now catch this exception and do not call restart on the watcher. Eventually we hit our WatcherHttpTimeout and enter the OnClose
method where the watcher is not restarted because the cancellation token was cancelled in the exception handling earlier.
Unsure how we should handle this, calling WatchResource().ConfigureAwait(false);
, we really don’t want to be calling WatchResource().ConfigureAwait(false);
every time that SerializationException
is thrown.
Issue Analytics
- State:
- Created a year ago
- Comments:6 (2 by maintainers)
Top GitHub Comments
I think your latest proposal should be fine, @tomasfabian . Thank you for your thoughts! What do you think @buehler ?
Hello @buehler,
in https://github.com/buehler/dotnet-operator-sdk/commit/a2290ff9edc41bbbb3a7b47a13a00c99e469f8f9 you wanted to make the
ResourceWatcher
ignore exceptions due to empty responses (In https://github.com/buehler/dotnet-operator-sdk/issues/460#issuecomment-1210262006). In https://github.com/buehler/dotnet-operator-sdk/blob/master/src/KubeOps/Operator/Kubernetes/ResourceWatcher{TEntity}.cs#L163 the watcher is disposed of in any case, independently of the type of exception.In the case of a
SerializationException
, the method returns immediately without recreating the watcher - neither directly like in the case of aTaskCancelledException
, nor with an exponential backoff like for all other exceptions.In order to ensure that the watcher is restarted even after a
SerializationException
, I would suggest to either remove thereturn
statement in https://github.com/buehler/dotnet-operator-sdk/blob/master/src/KubeOps/Operator/Kubernetes/ResourceWatcher{TEntity}.cs#L183 or to remove the special case for this type of exception completely (if a specific log entry for it is not needed) and handle it together with all other exceptions below.Do I miss something?