question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cache Webhook "disabled" by default in Azure

See original GitHub issue

What happened:

[A clear and concise description of what the bug is.] Right now, mutating webhooks are used for components such as the cache-server. Previously there was an issue with the knative webhook, so the label “control-plane” was attached to prevent the webhook from triggering all the time. (Refer to https://github.com/kubeflow/kubeflow/issues/4511). However, Azure by default adds the below namespace selector to mutatingwebhooks to prevent applying to AKS internal namespaces. (https://github.com/Azure/AKS/issues/1771)

namespaceSelector:
    matchExpressions:
    - key: control-plane
      operator: DoesNotExist

As the KF namespace comes with “control-plane: kubeflow”, this causes the cache server to fail to mutate any pods in Kubeflow.

What did you expect to happen:

It seems unfair to expect Kubeflow to fix this issue, as this dependency is inherently caused by Azure upstream. Perhaps we can update the Azure docs / default deploy to tell the users that these components won’t work as intended?

Environment:

Azure

How did you deploy Kubeflow Pipelines (KFP)?

KFP version: 1.2

KFP SDK version: 1.4

Anything else you would like to add:

[Miscellaneous information that will assist in solving the issue.]

/kind bug

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (2 by maintainers)

github_iconTop GitHub Comments

5reactions
andre-lxcommented, May 19, 2021

In azure, the problem is that istio pods are not being injected to the pods in kubeflow.

Azure enforces that the MutatingWebhookConfiguration - istio-sidecar-injector in kubeflow is being automatically edited by AKS to add the following match expression in the namespaceSelector:

matchExpressions:
- key: control-plane
  operator: DoesNotExist

So the MutatingWebhookConfiguration looks like this:

namespaceSelector:
	matchExpressions:
    - key: control-plane
      operator: DoesNotExist
	matchLabels:
  		istio-injection: enabled

This will exclude the kubeflow namespace, since the namespace have the label:

labels:
    control-plane: kubeflow

To solve this issue, you need to deactivate the admission enforcer from aks, using the following annotation in the MutatingWebhookConfiguration:

apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  annotations:
    admissions.enforcer/disabled: "true"
  name: istio-sidecar-injector

I believe, that the following issues are all related to this, so you don’t need to disable istio (changing the DestinationRules from ISTIO_MUTUAL to DISABLE):

https://github.com/kubeflow/pipelines/issues/4469 https://github.com/kubeflow/kubeflow/issues/5271 https://github.com/kubeflow/kubeflow/issues/5277 https://github.com/Azure/AKS/issues/1771

+info: https://docs.microsoft.com/en-us/azure/aks/faq#can-i-use-admission-controller-webhooks-on-aks

2reactions
andre-lxcommented, May 28, 2021

Hi @danishsamad. Not sure, actually.

In my case, deleting the match expression didn’t solve the issue, since AKS automatically add that match expression.

I’m not sure if I did the manual delete or if the annotation deleted it.

Maybe I added the annotation and eliminated the match expression manually.

Anyway, in new clusters, the annotation worked as expected -> the match expression is not added and the istio pods are injected to the pipelines pods.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Service Hooks getting disabled on random number of failures
Then clear the browser cache, and close this browser. Open a new incognito window, then directly go to this url: https://azuredevopsvirtualagent.azurewebsites.
Read more >
Troubleshoot Azure Automation runbook issues
If the webhook is disabled, you can re-enable it through the Azure portal. If the webhook has expired, you must delete and then...
Read more >
Set up notifications for changes in resource data
By default, change notifications do not contain resource data, other than the id . If the app requires resource data, it can make...
Read more >
Azure security baseline for Azure Cache for Redis
In this article​​ This security baseline applies guidance from the Azure Security Benchmark version 1.0 to Azure Cache for Redis. The Azure ......
Read more >
List of built-in policy definitions - Azure Policy - Microsoft Learn
Name (Azure portal) Effect(s) Version (G... App Service app slots should require FTPS only AuditIfNotExists, Disabled 1.0.0 App Service app slots should use managed identity...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found