Canary SMI strategy generates invalid TrafficSplit resource
See original GitHub issueWhen the task is used to do a canary deployment, it makes use of SMI (service mesh interface) for doing the TrafficSplit
to manage which traffic goes to the stable vs. canary version of the service. As part of that, it automatically goes to get the SMI custom resource version currently deployed so it can generate its manifest.
The manifest it generates uses XXXXm format weights. That is, you might get a manifest like this:
apiVersion: split.smi-spec.io/v1alpha2
kind: TrafficSplit
metadata:
name: my-service-azure-pipelines-rollout
namespace: my-ns
spec:
backends:
- service: my-service-stable
weight: 1000m
- service: my-service-baseline
weight: 0m
- service: my-service-canary
weight: 0m
service: my-service
Unfortunately, while the XXXXm
format is noted in the very first version of the TrafficSplit
spec, as of version 2 of the spec it was removed. The official “SMI SDK for Go” has a detailed custom resource definition and it validates the weight as a number. Further, the current SMI adapter for Istio uses that SDK so it’s entirely failing to read and validate TrafficSplit
resources generated during canary.
The simplest solution is to stop post-fixing m
on the weights. The weights being whole/relative numbers or percentages is compatible with all versions of the spec. A correct TrafficSplit
should look like this:
apiVersion: split.smi-spec.io/v1alpha2
kind: TrafficSplit
metadata:
name: my-service-azure-pipelines-rollout
namespace: my-ns
spec:
backends:
- service: my-service-stable
weight: 1000
- service: my-service-baseline
weight: 0
- service: my-service-canary
weight: 0
service: my-service
The SMI Adapter for Istio generates logs like this to reflect that issue:
E0902 14:45:23.035319 1 reflector.go:134] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:126: Failed to list *v1alpha2.TrafficSplit: v1alpha2.TrafficSplitList.Items: []v1alpha2.TrafficSplit: v1alpha2.TrafficSplit.Spec: v1alpha2.TrafficSplitSpec.Backends: []v1alpha2.TrafficSplitBackend: v1alpha2.TrafficSplitBackend.Weight: readUint64: unexpected character: �, error found in #10 byte of ...|"weight":"1000m"},{"|..., bigger context ...|":[{"service":"accounts-service-stable","weight":"1000m"},{"service":"accounts-service-baseline","we|...
E0902 14:45:24.038301 1 reflector.go:134] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:126: Failed to list *v1alpha2.TrafficSplit: v1alpha2.TrafficSplitList.Items: []v1alpha2.TrafficSplit: v1alpha2.TrafficSplit.Spec: v1alpha2.TrafficSplitSpec.Backends: []v1alpha2.TrafficSplitBackend: v1alpha2.TrafficSplitBackend.Weight: readUint64: unexpected character: �, error found in #10 byte of ...|"weight":"1000m"},{"|..., bigger context ...|":[{"service":"products-service-stable","weight":"1000m"},{"service":"products-service-baseline","we|...
E0902 14:45:25.042071 1 reflector.go:134] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:126: Failed to list *v1alpha2.TrafficSplit: v1alpha2.TrafficSplitList.Items: []v1alpha2.TrafficSplit: v1alpha2.TrafficSplit.Spec: v1alpha2.TrafficSplitSpec.Backends: []v1alpha2.TrafficSplitBackend: v1alpha2.TrafficSplitBackend.Weight: readUint64: unexpected character: �, error found in #10 byte of ...|"weight":"1000m"},{"|..., bigger context ...|":[{"service":"accounts-service-stable","weight":"1000m"},{"service":"accounts-service-baseline","we|...
I’m guessing this logic came from the original KubernetesManifest@V0
Azure DevOps task, which is actually where I discovered it. I’ve filed a corresponding issue there. We’re on AzDO right now but will shortly be moving to GitHub Actions (a few months?) and it’d be cool to see it fixed in both places.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:6
Top GitHub Comments
Created a new release v1.5 fixing this.
This issue is idle because it has been open for 14 days with no activity.