Kubernetes observability

Kubernetes undoubtedly is one of the most popular container orchestration platforms today. In fact, many of the cloud-native technologies now assume Kubernetes as the de facto deployment model.

Unfortunately, adopting Kubernetes comes with a cost. Aside from its steep learning curve, it has a lot of moving parts that require additional observability to keep it operating in a performant manner. Given the popularity of Kubernetes and the complexities it brings, focusing on observability is critical to a successful Kubernetes infrastructure.

Components of Kubernetes observability

At the core, the key components of observability do not change with Kubernetes. The three pillars of observability — logs, metrics, traces — are all important to collect and analyze to understand the state of your applications running on Kubernetes as well as the health of Kubernetes itself.

The challenge with Kubernetes observability is the amount of data sources and the volume of data.

First, all applications running inside Kubernetes must be monitored. This includes not only the application itself, but also the helper containers running alongside it. These helper containers could be sidecars, init containers, or hooks introduced by other tools such as Istio, CI/CD, or observability tools themselves.

Next, we need to monitor the infrastructure layer. This includes Kubernetes control planes, master plan components (if provided), along with other critical Kubernetes components such as autoscaler, DNS, or ingress controllers. Depending on the Kubernetes distribution or provider, some of these may be managed or not exposed.

Finally, access logs and policy information must also be monitored for security and audit purposes.

Tools for Kubernetes observability

Fortunately, Kubernetes is well integrated with cloud-native observability tool stack including Prometheus, Grafana, Jaeger, Zipkin, and OpenTelemetry-compliant frameworks.

Most modern observability platforms also support Kubernetes observability natively. This includes Lightrun, Datadog, New Relic, Dynatrace, Sumo Logic, and Honeycomb.

Best practices for Kubernetes observability

Since Kubernetes has many components to keep track of, it is important to define the pillars to drive observability instead of simply resorting to monitoring and alerting.

Use Kubernetes labels and annotations to tag and filter various Kubernetes components effectively. This can also help differentiate application data from infrastructure layer to triage and troubleshoot potential issues.

Also, given the large volume of logs and metrics, regularly review the data sources and implement optimization techniques such as sampling to reduce cost and surface only critical information for humans to respond to.