logo

K8snetlook – Root-Causing K8s Network Problems in an Automated Way

2021-10-13

Authors:   Arun Sriraman


Summary

The presentation discusses troubleshooting techniques for Kubernetes networking issues and introduces Kades Network, a tool for automated debugging.
  • Two common approaches to fixing issues are the 'big hammer' approach of restarting or deleting components and asking for help from specific groups or individuals
  • Identifying the type of problem and traffic path is crucial in troubleshooting
  • Kades Network is a tool that automates the debugging process by performing connectivity and path MTU checks
  • The tool is not CNI aware and does not provide automation for external reports
  • Contributors are welcome to improve the tool
The presenter demonstrates how Kades Network can be used to troubleshoot a node issue caused by a network policy that denies all traffic. The tool performs connectivity and path MTU checks and provides debug information to identify the problem and its solution.

Abstract

More and more applications in production call Kubernetes their home. As the density of workloads on a Kubernetes cluster increases, so does the probability of downtime due to an underlying network issue. Some of the most common quibbles we hear from users: I can’t connect to my service A running within a K8s cluster or my service A seems to not be responding some % of the time. What do you do in these situations; Do you call the network gurus to help out, or kubectl delete the application and let Kubernetes self heal? What if you could identify an issue without needing to master the internals of K8s Networking? Arun will go over the various issues seen in the data plane, from dns, external traffic to internal app-to-app communication, and then discuss open source tools available to identify these issues in real time. We will look at k8snetlook - a simple open source tool that empowers every Kubenretes user; expert or otherwise, to root cause these issues in an automated way.

Materials: