SBN

Increase Kubernetes Reliability: A Best Practices Guide for Readiness Probes

Kubernetes, released by Google as an open source container orchestration system in 2014 under Apache License 2.0, is written in the programming language Go. Today, Kubernetes is maintained by the Cloud Native Computing Foundation (CNCF). It offers impressive flexibility, enabling organizations of all sizes to deploy, scale, and manage production grade, containerized workloads. Kubernetes can be deployed on a variety of infrastructure providers such as AWS, Azure, Google Cloud, and even on-premise data centers. It’s also quite complex and introduces a lot of unfamiliar terms and technologies for development and platform teams to learn.

Readiness probes and  liveness probes are two terms you need to understand as you begin deploying more mature applications to Kubernetes. This guide discusses running readiness probes in Kubernetes clusters as a kind of health check to determine whether a container is ready to serve traffic. When a container is ready, the Kubernetes API will route traffic to it. If a container is not ready, Kubernetes will not route traffic to it. When you ensure that only healthy containers are serving traffic, it can significantly improve the reliability of your applications.

Importance of Kubernetes probes

If you’re using Kubernetes, it’s partly because you want your apps and services to be reliable, available, and scalable. That doesn’t always happen. Sometimes apps and services fail due to temporary connection issues, configuration errors, missing dependencies, or application errors. Although you need to know why the application became unreliable, you also need to know if an issue has occurred or is occurring. Probes can help you troubleshoot Kubernetes by monitoring applications and services for issues. Even better, they can also help you plan and  manage resources by showing you when an application is experiencing resource contention.

Probes are an excellent way to periodically check the health of an application. You usually configure readiness checks and liveness checks (among others) using the command-line client or a YAML deployment template in the spec.containers.readinessprobe attribute of the pod configuration. The three types of probes in Kube are:

  • Startup probes: Verify whether an application within a container has started. It is executed only at startup. If a container fails this probe, the container is killed and follows the restartPolicy for the pod. You can configure startup probes in the spec.containers.startupProbe attribute. A significant reason to include startup probes is because some legacy applications require additional startup time when first initialized, which can make it difficult to set liveness probe parameters. Make sure you use the same protocol as the application when you configure a startupProbe— and ensure the failureThreshold * periodSeconds is long enough to cover the worst case startup time. 

  • Readiness probes: Verify that a Docker container is ready to serve requests. If the probe returns a failed state, Kubernetes removes the IP address for the pod from the endpoints of all services. Readiness probes enable you to advise Kubernetes that a running container should not receive traffic until additional tasks are completed. Those tasks include loading files, warming caches, and establishing network connections. The location to configure readiness probes is in the spec.containers.readinessProbe attribute for the pod configuration. These probes must be run periodically, with that period defined by the periodSeconds attribute.

  • Liveness probes: Use these liveness checks to assess whether an application running in a container is in a healthy state. If the liveness probe fails, Kubernetes kills the container and attempts to restart it. Liveness probes are useful when you want to ensure your application is not experiencing deadlock or silently unresponsive. Deadlock is a situation when your container is not ready but the liveness probe is performing and it exceeds the failure threshold, because of a too short delay time. To mitigate this, you should use a startup probe and set your threshold high enough. Configure liveness probes in the spec.containers.livenessProbe code attribute of the pod configuration. Similar to readiness probes, liveness probes also run periodically.

What are readiness probes in Kubernetes?

Readiness probes are a feature in Kubernetes that allow a container to indicate when it is ready to start serving requests. If you have them configured, Kubernetes can perform this diagnostic check to determine whether a Docker container is ready to receive traffic.

When a container is first created or restarted, it may take time to fully initialize before it is ready to accept requests. During this time, Kubernetes may send requests to the container, which could cause errors or delays in the application. The readiness probes ensure that Kubernetes will wait until the container is fully ready before routing traffic to it.

You can configure readiness probes in a variety of ways, such as by specifying an HTTP endpoint to check or by executing a custom script or command. If the probe fails, Kubernetes temporarily removes the container from the list of available endpoints until it becomes ready again. This helps ensure that only healthy containers are serving traffic.

Types of readiness probes

  • HTTP probe. HTTP probes are the simplest and most common type of readiness probe. An HTTP probe makes a simple HTTP request to the container and checks the status code or response code. If the response code is 200, the container is considered ready. If the response code is anything else, the container isn’t ready. HTTP probe also utilizes httpheaders as an array of headers defined as a header/value map. 

  • TCP probe. A TCP probe is like a HTTP probe, but it makes a simple TCP connection to the container. If the connection is successful, the container is considered in a ready state. If the connection fails, the container isn’t ready. A TCP socket port can be identified by adding tcpSocket.port field to your liveness probe configuration. The probe will fail if the socket cannot be opened. 

  • Script probe. A script probe runs a script inside the container and checks the exit code of the script. If the exit code is 0, the container is considered ready. If the exit code is anything else, the container isn’t ready.

  • Exec probe. Executes a command inside the container and allows you to check for a variety of conditions, such as whether a file exists, a service is running, or a database is connected. To use an exec probe, you need to specify the command that you want to execute in the exec field of the probe configuration. You can also specify the timeoutSeconds and failureThreshold fields to control how long the probe takes to run and how many failures are allowed before the container is considered unhealthy. For example, this exec probe checks whether the file /tmp/healthy exists:

    readinessProbe:
      exec:
        command:
          - cat
          - /tmp/healthy

    If the file exists, the container is considered ready. If not, the container isn’t ready.

  • Combination of probes. You can also use a combination of the previous probes to get a more accurate picture of the health of your container. For example, you could use a HTTP probe to check the health of the application itself and a TCP probe to check the health of the underlying infrastructure.

Configure readiness probes

When you are configuring a readiness probe, there are several options you need to specify.

    • Path: The path to the endpoint that will be used to check the health of the container.

    • Port: The port that the endpoint is listening on.

    • InitialDelaySeconds: The number of seconds to wait before starting to check the health of the container.

    • TimeoutSeconds: The number of seconds to wait for a response from the health check endpoint.

    • PeriodSeconds: The number of seconds between health checks.

    • FailureThreshold: The number of consecutive failures that must occur before the container is considered unhealthy.

  • SuccessThreshold: Required number of successful probes to mark a container healthy/ready.

  • HTTPHeaders: Custom headers to set in the request, HTTP allows repeated headers.

It is also important to monitor the readiness probes of your containers to ensure that they are working properly. You can use the Kubernetes dashboard to do this or use a third-party monitoring tool.

You can configure readiness probes in your PodSpec. The following is an example of a PodSpec that uses an HTTP readiness probe:

kind: Pod

metadata:

  name: my-fun-pod

spec:

  containers:

  - name: my-fun-container

    image: nginx

    readinessProbe:

      httpGet:

        # The path to the endpoint that will be used to check the health of the container.

        path: /

        # The port that the endpoint is listening on.

        port: 90

        # Custom headers to set in the request. HTTP allows repeated headers

        httpHeaders:

Name: some-header

      Value: Running

        # The number of seconds to wait before starting to check the health of the container.

        initialDelaySeconds: 35

        # The number of seconds to wait for a response from the health check endpoint.

        timeoutSeconds: 2

        # Required number of successful probes to mark a container healthy/ready

        successThreshold: 1

        # The number of consecutive failures that must occur before the container is considered unhealthy.

        failureThreshold: 4

When you create a Pod with a readiness probe, the kubelet starts checking the health of the container at the specified interval. If the container fails the health check, the kubelet marks the container as not ready and will no longer route traffic to it. The kubelet continues checking the health of the container until it passes the health check. Then the kubelet marks the container as ready and begins routing traffic to it again.

Best practices for using readiness probes in Kubernetes

Readiness probes can help you improve the health and performance of your applications in Kubernetes. Here are a few best practices you can follow for using readiness probes in Kubernetes:

  • Use a variety of probes. There are different types of readiness probes available, including HTTP probes, TCP probes, and exec probes. Each type of probe has different strengths and weaknesses, so you need to choose the right one for your application. For example, an HTTP probe is a good choice for web applications, while a TCP probe is a good choice for backend services.

  • Configure your probes carefully. When you’re configuring your readiness probes, it’s important to set appropriate values for the initialDelaySeconds, periodSeconds, timeoutSecondsfailureThreshold and successThreshold parameters. These parameters control how often the probe is run, how long it takes to complete, and how many failures are allowed before the pod is considered unhealthy.

  • Use readiness probes and liveness probes. Liveness probes are like readiness probes, but they are liveness checks that determine whether a pod is still healthy. If a pod isn’t healthy, Kubernetes will restart it. When you use both readiness and liveness probes, it helps you ensure that your applications are always in a healthy state.

  • Monitor your readiness probes. Monitor your readiness probes to make sure they are working properly. If a readiness probe fails, it could mean that your application is not healthy. Use the Kubernetes dashboard or the kubectl command-line tool to monitor your readiness probes.

Advantages of using readiness probes in Kubernetes

Using readiness probes in Kubernetes can help you improve your application availability and performance because they help you prevent traffic from reaching unhealthy pods. Unhealthy containers can cause applications to fail, because a container may be unable to start up correctly or may crash while running. Unhealthy containers that are serving traffic can also use resources, such as CPU, memory, and storage, that other containers in the application need in order to run. These issues with unhealthy containers could even cause other applications in the cluster to fail. And when readiness probes are in place, you can use them for troubleshooting — if any of the readiness probes for containers in a service are failing, it can help you identify the cause of a problem when services aren’t performing as expected.

In some ways, Kubernetes can create a disconnect, because while your pods may appear healthy, your users may not actually be able to access your apps and services, resulting in errors and outages. Readiness probes help you make certain that your applications, containers, and pods are running as designed. When used with  liveness probes, they can also help you ensure that Kubernetes is restarting containers when they become unhealthy.

You can use open source tools, such as  Polaris, to check whether a pod is ready to receive traffic. Polaris may provide comments specific to liveness probes and readiness probes that prompt users to make updates appropriate to the context of their application. You can use this video that walks through some basic examples on setting readiness probes across clusters to ensure reliability using Fairwinds Insights to get started. 

Readiness probes, liveness probes, and startup probes all help you make certain that your Kubernetes services are built on a good foundation, which helps your DevOps teams deliver better reliability and increased uptime for your apps and services. If you’re having trouble getting started with these health checks, check out this tutorial on readiness, liveness, and startup probes on the Kubernetes website. Please remember that you need both readiness probes and liveness probes — they aren’t interchangeable. Together, readiness probes and liveness probes can help you increase the reliability of your applications in Kubernetes.

Fairwinds Insights enables your developers to identify potential problems early and reduce downtime or disruptions in Kubernetes. Try the free tier today!

Use Fairwinds Insights for Free Security, Cost and Developer Enablement In One

*** This is a Security Bloggers Network syndicated blog from Fairwinds | Blog authored by Munib Ali. Read the original post at: https://www.fairwinds.com/blog/increase-kubernetes-reliability-a-best-practices-guide-for-readiness-probes