Horizontal Pod Autoscaler – KubeCon + CloudNativeCon NA 2021

Thanks to all those who attended! As promised, here are the basic steps on scaling your application in Kubernetes.

This demo was based off the official documentation, that way you can refer to it for more details.

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/

While using the above demo as a base, our demo highlights the issues/errors that can happen along the way.

Assumptions

  • 1 control plane node (we use labels to limit to one node for simplicity)
  • Install¬†jq¬†(if you want to use it), else parse results another way

Setup

Showing failures

If you installed a single control plane node using kubeadm, the following commands fail out of the box.

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/" kubectl top pods

n.b. If you add -v=6 (or higher) you can see the actual REST-like calls including HTTP errors like 404

Install metrics server

While there is a router (the api server), their is no server to respond (aka. a metrics server). We need to install one.

curl -sLO https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.1/components.yaml

Review the metrics manifest file you just downloaded (vi components.yaml). Notice it is mostly security (RBAC) setup and configuration to launch the metrics server and setup a clusterIP.

kubectl apply -f components.yaml

Show startup issues

While it may appear to setup the metrics server, it is not fully operational yet. If you describe the pod, you will notice many lines complaining about TLS/https.

Due to our limited time, we will disable the TLS check. Change this section:

    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        image: k8s.gcr.io/metrics-server/metrics-server:v0.5.1

to this (add –kubelet-insecure-tls):

    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls
        image: k8s.gcr.io/metrics-server/metrics-server:v0.5.1

Reapply your manifest file, this takes a minute, but will replace the broken pod with a working pod (again, don’t run this setup in production).

kubectl apply -f components.yaml

Show metrics working

If all goes well, these commands:

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/" kubectl top pods

are now reporting stats back!

Demo load

We next create an application and deploy it. This will be our target to scale up and down.

Show code

Make sure the following php code is in the “code” subdirectory, as Docker needs that for build context.

~$ cat ./code/index.php 
<?php
  $x = 0.0001;
  for ($i = 0; $i <= 1000000; $i++) {
    $x += sqrt($x);
  }
  echo "OK!";
?>

Build Docker image

cat << EOF | sudo docker image build -t myphpapp:v0.1 -f- ./code
FROM php:5-apache
COPY index.php /var/www/html/index.php
RUN chmod a+rx index.php
EOF

Deploy application

Using the following deployment manifest, we deploy our new image. HPA works with resources that have a ‘replica’ feature. Here we start with a replication of 1.

~$ cat deployment.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
spec:
  selector:
    matchLabels:
      run: php-apache
  replicas: 1
  template:
    metadata:
      labels:
        run: php-apache
    spec:
      nodeSelector:
        node-role.kubernetes.io/control-plane: ""
      containers:
      - name: php-apache
        image: myphpapp:v0.1 
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: 500m
          requests:
            cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
  name: php-apache
  labels:
    run: php-apache
spec:
  ports:
  - port: 80
  selector:
    run: php-apache

kubectl apply -f deployment.yaml

Wait a minute, you can check the new pod is reporting metrics (which the HPA will use to make scaling decisions).

kubectl top pod

Monitoring

In a new terminals, run the following commands to monitor our growth.

( sh -c "kubectl get horizontalpodautoscalers.autoscaling -w" & sh -c "i=0;while sleep 15; do echo hb:\$i; i=\$((i+1)); done" )

kubectl get pods -w

The first command shows HPA changes along with a 15s heartbeat (which is the decision cycle of HPA by default). The second command simply watches pods scaling up and down.

Create HPA

We are now ready to launch our HPA, with a miniumum of 1 pod and maxium of 10. First we dry-run to review.

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10 --dry-run=client -o yaml

We now launch the HPA.

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

kubectl get horizontalpodautoscalers.autoscaling

Back to monitoring

Back in the monitoring windows, not much happens, this is because we have no load, we need to attack the CPU of our pod.

Attack the Pod CPU

We launched our app via a deployment, fronted by a service. Lets use the service as the entry point to attack the pod(s).

kubectl get service php-apache -o json | jq '.spec.clusterIP' -r

Below are three levels of attacks; slower, faster, fatest. By running each for a couple of minutes, you can see the various growth patterns in the monitoring windows.

( echo slower; sh -c "while sleep 1; do curl -i 10.107.222.7; done" ) ( echo faster; sh -c "while sleep .5; do curl -s -o /dev/null -w \"%{http_code}\n\" 10.107.222.7; done" ) ( echo fastest; sh -c "while true; do curl -s -o /dev/null -w \"%{http_code}\n\" 10.107.222.7; done" )

Do we ever max out?

In the case of ‘fastest’, in our test, we never hit 10, only 8 or 9. You can see why by reviewing the events.

kubectl get events

Turns out, our single box can’t handle 10 pods based on the Pod resource requirements (in the deployment manfiest).

Conclusion

For those who are new to HPA, you have seen the basics. For those who had some experience, you see the errors and how to (potentially) fix them. There is a lot more to scaling though: how the metrics are interpreted, what metrics are used, multi-metrics, etc.

Visit https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ for more, and don’t forget to checkout our Kubernetes related classes here https://rx-m.com/training/.

Feel free to contact me with any questions, including technical ones, I will help if I can.

Ronald Petty

ronald.petty@rx-m.com 

https://www.linkedin.com/in/ronaldpetty/