Automatic Scaling Jenkins Pods on Kubernetes

Introduction

Jenkins is one of the most popular CI/CD tools for many companies nowadays, but it doesn’t scale out of the box. One could simply add more nodes and connect them with the Master Jenkins node, but that would consume extra infrastructure resources with all the negatives that come with it. In this tutorial, you will learn how to configure Kubernetes and Jenkins, to dynamically provision Jenkins slaves as Kubernetes Pods while executing a pipeline job! After the job is finished, the pods will simply die freeing any resources they were consuming during job execution time.

Essentials

  1. A running Kubernetes Cluster. Follow How to install a Kubernetes Cluster to deploy one.
  2. A Running Jenkins Master. Refer to Jenkins installation on Kubernetes guide for instructions.

Step 1 – Jenkins Kubernetes plugin Installation

To make Jenkins communicate with Kubernetes API successfully, we need to install 2 plugins:

  1. The Kubernetes plugin.
  2. the Kubernetes Client API Plugin

After you log in to Jenkins user interface, navigate to Manage Jenkins – > Manage Plugins -> Available, and filter the results by typing “Kubernetes plugin” and “Kubernetes Client API Plugin

It would be a good idea also to install the Kubernetes CLI Plugin because it allows us to execute kubectl commands from within your pipeline. (optional)

Step 2 – Plugin Configuration

Now that the essential plugins are installed, you will have to follow the steps below:

  • Go to Manage Jenkins – > Manage Nodes and Clouds -> Configure Clouds
  • You will see a Drop Down menu named “Add New Cloud“. Click on it and choose “Kubernetes
  • Click on Kubernetes Cloud details
  • Fill out plugin values
    • Name: kubernetes
    • Kubernetes URL: https://kubernetes.default
    • Kubernetes Namespace: jenkins
    • Credentials -> Add -> Jenkins (Choose Kubernetes service account option & Global + Save)
    • Test Connection. It should be successful even without credentials since our Jenkins Master already uses the “jenkins” service account in the “jenkins” namespace and all permissions have been taking care of since the initial Jenkins installation stage.
    • Jenkins URL: http://jenkins:8080
    • Tunnel: jenkins:50000 (without the HTTP portion)
    • Apply cap only on alive pods : yes!
    • Add Kubernetes Pod Template
      • Name: jenkins-slave
      • Namespace: jenkins
      • Labels: jenkins-slave (you will need to use this label on all jobs)
      • Containers > Add Template
        • Name: jnlp
        • Docker Image: admintuts/jnlp-agent-docker:alpine
        • Workin Directory: /home/jenkins/agent
        • Command to run :
        • Arguments to pass to the command:
        • Allocate pseudo-TTY: yes
        • Add Volume
          • HostPath Volume
          • HostPath: /var/run/docker.sock
          • Mount Path: /var/run/docker.sock
      • Timeout in seconds for Jenkins connection: 300
      • Service Account: jenkins
  • Save

That’s all the configuration needed. Now you will create a Jenkins pipeline to test everything.

Step 3 – Test Jenkins Slave Agent

Now it’s time to see our work finally taking shape.

  1. Create a new freestyle project with the name Test Slave Agent and hit OK.
  2. Click on Build Environment – > Build -> Add Build Step -> Execute Shell
  3. Paste the code:
    echo "The slave agent will run for 30 seconds..."
    sleep 30
    echo "Time for the agent to die...Bye Jenkins Agent!"

    Now click Save, and then Build Now.

Verifying the Slave Agent Pod is Running on Kubernetes

This is what you will see when you click on Build now:

  1. (pending—‘Jenkins’ is reserved for jobs with matching label expression)
  2. A loading indicator that confirms the job is running on a Kubernetes slave pod.

This can be also confirmed by executing:

kubectl get pods -n jenkins

Where we shall see:

NAME                  READY   STATUS    RESTARTS   AGE
jenkins-0             1/1     Running   11         2d22h
jenkins-slave-pccvz   1/1     Running   0          13s

The jenkins-0 pod is the Jenkins Master, and jenkins-slave-pccvz our dynamically spawned slave pod.

Now let’s check how jenkins master is handling the slave operations. Run the command:

kubectl logs jenkins-0 -n jenkins

Output:

2020-05-30 08:52:01.525+0000 [id=730]	INFO	o.c.j.p.k.KubernetesClientProvider#gracefulClose: Not closing io.fabric8.kubernetes.client.DefaultKubernetesClient@211a9e92: there are still running (1) or queued (0) calls
2020-05-30 08:54:23.462+0000 [id=733]	INFO	hudson.model.AsyncPeriodicWork#lambda$doRun$0: Started DockerContainerWatchdog Asynchronous Periodic Work
2020-05-30 08:54:23.463+0000 [id=733]	INFO	c.n.j.p.d.DockerContainerWatchdog#execute: Docker Container Watchdog has been triggered
2020-05-30 08:54:23.463+0000 [id=733]	INFO	c.n.j.p.d.DockerContainerWatchdog$Statistics#writeStatisticsToLog: Watchdog Statistics: Number of overall executions: 24, Executions with processing timeout: 0, Containers removed gracefully: 0, Containers removed with force: 0, Containers removal failed: 0, Nodes removed successfully: 0, Nodes removal failed: 0, Container removal average duration (gracefully): 0 ms, Container removal average duration (force): 0 ms, Average overall runtime of watchdog: 0 ms, Average runtime of container retrieval: 0 ms
2020-05-30 08:54:23.463+0000 [id=733]	INFO	c.n.j.p.d.DockerContainerWatchdog#loadNodeMap: We currently have 0 nodes assigned to this Jenkins instance, which we will check
2020-05-30 08:54:23.464+0000 [id=733]	INFO	c.n.j.p.d.DockerContainerWatchdog#execute: Docker Container Watchdog check has been completed
2020-05-30 08:54:23.464+0000 [id=733]	INFO	hudson.model.AsyncPeriodicWork#lambda$doRun$0: Finished DockerContainerWatchdog Asynchronous Periodic Work. 2 ms
2020-05-30 08:54:31.524+0000 [id=735]	INFO	o.c.j.p.k.KubernetesClientProvider#gracefulClose: Not closing io.fabric8.kubernetes.client.DefaultKubernetesClient@211a9e92: there are still running (1) or queued (0) calls
2020-05-30 08:55:09.370+0000 [id=736]	INFO	hudson.model.AsyncPeriodicWork#lambda$doRun$0: Started Periodic background build discarder
2020-05-30 08:55:09.372+0000 [id=736]	INFO	hudson.model.AsyncPeriodicWork#lambda$doRun$0: Finished Periodic background build discarder. 2 ms
2020-05-30 08:57:01.522+0000 [id=739]	INFO	o.c.j.p.k.KubernetesClientProvider#gracefulClose: Not closing io.fabric8.kubernetes.client.DefaultKubernetesClient@211a9e92: there are still running (1) or queued (0) calls
2020-05-30 08:59:04.218+0000 [id=35]	INFO	o.c.j.p.k.KubernetesCloud#provision: Excess workload after pending Kubernetes agents: 1
2020-05-30 08:59:04.218+0000 [id=35]	INFO	o.c.j.p.k.KubernetesCloud#provision: Template for label null: jenkins-slave
2020-05-30 08:59:04.250+0000 [id=35]	INFO	o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-05-30 08:59:14.229+0000 [id=36]	INFO	hudson.slaves.NodeProvisioner#lambda$update$6: jenkins-slave-pccvz provisioning successfully completed. We have now 2 computer(s)
2020-05-30 08:59:14.248+0000 [id=745]	INFO	o.c.j.p.k.KubernetesLauncher#launch: Created Pod: jenkins/jenkins-slave-pccvz
2020-05-30 08:59:14.268+0000 [id=749]	INFO	o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-05-30 08:59:14.278+0000 [id=745]	INFO	o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-05-30 08:59:15.270+0000 [id=751]	INFO	h.TcpSlaveAgentListener$ConnectionHandler#run: Accepted JNLP4-connect connection #8 from /10.244.1.84:57544
2020-05-30 08:59:15.778+0000 [id=745]	INFO	o.c.j.p.k.KubernetesLauncher#launch: Pod is running: jenkins/jenkins-slave-pccvz
2020-05-30 08:59:23.463+0000 [id=800]	INFO	hudson.model.AsyncPeriodicWork#lambda$doRun$0: Started DockerContainerWatchdog Asynchronous Periodic Work
2020-05-30 08:59:23.464+0000 [id=800]	INFO	c.n.j.p.d.DockerContainerWatchdog#execute: Docker Container Watchdog has been triggered
2020-05-30 08:59:23.464+0000 [id=800]	INFO	c.n.j.p.d.DockerContainerWatchdog$Statistics#writeStatisticsToLog: Watchdog Statistics: Number of overall executions: 25, Executions with processing timeout: 0, Containers removed gracefully: 0, Containers removed with force: 0, Containers removal failed: 0, Nodes removed successfully: 0, Nodes removal failed: 0, Container removal average duration (gracefully): 0 ms, Container removal average duration (force): 0 ms, Average overall runtime of watchdog: 0 ms, Average runtime of container retrieval: 0 ms
2020-05-30 08:59:23.465+0000 [id=800]	INFO	c.n.j.p.d.DockerContainerWatchdog#loadNodeMap: We currently have 1 nodes assigned to this Jenkins instance, which we will check
2020-05-30 08:59:23.465+0000 [id=800]	INFO	c.n.j.p.d.DockerContainerWatchdog#execute: Docker Container Watchdog check has been completed
2020-05-30 08:59:23.465+0000 [id=800]	INFO	hudson.model.AsyncPeriodicWork#lambda$doRun$0: Finished DockerContainerWatchdog Asynchronous Periodic Work. 2 ms
2020-05-30 08:59:31.523+0000 [id=802]	INFO	o.c.j.p.k.KubernetesClientProvider#gracefulClose: Not closing io.fabric8.kubernetes.client.DefaultKubernetesClient@211a9e92: there are still running (1) or queued (0) calls
2020-05-30 08:59:48.580+0000 [id=807]	INFO	o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent jenkins-slave-pccvz
2020-05-30 08:59:48.624+0000 [id=807]	INFO	o.c.j.p.k.KubernetesSlave#deleteSlavePod: Terminated Kubernetes instance for agent jenkins/jenkins-slave-pccvz
Terminated Kubernetes instance for agent jenkins/jenkins-slave-pccvz
2020-05-30 08:59:48.624+0000 [id=807]	INFO	o.c.j.p.k.KubernetesSlave#_terminate: Disconnected computer jenkins-slave-pccvz
Disconnected computer jenkins-slave-pccvz

The most important aspects of the log above are:

  1. jenkins-slave-pccvz provisioning successfully completed. We have now 2 computer(s)
  2. Accepted JNLP4-connect connection #8 from /10.244.1.84:57544

If a JNLP4 connection won’t take place, the slave pod will never be able to accept the job since it will not be able to accept job requests from the master. (useful for debugging issues)