Category Archives: Virtualization

How to overprovision the EKS cluster?

In this article, I will guide you through the process of overprovisioning the EKS cluster, through a detailed step-by-step approach. Furthermore, in the later section, we will explore methods for testing the functionality of overprovisioning.

EKS Cluster overprovisioning!

If you want to understand what is overprovisioning, I recommend referring to my previously published article on overprovisioning.

Let’s get started.

Prechecks

Ensure your setup adheres to the below prerequisites. It should be; unless you are running ancient infrastructure 🙂

  • Ensure you are running Kubernetes 1.14 or later since pod priority and preemption are first introduced in 1.14 version.
  • Verify Cluster Autoscaler’s default priority cutoff is set to -10. It is the default since version 1.12.

The manifests provided in this article are with bare minimum specifications. You need to modify them depending on your requirements like the use of non-default namespace, custom labels or annotations, etc. The method of deploying these manifests varies. The simple way is with kubectl apply -f manifest.yaml or the complex way is via Helm charts or ArgoCD apps, etc.

Defining the PriorityClass 

In Kubernetes, we can set a custom priority for pods using something called PriorityClass. In order to configure overprovisioning, you need to use a PriorityClass lower than zero because the default pod priority is zero. It allows you to set the lower priority for pause pods and ensures that these pods are preempted when the time comes. To deploy this custom PriorityClass on your cluster, use the following simple manifest:

apiVersion: scheduling.k8s.io/v1
description: This priority class is for overprovisioning pods only.
globalDefault: false
kind: PriorityClass
metadata:
  name: overprovisioning
value: -1

Define Autoscaler strategy

A ConfigMap is utilized to define the autoscaler policy for overprovisioning deployment. The process of calculation is explained here. Please refer to the below manifest:

apiVersion: v1
data:
  linear: |-
    {
      "coresPerReplica": 1,
      "nodesPerReplica": 1,
      "min": 1,
      "max": 50,
      "preventSinglePointFailure": true,
      "includeUnschedulableNodes": true
    }
kind: ConfigMap
metadata:
  name: overprovisioning-cm

RBAC Config

Next is RBAC configuration, with the three components: ServiceAccount, ClusterRole, and ClusterRoleBinding. These components give autoscaler deployment the necessary access to adjust the size of the pause pod deployment based on the required scaling. Please refer to the manifest:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: overprovisioning-sa
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: overprovisioning-cr
rules:
  - apiGroups:
      - ''
    resources:
      - nodes
    verbs:
      - list
      - watch
  - apiGroups:
      - ''
    resources:
      - replicationcontrollers/scale
    verbs:
      - get
      - update
  - apiGroups:
      - extensions
      - apps
    resources:
      - deployments/scale
      - replicasets/scale
    verbs:
      - get
      - update
  - apiGroups:
      - ''
    resources:
      - configmaps
    verbs:
      - get
      - create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: overprovisioning-rb
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: overprovisioning-cr
subjects:
  - kind: ServiceAccount
    name: overprovisioning-sa

Pause pods deployments

Creating pause pods is an easy task. You can use a custom image to set up a healthy pod that acts as a placeholder in the cluster. The size of this pod, CPU, and memory configurations, can be adjusted based on your needs. Make sure to calculate the appropriate size to effectively block cluster resources using pause pods.

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: overprovisioning
  name: overprovisioning
spec:
  selector:
    matchLabels:
      app: overprovisioning
  template:
    metadata:
      labels:
        app: overprovisioning
    spec:
      containers:
          image: nginx (any custom image)
          name: pause
          resources:
            limits:
              cpu: Ym
              memory: YMi
            requests:
              cpu: Xm
              memory: XMi
      priorityClassName: overprovisioning

Autoscaler deployment

Proceed with the deployment of the autoscaler. The objective of these pods is to supervise the replica count of the above pause pod deployment, based on the linear strategy employed by the autoscaler. This mechanism allows for the expansion or reduction of replicas and the efficient allocation of cluster resources through the utilization of pause pods. Execute the deployment by employing the provided manifest below:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: overprovisioning-as
spec:
  replicas: 1
  selector:
    matchLabels:
      app: overprovisioning-as
  template:
    metadata:
      labels:
        app: overprovisioning-as
    spec:
      containers:
        - command:
            - /cluster-proportional-autoscaler
            - '--namespace=XYZ'
            - '--configmap=overprovisioning-cm'
            - '--target=deployment/overprovisioning'
            - '--logtostderr=true'
            - '--v=2'
          image: gcr.io/google_containers/cluster-proportional-autoscaler-amd64:{LATEST_RELEASE}
          name: autoscaler
      serviceAccountName: overprovisioning-sa

You now have an overprovisioning mechanism that allows you to allocate more resources than necessary to your cluster. To verify if it’s working correctly, you can perform the below test.

Testing the functionality

To prevent the need for scaling the entire cluster, please execute the tests on a single node within the cluster by employing Pod Affinity. Identify the node with running pause pods and direct the creation of new pods to this specific node through pod affinity specs.
Do not define any PodPriority in this test deployment, Kubernetes will automatically assign a Priority of 0 to this deployment. Meanwhile, our pause pods are configured with a priority of -1, indicating lower priority compared to regular workload pods or these test pods.
Upon creating this deployment, it should trigger the eviction of the pods on the designated node to prioritize the new test pods with a higher priority.

The pause pods should be terminated, and the new test pods should swiftly transition into a running state on this designated node. The terminated pause pods will be subsequently re-initiated as pending by their respective replica set and will search for a place on another node to run.

Basics of Overprovisioning in EKS Cluster

This article talks about the fundamental concepts of overprovisioning within a Kubernetes Cluster. We will explore the definition of overprovisioning, its necessity, and how to calculate various aspects related to it. So, without further delay, let’s dive right in.

Overprovisioning basics!

Need of Overprovisioning

It’s a methodology for preparing your cluster for future demands from hosted applications to prevent potential bottlenecks.

Let’s consider a scenario in which the Kubernetes-hosted application needs to increase the number of pods (horizontal scaling) beyond the cluster’s available resources. As a result, additionally spawned pods end up in a pending state because there are not enough resources on the cluster to schedule them. Even if you are using the Elastic Kubernetes Service (EKS) Cluster Autoscaler (referred to as CA), there is a minimum 10-second delay for CA to recognize the need for more capacity and communicate this requirement to the Auto Scaling Group (ASG). Furthermore, there is an additional delay as the ASG scales out, launches a new EC2 instance, goes through the boot-up process, executes necessary bootstrap scripts, and is marked as READY by Kubernetes in the cluster. This entire process typically takes a minute or two, during which time application pods remain in a pending state.

To avoid these delays and ensure immediate capacity availability for unscheduled pods, overprovisioning can be employed. This is accomplished through the use of pause pods.

Concept of pause pods

Pause pods are non-essential, low-priority pods that are created to reserve cluster resources, such as CPU, memory, or IP addresses. When critical pods require this reserved capacity, the scheduler evicts these low-priority pause pods, allowing the critical pods to utilize the freed-up resources. But, what happens to these evicted pause pods?

After being evicted, these pause pods are automatically re-created by their respective replica set and initially start in a pending state. At this point, the Cluster Autoscaler (CA) intervenes, as explained earlier, to provide the additional capacity required. Since pause pods do not serve any specific applications, it is acceptable for them to remain in a pending state for a certain period. Once the new capacity becomes available, these pause pods consume it, effectively reserving it for future requirements.

How does scale-in work with Pause pods?

Now that we’ve grasped how pause pods assist in scenarios requiring cluster scale-out, the next question arises: could these pause pods potentially hold onto resources unnecessarily and block your cluster’s scale-in actions? Here’s the scenario: when the Cluster Autoscaler (CA) identifies nodes with light utilization (perhaps containing only pause pods), it proceeds to evict these low-priority pause pods as part of the node termination process (a scale-in action). Subsequently, these evicted pods are re-created in a pending state. However, during this period, the node count has decreased by one, and the cluster-proportional-autoscaler (HPA) recalculates the new required number of pause pods. This number is typically lower, resulting in the termination of the newly pending pause pods.

Pause pod calculations

Pause pod deployment should be configured with the cluster-proportional-autoscaler i.e. HPA. Set it to use Linear mode by defining the below configuration in the respective ConfigMap as follows:

linear:
  {
    "coresPerReplica": 1,
    "nodesPerReplica": 1,
    "min": 1,
    "max": 50,
    "preventSinglePointFailure": true,
    "includeUnschedulableNodes": true
  }

This configuration means:

  • coresPerReplica: One pause pod per core, meaning one pause pod for each core.
  • nodesPerReplica: One pause pod per node, signifying one pause pod for each node.
  • min: At least one pause pod.
  • max: A maximum of 50 pause pods.

When both coresPerReplica and nodesPerReplica are used, the system calculates both values and selects the greater of the two. Let’s calculate for a cluster with 4 nodes, each using the m7g.xlarge instance type, which has 4 cores per node:

  • 4 nodes, meaning 4 pause pods (one per node).
  • 16 cores, which equates to 16 pause pods (one per core).

So, in this case, the cluster-proportional-autoscaler will spawn a total of 16 pause pods for the cluster.

Now, let’s explore the process of calculating the CPU request configuration for Pause pods and, as a result, determine the overprovisioned capacity of the cluster.

Let’s consider, each individual pause pod is set to request 200 milliCPU (mCPU); from the cluster’s computing resources point of view, it amounts to 20% of a single CPU core’s capacity. Given that we are using one pause pod per CPU core, this effectively results in overprovisioning 20% of the entire cluster’s computational resources.

Depending on the criticality and frequency of spikes in the applications running on the cluster, you can assess the overprovisioning capacity and compute the corresponding configurations for the pause pods.

What is PDB in Kubernetes?

Ever wondered what is PDB i.e. Pod Disruption Budget in the Kubernetes world? Then this small post is just for you!

PDB foundation!

PDB i.e. Pod Disruption Budget is a method to make sure the minimum number of Pods are always available for a certain application in the Kubernetes cluster. That is a kind of one-liner for explaining PDB 🙂 Let’s dive deeper and understand what is PDB. What does PDB offer? Should I define PDB for my applications? etc.

What is Pod Disruption Budget?

The Replicaset in Kubernetes helps us to keep multiple replicas of the same Pod to handle the load or add an extra layer of availability in containerized applications. But, those replicas are tossed during cluster maintenance or scaling actions if you don’t tell the control plane (Kubernetes master/ Kubernetes API server) how they should be terminated.

The PDB is a way to let control plane how the Pods in a certain Replicaset should be terminated. The PDB is a Kubernetes kind that should be associated with the Deployment kind.

How PDB is defined?

It’s a very small kind and offers only three fields to configure:

  • spec.selector: Defines the Pods to which PDB will be applied
  • spec.minAvailable: An absolute number or percentage. It’s the number of Pods that should always remain in a running state during evictions.
  • spec.maxUnavailable: An absolute number or percentage. It’s the maximum number of Pods that can be unavailable during evictions.
  • You can only specify either spec.minAvailable or spec.maxUnavailable

A sample Kubernetes manifest for PDB looks like this –

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: sample-pdb
  namespace: <namespace> #optional
  Annotations:           #optional
    key: value 
  labels:                #optional
    key: value
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: web

Here –

  • metadata: The PDB name, the namespace in which PDB lives, annotations and labels that are applied to the PDB itself.
  • spec: It’s a PDB config we discussed above.

How does PDB work?

Let’s look at how this configuration takes into effect. For better understanding, we will consider a simple application that has 5 replicas.

Case 1: PDB is configured with minAvailable to be 3.

This means we are telling the control plane that it can evict at most (5 running – 3 minavailable) 2 Pods at a time. That means we are allowing 2 disruptions at a time. This value is also called disruptionsAllowed . So, in a situation where the control plane needs to move all the 5 Pods, it will evict 2 Pods first then once those 2 evicted Pods, respawns on the new node and goes into the Running state, it will evict the next 2 and lastly 1. In a process, it makes sure that there are always 3 Pods in the Running state.

Case 2: PDB is configured with maxUnavailable to be 2

It’s the same effect as above! Basically, you are telling the control plane at any given point of time 2 Pods can be evicted meaning 5-2 = 3 Pods should be running!

The Allowed Disruptions is calculated on the fly. It always considers the Pods in Running state only. Continuing with the above example, if out of 5 Pods, 2 Pods are not in a Running state (for maybe some reason) then disruptionsAllowed is calculated as 3-3=0. This means only 3 Pods are in the Running state and all 3 should not be evicted since PDB says it wants a minimum of 3 Pods in the Running state all the time.

In a nutshell: disruptionsAllowed = Number of RUNNING Pods – minAvailable value

How to check Pod Disruption Budget?

One can use the below command to check the PDB –

$ kubectl get poddisruptionbudgets -n <namespace>

Then, kubectl describe can be used to get the details of each PDB fetched in the output of the previous command.

Should I define PDB for my applications?

Yes, you should! It’s a good practice to calculate and properly define the PDB to make your application resilient to Cluster maintenance/scaling activities.

The minimum number is to have minAvailable as 1 and replicas 2. Or make sure that minAvailable is always less than the replica count. The wrongly configured PDB will not allow Pod evictions and may disturb the cluster activities. Obviously, cluster admins can force their way in but then it means downtime in your applications.

You can also implement cluster constraints for PDB so that new applications won’t be allowed to deploy unless they have PDB manifest as well in the code.

Running a pod in Kubernetes

In this article we will look at pod concept in Kubernetes

pods in K8s.

What is pod in kubernetes?

The pod is the smallest execution unit in Kubernetes. It’s a single container or group of containers that serve a running process in the K8s cluster. Read what is container? if you are not familiar with containerization.

Each pod has a single IP address that is shared by all the containers within. Also, the port space is shared by all the containers inside.

You can view running pods in K8s by using below command –

$ kubectl get pods
NAME        READY   STATUS    RESTARTS   AGE
webserver   1/1     Running   0          10s

View pod details in K8s

To get more detailed information on each pod, you can run below command by supplying its pod name as argument –

$ kubectl describe pods webserver
Name:         webserver
Namespace:    default
Priority:     0
Node:         node01/172.17.0.9
Start Time:   Sun, 05 Jul 2020 13:50:41 +0000
Labels:       <none>
Annotations:  <none>
Status:       Running
IP:           10.244.1.3
IPs:
  IP:  10.244.1.3
Containers:
  webserver:
    Container ID:   docker://8b260effa4ada1ff80e106fb12cf6e2da90eb955321bbe3b9e302fdd33b6c0d8
    Image:          nginx
    Image ID:       docker-pullable://nginx@sha256:21f32f6c08406306d822a0e6e8b7dc81f53f336570e852e25fbe1e3e3d0d0133
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sun, 05 Jul 2020 13:50:50 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-bjcwg (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-bjcwg:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-bjcwg
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  25s   default-scheduler  Successfully assigned default/webserver to node01
  Normal  Pulling    23s   kubelet, node01    Pulling image "nginx"
  Normal  Pulled     17s   kubelet, node01    Successfully pulled image "nginx"
  Normal  Created    16s   kubelet, node01    Created container webserver
  Normal  Started    16s   kubelet, node01    Started container webserver

pod configuration file

One can create a pod configuration file i.e. yml file which has all the details to start a pod. K8s can read this file and spin up your pod according to specifications. Sample file below –

$ cat my_webserver.yml
echo "apiVersion: v1
kind: Pod
metadata:
  name: webserver
spec:
  containers:
    - name: webserver
      image: nginx
      ports:
        - containerPort: 80" >my_webserver.yml

Its a single container pod file since we specified specs for only one kind of container in it.

Single container pod

Single container pod can be run without using a yml file. Like using simple command –

$ kubectl run single-c-pod --image=nginx
pod/single-c-pod created
$ kubectl get pods
NAME           READY   STATUS    RESTARTS   AGE
single-c-pod   1/1     Running   0          35s
webserver      1/1     Running   0          2m52s

You can spin the single container pod using simple yml file stated above.

Multiple container pod

For multiple container pods, let’s edit the above yml file to add another container specs as well.

$ cat << EOF >web-bash.yml
apiVersion: v1
kind: Pod
metadata:
  name: web-bash
spec:
  containers:
    - name: apache
      image: httpd
      ports:
        - containerPort: 80
    - name: linux
      image: ubuntu
      ports:
      command: ["/bin/bash", "-ec", "while true; do echo '.'; sleep 1 ; done"]
EOF

In the above file, we are spinning up a pod that has 1 webserver container and another is Ubuntu Linux container.

$ kubectl create -f web-bash.yml
pod/web-bash created
$ kubectl get pods
NAME       READY   STATUS    RESTARTS   AGE
web-bash   2/2     Running   0          12s

How to delete pod

Its a simple delete pod command

$ kubectl delete pods web-bash
pod "web-bash" deleted

How to view pod logs in Kubernetes

I am running a single container pod of Nginx. We will then check pod logs to confirm this messages.

$ kubectl run single-c-pod --image=nginx
pod/single-c-pod created
$ kubectl logs single-c-pod
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Configuration complete; ready for start up

Lab setup for Ansible testing

Quick lab setup for learning Ansible using containers on Oracle Virtualbox Linux VM.

Setting up LAb for learning Ansible

In this article, we will be setting up our lab using Docker containers for testing Ansible. We are using Oracle Virtualbox so that you can spin up VM with a readymade OVA file in a minute. This will save efforts to install the OS from scratch. Secondly, we will be spinning up a couple of containers which can be used as ansible clients. Since we need to test ansible for running a few remote commands/modules, it’s best to have containers working as clients rather than spinning complete Linux VMs as a client. This will save a lot of resource requirements as well and you can run this ansible lab on your desktop/laptop as well for practicing ansible.

Without further delay lets dive into setting up a lab on desktop/laptop for learning Ansible. Roughly it’s divided into below sections –

  1. Download Oracle Virtualbox and OVA file
  2. Install Oracle Virtualbox and spin VM from OVA file
  3. Run containers to work as ansible clients
  4. Test connectivity via passwordless SSH access from Ansible worker to clients

Step 1. Download Oracle Virtualbox & OEL7 with Docker readymade OVA file

Goto VirtualBox downloads and download Virtualbox for your OS.

Goto Oracle Downloads and download Oracle Linux 7 with Docker 1.12 Hands-On Lab Appliance file. This will help us to spin up VM in Oracle VirtualBox without much hassle.

Step 2. Install Oracle Virtualbox and start VM from OVA file

Install Oracle Virtualbox. Its a pretty standard setup procedure so I am not getting into it. Once you download above OVA file, open it in Oracle VirtualBox and it will open up the Import Virtual Appliance menu like below-

Import Virtual Appliance menu

Click Import. Agree to the software license agreement shown and it will start Importing OVA as a VM. After finishing import, you will see VM named DOC-1002902 i.e. same name as OVA file is created in your Oracle VirtualBox.

Start that VM and login with the user. Credentials details are mentioned in the documentation link on the download page of OVA file.

Step 3. Running containers

For running containers, you need to set up Docker Engine first on VM. All steps are listed in the same documentation I mentioned above where you looked at your first login credentials. Also, you can follow our Docker installation guide if you want.

Then create key pair on your VM i.e. Ansible worker/server so that public key can be used within a container for passwordless SSH. We will be using Ansible user as ansible-usr in our setup, so you can see this user henceforth here. Read how to configure Ansible default user.

[root@ansible-srv .ssh]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
98:42:9a:82:79:ac:74:7f:f9:31:71:2a:ec:bb:af:ee root@ansible-srv.kerneltalks.com
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|                 |
|    .            |
|.o +   o         |
|+.=.. o S. .     |
|.+. ... . +      |
|.    . = +       |
|      o o o      |
|      oE=o       |
+-----------------+

Now we have key pair ready move on to containers.

Once your Docker Engine is installed and started, create custom Docker Image using Dockerfile mentioned below which we will use to spin up multiple containers (ansible clients). Below Dockerfile is taken from link and modified a bit for setting passwordless SSH. This Dockerfile answers the question how to configure passwordless SSH for containers!

FROM ubuntu:16.04

RUN apt-get update && apt-get install -y openssh-server
RUN mkdir /var/run/sshd
RUN echo 'root:password' | chpasswd
RUN sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config

# SSH login fix. Otherwise user is kicked off after login
RUN sed 's@session\s*required\s*pam_loginuid.so@session optional pam_loginuid.so@g' -i /etc/pam.d/sshd

ENV NOTVISIBLE "in users profile"
RUN echo "export VISIBLE=now" >> /etc/profile
RUN useradd -m -d /home/ansible-usr ansible-usr
RUN mkdir /home/ansible-usr/.ssh
COPY .ssh/id_rsa.pub /home/ansible-usr/.ssh/authorized_keys
RUN chown -R ansible-usr:ansible-usr /home/ansible-usr/.ssh
RUN chmod 700 /home/ansible-usr/.ssh
RUN chmod 640 /home/ansible-usr/.ssh/authorized_keys
EXPOSE 22
CMD ["/usr/sbin/sshd", "-D"]

Keep above file as Dockerfile in /root and then run below command while you are in /root. If you are in some other directory then make sure you use relative path in COPY command in above Dockerfile.

[root@ansible-srv ~]# docker build -t eg_sshd .

This command will create a custom Docker Image named eg_sshd. Now you are ready to spin up containers using this custom docker image.

We will start containers in below format –

  1. Webserver
    1. k-web1
    2. k-web2
  2. Middleware
    1. k-app1
    2. k-app2
  3. Database
    1. k-db1

So in total 5 containers spread across different groups with different hostname so that we can use them for testing different configs/actions in ansible.

I am listing command for the first container only. Repeat them for rest 4 servers.

[root@ansible-srv ~]# docker run -d -P --hostname=k-web1 --name k-web1 eg_sshd
e70d825904b8c130582c0c52481b6e9ff33b18e0ba8ab47f12976a568587087b

It is working!

Now, spin up all 5 containers. Verify all containers are running and note down their ports.

[root@ansible-srv ~]# docker container ls -a
CONTAINER ID        IMAGE               COMMAND               CREATED              STATUS              PORTS                   NAMES
2da32a4706fb        eg_sshd             "/usr/sbin/sshd -D"   5 seconds ago        Up 3 seconds        0.0.0.0:32778->22/tcp   k-db1
75e2a4bb812f        eg_sshd             "/usr/sbin/sshd -D"   39 seconds ago       Up 33 seconds       0.0.0.0:32776->22/tcp   k-app2
40970c69348f        eg_sshd             "/usr/sbin/sshd -D"   50 seconds ago       Up 47 seconds       0.0.0.0:32775->22/tcp   k-app1
4b733ce710e4        eg_sshd             "/usr/sbin/sshd -D"   About a minute ago   Up About a minute   0.0.0.0:32774->22/tcp   k-web2
e70d825904b8        eg_sshd             "/usr/sbin/sshd -D"   4 minutes ago        Up 4 minutes        0.0.0.0:32773->22/tcp   k-web1

Step 4. Passwordless SSH connectivity between Ansible server and clients

This is an important step for the smooth & hassle-free functioning of Ansible. You need to create ansible user on Ansible server & clients. Then configure passwordless SSH (using keys) for that user.

Now you need to get the IP addresses of your containers. You can inspect the container and extract that information –

[root@ansible-srv ~]# docker inspect k-web1 |grep IPAddress
            "SecondaryIPAddresses": null,
            "IPAddress": "172.17.0.2",
                    "IPAddress": "172.17.0.2",

Now we have an IP address, let’s test the passwordless connectivity –

[root@ansible-srv ~]# ssh ansible-usr@172.17.0.2
Welcome to Ubuntu 16.04.6 LTS (GNU/Linux 4.1.12-37.5.1.el7uek.x86_64 x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

Last login: Wed Jan 15 18:57:38 2020 from 172.17.0.1
$ hostname
k-web1
$ exit
Connection to 172.17.0.2 closed.

It’s working! Go ahead and test it for rest all, so that the client’s authenticity will be added and RSA fingerprints will be saved to the known host list. Now we have all 5 client containers running and passwordless SSH is setup between ansible server and clients for user ansible-usr

Now you have full lab setup ready on your desktop/laptop within Oracle Virtualbox for learning Ansible! Lab setup has a VM running in Oracle Virtualbox which is you mail Ansible server/worker and it has 5 containers running within acting as Ansible clients. This setup fulfills the pre-requisite of the configuration of passwordless SSH for Ansible.

Kubernetes installation and configuration

Step by step guide for Kubernetes installation and configuration along with sample outputs.

Kubernetes installation guide

Pre-requisite

  • Basic requirement to run Kubernetes is your machine should not have SWAP configured if at all it is configured you need to turn it off using swapoff -a.
  • You will need Docker installed on your machine.
  • You will need to set your SELinux in permissive mode to enable kubelet network communication. You can set policy in SELinux for Kubernetes and then you can enable it normally.
  • Your machine should have at least 2 CPUs.
  • Kubernetes ports should be open between master and nodes for cluster communications. All are TCP ports and to be open for inbound traffic.
PortsDescription
10250Kublet API (for master and nodes)
10251kube-scheduler
10252kube-controller-manager
6443*Kubernetes API server
2379-2380etcd server client API
30000-32767 NodePort Services (only for nodes)

Installation of Kubernetes master node Kubemaster

First step is to install three pillar packages of Kubernetes which are :

  • kubeadm – It bootstraps Kubernetes cluster
  • kubectl – CLI for managing cluster
  • kubelet – Service running on all nodes which helps managing cluster by performing tasks

For downloading these packages you need to configure repo for the same. Below are repo file contents for respective distributions.

For RedHat, CentOs or Fedora (YUM based)-

root@kerneltalks # cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
root@kerneltalks # yum install -y kubectl kubeadm kubelet

For Ubuntu, Suse or Debian (APT based)-

sudo apt-get update && sudo apt-get install -y apt-transport-https gnupg2
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubectl kubeadm kubelet

Once you have configured the repo install packages kubeadm, kubectl and kubelet according to your distribution package manager.

Enable and start kubelet service

root@kerneltalks # systemctl enable kubelet.service
root@kerneltalks # systemctl start kubelet

Configuration of Kubernetes master node Kubemaster

Now you need to make sure both Docker and Kubernetes using the same cgroup driver. By default its cgroupfs for both. If you haven’t changed for Docker then you don’t have to do anything for Kubernetes as well. But if you are using different cgroup in Docker you need to specify it for Kubernetes in below file –

root@kernetalks # cat /etc/default/kubelet
KUBELET_KUBEADM_EXTRA_ARGS=--cgroup-driver=<value>

This file will be picked up by kubeadm while starting up. But if you have Kubernetes already running you need to reload this configuration using –

root@kerneltalks # systemctl daemon-reload
root@kerneltalks # systemctl restart kubelet

Now you are ready to bring up Kubernetes master and then add worker nodes or minions to it as a slave for the cluster.

You have installed and adjusted settings to bring up Kubemaster. You can start Kubemaster using the command kubeadm init but you need to provide network CIDR first time.

  • --pod-network-cidr= : For pod network
  • --apiserver-advertise-address= : Optional. To be used when multiple IP addresses/subnets assigned to the machine.

Refer below output for starting up Kubernetes master node. There are few warnings which can be corrected with basic sysadmin tasks.

# kubeadm init --apiserver-advertise-address=172.31.81.44 --pod-network-cidr=192.168.1.0/16
[init]

using Kubernetes version: v1.11.3

[preflight]

running pre-flight checks I0912 07:57:56.501790 2443 kernel_validator.go:81] Validating kernel version I0912 07:57:56.501875 2443 kernel_validator.go:96] Validating kernel config [WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.05.0-ce. Max validated version: 17.03 [WARNING Hostname]: hostname “kerneltalks” could not be reached [WARNING Hostname]: hostname “kerneltalks” lookup kerneltalks1 on 172.31.0.2:53: no such host [WARNING Service-Kubelet]: kubelet service is not enabled, please run ‘systemctl enable kubelet.service’

[preflight/images]

Pulling images required for setting up a Kubernetes cluster

[preflight/images]

This might take a minute or two, depending on the speed of your internet connection

[preflight/images]

You can also perform this action in beforehand using ‘kubeadm config images pull’

[kubelet]

Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”

[kubelet]

Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”

[preflight]

Activating the kubelet service

[certificates]

Generated ca certificate and key.

[certificates]

Generated apiserver certificate and key.

[certificates]

apiserver serving cert is signed for DNS names [kerneltalks1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.31.81.44]

[certificates]

Generated apiserver-kubelet-client certificate and key.

[certificates]

Generated sa key and public key.

[certificates]

Generated front-proxy-ca certificate and key.

[certificates]

Generated front-proxy-client certificate and key.

[certificates]

Generated etcd/ca certificate and key.

[certificates]

Generated etcd/server certificate and key.

[certificates]

etcd/server serving cert is signed for DNS names [kerneltalks1 localhost] and IPs [127.0.0.1 ::1]

[certificates]

Generated etcd/peer certificate and key.

[certificates]

etcd/peer serving cert is signed for DNS names [kerneltalks1 localhost] and IPs [172.31.81.44 127.0.0.1 ::1]

[certificates]

Generated etcd/healthcheck-client certificate and key.

[certificates]

Generated apiserver-etcd-client certificate and key.

[certificates]

valid certificates and keys now exist in “/etc/kubernetes/pki”

[kubeconfig]

Wrote KubeConfig file to disk: “/etc/kubernetes/admin.conf”

[kubeconfig]

Wrote KubeConfig file to disk: “/etc/kubernetes/kubelet.conf”

[kubeconfig]

Wrote KubeConfig file to disk: “/etc/kubernetes/controller-manager.conf”

[kubeconfig]

Wrote KubeConfig file to disk: “/etc/kubernetes/scheduler.conf”

[controlplane]

wrote Static Pod manifest for component kube-apiserver to “/etc/kubernetes/manifests/kube-apiserver.yaml”

[controlplane]

wrote Static Pod manifest for component kube-controller-manager to “/etc/kubernetes/manifests/kube-controller-manager.yaml”

[controlplane]

wrote Static Pod manifest for component kube-scheduler to “/etc/kubernetes/manifests/kube-scheduler.yaml”

[etcd]

Wrote Static Pod manifest for a local etcd instance to “/etc/kubernetes/manifests/etcd.yaml”

[init]

waiting for the kubelet to boot up the control plane as Static Pods from directory “/etc/kubernetes/manifests”

[init]

this might take a minute or longer if the control plane images have to be pulled

[apiclient]

All control plane components are healthy after 46.002127 seconds

[uploadconfig]

storing the configuration used in ConfigMap “kubeadm-config” in the “kube-system” Namespace

[kubelet]

Creating a ConfigMap “kubelet-config-1.11” in namespace kube-system with the configuration for the kubelets in the cluster

[markmaster]

Marking the node kerneltalks1 as master by adding the label “node-role.kubernetes.io/master=””

[markmaster]

Marking the node kerneltalks1 as master by adding the taints [node-role.kubernetes.io/master:NoSchedule]

[patchnode]

Uploading the CRI Socket information “/var/run/dockershim.sock” to the Node API object “kerneltalks1” as an annotation

[bootstraptoken]

using token: 8lqimn.2u78dcs5rcb1mggf

[bootstraptoken]

configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials

[bootstraptoken]

configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token

[bootstraptoken]

configured RBAC rules to allow certificate rotation for all node client certificates in the cluster

[bootstraptoken]

creating the “cluster-info” ConfigMap in the “kube-public” namespace

[addons]

Applied essential addon: CoreDNS

[addons]

Applied essential addon: kube-proxy Your Kubernetes master has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run “kubectl apply -f [podnetwork].yaml” with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of machines by running the following on each node as root: kubeadm join 172.31.81.44:6443 –token 8lqimn.2u78dcs5rcb1mggf –discovery-token-ca-cert-hash sha256:de6bfdec100bb979d26ffc177de0e924b6c2fbb71085aa065fd0a0854e1bf360

In the above output there are two key things you get –

  • Commands to enable the regular user to administer Kubemaster
  • Command to run on slave node to join Kubernetes cluster

That’s it. You have successfully started the Kubemaster node and brought up your Kubernetes cluster. The next task is to install and configure your secondary nodes in this cluster.

Installation of Kubernetes slave node or minion

The installation process remains the same. Follow steps for disabling SWAP, installing Docker, and installing 3 Kubernetes packages.

Configuration of Kubernetes slave node minion

Nothing to do much on this node. You already have the command to run on this node for joining cluster which was spitting out by kubeadm init command.

Lets see how to join node in Kubernetes cluster using kubeadm command –

[root@minion ~]# kubeadm join 172.31.81.44:6443 --token 8lqimn.2u78dcs5rcb1mggf --discovery-token-ca-cert-hash sha256:de6bfdec100bb979d26ffc177de0e924b6c2fbb71085aa065fd0a0854e1bf360
[preflight]

running pre-flight checks I0912 08:19:56.440122 1555 kernel_validator.go:81] Validating kernel version I0912 08:19:56.440213 1555 kernel_validator.go:96] Validating kernel config

[discovery]

Trying to connect to API Server “172.31.81.44:6443”

[discovery]

Created cluster-info discovery client, requesting info from “https://172.31.81.44:6443”

[discovery]

Failed to request cluster info, will try again: [Get https://172.31.81.44:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: net/http: TLS handshake timeout]

[discovery]

Requesting info from “https://172.31.81.44:6443” again to validate TLS against the pinned public key

[discovery]

Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server “172.31.81.44:6443”

[discovery]

Successfully established connection with API Server “172.31.81.44:6443”

[kubelet]

Downloading configuration for the kubelet from the “kubelet-config-1.11” ConfigMap in the kube-system namespace

[kubelet]

Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”

[kubelet]

Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”

[preflight]

Activating the kubelet service

[tlsbootstrap]

Waiting for the kubelet to perform the TLS Bootstrap…

[patchnode]

Uploading the CRI Socket information “/var/run/dockershim.sock” to the Node API object “minion” as an annotation This node has joined the cluster: * Certificate signing request was sent to master and a response was received. * The Kubelet was informed of the new secure connection details. Run ‘kubectl get nodes’ on the master to see this node join the cluster.

And here you go. Node has joined the cluster successfully. Thus you have completed Kubernetes cluster installation and configuration!

Check nodes status from kubemaster.

[root@kerneltalks ~]# kubectl get nodes
NAME           STATUS     ROLES     AGE       VERSION
kerneltalks1   Ready      master    2h        v1.11.3
minion         Ready      <none>    1h        v1.11.3

Once you see all status as ready you have a steady cluster up and running.

Difference between Docker swarm and Kubernetes

Learn the difference between Docker swarm and Kubernetes. Comparison between two container orchestration platforms in a tabular manner.

Docker Swarm v/s Kubernetes

When you are on the learning curve of application containerization, there will be a stage when you come across orchestration tools for containers. If you have started your learning with Docker then Docker swarm is the first cluster management tool you must have learned and then Kubernetes. So its time to compare docker swarm and Kubernetes. In this article, we will quickly see what is docker, what is Kubernetes, and then a comparison between the two.

What is Docker swarm?

Docker swarm is a native tool to Docker which is aimed at clustering management of Docker containers. Docker swarm enables you to build a cluster of multi-node VM of physical machines running the Docker engine. In turn, you will be running containers on multiple machines to facilitate HA, availability, fault-tolerant environment. It’s pretty much simple to set up and native to Docker.

What is Kubernetes?

It’s a platform to manage containerized applications i.e. containers in cluster environment along with automation. It does almost similar job swarm mode does but in a different and enhanced way. It’s developed by Google in the first place and later project handed over to CNCF. It works with containers like Docker and rocket. Kubernetes installation is a bit complex than Swarm.

Compare Docker and Kubernetes

If someone asks you a comparison between Docker and Kubernetes then that’s not a valid question in the first place. You can not differentiate between Docker and Kubernetes. Docker is an engine that runs containers or itself it refers to as container and Kubernetes is orchestration platform that manages Docker containers in cluster environment. So one can not compare Docker and Kubernetes.

Difference between Docker Swarm and Kubernetes

I added a comparison of Swarm and Kubernetes in the below table for easy readability.

Docker Swarm
Kubernetes
Docker’s own orchestration tool Google’s open-source orchestration tool
Younger than Kubernetes Older than Swarm
Simple to setup being native tool to Docker A bit complex to set up but once done offer more functionality than Swarm
Less community around it but Docker has excellent documentation Being Google’s product and older has huge community support
Simple application deploy in form of services
Bit complex application deploys through pods, deployments, and services.
Has only command line interface to manage Also offers GUI addition to CLI
Monitoring is available using third party applications Offer native and third party for monitoring and logging
Much faster than Kubernetes Since its a complex system its deployments are bit slower than Swarm

DCA – Docker Certified Associate Certification guide

The small guide which will help aspirants for Docker Certified Associate Certification preparation.

Docker Certified Associate Certification guide

I recently cleared DCA – Docker Certified Associate Certification and wanted to share my experience here on my blog. This might be helpful for folks who are going to appear examination soon or may aspire containerization aspirant to take it.

DCA details :

Complete certification details and FAQs can be found here on there official website.

  • Duration: 90 minutes
  • Type: Multiple choice questions
  • Number of questions: 55
  • Mode of exam: Remotely proctored
  • Cost: $195 (For India residents, it would be plus 18% GST which comes roughly 16-17K INR.)

Preparation

Docker Certified Associate aims at certifying professionals having enterprise-level experience of Docker for a minimum of a year. Whenever you are starting to learn Docker, mostly you start off with CE (Community Editions) which comes free or your practice on Play with Docker which also serves CE Docker. You should not attempt this certification with knowledge or experience on CE only. This certification is designed to test your knowledge with Enterprise Edition of Docker which is fully feature packed and has paid license tagged to it.

So it is expected that you have a minimum 6 months or years of experience on Docker EE in the enterprise environment before you attempt for certification. Docker also offers Trail EE license which you can use to start off with EE Docker learning. Once you are through with all the syllabus mentioned on the website for this certification and well versed with Docker enterprise world then only attempt for certification.

You can register for examination from a website which will redirect you to their vendor Examity website. There you need to register for the exam according to the available time slot and make the payment online. You can even book for a time which is within 24 hours but it’s not always available. Make sure your computer completes all the pre-requisite so that you can take the exam without any hassle. You can even connect with the Exam vendor well before the exam and get your computer checked for compatibility with Exam software/plugin.

Docker’s official study guide walks you through the syllabus so that you can prepare yourself accordingly.

During Exam

You can take this exam from anywhere provided you have a good internet connection and your surrounding abides rules mentioned on certification website like an empty room, clean desk, etc. As this exam is remotely proctored, there will be executive monitoring of your screen, webcam, mic remotely in real-time. So make sure you have a calm place, empty room before you start the exam. You should eat, use a cellphone or similar electronic device, talk, etc during the exam.

Exam questions are carefully designed by professionals to test your knowledge in all areas. Do not expect only command, options, etc types questions. There is a good mix of all types of logical, conceptual, and practical application questions. Some questions may have multiple answers so keep an eye on such questions and do not miss to select more than one answer.

After exam

Your examination scorecard will be displayed immediately and the result will be shown to you. You can have it on email. The actual certificate takes 3 minutes before it hits your inbox! Do check spam if you don’t receive it before you escalate it to Docker Certification Team (certification@docker.com)

All the best! Do share your success stories in the comments below.

How Docker container DNS works

Learn about Docker DNS. How docker container DNS works? How to change nameserver in Docker container to use external DNS?

Docker DNS

Docker container has inbuilt DNS which automatically resolves IP to container names in user-defined networks.  But what if you want to use external DNS into the container for some project need. Or how to use external DNS in all the containers run on my host? In this article, we will walk you through the below points :

  1. Docker native DNS
  2. Nameservers in Docker
  3. How to use external DNS in the container while starting it
  4. How to use external DNS in all the containers on a docker host

Docker native DNS

In a user-defined docker network, DNS resolution to container names happens automatically. You don’t have to do anything if your containers are using your defined docker network they can find each other with hostname automatically.

We have 2 Nginx containers running using my newly created docker network named kerneltalks. Both Nginx containers are installed with ping utility.

$ docker container ls
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
1b1bb99559ac        nginx               "nginx -g 'daemon of…"   27 minutes ago      Up 27 minutes       80/tcp              nginx2
239c662d3945        nginx               "nginx -g 'daemon of…"   27 minutes ago      Up 27 minutes       80/tcp              nginx1

$ docker network inspect kerneltalks
"Containers": {
"1b1bb99559ac21e29ae671c23d46f2338336203c96874ac592431f60a2e6a5de": {
"Name": "nginx2",
"EndpointID": "4141f56fe878275e322b9283476508d1135e813d12ea2b7d87a5c3d0db527f79",
"MacAddress": "02:42:ac:13:00:05",
"IPv4Address": "172.19.0.5/16",
"IPv6Address": ""
},
"239c662d3945031413e4c69b99e3ddde57832004bd6193bdbc30bd5e6ca6f4e2": {
"Name": "nginx1",
"EndpointID": "376da79e6746cc80d178f4363085e521a9d45c65df08248b77c1bc744b495ae4",
"MacAddress": "02:42:ac:13:00:04",
"IPv4Address": "172.19.0.4/16",
"IPv6Address": ""
},

And they can ping each other without any extra DNS efforts. Since user-defined networks have inbuilt DNS which resolves IP addresses from container names.

$ docker exec -it nginx1 ping nginx2
PING nginx2 (172.19.0.5) 56(84) bytes of data.
64 bytes from nginx2.kerneltalks (172.19.0.5): icmp_seq=1 ttl=64 time=0.151 ms
64 bytes from nginx2.kerneltalks (172.19.0.5): icmp_seq=2 ttl=64 time=0.053 ms

$ docker exec -it nginx2 ping nginx1
PING nginx1 (172.19.0.4) 56(84) bytes of data.
64 bytes from nginx1.kerneltalks (172.19.0.4): icmp_seq=1 ttl=64 time=0.088 ms
64 bytes from nginx1.kerneltalks (172.19.0.4): icmp_seq=2 ttl=64 time=0.054 ms

But in default docker bridge network (which installs with docker daemon) automatic DNS resolution is disabled to maintain container isolation. You can add container inter-comm just by using --link option while running container (when on default bridge network)

--link is a legacy feature and may be removed in upcoming features. So it is always advisable to use user-customized networks rather than using default docker networks.

DNS nameservers in Docker

Docker is coded in a smart way. When you run a new container on the docker host without any DNS related option in command, it simply copies host’s /etc/resolv.conf into container. While copying it filter’s out all localhost IP addresses from the file. That’s pretty obvious since that won’t be reachable from container network so no point in keeping them. During this filtering, if no nameserver left to add in container’s /etc/resolv.conf the file then Docker daemon smartly adds Google’s public nameservers 8.8.8.8 and 8.8.4.4 in to file and use it within the container.

Also, host and container /etc/resolv.conf always be in sync. Docker daemon takes help from the file change notifier and makes necessary changes in the container’s resolve file when there are changes made in the host’s file! The only catch is these changes will be done only if the container is not running. So to pick up changes you need to stop and start the container again. All stopped containers will be updated immediately after the host’s file changes.

How to use external DNS in container while starting it

If you want to use external DNS in the container other than docker native or other than what’s in host’s resolv.conf file, then you need to use --dns switch in docker container run command.

$ docker container run -d --dns 10.2.12.2 --name nginx5 nginx
fbe29f22bd5f78213163532f2529c5cd98bc04573a626d0e864e670f96c5dc7a

$ docker exec -it nginx5 cat /etc/resolv.conf
search 51ur3jppi0eupdptvsj42kdvgc.bx.internal.cloudapp.net
nameserver 10.2.12.2
options ndots:0

In the above example, we chose to have nameserver 10.2.12.2 in the container we run. And you can see /etc/resolv.conf inside the container saves this new nameserver in it. Make a note that whenever you are using --dns switch it will wipe out all existing nameserver entries within the container and keeps only the one you supply.

This is a way if you want to use custom DNS in a single container. But what if you want to use this custom DNS to all containers which will run on your docker host then you need to define it in the config file. We are going to see this in the next point.

How to use external DNS in all the containers on docker host

You need to define the external DNS IP in docker daemon configuration file /etc/docker/daemon.json as below –

{
    "dns": ["10.2.12.2", "3.4.5.6"]
}

Once changes saved in the file you need to restart docker daemon to pick up these new changes.

root@kerneltalks # systemctl docker restart

and it’s done! Now any container you run fresh on your docker host will have these two DNS nameservers by default in it.

$ docker container run -d --name nginx7 nginx
200d024ac8930c5bfe59fdbc90a1d4d0e8cd6d865f82096c985e23f1e022d548

$ docker exec -it nginx7 cat /etc/resolv.conf
search 51ur3jppi0eupdptvsj42kdvgc.bx.internal.cloudapp.net
options ndots:0

nameserver 10.2.12.2
nameserver 3.4.5.6

If you have any queries/feedback/corrections, let us know in the comment box below.

Docker swarm cheat sheet

Docker swarm cheat sheet. List of all commands to create, run, manage container cluster environment, Docker Swarm!

Docker swarm cheat-sheet

Docker swarm is a cluster environment for Docker containers. Swarm is created with a number of machines running docker daemons. Collectively they are managed by one master node to run clustered environment for containers!

In this article, we are listing out all the currently available docker swarm commands in a very short overview. This is a cheat sheet you can glance through to brush or your swarm knowledge or quick reference for any swarm management command. We are covering most used or useful switches with the below commands. There are more switches available for each command and you can get them with --help

Read all docker or containerization related articles here from KernelTalk’s archives.

Docker swarm commands for swarm management

This set of command is used mainly to start, manage the swarm cluster as a whole. For node management, within the cluster, we have a different set of commands following this section.

  • docker swarm init : Initiate swam cluster
    • –advertise-addr: Advertised address on which swarm lives
    • –autolock: Locks manager and display key which will be needed to unlock stopped manager
    • –force-new-cluster: Create a new cluster from backup and dont attempt to connect to old known nodes
  • docker swarm join-token: Lists join security token to join another node in swarm as worker or manager
    • –quite: Only display token. By default, it displays complete command to be used along with the token.
    • –rotate: Rotate (change) token for security reasons.
  • docker swarm join: Join already running swarm as a worker or manager
    • –token: Security token to join the swarm
    • –availability: Mark node’s status as active/drain/pause after joining
  • docker swarm leave: Leave swarm. To be run from the node itself
    • -f: Leave forcefully ignoring all warnings.
  • docker swarm unlock: Unlocks swarm by providing key after manager restarts
  • docker swarm unlock-key: Display swarm unlock key
    • -q: Only display token.
    • –rotate: Rotate (change) token for security reasons.
  • docker swarm update: Updates swarm configurations
    • –autolock: true/false. Turns on or off locking if not done while initiating.

Docker swarm node commands for swarm node management

Node is a server participating in Docker swarm. A node can either be a worker or manager in the swarm. The manager node has the ability to manage swarm nodes and services along with serving workloads. Worker nodes can only serve workloads.

  • docker node ls : Lists nodes in the swarm
    • -q : Only display node Ids
    • –format : Format output using GO format
    • –filter : Apply filters to output
  • docker node ps : Display tasks running on nodes
    • Above all switches applies here too.
  • docker node promote : Promote node to a manager role
  • docker node demote : Demote node from manager to worker role
  • docker node rm : Remove node from the swarm. Run from the manager node.
    • -f : Force remove
  • docker node inspect : Detailed information about the node
    • –format : Format output using GO format
    • –pretty : Print in a human-readable friendly format
  • docker node update : Update node configs
    • –role : worker/manager. Update node role
    • –availability : active/pause/drain. Set node state.

Docker swarm service commands for swarm service management

Docker service is used to create and spawn workloads to swarm nodes.

  • docker service create : Start new service in Docker swarm
    • Switches of docker container run command like -i (interactive), -t (pseud terminal), -d (detached), -p (publish port) etc supported here.
  • docker service ls : List services
    • –filter, –format and -q (quiet) switches which we saw above are supported with this command.
  • docker service ps : Lists tasks of services
    • –filter, –format and -q (quiet) switches which we saw above are supported with this command.
  • docker service logs : Display logs of service or tasks
  • docker service rm : Remove service
    • -f : Force remove
  • docker service update : Update service config
    • Most of the parameters defined in service create command can be updated here.
  • docker service rollback : Revert back changes done in service config.
  • docker service scale : Scale one or more replicated services.
    • servicename=number format
  • docker service inspect : Detailed information about service.