DCA - Section 4 Orchestration
DCA - Section 4 Orchestration
Orchestration
ISSUED BY
Zeal Vora
REPRESENTATIVE
[email protected]
Module 1: Overview of Container Orchestration
Container orchestration is all about managing the life cycles of containers, especially in large,
dynamic environments.
Container Orchestration can be used to perform a lot of tasks, some of them includes:
There are many container orchestration solutions which are available, some of the popular ones
include:
● Docker Swarm
● Kubernetes
● Apache Mesos
● Elastic Container Service (AWS ECS)
There are also various container orchestration platforms available like EKS.
Module 2: Overview of Docker Swarm
Docker Swarm is a container orchestration tool that is natively supported by Docker.
To deploy your application to a swarm, you submit a service definition to a manager node.
The manager node dispatches units of work called tasks to worker nodes.
There are two ways in which you can scale service in a swarm:
For a replicated service, you specify the number of identical tasks you want to run. For example,
you decide to deploy an NGINX service with two replicas, each serving the same content.
Each time you add a node to the swarm, the orchestrator creates a task and the scheduler
assigns the task to the new node.
With Compose, you use a YAML file to configure your application’s services.
We can start all the services with a single command - docker compose up
We can stop all the services with a single command - docker compose down
Whenever we make use of docker service, it is typically for a single container image.
A stack is a group of interrelated services that share dependencies and can be orchestrated and
scaled together.
A stack can compose a YAML file like the one that we define during Docker Compose.
We can define everything within the YAML file that we might define while creating a Docker
Service.
If your Swarm is compromised and if data is stored in plain-text, an attack can get all the
sensitive information.
There are multiple reasons why service might go into a pending state
If all nodes are drained, and you create a service, it is pending until a node becomes available.
You can reserve a specific amount of memory for a service. If no node in the swarm has the
required amount of memory, the service remains in a pending state until a node is available
which can run its tasks.
Swarm services provide a few different ways for you to control the scale and placement of
services on different nodes.
The overlay network driver creates a distributed network among multiple Docker daemon hosts
Overlay network allows containers connected to it to communicate securely.
If the containers are communicating with each other, it is recommended to secure the
communication.
To enable encryption, when you create an overlay network pass the --opt encrypted flag:
When you enable overlay encryption, Docker creates IPSEC tunnels between all the nodes
where tasks are scheduled for services attached to the overlay network.
These tunnels also use the AES algorithm in GCM mode and manager nodes automatically
rotate the keys every 12 hours.
--hostname
--mount
--env
Using a Raft implementation, the managers maintain a consistent internal state of the entire
swarm and all the services running on it
Docker recommends you implement an odd number of nodes according to your organization’s
high-availability requirements.
Using a Raft implementation, the managers maintain a consistent internal state of the entire
swarm and all the services running on it
By default manager nodes also act as worker nodes. This means the scheduler can assign
tasks to a manager node.
For small and non-critical swarms assigning tasks to managers is relatively low-risk as long as
you schedule services using resource constraints for CPU and memory.
To avoid interference with manager node operation, you can drain manager nodes to make
them unavailable as worker nodes:
When you drain a node, the scheduler reassigns any tasks running on the node to other
available worker nodes in the swarm. It also prevents the scheduler from assigning tasks to the
node.
Swarm is resilient to failures and the swarm can recover from any number of temporary node
failures (machine reboots or crash with restart) or other transient errors. However, a swarm
cannot automatically recover if it loses a quorum.
The best way to recover from losing the quorum is to bring the failed nodes back online. If you
can’t do that, the only way to recover from this state is to use the --force-new-cluster action
from a manager node.
This removes all managers except the manager the command was run from. The quorum is
achieved because there is now only one manager. Promote nodes to be managers until you
have the desired number of managers.
Sample Command:
It was originally designed by Google and is now maintained by the Cloud Native Computing
Foundation.
Module 20: PODS
A Pod in Kubernetes represents a group of one or more application containers and some
shared resources for those containers.
Containers within a Pod share an IP address and port space and can find each other via the
localhost.
A Pod always runs on a Node.
Once you create the object, the Kubernetes system will constantly work to ensure that object
exists.
It designed to be human friendly and works perfectly with other programming languages.
Module 22: Creating First POD Configuration in YAML
We can easily roll out new updates to our application using deployments.
Deployments will perform an update in a rollout manner to ensure that your app is not down.
Deployment ensures that only a certain number of Pods are down while they are being updated.
By default, it ensures that at least 25% of the desired number of Pods are up (25% max
unavailable).
maxUnavailable=10% and maxSurge=0 << Update with no extra capacity. In-place updates.
Elaborating Type:
i) Generic:
File (--from-file)
directory
literal value
ConfigMaps allow you to decouple configuration artifacts from image content to keep
containerized applications portable.
Whenever you create a Pod, the containers created will have Private IP addresses.
Following is a high-level diagram on the functionality of Service:
We can make use of service which acts as a gateway and can get us connected with right set of
pods.
Service is an abstract way of exposing application running in the pods as a network service.
● NodePort
● ClusterIP
● LoadBalancer
● ExternalName
Since an internal cluster IP is assigned, it can only be reachable from within the cluster.
If service type is NodePort, then Kubernetes will allocate a port (default: 30000-32767) on every
worker node.
● pods on a node can communicate with all pods on all nodes without NAT
● all Nodes can communicate with all Pods without NAT.
● the IP that a Pod sees itself as is the same IP that others see it as.
Based on the constraints set, there are four different networking challenges that need to be
solved:
● Container-to-Container Networking
● Pod-to-Pod Networking
● Pod-to-Service Networking
● Internet-to-Service Networking
Kubernetes Service can act as an abstraction which can provide a single IP address and DNS
through which pods can be accessed.
Endpoints track the IP address of the objects that service can send traffic to.
Kubernetes Ingress is a collection of routing rules which governs how external users access the
services running within the Kubernetes cluster.
Module 32: Liveness Probe
Many applications running for long periods of time eventually transition to broken states, and
cannot recover except by being restarted.
For example, the application is running but it is still loading it’s large configuration files from
external vendors.
In such a case, we don’t want to kill the container however we also do not want it to serve the
traffic.
In order to enter the taint worker node, you need a special pass.
Example:
Show me all the objects which have a label where env: prod
. Module 37: Requests and Limits
Requests and Limits are two ways in which we can control the amount of resource that can be
assigned to a pod (resource like CPU and Memory)
Kubernetes Scheduler decides the ideal node to run the pod depending on the requests and
limits.
If your POD requires 8GB of RAM, however, there are no nodes within your cluster which has
8GB RAM, then your pod will never get scheduled.