Docker and Kubernetes
Docker and Kubernetes
Container: It's a standardized unit of software. A package of code and dependencies to run that
code. Docker then in the end is a tool that simplifies the creation and management of these
containers.
Reference: https://learn.microsoft.com/en-us/windows/wsl/install
WSL Commands:
wsl -l -v : To view the OS installed and their state.
wsl -d <name> : To connect to the specific OS.
Images and Containers: A container is a running unit of software. Whereas, images are
templates or blueprints for containers.
We first build an image with the required code and dependencies. Now, Using that single
image, we can run multiple containers where each is a running instance of that image.
Using and Running External (Pre-Built) images:
https://hub.docker.com/ : We can find pre built images.
docker run node : It will create node container but it won’t do anything as container runs in an
isolation.
docker run -it node: With “it” option, we can now run node container in an interactive way and
can execute basic node commands.
Using pre built images, we can also write our own docker file and create a custom image.
Creating a DockerFile:
FROM <base_image>: To import the base image from DockerHub or from our local to the
current image.
COPY source dest
COPY . . -> Specify while files to be copied to image. Here first “ . “ specifies all the files in path
of DockeFile to be copied. And second “ . “ specifies the destination where its to be copied into.
Here it will be copied to the DockerFile.
Note: If the destination folder doesn’t exist, it simply creates one.
RUN <command> : This command is used to run the commands in the working directory. The
default working directory will be the root folder in the container file system. In the above, as we
are copying all the folders into destination, it will be the default working directory.
Eg:
FROM node
COPY . /app
RUN npm install
RUN node server.js
In the above example, the npm install command will be executed in /app directory.
WORKDIR <path>: To set the default working directory of container. All the subsequent
commands, will be executed in the path given here.
Also, in the above example, RUN node server.js is INCORRECT because this would be executed
when the image is being built. The image is just a template to the container. The image is not
what we run in the end. We run a container based on the image. We only want to start the
server, only if we start a container based on the image.
EXPOSE <port>: This command is OPTIONAL used to document the port that we are exposing
our container to. We can specify the port here according to our server.js file.
Note: The EXPOSE 80 below in the Dockerfile in the end is optional. It documents that a process
in the container will expose this port. But you still need to then actually expose the port with -p
when running docker run. So technically, -p is the only required part when it comes to listening
on a port. Still, it is a best practice to also add EXPOSE in the Dockerfile to document this
behavior.
FROM node
WORKDIR /app
COPY . /app
RUN npm install
EXPOSE 80
CMD ["node", "server.js"]
Image Layers:
Once image is built, and if we build it again without any changes, the new image uses cache as
Docker understands that there arent any changes. So, when we build an image for first time,
docker caches every instruction and uses that cache when there are no changes when we
rebuild an image again. This is called Layered Structure.
When a specific layer is changed, Docker understands the corresponding layer. So, it builds the
other previous layers from cache and rebuilds only the layer that is changed. Docker here
doesn’t know the impact on subsequent layer based on changes of previous layer. So, it re
executes all the subsequent layers of the changed layer again.
FROM node
WORKDIR /app
COPY package.json /app
RUN npm install
COPY . /app
EXPOSE 80
CMD ["node", "server.js"]
In the above code, we know that only if there is change in package.json, npm install needs to be
executed. For other code changes, its not required to run npm install. So, we can order our
layers accordingly and optimize our code.
To restart, we can run docker start <container_name | <container_id >. We can then check if
container is running by docker ps.
With docker run command, we are struck with that process as we cannot enter any other
commands as can be seen below. Here, the container is in “Attached” mode.
With docker start, detached mode is default. While with docker run, attached mode is the
default behaviour.
With attached mode, we can see the log statements (if there are any) in our terminal. To run in
detached mode, we can add -d option to docker run command. And similarly, to run in attached
mode, we can add -a option to docker start command.
We can also attach again a detached container by running docker container attach <
container_name | container_id > and see the console logs:
We can also check the logs of a detached container using docker logs <container_name>
Interactive Mode:
Docker images are not just for web applications but it can also be used to run simple python or
any programming language codes. Now, when we are building a docker image that expects an
input, and when we simply use docker run, though its in attached mode, it is not interacting
with and failing with below error:
So, in order to run docker images which expects an input, we need to run it with interactive
mode (-i option). Also, we can allocate a pseudo terminal with -t option which is OPTIONAL.
So, once the program is executed, we can observe that the container is stopped. Now, when we
restart the container, the container is in RUNNING state, but its not taking any INPUT and giving
us the required results.
We need to restart the container in interactive and attached mode again to give the input and
execute the program as below:
docker image prune: We can use this command to remove all the unused images.
Docker Hub: To login into docker hub, we can use docker login.
Since our docker hub repo is public, we can logout and try docker pull <Repo_name> to
download image from docker hub to local.
Kubernetes helps you manage, scale and deploy containers across a cluster of services.
Nodes These are the EC2 instances that will run your Docker containers (pods)
Pods These are the smallest deployable units in K8s. Each Pod runs one or more containers
Services This ensures that your application is accessible and load-balanced across different pods
Kubectl This is the k8s command line tool
Create a deployment.yml and service.yml file and apply the files
Horizontal Pod Autoscaler Automatically scales the no.of pods based on CPU/memory
utilization. If the max no.of pods are reached, the Cluster Autoscaler kicks in to add new EC2
instances.
Cluster autoscaler This automatically scales the no.of worker nodes (EC2 instances) in cluster
Replicas in deployment.yml file - replicas: 3 means you want 3 pods running at any time across
the available nodes. Ex: If you have 2 nodes and 3 replicas, K8s will balance them across the
nodes (2 pods on one node, 1 pod on the other)
Health monitoring K8s continuously monitors the health of each pod using readiness and
liveness probes.
● Readiness probe - Checks if the Pod is ready to serve traffic. A Pod could be alive, but
not yet ready(i.e., waiting for a DB connection)
● Liveness probe - Checks if the Pod is alive. If a Pod fails liveness check, k8s will
automatically restart it
Kubelet service A critical k8s component that runs on every node (EC2 instance in EKS). It’s
responsible for managing pods on its node. It interacts with the k8s control plane to ensure that
the containers are running properly
● Pod Management: The kubelet watches the API server for pods scheduled on its node
and ensures that the containers in those pods are running.
● Health Checks: It performs the health checks (liveness and readiness probes) and
ensures the containers are restarted if necessary.
● Container Runtime Interface (CRI): The kubelet communicates with the container
runtime (e.g., Docker, containerd) to start and stop containers.
● Node Status: It reports the status of the node (resource utilization, health, etc.) to the
control plane.
● Triggering the Cluster Autoscaler: When the HPA increases the number of pods and the
existing nodes cannot accommodate them, the Cluster Autoscaler automatically
provisions new EC2 instances to handle the increased workload.
For example:
○ Your nodes (EC2 instances) have enough resources to run 10 pods.
○ The HPA scales up the number of pods to 15.
○ If the existing nodes can't handle the additional 5 pods, the Cluster Autoscaler
will add a new EC2 instance to the cluster to provide more capacity.
Kubernetes Volumes:
A k8s volume is a directory accessible by containers in pod
● K8s volumes are scoped to a pod, meaning they exist as long as the pod exists
● Different volume types handle storage differently - some are ephemeral(data lost when
the pod is deleted), while others are persistent(data is retained even if the pod is
deleted)