0% found this document useful (0 votes)

176 views10 pages

Linux Containers: Namespaces and Cgroups

Linux containers use kernel mechanisms like cgroups, namespaces, and seccomp to provide isolation at the process level. Cgroups provide resource management and limitation for groups of processes. They present a hierarchical filesystem interface for controlling subsystems like CPU, memory, devices and more. Namespaces provide isolation of kernel objects like processes, networking, IPC, and more, limiting the scope of these objects to only processes within the namespace. Together these mechanisms allow for the implementation of lightweight isolated userspace instances called containers.

Uploaded by

Matthew McMurphy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

176 views10 pages

Linux Containers: Namespaces and Cgroups

Uploaded by

Matthew McMurphy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Linux

Containers
Basic Concepts

Lucian Carata
FRESCO Talklet, 3 Oct 2014

Underlying kernel mechanisms

cgroups manage resources for groups of processes
namespaces per process resource isolation

seccomp limit available system calls
capabilities limit available privileges
CRIU checkpoint/restore (with kernel support)

Those mechanisms are orthogonal and are used in conjunction for implementing actual container functionality.

cgroups user space view
lowlevel filesystem interface similar to sysfs (/sys) and procfs (/proc)
new filesystem type “cgroup”, default location in /sys/fs/cgroup

each subsystem can be
cgroup hierarchies used at most once*
subsystems (controllers)

/sys/fs/cgroup

TL /cpu cpu cpuacct cpuset cpu cpuacct

/highpriority
/normal memory hugetbl
/experiment_1
devices blkio net_cls net_prio
TL /mem memory

/opus
freezer perf
/normal built as kernel module
/experiment_1 TL top level cgroup (mount)

* or, if a new toplevel cgroup is created with an already existing combination of subsystems, the previous top
level cgroup will be used behind scenes

● issues with systemd premounting directories with certain controllers, which makes new hierarchies (with
different controller combinations) difficult to achieve
● each process can appear at most once within a cgroup hierarchy (from toplevel towards descendants)

cgroups user space view
cgroup hierarchies

/sys/fs/cgroup common cpuacct cpu

TL /cpu cpu
tasks [Link] [Link]
cpuacct
[Link] [Link] [Link]
/highpriority
release_agent cpuacct.usage_percpu cpu.cfs_period_us
/normal notify_on_release cpu.cfs_quota_us
TL
/experiment_1 cgroup.clone_children cpu.rt_period_us
cgroup.sane_behavior cpu.rt_runtime_us
TL /mem memory

/opus
/normal
/experiment_1

cpuset memory hugetbl devices blkio

net_cls net_prio freezer perf

● by default, the toplevel cgroup contains all running tasks. a cgroup created as a subdirectory starts with no
tasks, and those must be manually added to the “tasks” file
● release_agent is only present at the toplevel cgroup level, and contains a command to be run when the
last process of a cgroup terminates. notify_on_release needs to be set in particular cgroups for that
command to actually execute.

● cpu controller: by default, the kernel scheduler aims to give equal cpu time to all processes. cgroups can
be used for fair grouping between arbitrary sets of processes (an example of 30 apache processes and 10
postgres processes)
● net_cls interface for tagging network packets with a class identifier (so that you could later add rules
based on packet class)
● memory controller has hierarchical support and allows for soft limits (cgroup can use as much memory as
needed provided there is no memory contention and the hard limit is not exceeded).
○ Hierarchical support means that child cgroups contribute to the memory usage of their ancestors. If an
ancestor exceeds a limit, memory will be reclaimed from the ancestor and all its children
● cpuset is also hierarchical

cgroups kernel space view
include / linux / cgroup.h
task_struct css_set

css_set *cgroups list_head tasks
list_head cg_list cgroup_subsys_state *subsys[CGROUP_SUBSYS_COUNT]

task_struct kernel code for attach/detaching
task from css_set
css_set *cgroups
list_head cg_list
init/main.c
fork(), exit()

list of all tasks using the
same css_set

on initialization, a css_set init_css_set is created containing the initial css_set at system boot.

a css_set contains all the tasks that are under the same state configuration for all enabled controllers (they share cgroups in all hierarchies)
the cgroup hierarchy is not directly accessible from a given task (this is not required as often)

cgroups kernel space view
include / linux / cgroup.h
task_struct css_set

css_set *cgroups list_head tasks
list_head cg_list cgroup_subsys_state *subsys[CGROUP_SUBSYS_COUNT]
...

include / linux / cgroup_subsys.h
cgroup_subsys
task_struct cgroup_subsys cpuset_subsys
int (*attach)(...)
css_set *cgroups void (*fork)(...)
list_head cg_list
cgroup_subsys freezer_subsys
void (*exit)(...)
void (*bind)(...)
cgroup_subsys mem_cgroup_subsys
...

const char* name;
list of all tasks using the cgroupfs_root *root;
cftype *base_cftypes
same css_set
cgroups kernel space view
include / linux / cgroup_subsys.h
cgroup_subsys

int (*attach)(...)
void (*fork)(...)
void (*exit)(...)
void (*bind)(...)

...

const char* name;
cgroupfs_root *root;
cftype *base_cftypes

cgroup_subsys cpuset_subsys
.base_cftypes = files

cgroups summary
each subsystem can be
cgroup hierarchies used at most once*
subsystems (controllers)

/sys/fs/cgroup

TL /cpu cpu cpuacct cpuset cpu cpuacct

/highpriority
/normal memory hugetbl
/experiment_1
devices blkio net_cls net_prio
TL /mem memory

/opus
freezer perf
/normal built as kernel module
/experiment_1 TL top level cgroup (mount)

● show in terminal /sys/fs/cgroups

namespaces user space view
Namespaces limit the scope of kernelside names and data structures
at process granularity

mnt (mount points, filesystems) CLONE_NEWNS
pid (processes) CLONE_NEWPID
net (network stack) CLONE_NEWNET
ipc (System V IPC) CLONE_NEWIPC
uts (unix timesharing domain name, etc) CLONE_NEWUTS
user (UIDs) CLONE_NEWUSER

The main purpose of a namespace is the isolation of whatever is contained within from other namespaces running in the same kernel

namespaces user space view
Namespaces limit the scope of kernelside names and data structures
at process granularity

Three system calls for management
clone() new process, new namespace, attach process to ns
unshare() new namespace, attach current process to it
setns(int fd, int nstype) join an existing namespace

The main purpose of a namespace is the isolation of whatever is contained within from other namespaces running in the same kernel

namespaces user space view
each namespace is identified by an inode (unique)
six entries (inodes) added to /proc/<pid>/ns/
(?)

two processes are in the same namespace if they see the same inode for
equivalent namespace types (mnt, net, user, ...)

User space utilities
* IPROUTE (ip netns add, etc)
* unshare, nsenter (part of utillinux)
* shadow, shadowutils (for user ns)

nsenter is a wrapper around setns
unshare has support for all 6 namespaces

namespaces kernel space view
include / linux / nsproxy.h include / linux / cred.h
task_struct nsproxy cred
atomic_t count
struct nsproxy *nsproxy ...
struct cred *cred struct uts_namespace *uts_ns struct user_namespace *user_ns
struct ipc_namespace *ipc_ns
struct mnt_namespace *mnt_ns
struct pid_namespace *pid_ns_for_children
struct net *net_ns

include / linux / nsproxy.h

nsproxy* task_nsproxy(struct task_struct *tsk)

For each namespace type, a default namespace exists (the global namespace)
struct nsproxy is shared by all tasks with the same set of namespaces

namespaces kernel space view
Example for uts namespace
include / uapi / linux / utsname.h
new_utsname
include / linux / nsproxy.h
task_struct nsproxy char sysname []
char nodename []
struct uts_namespace *uts_ns char release []
struct nsproxy *nsproxy
... ... char version []
char machine []
char domainname []
global access to hostname: system_utsname.nodename
namespaceaware access to hostname: &current>nsproxy>uts_ns>name>nodename

namespaces kernel space view
Example for net namespace
include / net / net_namespace.h
net
include / linux / nsproxy.h
task_struct nsproxy Logical copy of the network stack:
struct net *net_ns
struct nsproxy *nsproxy loopback device
... ...
all network tables (routing, etc)
all sockets
/procfs and /sysfs entries

a network device belongs to exactly one network namespace
a socket belongs to exactly one network namespace
a new network namespace only includes the loopback device
communication between namespaces using veth or unix sockets

namespaces summary
Namespaces limit the scope of kernelside names and data structures
at process granularity

mnt (mount points, filesystems)
pid (processes)
net (network stack)
ipc (System V IPC)
uts (unix timesharing domain name, etc)
user (UIDs)

The main purpose of a namespace is the isolation of whatever is contained within from other namespaces running in the same kernel
Containers
A light form of resource virtualization based on kernel mechanisms
A container is a userspace construct
Multiple containers run on top of the same kernel
illusion that they are the only one using resources
(cpu, memory, disk, network)

some implementations offer support for
container templates
deployment / migration
union filesystems

taken from the Docker documentation

Container solutions
Mainline
Google containers (lmctfy)
uses cgroups only, offers CPU & memory isolation
no isolation for: disk I/O, network, filesystem, checkpoint/restore
adds some cgroup files: [Link], [Link]
LXC: userspace containerisation tools
Docker
systemdnspawn

Forks
Vserver, OpenVZ

Container solutions LXC
An LXC container is a userspace process created with the clone() system call
with its own pid namespace
with its own mnt namespace
net namespace (configurable) [Link]

Offers container templates /usr/share/lxc/templates
shell scripts
lxccreate t ubuntu n containerName
also creates cgroup /sys/fs/cgroup/<controller>/lxc/containerName

Container solutions Docker
A Linux container engine

multiple backend drivers
application rather than machinecentric
app build tools
diffbased deployment of updates (AUFS)
versioning (gitlike) and reuse

links (tunnels) between containers

taken from the Docker documentation

Questions?
Thank you! Lucian Carata
lc525@[Link]

More details
cgroups: [Link]
namespaces: [Link]

Managing Resources With Cgroups
No ratings yet
Managing Resources With Cgroups
97 pages
NetDev 1.1 Proceedings: Linux Networking
No ratings yet
NetDev 1.1 Proceedings: Linux Networking
69 pages
Linux Cgroups: Resource Management Guide
No ratings yet
Linux Cgroups: Resource Management Guide
37 pages
Docker Container Isolation Explained
No ratings yet
Docker Container Isolation Explained
8 pages
Containerization with Docker Overview
No ratings yet
Containerization with Docker Overview
38 pages
Understanding Linux Containers: Part I
No ratings yet
Understanding Linux Containers: Part I
3 pages
Linux Containers Overview and Tools
No ratings yet
Linux Containers Overview and Tools
1 page
Unit II
No ratings yet
Unit II
62 pages
Building a Bash Container Engine
No ratings yet
Building a Bash Container Engine
24 pages
Understanding Linux Containers and Isolation
No ratings yet
Understanding Linux Containers and Isolation
7 pages
Containers: A Developer's Guide
No ratings yet
Containers: A Developer's Guide
7 pages
Lightweight Container Engine: Cgroup-Box
No ratings yet
Lightweight Container Engine: Cgroup-Box
27 pages
Managing Resources with Linux CGroups
No ratings yet
Managing Resources with Linux CGroups
25 pages
Understanding Linux Cgroups and Namespaces
No ratings yet
Understanding Linux Cgroups and Namespaces
10 pages
Rkt and Linux Container Mechanics
No ratings yet
Rkt and Linux Container Mechanics
75 pages
ZoodMall Infrastructure Handbook Guide
No ratings yet
ZoodMall Infrastructure Handbook Guide
78 pages
xv6: Cgroups and Namespaces Overview
No ratings yet
xv6: Cgroups and Namespaces Overview
38 pages
Understanding SELinux and Zombie Processes
No ratings yet
Understanding SELinux and Zombie Processes
16 pages
Docker Basics and Container Isolation
No ratings yet
Docker Basics and Container Isolation
5 pages
GRUB 2 Configuration and Management Guide
No ratings yet
GRUB 2 Configuration and Management Guide
20 pages
GRUB 2 Configuration and Management Guide
No ratings yet
GRUB 2 Configuration and Management Guide
37 pages
SELinux and Container Security Insights
No ratings yet
SELinux and Container Security Insights
46 pages
History of Container Technology
No ratings yet
History of Container Technology
29 pages
Configuring a High Availability Firewall
No ratings yet
Configuring a High Availability Firewall
15 pages
Container Whitepaper
No ratings yet
Container Whitepaper
53 pages
Red Hat Enterprise Linux-7-Resource Management Guide-en-US
No ratings yet
Red Hat Enterprise Linux-7-Resource Management Guide-en-US
36 pages
Enhancing Linux Container Security
No ratings yet
Enhancing Linux Container Security
63 pages
Configuring Linux Clusters for Failover
No ratings yet
Configuring Linux Clusters for Failover
24 pages
Manage Containers with systemd Services
No ratings yet
Manage Containers with systemd Services
6 pages
EX188 Podman Crash Course Overview
100% (1)
EX188 Podman Crash Course Overview
105 pages
Linux Learning Roadmap
No ratings yet
Linux Learning Roadmap
3 pages
Introduction to Linux: Key Concepts
No ratings yet
Introduction to Linux: Key Concepts
8 pages
Virtual Machines vs Containers Explained
No ratings yet
Virtual Machines vs Containers Explained
62 pages
Docker (تم الحفظ تلقائيًا)
No ratings yet
Docker (تم الحفظ تلقائيًا)
2 pages
L2 Linux Interview Questions With Answers
No ratings yet
L2 Linux Interview Questions With Answers
8 pages
RHCSA Exam Questions and Lab Setup
No ratings yet
RHCSA Exam Questions and Lab Setup
8 pages
Kubernetes 101: A Beginner's Guide
No ratings yet
Kubernetes 101: A Beginner's Guide
29 pages
LINUX Containers-For-Everyone-Ebook PDF
No ratings yet
LINUX Containers-For-Everyone-Ebook PDF
100 pages
RHEL 5.2 Cluster Setup Guide
No ratings yet
RHEL 5.2 Cluster Setup Guide
7 pages
Simple Container Engine Lab Guide
No ratings yet
Simple Container Engine Lab Guide
20 pages
Linux Basics: Beginner to Advanced Guide
No ratings yet
Linux Basics: Beginner to Advanced Guide
10 pages
Linux Overview and RHEL Installation Guide
No ratings yet
Linux Overview and RHEL Installation Guide
24 pages
Understanding LXD and Containerization
No ratings yet
Understanding LXD and Containerization
45 pages
RHCSA Exam Questions and Lab Setup
0% (1)
RHCSA Exam Questions and Lab Setup
16 pages
Conceptual Architecture of Linux Kernel
No ratings yet
Conceptual Architecture of Linux Kernel
13 pages
Docker Security Best Practices Guide
No ratings yet
Docker Security Best Practices Guide
33 pages
RHCSA Exam Practice Questions Guide
100% (4)
RHCSA Exam Practice Questions Guide
14 pages
Continerization 1-2
No ratings yet
Continerization 1-2
41 pages
01 Linux Fundamentals
No ratings yet
01 Linux Fundamentals
11 pages
Linux DevOps Handbook Overview
No ratings yet
Linux DevOps Handbook Overview
21 pages
Linux Kernel Basics: An Extended Guide
No ratings yet
Linux Kernel Basics: An Extended Guide
5 pages
Linux Firewall and SELinux Configuration Guide
No ratings yet
Linux Firewall and SELinux Configuration Guide
10 pages
Understanding Kernels and Docker Basics
No ratings yet
Understanding Kernels and Docker Basics
16 pages
System Administration Essentials: Security
No ratings yet
System Administration Essentials: Security
14 pages
Linux Kernel Hacking Guide
No ratings yet
Linux Kernel Hacking Guide
45 pages
Comprehensive Linux Skills Guide
No ratings yet
Comprehensive Linux Skills Guide
6 pages
Overview of the Linux Kernel
No ratings yet
Overview of the Linux Kernel
8 pages
(Ebook) Container Security: Fundamental Technology Concepts That Protect Containerized Applications by Liz Rice Isbn 9781492056706, 1492056707
No ratings yet
(Ebook) Container Security: Fundamental Technology Concepts That Protect Containerized Applications by Liz Rice Isbn 9781492056706, 1492056707
80 pages
Container Management and Kubernetes Setup
No ratings yet
Container Management and Kubernetes Setup
24 pages
Overview of Operating Systems Functions
No ratings yet
Overview of Operating Systems Functions
11 pages
Distributed Systems Question Bank
100% (2)
Distributed Systems Question Bank
4 pages
Bulk Data Migration Using Robo
No ratings yet
Bulk Data Migration Using Robo
3 pages
Essential UNIX Commands Guide
No ratings yet
Essential UNIX Commands Guide
15 pages
Infineon-Getting Started XMC4800 Relax EtherCat APP FWUpdate Slave SSC-GS-V01 03-En
No ratings yet
Infineon-Getting Started XMC4800 Relax EtherCat APP FWUpdate Slave SSC-GS-V01 03-En
62 pages
Android System Server Initialization Logs
No ratings yet
Android System Server Initialization Logs
872 pages
REFramework v1.5.9.1 Overview
No ratings yet
REFramework v1.5.9.1 Overview
3,507 pages
CS 419 Distributed Computing Assignment
No ratings yet
CS 419 Distributed Computing Assignment
2 pages
OS Installation and File System Overview
No ratings yet
OS Installation and File System Overview
15 pages
Firebase Initialization Log Analysis
No ratings yet
Firebase Initialization Log Analysis
8 pages
Linux Command Line Basics
No ratings yet
Linux Command Line Basics
67 pages
Exploiting Cron Jobs for Privilege Escalation
No ratings yet
Exploiting Cron Jobs for Privilege Escalation
11 pages
WSL2 Bind Mounts: Solutions for Docker
No ratings yet
WSL2 Bind Mounts: Solutions for Docker
2 pages
High Performance Computing Exam Questions
No ratings yet
High Performance Computing Exam Questions
10 pages
Understanding Filesystem Fragmentation
No ratings yet
Understanding Filesystem Fragmentation
24 pages
Android App Performance Logs Analysis
No ratings yet
Android App Performance Logs Analysis
7 pages
Red Hat System Administration I: Document Version
No ratings yet
Red Hat System Administration I: Document Version
8 pages
DBMS Transaction Processing Concepts
No ratings yet
DBMS Transaction Processing Concepts
211 pages
Configure Yum Repo from RHEL 9 DVD
No ratings yet
Configure Yum Repo from RHEL 9 DVD
3 pages
Installation - NethServer 7 Final
No ratings yet
Installation - NethServer 7 Final
6 pages
Understanding DBMS Isolation Levels
No ratings yet
Understanding DBMS Isolation Levels
17 pages
Introduction to Operating Systems Basics
No ratings yet
Introduction to Operating Systems Basics
43 pages
Install Hadoop 2.8.0 on Windows 10
No ratings yet
Install Hadoop 2.8.0 on Windows 10
14 pages
Survey of Virtualization Technologies
No ratings yet
Survey of Virtualization Technologies
12 pages
Eclipse Tutorial Guide
No ratings yet
Eclipse Tutorial Guide
26 pages
Compile Linux Kernel 3.8 on Ubuntu
No ratings yet
Compile Linux Kernel 3.8 on Ubuntu
25 pages
MSXML 4.0 Installation Log
No ratings yet
MSXML 4.0 Installation Log
49 pages
UltraViewer Service Log Analysis
No ratings yet
UltraViewer Service Log Analysis
242 pages
Memory Addressing and Page Management
No ratings yet
Memory Addressing and Page Management
10 pages
Git Training: Essential Branching Skills
No ratings yet
Git Training: Essential Branching Skills
24 pages

Linux Containers: Namespaces and Cgroups

Uploaded by

Linux Containers: Namespaces and Cgroups

Uploaded by

Linux

TL /cpu cpu cpuacct cpuset cpu cpuacct

/sys/fs/cgroup common cpuacct cpu

cpuset memory hugetbl devices blkio

net_cls net_prio freezer perf

TL /cpu cpu cpuacct cpuset cpu cpuacct

You might also like