DevOps Shack _ Linux Interview Questions and Answers
DevOps Shack _ Linux Interview Questions and Answers
com
DevOps Shack
100 Linux Interview Questions and Answers
🧾 Table of Contents
✅ Section 1: Linux Basics and Fundamentals
1. What is Linux and how is it different from Unix?
3. What are runlevels? How are they configured in modern distros?
4. What are the major components of the Linux operating system?
8. What are hard links and soft links? When should you use each?
9. How does Linux handle file permissions? Explain chmod, chown, and
umask.
17. Explain how groups work in Linux. Difference between primary and
secondary groups?
25. How does the cron scheduler work? How to schedule jobs with
examples?
29. How do you monitor disk space usage with df, du, and ncdu?
30. What is the difference between ext3, ext4, xfs, and btrfs?
✅
Section 8: System Monitoring and Performance Tuning
40. What are cgroups and how do they help in resource management?
42. What are the differences between apt, yum, dnf, and zypper?
44. What is a .deb and .rpm file? How are they created?
59. How to search for patterns recursively using grep and find?
60. Explain piping (|) and redirection (>, >>, 2>) with examples.
95. How do you use cron jobs for automated health checks?
100. What are your go-to tools for Linux troubleshooting and why?
📘 Introduction
Linux is the backbone of modern computing—from cloud servers and
supercomputers to embedded systems and smartphones. Whether you're
applying for a system administrator role, DevOps engineer, cloud architect,
or security analyst, a strong grasp of Linux is essential.
This guide covers 100 of the most commonly asked and technically
detailed Linux interview questions, ranging from beginner concepts to
complex real-world production scenarios. Each question is followed by a
deep-dive explanation, usage examples, relevant command syntax, and
real-life use cases so that you're not just memorizing facts—you’re gaining
insight.
● System administrators
● Linux enthusiasts
● Interview preparation
Answer:
Answer:
5. Init/Systemd
The init system takes over and begins starting services based on the
runlevel or target. On most modern systems, it's systemd.
Interview Tip: Be ready to explain what each stage does, especially the
difference between BIOS vs UEFI, MBR vs GPT, and GRUB vs
systemd-boot.
Answer:
Runlevels are states of the machine that define what services are running.
In traditional SysVinit systems, they were used to control system
behavior:
● 0 – Halt
● 4 – Unused/custom
● 6 – Reboot
● runlevel3.target → multi-user.target
● runlevel5.target → graphical.target
systemctl get-default
To switch immediately:
Answer:
1. Kernel
Core part of the OS. It manages hardware, memory, processes, I/O,
and system calls. It abstracts hardware so apps don't need to deal
with it directly.
Answer:
Both processes and threads are used to execute code but they differ in
how they operate and share system resources.
Process:
● Has its own memory space, file descriptors, and address space.
Thread:
Example:
A browser can be a process. Each tab can be a thread. They share
memory (like bookmarks, cache) but perform tasks independently (load
different web pages).
Commands:
Answer:
● /bin – Essential binary commands (e.g., ls, cp, mv, cat). Available
in single-user mode.
Answer:
● Size
File name is not stored in inode, it's stored in the directory entry, which
maps the name to an inode number.
ls -li
df -i
8. What are hard links and soft (symbolic) links? When should you use
each?
Answer:
Hard Link:
● Both the original file and the hard link point to the same inode.
● Deleting the original file does not delete the content as long as a hard
link exists.
ln original.txt hardlink.txt
ln -s /path/to/original symlink.txt
● You want multiple names for the same data on the same filesystem.
ls -l
9. How does Linux handle file permissions? Explain chmod, chown, and
umask.
Answer:
File permissions in Linux control access at the user, group, and others
level.
Permissions:
● r – read
● w – write
● x – execute
Example:
● Others: read
Symbolic:
Numeric:
umask (user mask): Defines default permissions for new files and
directories.
umask
So:
Answer:
● Used on directories.
● Only the owner (or root) can delete files within the directory, even if
others have write access.
● Common on /tmp.
Set it:
chmod +t /shared
ls -ld /shared
# drwxrwxrwt
● Executes a file with the file owner's permissions, not the user's.
ls -l
# -rwsr-xr-x
Example:
ls -ld /project_dir
# drwxrwsr-x
11. Explain how to write and debug a basic shell script in Linux.
Answer:
Basic structure:
#!/bin/
chmod +x script.sh
./script.sh
Debugging techniques:
-x script.sh
This shows each command before it is executed (great for catching logic
errors).
#!/bin/
-n script.sh
Real-World Tip: Always check for -n, and make sure your scripts work
with set -e to abort on any command failure—especially in CI/CD
pipelines.
12. What are environment variables? How do you manage them in Linux?
Answer:
Printenv
Env
Set environment variables (temporarily):
export VAR_NAME=value
echo $VAR_NAME
Example:
export JAVA_HOME=/usr/lib/jvm/java-11
export PATH=$PATH:$JAVA_HOME/bin
Unset a variable:
unset VAR_NAME
echo "$HOME"
13. What is the difference between cron, at, and systemd timers?
Answer:
All three are used for task scheduling, but they serve different use cases.
cron: Recurring jobs
● Syntax:
* * * * * /path/to/script.sh
● Manage using:
at 10:30 AM
> <Ctrl+D>
● View jobs:
atq
● Remove jobs:
atrm <job_number>
systemd timers: Modern and powerful
Example:
# mytask.service
[Service]
ExecStart=/path/to/script.sh
# mytask.timer
[Timer]
OnCalendar=*-*-* 03:00:00
Persistent=true
[Install]
WantedBy=timers.target
Choose:
Answer:
● Without it, the script will be run using the current shell, which may not
behave consistently (e.g., sh, dash, zsh, etc.).
Example:
#!/bin/
./script.sh
script.sh
Interview Tip: Mention portability and how omitting the shebang can cause
subtle bugs across systems.
Answer:
Managing users is a core Linux sysadmin task. Here's how it’s done:
Create a user:
useradd john
passwd john
Or:
Creates:
id john
Files involved:
16. How do you create, delete, and manage users in Linux?
Answer:
We touched on the basics earlier, but let’s go deeper into managing Linux
users, especially from a production and troubleshooting perspective.
Create a user with custom options:
Set password:
passwd john
Force password reset at next login:
chage -d 0 john
View user info:
id john
userdel -r john
Pro Tip:
In environments with LDAP or Active Directory integration, users may not
exist in /etc/passwd but can still be seen via getent passwd
username.
17. Explain how groups work in Linux. What's the difference between
primary and secondary groups?
Answer:
● Defined in /etc/passwd.
Secondary Groups:
● Stored in /etc/group.
Check groups:
id john
groups john
Add user to secondary group:
18. How to set password aging policies and enforce them in Linux?
Answer:
chage -M 90 -m 7 -W 10 john
In /etc/login.defs:
nginx
PASS_MAX_DAYS 90
PASS_MIN_DAYS 7
PASS_WARN_AGE 10
Force password expiration immediately:
chage -d 0 john
Audit tip:
Check for accounts with never expire status — those are potential
vulnerabilities.
Answer:
● Format:
username:x:UID:GID:comment:home:shell
● Example:
john:x:1001:1001:John Dev:/home/john:/bin/
/etc/shadow:
● Format:
username:$6$hashedPassword:lastChange:min:max:warn:inac
tive:expire
/etc/group:
● Format:
makefile
groupname:x:GID:member1,member2
Important:
If /etc/passwd is misconfigured, you may not be able to login or
escalate privileges. Always back it up before editing.
Answer:
Sudo allows users to run commands as root or another user, without giving
away the root password.
Grant sudo access:
sudo -l -U john
visudo
pgsql
sudo command
Real-world tips:
21. Explain ps, top, htop, nice, renice, and kill commands.
Answer:
ps aux
● a: all users
● u: user-oriented format
ps -p 1234 -o pid,ppid,cmd,%mem,%cpu
Key shortcuts:
● k: kill process
● r: renice process
● P: sort by CPU
● M: sort by memory
Install via:
nice -n 10 ./my_script.sh
renice -n -5 -p 1234
Interview Tip: Always try SIGTERM (-15) before SIGKILL (-9) to allow
cleanup.
Answer:
Runs interactively.
ping google.com
Stop it with:
● Ctrl+C: terminate
● Ctrl+Z: pause/suspend
Background process:
Start directly:
./long_script.sh &
bg
jobs
Bring to foreground:
fg %1
kill %1
Use case:
Run long-running scripts in the background so you can continue working in
the terminal.
Answer:
● The process has completed but still has an entry in the process table.
Identify them:
Or:
ps -el | grep Z
If persistent:
They usually don’t cause harm, but can indicate poor application behavior.
View:
Prevention:
Use wait() or waitpid() in scripts to collect child exit statuses and
avoid zombies.
24. What is strace and how do you debug using it?
Answer:
● Trace a command:
strace -p 1234
25. How does the cron scheduler work? How to schedule jobs with
examples?
Answer:
sql
Edit current user’s cron jobs:
List them:
Sample entries:
* * * /home/user/backup.sh
* * * * * command_to_execute
| | | | |
crontab -e
crontab -l
0 1
● Every 15 minutes:
*/15 * * * * /home/user/check_status.sh
0 10 * * 1 /home/user/report.sh
Best practices:
SHELL=/bin/
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:
/sbin:/bin
26. How to create and manage partitions using fdisk or parted in Linux?
Answer:
Partitioning divides a physical disk into logical sections. Two of the most
common tools are:
Using fdisk (for MBR disks):
● m – help
● d – delete partition
● w – write changes
Mount it:
Commands:
● quit – exit
sudo partprobe
Answer:
mkfs.ext4 /dev/myvg/mydata
● To extend:
resize2fs /dev/myvg/mydata
● To reduce:
umount /mnt/lvmdata
e2fsck -f /dev/myvg/mydata
● Snapshot support
Answer:
df -h
Add:
ini
Find UUID:
blkid /dev/sdb1
29. How do you monitor disk space usage with df, du, and ncdu?
Answer:
df -h
du -sh /var/log
● -s: summary
● -h: human-readable
sudo ncdu /
Install via:
sudo apt install ncdu
● Check inodes:
df -i
If inodes are full, clean up many small files (e.g., mail queues, cache).
30. What is the difference between ext3, ext4, xfs, and btrfs?
Answer:
● Successor to ext3
● No shrinking support
● Common in RHEL/CentOS
Use cases:
Answer:
ip addr show
or
ip a
Assign IP temporarily:
● On RHEL/CentOS: Edit
/etc/sysconfig/network-scripts/ifcfg-eth0
Subnetting Example:
● Network: 192.168.1.0
● Broadcast: 192.168.1.7
ipcalc 192.168.1.10/29
Answer:
ip link
ip addr
Bring interface up/down:
Debian/Ubuntu (Netplan):
yaml
network:
version: 2
ethernets:
eth0:
dhcp4: no
addresses: [192.168.0.100/24]
gateway4: 192.168.0.1
nameservers:
addresses: [8.8.8.8,8.8.4.4]
RHEL/CentOS (ifcfg):
ini
BOOTPROTO=static
IPADDR=192.168.0.100
NETMASK=255.255.255.0
GATEWAY=192.168.0.1
DNS1=8.8.8.8
Answer:
netstat -tulnp
● -t: TCP
● -u: UDP
● -l: listening
netstat -plnt
ss (replacement for netstat, faster):
ss -tulnp
ip addr
ip route
ip link
ifconfig (deprecated, but still present on many systems):
ifconfig -a
Tip: ip and ss are faster, script-friendly, and preferred for scripting and
automation.
Answer:
These tools help isolate and diagnose connectivity, routing, port issues, and
firewall blocks.
ping:
Tests reachability.
ping 8.8.8.8
ping -c 4 google.com
traceroute google.com
Helps detect where the connection breaks (e.g., ISP routing issue).
nmap:
Analyze in Wireshark.
Real-world flow:
Answer:
Chains:
List rules:
sudo iptables -L -n -v
Allow SSH:
Block an IP:
sudo iptables -F
Save rules:
sudo iptables-save > /etc/iptables.rules
firewalld (modern systemd-based interface):
Check status:
Allow a service:
Add port:
Key Differences:
36. How to monitor system performance with top, vmstat, iostat, sar,
and dstat?
Answer:
top
top
vmstat 2 5
● Key columns:
iostat -xz 2 5
● Important metrics:
Install via:
sar -r 1 5 # Memory
sar -u -f /var/log/sysstat/sa10
dstat (All-in-one performance viewer)
dstat -cdngy
Install:
sudo apt install dstat
Answer:
Example:
Answer:
free -h
Columns:
Important: used may appear high, but Linux aggressively caches. Always
check available for real usage.
top memory view:
vmstat 1 5
Fields:
Install:
sudo apt install smem
smem -r
Check swap usage:
swapon --show
Answer:
tail -f /var/log/syslog
Search logs:
grep "error" /var/log/syslog
journalctl (for systemd systems):
40. What are cgroups and how do they help in resource management?
Answer:
cgroups (control groups) are a Linux kernel feature that limits, accounts
for, and isolates resource usage (CPU, memory, I/O, etc.) of a set of
processes.
Key use cases:
cat /proc/self/cgroup
To manually create a cgroup (v1 example):
Debian-based systems (like Ubuntu) use .deb packages and the dpkg
tool under the hood. For ease, they use apt or apt-get.
Example:
sudo apt update
Example:
or:
42. What are the differences between apt, yum, dnf, and zypper?
yum is the older tool for RHEL-based systems. It handles repositories and
dependencies well but has been replaced by dnf.
dnf (Dandified Yum) is the next-gen replacement for yum. It’s faster,
supports parallel downloads, better dependency resolution, and has a
cleaner architecture.
In practice:
wget https://example.com/tool.tar.gz
cd tool
./configure
4. Compile:
make
5. Install:
checkinstall
44. What is a .deb and .rpm file? How are they created?
Both .deb and .rpm files are precompiled software packages used by
their respective systems.
They contain:
● Executable binaries
● Configuration files
myapp/
├── DEBIAN/
│ └── control
├── usr/
│ └── local/
│ └── bin/
│ └── myapp
Package: myapp
Version: 1.0
Architecture: amd64
To create an .rpm file, use the rpmbuild tool and an .spec file with build
and install scripts.
For Debian/Ubuntu: apt handles dependencies pretty well, but if you use
dpkg directly and it complains:
If it fails:
Or use:
Other tools:
● ldd binary – shows which shared libraries are needed and if any
are missing.
● alien – converts between .rpm and .deb if needed, but use with
caution.
systemd is the modern system and service manager used in most major
Linux distributions today. It’s the replacement for the traditional SysVinit
and Upstart systems.
ps -p 1 -o comm=
Start a service:
sudo systemctl start nginx
Stop a service:
Restart it:
Enable on boot:
Disable autostart:
Check status:
systemctl --failed
Let’s say you have a custom script or application you want to run as a
managed service.
ini
[Unit]
After=network.target
[Service]
ExecStart=/usr/local/bin/myapp.sh
Restart=on-failure
User=myuser
[Install]
WantedBy=multi-user.target
This is how you “daemonize” your own scripts and run them reliably like a
native Linux service.
Boot failures are critical, and knowing how to troubleshoot them is a key
sysadmin skill.
Step-by-step approach:
1. Watch for kernel panic or GRUB errors during boot. If you see
something like “grub rescue>”, then GRUB is corrupted.
You may need to boot from a Live CD/USB and repair with:
grub-install /dev/sda
update-grub
This process often involves booting into recovery mode and applying fixes
from there.
journalctl
journalctl -xe
journalctl -u nginx.service
journalctl -f
journalctl --list-boots
journalctl -b -1
Key benefit: Unlike traditional logs, systemd journals include rich metadata
like process ID, user ID, session ID, boot ID, and can be filtered in powerful
ways.
● Always disable root SSH access. Instead, use sudo with individual
accounts.
● Keep the system updated regularly using apt update && apt
upgrade or yum update.
Security is not about one tool—it’s about creating layers. Even if one layer
fails (like a user getting access), the attacker shouldn't be able to do much.
● Modes:
Check mode:
getenforce
Switch mode:
setenforce 0 # Permissive
setenforce 1 # Enforcing
ls -Z
sudo aa-status
Firewalls are the first line of defense in a Linux server. You should only
allow what’s needed and block everything else by default.
Allow SSH:
Enable it:
Allow services:
Deny something:
Check status:
Start with the auditd service. It records events like file access, permission
changes, and system calls.
To monitor a file:
Check logs:
ausearch -k passwd_watch
Install AIDE:
Initialize database:
sudo aideinit
This helps you know if any critical file was altered. It’s especially useful for
detecting rootkits or tampering after a breach.
SSH is the most common attack vector on Linux servers, and hardening it
is essential.
nginx
PermitRootLogin no
nginx
PasswordAuthentication no
nginx
5. Use Fail2Ban: Installs filters to block IPs after failed login attempts.
Install it:
7. Use firewall rules to restrict SSH: Allow only known IPs to access
your SSH port.
journalctl -u ssh
Or check /var/log/auth.log
These measures make it much harder for attackers to get in, even if they
know your IP or scan for open ports.
Use regex:
grep -E "fail|error" /var/log/syslog
Search recursively:
Edit in place:
Sum a column:
Get usernames.
These tools together let you transform data in files or pipelines without ever
opening a GUI. In scripts, they’re indispensable.
Both are used to search for files, but they work very differently.
find searches the filesystem in real time. It’s accurate and powerful but
slower.
Example:
Search by type:
Example:
locate nginx.conf
It’s lightning fast, but may show outdated results if the file was created or
deleted after the last index update.
sudo updatedb
Use find for precision and advanced logic. Use locate for speed when
accuracy is less critical.
xargs is used to take input from one command and pass it as arguments
to another. It's especially useful when you need to process output line by
line and feed it to commands like rm, mv, cp, or even curl and ssh.
This deletes all .log files. You could do the same without xargs, but
xargs improves performance by batching the input.
59. How to search for patterns recursively using grep and find?
This approach gives more control—filter files with find, then search their
content with grep.
60. Explain piping (|) and redirection (>, >>, 2>) with examples
Piping and redirection are how Linux glues together commands and
handles outputs and errors.
The pipe (|) connects the output of one command to the input of another:
Redirection:
command 2>&1
A useful pattern:
Mastering redirection and pipes means you can script anything, debug
anything, and write efficient command-line workflows.
61. What is the role of the Linux kernel?
The Linux kernel is the core component of the operating system. It acts as
a bridge between the hardware and user applications. Everything that
touches your CPU, memory, disk, or network goes through the kernel.
Without the kernel, your Linux system is just a bunch of files. It's what boots
your machine, handles hardware, and enforces boundaries between user
applications.
uname -r
Kernel modules are pieces of code that can be loaded into the kernel as
needed. They extend the kernel’s functionality without needing to reboot or
recompile.
lsmod
To load a module:
To unload a module:
modinfo <module_name>
lspci -k
You can also blacklist modules you don’t want the system to load
automatically. To do this, add a line like this in
/etc/modprobe.d/blacklist.conf:
nginx
blacklist bluetooth
Kernel modules are essential for customizing behavior without bloating the
base kernel. You only load what you need.
dmesg | less
dmesg | tail
dmesg | grep sd
● Kernel version
● CPU initialization
● Memory detection
● Filesystem mounting
● Driver loading
It’s a vital tool for any hardware issue—disks not mounting, network cards
not appearing, or USBs not being detected.
On Debian/Ubuntu:
On RHEL/CentOS:
rpm -q kernel
sudo reboot
uname -r
If the system fails to boot with the new kernel, you can select an older
kernel from the GRUB menu during boot.
Always test kernel updates in staging or with snapshotting (if using LVM or
Btrfs) before applying in production.
Compiling your own kernel gives you control over features, modules, and
performance optimizations. This is usually done in embedded systems,
custom hardware setups, or when you want bleeding-edge features.
wget
https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.x.
y.tar.xz
cd linux-6.x.y
make menuconfig
make -j$(nproc)
This places the kernel under /boot, updates GRUB, and prepares the
initramfs.
6. Update bootloader: Usually done automatically, but you can force it:
sudo update-grub
sudo reboot
Compiling a kernel can take time and CPU. Always back up and know how
to revert in case something breaks. This is not common in day-to-day
admin work but highly relevant in kernel development, embedded systems,
and advanced performance tuning.
66. Where are the major log files stored in Linux?
In Linux, most logs live under /var/log. This directory is where the
system, kernel, services, and applications write their activity logs.
Understanding what lives where helps you pinpoint issues fast.
To inspect logs:
less /var/log/syslog
tail -f /var/log/auth.log
Also, journal-based systems store logs using systemd’s journal, which can
be queried with journalctl. For example:
journalctl -xe
Start with:
dmesg | less
This will show kernel-level messages. Look for lines like "Kernel panic" or
OOM kill events.
journalctl -u nginx
● Disk space:
df -h
● RAM/swap:
free -m
● Inodes:
df -i
The key is to correlate the exact time of the crash with relevant logs across
dmesg, journalctl, and /var/log.
Linux systems generate logs constantly. Without management, logs can fill
up your disk and create performance issues. logrotate is the tool that
handles rotating, compressing, and removing old log files automatically.
/var/log/nginx/*.log {
daily
missingok
rotate 7
compress
Delaycompress
notifempty
postrotate
endscript
logrotate -d /etc/logrotate.conf
logrotate -f /etc/logrotate.conf
This keeps your /var/log clean and ensures services don’t fail due to full
disks.
This sets up a watch on /etc/passwd for write (w), attribute (a), and read
(r) operations with the tag passwd-watch.
ausearch -k passwd-watch
You can filter logs by user, PID, event type, or time range.
lua
/var/log/audit/audit.log
swift
/etc/audit/rules.d/audit.rules
umount /dev/sdb1
Then run:
fsck /dev/sdb1
If it's your root filesystem and you can’t boot, you’ll need to:
The -y flag answers "yes" to all fix prompts, which is useful in automation
or recovery scripts.
In case of journaled filesystems like ext3/ext4, this can recover from most
soft errors. If the disk is physically damaged, consider cloning it with dd first
to avoid data loss.
Linux is the standard OS for virtual machines in cloud platforms like AWS,
Azure, and GCP. Whether you're spinning up EC2 instances on AWS or
Compute Engine VMs on GCP, you're interacting with cloud-optimized
Linux distributions.
--image-id ami-xyz \
--count 1 \
--instance-type t2.micro \
--key-name mykey \
--security-groups my-sg
In practice, once your Linux VM is up, you’ll: – Use ssh to connect – Install
packages via apt or yum – Configure web servers, databases, monitoring
agents, etc. – Set up firewall rules with cloud-specific tools or iptables
Linux is also the foundation for container hosts, autoscaling groups, and
most cloud-native services. As a DevOps engineer, working in the terminal
on cloud Linux VMs becomes second nature.
cloud-init is a tool that runs during the first boot of a cloud instance. It's
used to initialize settings like hostname, users, SSH keys, and even run
commands or install software.
It pulls metadata from the cloud provider (like AWS or GCP) and executes
based on what's defined in the user-data script.
yaml
#cloud-config
hostname: devserver
users:
- name: devops
groups: sudo
shell: /bin/
ssh-authorized-keys:
- ssh-rsa AAAA...
packages:
- nginx
- git
runcmd:
cat /var/log/cloud-init.log
#!/bin/
set -e
npm install
npm test
sudo dockerd
– Run a container:
docker ps
docker rm <id>
Linux lets you dig deeper into Docker internals: – Check container
processes via ps aux on the host
– Inspect container network bridges using ip a
– View container mount points using mount, df -h, or lsns
– kubelet: the agent that communicates with the control plane and runs
pods
– containerd or CRI-O: the container runtime
– kube-proxy: handles networking rules
– cni0, flannel.1, or other virtual interfaces for pod networking
– systemd or Docker to manage services like kubelet
If you SSH into a Linux worker node, you can inspect pod processes with:
ps aux | grep kube
journalctl -u kubelet
crictl ps
crictl info
When your system is slow or laggy, the first signs usually show up as
spikes in CPU or memory. To figure out what’s happening, you start by
identifying which process is responsible.
For CPU:
top
htop
If one process is hogging the CPU, you’ll see it at the top. If it’s a runaway
script or service, you can inspect it further using:
strace -p <pid>
This helps you see what syscalls it’s making. If it’s stuck in a loop or waiting
on I/O, you’ll know.
free -m
Shows how much RAM is being used. If swap is high and RAM is full,
you're under memory pressure.
Also, watch out for zombie processes (Z status in ps). If memory leak is
suspected, tools like valgrind, smem, or application-specific profilers
come in handy.
If the issue is regular and not just a one-time thing, consider setting up
alerts via Prometheus, CloudWatch, or another monitoring tool.
When an application crashes, you want to figure out: – What caused it
– When it happened
– If it can be reproduced
journalctl -u myapp.service
If it failed with a segmentation fault or core dump, make sure core dumps
are enabled:
ulimit -c unlimited
Also check for file descriptor leaks, full disks, permission issues, or bad
configuration files.
If it’s a service that starts and then immediately stops, it might be missing
environment variables, dependencies, or files it expects at runtime.
Always correlate with logs and context: Was there a deployment? A config
change? A spike in traffic?
Root Cause Analysis (RCA) is the post-mortem where you identify what
broke, why it broke, and how to prevent it next time.
1. Timeline analysis – When did the issue start? Correlate logs,
metrics, and alerts.
3. Scope – Was it one host, one service, or the entire system?
5. Underlying cause – Why was that change made? Why did the
safeguard fail?
Tools that help during RCA: – journalctl or log aggregation tools (ELK,
Loki)
– Monitoring dashboards (Prometheus, Grafana, Datadog)
– Deployment logs from CI/CD
– Command histories (history | grep kube, etc.)
Once you’ve found the cause, document the fix and add a check or alert to
prevent recurrence. That’s how you mature systems.
First, to see what ports are open and which processes are using them:
ss -tulnp
Or with netstat:
netstat -tulnp
Now, if you suspect malware, look for strange ports, especially high ones
(like 1337, 6667, etc.), or odd services listening.
ps aux --sort=-%cpu
lsof -i
chkrootkit
rkhunter
You can also use auditd to monitor what processes are being launched,
or run tripwire or AIDE to look for modified system files.
If you want to know who accessed or modified a file, you need either
auditing tools or proper logging.
ausearch -k passwd_watch
You can filter by PID, UID, or time range to see exactly who did what.
Beyond auditd, you can use: – inotifywait for real-time file change
tracking
– ls -ltu to check last access times
– find /etc -amin -10 to find files accessed in the last 10 minutes
81. How to use rsync, tar, dd, and scp for backups?
These are four of the most versatile tools for backing up and moving data in
Linux.
Start with rsync, the most efficient for syncing files across directories or
remote systems. It copies only the differences between source and
destination.
Basic usage:
tar is used to compress and archive files into a single .tar or .tar.gz
file.
Create a backup:
Extract:
dd works at the block level. It’s often used to make complete disk or
partition images.
Restore it:
scp is simple and secure for copying files between systems over SSH.
Basic use:
To copy a directory:
These tools together give you the flexibility to automate backups, sync
large data sets, create snapshots, or move critical files between hosts.
#!/bin/
DATE=$(date +%F)
BACKUP_DIR="/backups"
TARGET="/etc"
Make it executable:
chmod +x backup.sh
crontab -e
0 2 * * * /usr/local/bin/backup.sh
0 2 * * * /usr/local/bin/backup.sh >>
/var/log/backup.log 2>&1
cp /mnt/snap/etc/important.conf /etc/
In the cloud, tools like AWS Backup or EBS snapshots can restore full
volumes: – Create a snapshot
– Restore it to a new volume
– Mount the new volume on another instance
– Copy the file back using scp, rsync, or just plain cp
If you're preparing for full system disaster recovery, a complete disk image
is your best bet. You can make the system bootable again using tools like
dd, Clonezilla, or creating your own recovery ISO.
To restore:
update-grub
You can also use tools like Clonezilla to create full disk images and restore
them interactively. It supports compression, encryption, and partition
cloning.
The key is to have both system files and boot records backed up—without
that, restoring won’t bring the system back online.
Offsite backups are critical for disaster recovery. Fire, hardware failure, or
ransomware can take out your primary storage, but if you’ve got offsite
backups, you're protected.
For even more security: – Encrypt files using gpg or openssl before
uploading
– Use restic or duplicity with encryption enabled
– Store backups in a different region or provider
86. What is the difference between containers and virtual machines?
Both containers and virtual machines are used to isolate workloads, but the
way they achieve that isolation is fundamentally different.
Virtual Machines (VMs) virtualize the hardware. They run full operating
systems on top of a hypervisor like KVM, VMware, or VirtualBox. Each VM
has its own kernel, filesystem, and virtual hardware. This makes them
heavy in terms of resource usage and slower to boot.
Containers, on the other hand, use the host's kernel. They isolate
applications using Linux kernel features like namespaces and cgroups.
Because containers don’t need to boot an entire OS, they start in
milliseconds and use far fewer resources.
Think of it like this: – A VM is like a full house with plumbing, power, and
walls.
– A container is like a room inside a shared apartment with private access
but shared infrastructure.
Containers are ideal for microservices and cloud-native apps. VMs are
better when you need full OS-level separation, e.g., for legacy apps,
custom kernels, or multiple OS types on one host.
87. How does chroot work and how is it different from containers?
mkdir /mychroot
But here’s the thing: chroot only restricts the filesystem. It doesn't isolate
network, user IDs, or processes. That’s why it's not secure by itself.
These are the two core features that make containers possible in Linux.
Each container has its own set of these, which means it can’t see or
interact with the host or other containers.
Example: Docker uses cgroups to make sure one container can’t consume
all system memory and bring the node down.
cat /proc/self/cgroup
Or use:
systemd-cgls
While Docker is the most popular container tool, alternatives like Podman
and containerd have become important in production, especially in
Kubernetes environments.
Podman can be run as a normal user, which improves security, and works
well in rootless containers and CI/CD pipelines.
ctr
crictl
docker ps
Or inspect metadata:
You’ll see IP address, mount points, image info, restart policy, and more.
To view logs:
Shows what files were added or changed since the container started.
docker stats
crictl ps
91. How to write an interactive shell script?
An interactive shell script prompts the user for input, processes it, and
responds accordingly. These scripts are great for tools that require user
confirmation or setup, like installers, CLI utilities, or interactive DevOps
scripts.
#!/bin/
mkdir "$project"
else
fi
case $option in
quit) break;;
esac
done
Interactive scripts are useful for internal tooling or when building helper
utilities for your team.
cron is the go-to scheduler for recurring tasks in Linux. To set it up:
#!/bin/
Make it executable:
chmod +x /usr/local/bin/mybackup.sh
crontab -e
0 2 * * * /usr/local/bin/mybackup.sh
0 2 * * * /usr/local/bin/mybackup.sh >>
/var/log/backup.log 2>&1
Example:
#!/bin/
Run it:
To make it safer:
exit 1
fi
done
Functions break your script into logical blocks. This makes your scripts
more maintainable, reusable, and testable.
#!/bin/
log() {
backup() {
cleanup() {
# Main execution
Backup
cleanup
deploy_app() {
env=$1
It’s a solid pattern for growing from quick scripts to real automation tools.
Handling errors cleanly is what separates good scripts from fragile ones.
You don’t want your script to fail silently or continue on an error.
Start with:
set -e
Also add:
set -o pipefail
log() {
Solid error handling and logging makes your scripts production-grade and
ready to be scheduled, monitored, and trusted.
96. How would you approach troubleshooting a slow Linux system?
Start with:
uptime
top
free -m
If it’s disk:
iostat -xz 1
Or:
iotop
If it’s network-related:
ss -s
iftop
nload
journalctl -xe
dmesg | tail
You can’t SSH into every box. Managing at scale means treating
infrastructure like code, automating everything, and having centralized
control.
Next, use SSH key management via tools like: – HashiCorp Vault
– AWS Systems Manager (SSM)
– LDAP or centralized auth (FreeIPA)
Use orchestration and grouping: – Host tagging (web, db, staging, etc.)
– Inventory management via dynamic sources (EC2, GCP, etc.)
The real trick is visibility and automation — one mistake across 100 boxes
is a disaster if you don't have checks and balances in place.
99. How do you keep your Linux skills sharp and production-ready?
Linux is too vast to know everything at once, but staying sharp is about
consistent exposure and real projects.
If you're always trying to understand, not just use, you'll evolve naturally
from user to engineer to architect.
100. What advice would you give someone new to Linux aiming for DevOps
roles?
Start with the why, not just the commands. Linux powers cloud,
automation, containers, and CI/CD. Every DevOps pipeline sits on a Linux
box somewhere.
Most importantly, don’t just learn Linux — live in it. Use it daily. Break it.
Fix it. That’s how you become production-ready.