Lecture 5
02/03/2023
Linux Rsync and Inodes
CS307: Systems Practicum
Prof. Varun Dutt, IIT Mandi
Rsync (Remote synchronization)
● Rsync: remote sync (or remote synchronization)
● A flexible network-enabled remote and local file synchronization tool.
● It is similar in function and invocation to rdist (rdist -c)
● It minimize the amount of data copied.
○ How? Because it only moves the portions of files that have changed.
● It is typically used for synchronizing files and directories between two different
systems.
○ For this rsync local-file user@remote-host:remote-file is run, rsync will use SSH to connect as user to
remote-host.
2
Rsync - Understanding the syntax
● To sync the contents of dir1 to dir2 on the same system,
use the -r flag, which stands for “recursive”
● The trailing slash signifies the contents of dir1. Without
the trailing slash, dir1, including the directory, would be
placed within dir2. The outcome would create a
hierarchy like the following: ~/dir2/dir1/[files]
*Please note that there is a trailing slash (/) at the end of
3
the first argument.
Rsync - Options
● Another option is to use the -a flag, which is a
combination flag and stands for “archive”. This flag syncs
recursively and preserves symbolic links, special and
device files, modification times, groups, owners, and
permissions. It’s more commonly used than -r and is the
recommended flag to use.
● Another tip is to double-check your arguments before
executing an rsync command. Rsync provides a method
for doing this by passing the -n or --dry-run options. The
-v flag, which means “verbose”, is also necessary to get
the appropriate output.
4
Rsync - Options
● To compress a file before transferring use the -z
flag.
● The -P flag combines the flags --progress and
--partial. This first flag provides a progress bar for
the transfers, and the second flag allows to resume
interrupted transfers.
5
Rsync - Options
● In order to keep two directories truly in sync, it’s
necessary to delete files from the destination directory if
they are removed from the source. By default, rsync does
not delete anything from the destination directory.
● You can change this behavior with the --delete option.
● If you have a specified pattern to include/exclude, you
can use the --include, --exclude flags.
$ rsync -a --include=*.txt --include=*.log --exclude=* /source/* /destination/
6
Using Rsync to Sync with a Remote System
● To use rsync to sync with a remote system, you only need
SSH access configured between your local and remote
machines, as well as rsync installed on both systems. Once
you have SSH access verified between the two machines,
you can sync the dir1 folder from the previous section to a
remote machine by using the following syntax. This process
is called a push operation because it “pushes” a directory
from the local system to a remote system.
● The opposite operation is pull, and is used to sync a remote
directory to the local system. If the dir1 directory were on
the remote system instead of your local system, the syntax
would be the following:
7
Inode (Index Node)
● A data structure in linux filesystem containing
metadata of files and directories.
● Identified by a unique inode number, used by the
filesystem to locate and access the file.
● When a file is created, a filename and an Inode The relationship between the direct entry, an inode and blocks of an
allocated file
number is assigned to it.
8
Inode (Index Node)
● The inode space is used to track the files stored in the disk
● Makes accessing and modifying faster and more efficient.
● Inode does not contain file content, but it has a pointer to that data.
● All the inodes are stored in special array of disc known as inode table.
9
What Information Does Inode Contain?
Inode contain below information:
● File Type (executable, block special etc)
● Permissions(read, write etc)
● Group
● File Size
● File access, change and modification time
● File deletion time
● Number of links (soft/link)
● Access Control List
10
What is not in an Inode?
● File Name: There is no file name entry in the inode. Another directory entry runs parallel to
the inode. The reason for this is to maintain links to files. One can have various file names
point to the same inode
● Parent Directory: This is for the same reason file names are not included in the inode.
Multiple directories can have directory entries that point to the same file. So, there should
be one single parent directory in the inode
● Processes that have file open: The processes would be implemented as a linked list. This
leads to bad performance and security issues
● Inode don’t contain all of the metadata of a file. Some metadata must be stored in another
location to allow specific features to be implemented
11
How to see Inode Contents?
As we know inode contains the information like owner, permissions, timestamps, and the location of its data blocks on
the disk.
We can see those inode content of a file using stat command.
Using ls -l -i command, we can see few information about inodes of a file/directories
12
Monitoring inode usage
About 1% of the total disk space is allocated to the inode table. Sufficient number of inodes are
associated with a file system, but running out of inodes is always a possibility. To monitor this one can
use the df command.
df Command - Report file system disk space usage
df --inodes or df -i command is used to check how many inodes are free and left unused in the
filesystem.
13
Deleting Special Character filename using Inode
Sometime while deleting files containing spaces and/or special characters, bash shell considers these characters as
commands. To delete these kind of files we can use Inode number. Let’s suppose we have to delete a file named
‘>file’.
The rm command failed to delete the file due to strange character in filename. This can be deleted using inode
number. First we get the inode number and then using find command, we can delete the file.
14
References
● [Link]
● [Link]
● [Link]
● [Link]
● [Link]
● [Link]
● [Link]
15