Linux Magazine - RAID and LVM - Part One
Linux Magazine - RAID and LVM - Part One
You should first understand what each technology does. RAID and LVM collectively solve three problems associated with
disk storage: speed, reliability, and flexibility. RAID enhances speed and reliability by enabling you to combine partitions
from two or more disks into one virtual partition. Depending on how the space is combined, disk access speed may be
improved, reliability may be improved, or both may be improved. LVM enhances flexibility by enabling creation of
partition-like units that can be more easily resized. Combining these two technologies can be a big help with your disk
problems. Unfortunately, both RAID and LVM require configuration effort, and if you have an existing Linux installation,
you’ll need to jump through some hoops to get everything working.
RAID Basics
On the surface, RAID is the more complex of the two technologies, because several different types of RAID arrays exist,
many of which are supported by the Linux kernel:
* LINEAR(APPEND)MODE. This technique creates a single virtual partition from two or more input partitions. Data isn’t
interleaved or duplicated, so linear mode provides none of RAID’s benefits, aside from the ability to create partitions larger
than your biggest physical hard disk.
* RAID 0(STRIPING). Level 0, the lowest level of RAID, interleaves data from the constituent partitions. When reading a
large block of (apparently) contiguous data from a RAID 0 array split across two disks, an application reads a few blocks
from disk 1, a few blocks from disk 2, a few blocks from disk 1, and so on. This access pattern results in better overall
throughput than reading everything from one disk, because the interleaved access reduces the impact of disk speed
bottlenecks. RAID 0 provides no data integrity checking, though, and so it doesn’t improve reliability. In fact, if anything it
could be said to degrade reliability because either of two disks can fail.
* RAID 1(MIRRORING). This approach creates an exact copy of the data on one disk on one or more additional disks to
improve reliability. If one disk fails, the computer can read the data from the second disk. Unfortunately, RAID 1 degrades
write performance, at least when implemented in the Linux kernel, since the system must write data twice.
* RAID 4/5. This type of RAID attempts to gain the benefits of both RAID 0 and RAID 1. RAID 4/5 stripes data, much as
in RAID 0, but adds checksums, which can be used to regenerate data in the event of a disk failure. RAID 4 stores the
checksums on a single dedicated drive, whereas RAID 5 stores them on the component drives. In either event, RAID 4/5
improves both disk access time and reliability, but at the cost of the need for an extra drive: N+1 drives store the data
that could be stored on N drives without RAID 4/5. Therefore, three drives are the minimum practical configuration for a
RAID 4/5 array.
* RAID 6. RAID 4/5 is great, but what if two drives fail? You lose data. RAID 6 exists to provide still more redundancy,
but at the cost of the need to use yet another drive (N+2 drives are required to obtain the capacity of N drives).
* RAID 10. This level of RAID, like levels 4, 5, and 6, combines the benefits of RAID 0 and RAID 1, but in a different
way. This support is experimental as of the 2.6.14 kernel, so you should avoid it on production systems.
In most cases, RAID arrays are constructed from identical or near-identical drives. Although this isn’t strictly required,
using drives of disparate size or speed typically results in lost storage space or degraded performance compared to using a
single fast drive. An exception might be if you want to leave significant space in a non-RAID configuration (say, partitions
devoted to a non-Linux OS on a multi-boot computer); you might then reasonably use drives of different sizes.
RAID is also best implemented using SCSI disks, because SCSI handles transfers to and from multiple devices on the same
chain well. When using ATA disks, RAID arrays should use disks on different physical cables — placing two disks used for
1 of 4 9/15/2006 4:59 PM
Linux Magazine :: RAID and LVM: Part One http://www.linux-mag.com/index2.php?option=com_content&task=v...
RAID on a single cable means Linux won’t be able to transfer data from two disks simultaneously, resulting in a speed hit.
Some disk controllers advertise that they support RAID directly. In some cases this means that the controller has enough
smarts to handle the disks and present a single virtual disk to the host OS. Such controllers can be worth using,
particularly with RAID 1 or higher, because they can help offload the computational and disk access work of implementing
RAID, thus improving performance. Frequently, though, so-called RAID controllers are nothing but ordinary disk controllers
with a few hooks and software RAID drivers for Windows. Such controllers offer no benefits to Linux.
LVM Basics
LVM’s purpose is to improve partitioning flexibility. LVM begins with one or more partitions, each of which is known as a
physical volume in LVM terminology. These physical volumes are combined together into a volume group, which can then
be re-allocated as logical volumes. This process may seem convoluted and pointless, but its advantage is that you can
easily re-allocate the space within logical volumes. For instance, consider the following Linux partitions:
This arrangement might work well for an initial installation, but what happens if you discover that you need more space for
/usr and less space for /home? Resizing the partitions is possible using tools such as GNU Parted (described in the April
2003" Guru Guidance," available online at http://www.linux-mag.com/2003-04/guru_01.html), but using such tools is
tricky and usually requires rebooting the computer.
LVM is more flexible. It enables you to shrink the /home logical volume and increase the size of the /usr logical volume
without moving the volumes. If the filesystem you use permits it, you can even resize your logical volumes without
unmounting them! Presently ext2fs and ext3fs must be unmounted to be resized; ReiserFS can be resized when mounted
or unmounted; and JFS and XFS must be mounted to be resized. XFS partitions may only be grown, not shrunk.
Experimental tools to enable resizing ext2fs and ext3fs while mounted are under development. If you’ve ever run out of
space on a partition and wanted to resize it quickly, you no doubt see the appeal of this feature.
In addition, Linux’s LVM tools provide a feature that’s similar to RAID 0: You can tell the system to interleave accesses
from two or more component partitions, thus improving access speed if those partitions are on separate physical drives.
For maximum flexibility and performance, you can combine RAID and LVM. The most common way to do this is to create a
RAID array and then apply LVM to the RAID array. This approach gives you whichever RAID benefits apply to the RAID
version you use, and gives you LVM’s resizing flexibility. In theory, it’s also possible to apply LVM to your disks and then
create RAID arrays atop the logical volumes; however, this approach gives you (in theory) the ability to resize your
constituent RAID volumes, not the Linux filesystems stored on them. In practice, RAID may not respond well to its
volumes being resized.
RAID and LVM both rely on kernel support. (A partial exception is hardware RAID implemented by a RAID controller. You
don’t need Linux’s kernel RAID support to use such hardware, but you probably do need a kernel driver for the RAID
controller itself.) Because of the need for kernel support, you should be cautious about placing the most basic and
fundamental partitions in a RAID array or LVM configuration. In particular, the root (/), /boot, /etc, /root, /bin, /sbin,
/mnt, and /dev directories are best kept out of RAID and LVM. Although these directories can be placed in RAID or LVM
configurations, doing so means that you won’t be able to easily access them from emergency tools that lack the
appropriate support. Booting from a kernel stored in a RAID or LVM array can also be complicated.
If you want to configure a Linux system so that the entire disk is in a RAID array, one tip is to place the root (/) partition,
including all the specified directories, in a RAID 1 partition. Emergency tools can then access the constituent partitions
individually, and you can easily configure a boot loader to point to just one of the relevant kernels in the duplicated
partitions. To truly maximize your RAID experience, configure only the /boot partition as RAID 1 and put everything else in
a higher RAID level.
2 of 4 9/15/2006 4:59 PM
Linux Magazine :: RAID and LVM: Part One http://www.linux-mag.com/index2.php?option=com_content&task=v...
3. Reconfigure your system’s partitions in preparation for using RAID and/or LVM.
The remainder of this month’s column is devoted to the first and second steps. (The third step can actually be performed
before the first, but this is most common when implementing RAID or LVM when you install your system.)
The RAID and LVM kernel options are found under the Device Drivers> Multi-device Support(RAID and LVM) category in
the kernel configuration menus, as shown in Figure One. Activate the “RAID Support” option for RAID and the “Device
Mapper Support” option for LVM. (These options apply to the 2.6.9 and later kernels. Older kernels provided another LVM
option, but this option was used for older Linux LVM packages. The “Device Mapper Support” option is used for the current
LVM2 software.)
In addition to the main “RAID Support” and “Device Mapper Support” options, you should select appropriate sub-options.
Specifically, you must activate support for the RAID level you want to use. Figure One shows RAID 1 being built into the
kernel and RAID 4/5 being built as a module. No sub-options of the “Device Mapper Support” option are required for a
basic LVM configuration, at least as of the 2.6.14 kernel, but you might want to read the descriptions of the sub-options in
case one of the features appeals to you.
It’s recommended to compile the RAID and LVM kernel options into your main kernel file. This will simplify the boot
procedure, particularly if you decide to place your root (/) filesystem or other critical directories on a RAID or LVM device.
If you compile these options as kernel modules instead, you must be able to load the modules before you’ll be able to
access RAID or LVM devices, which means the modules (typically stored in /lib/modules/) must be on a conventional
partition or you must use a boot-time RAM disk.
Many distributions ship with RAID and LVM support in their stock kernels, or at least available as kernel modules. Thus,
you may not need to recompile your kernel to add this support. Check your kernel’s configuration to be sure it’s present,
though. For RAID, you can type cat /proc/mdstat. If the file exists, RAID support exists in your kernel and the file
contains information on the RAID levels that are available to you.
Once you’ve built your new kernel, you can install it, add it to your boot loader, and reboot your computer to test it. If
you’re modifying a working kernel, you shouldn’t have any problems with the new kernel. If it won’t boot, reboot using
your old kernel and try again.
After you reboot with the new kernel, you should install the RAID and LVM packages. In most cases, the RAID tools ship in
a package called mdadm, while the LVM2 software ships in a package called lvm2. (Older RAID and LVM packages —
3 of 4 9/15/2006 4:59 PM
Linux Magazine :: RAID and LVM: Part One http://www.linux-mag.com/index2.php?option=com_content&task=v...
raidtools and lvm-user — are also available, but are not described here.) If you can’t find these packages on your
distribution media or if you prefer to go to the original sites, check http://www.cse.unsw.edu.au/~neilb/source/mdadm/
for mdadm or http://sources.redhat.com/lvm2/ for lvm2.
With the software installed, you can begin reconfiguring your partitions. This task can be tedious, particularly if you want
to reconfigure a working system without adding new hardware. Be sure to back up your data before proceeding!
Reconfiguring existing hard disks requires deleting one or more of their partitions, creating new partitions, and restoring
data. This process won’t be complete until you’ve configured a working RAID and/or LVM system, as described in the next
two months, so don’t jump into this process until you’ve read the relevant future columns. If you intend to move your
system to new hard disk (s), you can begin preparing them now and complete the transition later.
The configuration described here is suitable for an LVM-on-RAID system: It uses low-level RAID partitions to store LVM
logical volumes. As a safeguard and to permit minimal access to the system in the event of a problem with the RAID or
LVM configuration, the root (/) filesystem is stored in a conventional partition, while the bulk of the data in /usr and /home
is stored in RAID/LVM volumes. The basic partition layout looks like Figure Two.
The list of partitions in Figure Two omits some non-Linux partitions for simplicity’s sake, but it demonstrates the fact that a
Linux RAID/LVM configuration can coexist with other operating systems — DOS, Windows, and FreeBSD partitions exist on
these disks along with the Linux partitions. Also, both / and /boot exist as standard partitions, although both are small in
size. They could be moved within the RAID/LVM configuration, but at the cost of greater complexity, particularly for /boot.
One swap partition exists outside of the RAID/LVM configuration and (as described in subsequent columns) another exists
within it, although this RAID/LVM swap space isn’t apparent in the partition list. This setup has no particular advantage,
unless perhaps you want swap space for a small emergency Linux system without RAID/LVM support. Indeed, the example
system is configured as such just to illustrate that swap space can exist in or out of a RAID/LVM system.
Next Month
Next month looks at the RAID side of the configuration in more detail. If you want to implement a RAID system only (with
no LVM features), you should be able to do so after reading next month’s column. If you want a complete RAID/LVM
configuration, though, you’ll have to wait for the next two months’ columns to finish the job.
Roderick W. Smith is the author or co-author of over a dozen books, including Linux in a Windows World and Linux
Power Tools. He can be reached at [email protected].
Close Window
4 of 4 9/15/2006 4:59 PM