Friday, September 14, 2012

RAID in Linux

From: NextStep4IT


The term RAID is an acronym for the phrase, Redundant Array of Independent Disks. RAID is a way of combining the storage available across multiple disks and supplying users a single, unified virtual device.
RAID can be used to provide:

  • data integrity
  • fault tolerance
  • improved performance
  • greater storage capacity

Hard disks are mechanical devices involving moving parts and unfortunately tend to fail over time. There are also physical limits to the speed at which data can be read and/or written to disks. RAID helps mitigate this risk by protecting data stored on hard disks and improving disk performance by writing the data to multiple
physical locations according to several different schemas, known as "RAID Levels". Furthermore, RAID can be provided by either dedicated, specialized hardware or by the operating system at a virtual layer.



Hardware RAID vs. software RAID?

Hardware RAID solutions exist that operate as dedicated devices, usually as PCI expansion cards or directly on the motherboard. The independent disks attach to the hardware interface. In a true hardware RAID, the operating system simply writes data to the hardware RAID controller which handles the multiplicitous reads and writes to the associated disks. Other so−called hardware RAIDs rely on special drivers to the operating system; these act more like software RAIDs in practice. With current technology, hardware RAIDconfigurations are generally chosen for very large RAIDs.

Additionally, some operating systems, including Linux®, provide RAID functionality within a software layer.RAID partitions are logically combined and a virtual device appears to higher layers of the operating system in place of the multiple constituent devices. This solution is often a high−performance and inexpensive alternative available for RAID users.

 

RAID levels

There are many RAID levels. It will be impossible to list them all here. Here we mention the most common & most important RAID types, all of which are fully supported by Linux.

 

RAID0 (Striping)

 

This level is achieved by grouping 2 or more hard disks into a single unit with the total size equaling that of all disks used.

Practical example: 3 disks, each 80GB in size can be used in a 240GB RAID0 configuration.

RAID0 works by breaking data into fragments and writing to all disk simultaneously. This significantly improves the read and write performance.

On the other hand, no single disk contains the entire information for any bit of data committed. This means that if one of the disks fails, the entire RAID is rendered inoperable, with unrecoverable loss of data.

RAID0 is suitable for non-critical operations that require good performance, like the system partition or the /tmp partition where lots of temporary data is constantly written. It is not suitable for data storage.

Usable Space in Raid level0 = (smallest disk) * (no. of disks)

RAID1 (Mirroring)

 

This level is achieved by grouping 2 or more hard disks into a single unit with the total size equaling that of the smallest of disks used.

 

This is because RAID1 keeps every bit of data replicated on each of its devices in the exactly same fashion, create identical clones. Hence the name, mirroring. Practical example: 2 disks, each 80GB in size can be used in a 80GB RAID1 configuration.

 

Usable space of Raid level1= smallest-disk

On a side note, in mathematical terms, RAID1 is an AND function, whereas RAID 0 is an OR. Because of its configuration, RAID1 reduced write performance, as every chunk of data has to be written n times, on each of the paired devices. The read performance is identical to single disks.Redundancy is improved, as the normal operation of the system can be maintained as long as any one disk is functional.

RAID 1 is suitable for data storage, especially with non-intensive I/O tasks.

 

 

RAID5 

 

This is a more complex solution, with a minimum of three devices used. Two or more devices are configured in a RAID0 setup, while the third (or last) device is a parity device. If one of the RAID 0 devices malfunctions, the array will continue operating, using the parity device as a backup. The failure will be transparent to the user, save for the reduced performance.

RAID 5 improves the write performance, as well as redundancy and is useful in mission-critical scenarios, where both good throughput and data integrity are important. RAID 5 does induce a slight CPU penalty due to parity calculations.

Usable Space for Raid Level5 = smallest-disk*(no of disk -1)

 

Linear RAID

 

This is a less common level, although fully usable. Linear is similar to RAID0, except that data is written sequentially rather than in parallel. Linear RAID is a simple grouping of several devices into a larger volume, the total size of which is the sum of all members. For instance, three disks the sizes of 40, 60 and 250GB can be grouped into a linear RAID the total size of 350GB.

 

Linear RAID provides no read/write performance, not does it provide redundancy; a loss of any member will render the entire array unusable. It merely increases size. It's very similar to LVM. Linear RAID is suitable when large data exceeding the individual size of any disk or partition must be used.

Now, move backup to tape or other server:

 

Nested RAID Levels

 

RAID0+1

 

Exapmle of RAID Level 0+1

 

RAID 0+1 (also called RAID01), is a RAID level used for both replicating and sharing data among disks.The minimum number of disks required to implement this level of RAID is 3 (first, even numbered chunks on all disks are built – like in RAID0 – and then every odd chunk number is mirrored with the next higher even neighbour) but it is more common to use a minimum of 4 disks.

 

The difference between RAID0+1 and RAID 1+0 is the location of each RAID system — RAID0+1 is a mirror of stripes although some manufacturers (e.g. Digital/Compaq/HP) use RAID0+1 to describe striped mirrors, consequently this usage is now deprecated so that RAID0+1 and RAID1+0 are replaced by RAID10 whose definition correctly describes the correct and safe layout, i.e. striped mirrors.

RAID1 + 0

RAID1+0, sometimes called RAID1&0 or RAID10, is similar to a RAID0+1 with exception that the RAID levels used are reversed — RAID10 is a stripe of mirrors.

Hard disks are mechanical devices involving moving parts and unfortunately tend to fail over time. There are also physical limits to the speed at which data can be read and/or written to disks. RAID helps mitigate this risk by protecting data stored on hard disks and improving disk performance by writing the data to multiple

ITWORLD
If you have any question then you put your question as comments.

Put your suggestions as comments