Electronic Design

An Introduction To RAID

Low-cost, high-performance, 3.5-in. hard-disk drives dominate the landscape, but storage and reliability requirements for high-end enterprise systems exceed the features available with a single drive. That's why the redundant-array-of-inexpensive-disks (RAID) systems are essentially synonymous with high-end systems.

The dropping cost and rising capacity of high-end 3.5-in. drives plus the falling cost of RAID controllers has brought RAID to low-end servers. Motherboards with integrated SCSI and ATA RAID controllers are readily available.

RAID controllers support one or more standard RAID configurations specified as RAID 0 through RAID 5. Proprietary or experimental configurations have been given designations above RAID 6, and some configurations like RAID 10 are a combination of RAID 1 and RAID 0.

All of the configurations except RAID 0 provide some form of redundancy. RAID controllers with redundancy often support hot-swappable drives, allowing the removal and replacement of a failed drive. Typically, information on a replacement drive must be rebuilt from information on the other drives before the system will be redundant and able to handle another drive failure.

Let's look at how a RAID system works (see the figure). The letters A through I indicate where data is placed on a disk along with the parity and error-correction code (ECC) support. The diagrams show the minimum disk configurations for each approach—with the exception of RAID 0—that can operate with only two drives.

Also known as striping, RAID 0 uses any number of drives. Data is written sequentially by sector or block across the drives. This provides high read and write performance because many operations can be performed simultaneously, even when a sequential block of information is being processed. The downside is the lack of redundancy. Striping is commonly combined with other RAID architectures to provide high performance and redundancy.

Called mirroring or duplexing, RAID 1 has 100% overhead requiring two drives to store information normally stored on only one. Although writing speed is the same as with a single drive, it's possible to have twice the read transfer rate because information is duplicated and the drives can be accessed independently. Also, there's no loss of write performance when a drive is lost and no rebuild delay as with other RAID architectures. Note that RAID 0 is the most expensive approach when it comes to disk usage.

The RAID 2 configuration employs Hamming Code ECC to provide redundancy. It has a simple controller design, but no commercial implementations exist, partly due to the high ratio of ECC disks to data disks.

Striped data with the addition of a parity disk is used by the RAID 3 configuration. It has a high read/write transfer rate, but controller design is complex and difficult to accomplish as software RAID. Plus, the transaction rate of the system is the same as a single-disk drive.

RAID 4 utilizes independent disks with shared parity. This configuration has a very high read transaction and aggregate read rate. Furthermore, it has a low ratio of ECC disks to data disks. Unfortunately, it has a poor write transaction rate and a complex controller design. Most implementations prefer RAID 3 over RAID 4.

The RAID 5 configuration is a set of independent disks with distributed parity. Similar to RAID 3, the contents of the parity disk are spread across each disk instead of concentrating all of this information on a single disk. RAID 5 has high read performance with a medium write transaction rate speed. It has a good aggregate transfer rate, making it one of the most common approaches to RAID, even with a software RAID solution.

This configuration, however, has one of the most complex controller designs, and the individual block transfer rate is the same as a single disk. Rebuilding a replacement drive is difficult and time consuming. But some controllers implement this feature so that it takes place in the background with a degraded application throughput until rebuilding has been completed.

Essentially, RAID 6 is RAID 5 plus an extra parity drive. One parity is generated across disks while the second parity tends to the data on another disk. The two independent parity drives provide high fault tolerance that permits multiple drive failures to happen. Controller design is very complex, even compared to RAID 5. Furthermore, RAID 6 has very poor write performance because two parity changes must be made for each write.

Many RAID controllers place disks on independent channels to allow simultaneous access to multiple drives. Likewise, large memory caches significantly improve controller performance.

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish