Ads by Google

What is RAID technology and configuration overview

What is RAID technology

RAID (originally Redundant Array of Inexpensive Disks, today commonly referred to as Redundant Array of Independent Disks) is a data storage technology that combines multiple physical disks into a single logical unit for the purposes of data redundancy, performance improvement, or both.

Data is distribution across the disks is standardized into RAID levels. Decision on which level you will choose, depends on: required level of redundancy, required level of performance, and amount of money you can spend. In this article you’ll learn about basic RAID configurations (and two nested ones), overview of RAID levels, and key points for every RAID level.

Before we begin, here is the legend of coloring and notation that is used in this article. Blocks are blocks on disk, ECC is Hamming Error Correction Code, bytes (8 bits), bits (binary digits), and parity info:

RAID configurations legend

RAID technology was invented by David Patterson, Garth Gibson, and Randy Katz in 1987 (University of California, Berkeley). It was designed to be the cheap alternative to then widely spread DASD technology on mainframe computers.

What is RAID level 0

RAID level 0 requires minimum 2 disks. Data is striped across the blocks on these two disks in the following way: block A is located on disk 1, block B on disk 2, etc:

RAID level 0 configuration diagram

Key points:

  • excellent performance, as blocks are striped,
  • no redundancy, so use this configuration only when the performance is required, and data storage isn’t important,
  • failure of one disk causes the loss of the entire RAID 0 volume,
  • the capacity of a RAID 0 volume is the sum of the capacities of the disks in the set.

What is RAID level 1

RAID level 1 also requires minimum 2 disks. All blocks from disk 1 are mirrored to disk 2:

RAID level 1 configuration diagram

Key points:

  • good performance,
  • excellent redundancy; if one disk fails, all data is mirrored on the another disk (data is written identically to two or more disks),
  • the array can only be as big as the smallest member disk,
  • this configuration is good for systems where storage reliability is of the utmost importance.

What is RAID level 2

RAID level 2 requires 2 groups of disks: data disks and ECC disks; this configuration consists of bit-level striping (not block-level striping) with dedicated Hamming-code parity:

RAID 2 configuration diagram

Key points:

  • Hamming error correction code (ECC) is used, and that information is stored in the redundancy disks. ECC code is calculated “on the fly”, when data is written to the disks,
  • when reading data from the disks, corresponding ECC code from the redundancy disks is also read, in order to check if there is any errors (whether the data is consistent). Possible corrections are also made “on the fly”,
  • groups don’t have to contain the equal number of disks; for example, you can set a configuration where 10 data disks and 4 ECC disks are engaged. Or a configuration where 4 data disks and 3 ECC disks are engaged,
  • although this configuration has its significance in theory, it is not used in practice because it is expensive (a lot of disks are used), and the RAID controller implementation is complex. Also, ECC doesn’t have any significant advantage over parity.

What is RAID level 3

RAID level 3 also consists of data disks, and one disk used for parity calculations; it introduces byte-level striping (not block-level or bit-level striping). Look at the diagram or RAID 3 below:

RAID 3 configuration diagram

Key points:

  • all disk spindle rotation has to be synchronized and data is striped in such way that each sequential byte is located on a different disk,
  • in order to process an R/W request, all disks are engaged (and spin in sync),
  • only one disk for parity is required, no matter how big the array is. Instead of calculating ECC, only parity is calculated across corresponding bytes and stored on a dedicated parity disk (if one disk fails, data can be reconstructed by using parity disk combined with all other data disks),
  • although implementations exist, RAID 3 is not commonly used in practice (because it generally cannot service multiple R/W requests simultaneously).

What is RAID level 4

RAID level 4 also consists of data disks, and one disk used for parity calculations; it introduces block-level striping (not byte-level or bit-level striping). At least 3 disks are required (two data disks and one parity disk). Look at the diagram or RAID 4 below:

RAID 4 configuration diagram

Key points:

  • good random reads, as the data blocks are striped,
  • bad random writes, as for every write, parity also needs to be written to the parity disk,
  • The main advantage of RAID 4 over RAID 2 and RAID 3 is I/O parallelism. In RAID 2 and RAID 3, a single read/write operation requires reading the whole group of data drives, while in RAID 4 one read/write operation does not have to spread across all data drives; as a result, more I/O operations can be executed in parallel, improving the performance of small transfers. For example, block A and block E can be simultaneously read (parallelism isn’t possible when multiple requests are on just one disk),
  • this is also not commonly used in practice, because it uses single parity disk instead of using distributed parity (see RAID 5).

What is RAID level 5

Besides RAID 0 and RAID 1, here is another RAID level that is commonly used in practice. To configure RAID 5, at least 3 disks are needed. Compared to RAID 4, RAID 5 consists of block-level striping with distributed parity, which can be seen at the diagram below:

RAID 5 configuration diagram

Key points:

  • good performance, as blocks are striped,
  • good redundancy, as it uses distributed parity,
  • good ratio when it comes to performance, redundancy and cost,
  • write operations are slow, as for each write, parity needs to be calculated and written to a disk. It’s an excellent configuration for heavily read oriented tasks.

What is RAID level 6

The main difference between RAID level 6 and RAID 5 is that RAID 6 consists of block-level striping with double distributed parity. RAID 6 requires a minimum of four disks – see the diagram below:

RAID 6 configuration diagram

Key points:

  • delivers better redundancy than RAID 5, because two disks can fail and the array still won’t be broken (there is no data loss),
  • complex RAID controller implementation, as it has to calculate two parity data for each data block,
  • it’s practical especially for high-availability systems, as large-capacity drives take longer to restore,
  • write operations are slow, as for each write, dual parity needs to be calculated and written to a disk.

What is RAID level 10

Besides standard RAID levels, we’ll mention two nested RAID levels. Nested RAID levels (or hybrid RAID), combine two or more of the standard RAID levels to gain performance, additional redundancy or both, as a result of combining properties and benefits of different standard RAID layouts. The first one that is to be mentioned is RAID 10, and it requires minimum of 4 disks. Data is striped across disks 1 and 3, and at the same time mirrored: 1–>2 and 3–>4.

RAID 10 configuration diagram

Key points:

  • excellent redundancy, as blocks are mirrored,
  • excellent performance, as blocks are striped,
  • it is the best option for I/O intensive applications such as database, email, and web servers, as well as for any other use requiring high disk performance,
  • according to tests, in most cases RAID 10 provides better throughput and latency than all other RAID levels except RAID 0 (which wins in throughput).

What is RAID level 0+1

The second nested RAID that is to be mentioned is RAID 0+1. More on other nested levels (RAID level 03, 50, 60, 100 … ), search at Wikipedia. It requires minimum of 3 disks, but in most cases this will be implemented as minimum of 4 disks:

RAID 01 or RAID 0+1 configuration diagram

Key points:

  • data is striped across disks 1 and 2, and at the same time mirrored: 1–>3 and 2–>4; disks 1 and 2 form one group (as RAID 0), which is mirrored to disks of the another group (RAID 1 mirrors one group to another). Hence the name RAID 01 or RAID 0+1,
  • performance and storage capacity of RAID 01 is the same as with RAID 10 (with the same number of disks),
  • the main reason why RAID 01 is less used in practice is the less fault tolerance level, compared to RAID 10.
  • assuming that we have more than 4 disks, e.g. 6 disks: RAID 10 would have more groups (3 groups, where every single group consists of two mirrored disks), and RAID 01 would in this case have less groups (2 groups, where each group consists of disks). If one disk fails from every group RAID 10 will not be broken, but RAID 01 will be broken and the whole array will fail.

RAID levels in practice – with disks that have different capacity

A RAID controller is a hardware device (integrated on a motherboard or as a separate PCI card) or software program used to manage hard disk drives (HDDs) or solid-state drives (SSDs) in a computer/server or storage array so they work as a logical unit.

Although it’s recommended to have equal capacity of disks (and the same model), disks that have different capacity may be used, but note that results may not be as you might expected. Here is the example of RAID from the corporate environment – I wanted to test the capacity of logical RAID disk on the PowerEdge server when I select different RAID levels, with the setup of 4 physical disks.

In order to configure RAID on PowerEdge server, power on your computer/server and press the key combination:  CTRL + R. That starts a Configuration Utility:

Power Edge Expandable RAID Controller BIOS

Four SAS disks are in the server: two 300 GB disks, and two 150 GB disks. Why you see here less capacity than what the manufacturer specified, read in this article. OK, so the real capacities are 278.87 GiB and 136.12 GiB:

PERC H700 Integrated BIOS Configuration Utility

We already mentioned that the physical disks are used to create a logical RAID volume; in this configuration utility the term Virtual Disk is used instead:

Create new VD item in BIOS: RAID Virtual Disk configuration (logical disk)

We covered all standard and two nested RAID levels, but see what levels are usually used in practice – RAID controller of PowerEdge server supports only: RAID-0, RAID-1, RAID-5, RAID-6 and RAID-10, (these levels are the most used in practice):

How to define RAID level configuration in Power Edge Server

Here are the results of the Virtual Disk’s capacity when the different RAID levels are applied (have in mind that a layout of physical disks doesn’t affect the capacity of the virtual disk):

  • RAID 0: 544.50 GiB (capacity of bigger disks is trimmed to match the smaller ones),
  • RAID 1: 278.87 GiB (RAID 1 is here possible just with two disks, so we configured it with the two bigger ones),
  • RAID 5: 408.37 GiB (if you want redundancy – this level provides the largest capacity, in this case),
  • RAID 6: 272,25 GiB (sum of two smaller disks),
  • RAID 10: 272,25 GiB (sum of two smaller disks).

 

If you like this article, share it with your friends. 🙂

 

Summary
What is RAID technology and configuration overview
Article Name
What is RAID technology and configuration overview
Description
RAID (originally Redundant Array of Inexpensive Disks, today commonly referred to as Redundant Array of Independent Disks) is a data storage technology that combines multiple physical disks into a single logical unit for the purposes of data redundancy, performance improvement, or both.
Author
www.CreativForm.com