Redundant Array of Independent (originally Inexpensive) Disks (RAID) is a term used for computer data storage systems that spread and/or replicate data across multiple drives. RAID technology has revolutionised enterprise data storage and was designed with two key goals: to increased data reliability and increased I/O (input/output) performance.
Unfortunately though, RAID storage isn’t a perfect technology and as a result data loss can still occur when using these systems. In this post we’ll explore how RAID levels work and how data can be stored (and lost!) with this type of storage.
How does RAID work?
A RAID combines physical disks into a single logical unit by using either special hardware or software. Hardware RAID solutions can come in a variety styles, from built onto the motherboard or add in cards, up to large enterprise NAS or SAN servers. With these setups the operating system (OS) is unaware of the technical workings or the RAID. Software solutions are typically implemented within the OS.
RAID is traditionally used on servers, but can be also used on workstations. The latter is especially true in storage-intensive computers such as those used for video and audio editing, where high storage capacities and data transfer speeds are required.
Commonly used RAID terms
Before we go into any further detail, let’s take a look at some of the technical terms that are commonly used to describe aspects of RAID storage:
A technology that supports various hard drive configurations for the purposes of achieving greater performance, reliability and larger volume sizes through the use of consolidating disk resources and parity calculations.
Distributed information which allows the recreation of data stored within a RAID array, even if one disk fails.
Data from 1 or more hard drives is duplicated onto another physical disk(s).
A method where data can be written across multiple disks. In the example below the data is written across the drives in a sequential order until the last drive, it then jumps back to the first and starts a 2nd stripe, etc.
A block is the logical space on each disk where the data is written, the amount of space is set by the RAID controller.
Left / right symmetry
Symmetry in a RAID controls how the data and parity are distributed across the drives. There are 4 main styles of symmetry – which one is used depends on the RAID vendor. Some companies also make proprietary styles depending on their business needs.
There are a few different methods for dealing with drive failures within a RAID; one is the use of a ‘hot spare’. It is a spare disk which can be used in place of a failed one.
This happens when a drive in the RAID becomes unreadable; the drive is then considered bad and is withdrawn from the RAID. The new data and parity are then written to the remaining drives within the RAID, if any data is requested from the failed drive it is worked out with the parity on the others. This degrades the performance of the RAID.
Still with me? Now that we’ve defined the key terms, in our next article we will take a look at the three key concepts in RAID: mirroring, striping and error correction. We’ll also look at different RAID levels, how modern arrays work and what challenges lie ahead if data is lost. See you next time!
Until then, if you’ve got a question or comment why not let us know below.