It’s not often in the IT world that a technology developed many decades ago is still widely used and important for administrators and other users. Even modern servers and storage running RAID technology inside – mostly in enterprises, but more and more in consumer NAS systems as well. Well, RAID has survived and as we celebrate its 30th birthday this year, it still plays a major role. But why is that so? And what are to benefits and disadvantages of the concept?
David Patterson, Garth A. Gibson, and Randy Katz from the University of California in Berkeley invented the term RAID in 1987. The following year, they published their paper about the “Case for Redundant Arrays of inexpensive Disks” at the SIGMOD conference in June of 1988. At the time, hard disks were still quite expensive and trying to keep data storage “lean” was not only common, but a necessity. Additionally, companies were using huge mainframe computers, while desktop computers had not been widely introduced in the workplace. However, this started to change as the acceptance and usage of personal computers grew.
Consequently, hard drives for these first non-mainframe-computers were already much cheaper than those used in mainframe systems and this became the reason why the three developed the concept of RAID. They argued that several connected, and less expensive, hard disks would beat a single, top mainframe hard disk in terms of performance. And even though using many hard disks meant the failure rate would rise, it was possible to configure them for redundancy so that the reliability of such an array could far exceed that of any large single mainframe drive.
RAID is based on the concept that data spreads, or replicates, across multiple inexpensive or independent drives. Drives within the system are configured so that data can be divided or replicated over two or more drives for load distribution or to help recover data if a drive fails. There are two technical ways to achieve that: either by a hardware solution, a dedicated RAID controller, or a software solution which is mostly already included in modern operating systems. Hardware-based systems manage the RAID independently from the host computer using a RAID controller, so the operating system is unaware of the technical workings of the RAID and sees the whole storage system as if it were a single volume connected to the host computer.
Besides these technical implementations, the RAID concept is based on these three fundamental principles:
- Parity is a way of distributing information across a RAID system which allows data to be restored in the case of a drive failure.
- Redundancy is the duplication of critical components in the system architecture to increase reliability and act as a fail-safe. In essence, it allows for multiple component failures to happen before the whole system fails and in the case of RAID systems, the components are the drives.
- Mirroring is when the same data is duplicated from one disk to another. Striping is another method where data is written across multiple disks. Different RAID setups use one or more of these techniques, depending on system requirements.
Based on these principles these standard RAID levels have been developed:
- RAID 0 uses ‘striping’ and is the most basic RAID level. It offers no redundancy but it does increase performance. Data is striped across at least two disks and with every disk added, read/write performance and storage capacity is increased over a single drive. If one drive fails, there’s no way of the RAID controller rebuilding it.
- RAID 1 uses ‘mirroring’, which as the name suggests, mirrors the same data across two disks, therefore it provides the lowest level of RAID redundancy. RAID 1 can double read performance over a single drive, but it gives no increase in write speed. This level allows for one drive to fail.
- RAID 5 is a common configuration and it gives a decent compromise between reliability and performance. It provides a gain in read speeds but no increase in write performance. RAID 5 introduces ‘parity’, which takes up the space of one disk in total. This level can handle one disk failure. If you have a hot spare configured as a 5th drive, this can sit as an idle drive in the system with no data saved to it. If one disk fails, the data can be rebuilt to the hot spare by using the data in the parity across the other drives. Once the data has finished rebuilding you can then remove the failed drive and replace it with a new one, which becomes the new hot spare.
- RAID 6 takes the concept of RAID 5 and adds further redundancy with dual-parity. This allows for data to be recreated even if two disks fail within the array. The dual-parity is spread across all the disks and takes the space of two drives.
Over the last 30 years, many more RAID levels have been developed mainly by RAID system manufacturers. Today, we have RAID levels ranging from RAID 0 all the way to RAID 61 and beyond, with larger companies creating custom RAID levels to support different applications and infrastructure requirements.
Drive failures and the dangers of RAID
If disk failure occurs in a RAID 1 or RAID 5 configuration, the user shouldn’t replace the failed drive until ensuring that all data from the remaining disks are backed up. In many cases, especially when the solution used disks that came out the same production, the possibility that another disk will also fail soon is quite high. And this is where the danger of this concept lies:
Even with all the benefits RAID offers, including better performance and data security, users tend to forget that RAID is not a backup! RAID can be used in combination with backups, thus making the whole storage system much more secure, but a RAID is never to be used instead of a backup. On the contrary, when a RAID system fails due to a malfunctioning hardware RAID controller, for example, it’s much more complicated to get the RAID up and running and recover lost data when hit by such an incident.
NAS systems have become more affordable to home users. They use the built-in RAID configurations in combination with other advanced storage technologies, like deduplication, to get as much space as possible out of their system. However, this comes at a price. In many cases these systems are set up incorrectly, and when a failure arises, the whole system breaks down.
Before setting up a RAID array, regardless if you’re a home user or an enterprise IT administrator, carefully consider what RAID level suits your needs, or if RAID is even necessary at all. Remember, negligence in the beginning can result in serious problems, high costs, and possible data loss. As new ways to store data continue to be explored, invented, and evolve over time, it’s likely that RAID won’t vanish anytime soon.
Picture copyright: Paul-Georg Meister / pixelio.de