What is RAID?

RAID is short for Redundant Array of Independent Disks.

The basic idea is to combine multiple relatively inexpensive hard drives into a hard disk array group to achieve performance even more than an expensive hard drive.

Since RAID combines multiple hard disks into one logical sector, it is only used as a hard disk or logical storage unit for computers.

Depending on the version chosen, RAID has one or more of the following benefits over a single hard drive:

Increase data integration
Enhanced fault tolerance
Increase throughput or capacity
RAID version
RAID technology originally developed by the University of California, Berkeley in 1988 proposed (University of California-Berkeley), after many years of development already has a version, mainly divided into standard RAID (RAID 0 ~ RAID 6) , mixed RAID (JBOD, RAID 7 , RAID 10/0, RAID 50, etc.).
Here is just a brief introduction to standard RAID.

In terms of application, RAID 0, RAID 1, RAID 5, and RAID 6 are the most common, and RAID 2, 3, and 4 are less practical applications. Because RAID 5 already covers the required functions, RAID 2, 3, and 4 are mostly only in the research field. There are implementations, RAID4 has applications on some commercial machines, such as the NAS system designed by NetApp is the design concept using RAID4.

RAID 0

RAID 0 is also known as a stripe set. It connects more than two disks in parallel to become a large-capacity disk. When storing data, it is scattered and stored on these disks after segmentation. Because it can be processed in parallel when reading and writing, RAID 0 is the fastest in all levels. However, RAID 0 has neither redundancy nor fault tolerance. If a disk (physical) is damaged, all data will be lost, and the degree of danger is equivalent to JBOD.

RAID0 storage schematic

RAID0 storage schematic

RAID 1

More than two sets of N disks mirror each other, and in some multi-threaded operating systems, there is a good read speed. In theory, the read speed is equal to a multiple of the number of hard disks, and the write speed is slightly reduced. As long as one disk is normal, it can be operated with the highest reliability. The principle is to store the same data on the mirrored hard disk while storing data on the primary hard disk. When the primary hard disk (physical) is damaged, the mirrored hard disk replaces the work of the primary hard disk. Because there is a mirrored hard disk for data backup, RAID 1 data security is the best at all RAID levels. But no matter how many disks are used for RAID 1, only one disk capacity is the lowest level of disk utilization in all RAIDs.
If you use two different sized disks to build RAID 1, the available space is the smaller one. The larger disk space can also be partitioned into one area for use without waste.

RAID1 storage schematic

RAID1 storage schematic

RAID 2

This is a modified version of RAID 0. The data is encoded in a Hamming Code and partitioned into independent bits, and the data is written to the hard disk. Because the error correction code (ECC) is added to the data, the overall capacity of the data will be larger than the original data. RAID2 requires at least three disk drives to operate.

RAID2 storage schematic

RAID2 storage schematic

RAID 3

Bit-interleaving (data interleaving) technology is used, which needs to encode the data bits and then separate them into the hard disk. The same bit is checked and stored in a separate hard disk, but the bits in the data are scattered on different hard disks. Therefore, even if you want to read a small piece of data, you may need all the hard disks to work, so this specification is more suitable for reading large amounts of data.

RAID3 storage schematic

RAID3 storage schematic

RAID 4

It differs from RAID 3 in that it is stored in the hard disk in units of blocks when partitioning, but each time data access must be taken from the hard disk checked by the same bit to check the corresponding bit data, because it is too frequent Use, so the loss on the hard disk may increase. (Block interleaving)

RAID4 storage schematic

RAID4 storage schematic

RAID 5

RAID Level 5 is a storage solution that combines storage performance, data security, and storage costs. It uses Disk Striping technology. RAID 5 requires at least three hard disks. RAID 5 does not back up the stored data. Instead, it stores the data and the corresponding parity information on each disk that constitutes RAID 5. The parity information and the corresponding data are respectively Stored on different disks. When a disk data of RAID 5 is damaged, the remaining data and corresponding parity information can be used to recover the corrupted data. RAID 5 can be understood as a compromise between RAID 0 and RAID 1. RAID 5 provides data security for the system, but it is less secure than mirroring and disk space utilization is higher than mirroring. RAID 5 has a data read speed similar to that of RAID 0. Just because there is one more parity information, the speed of writing data is slightly slower than writing to a hard disk separately. If you use “write-back cache”, you can make performance. A lot of improvement. At the same time, because multiple data corresponds to one parity information, RAID 5 has higher disk space utilization than RAID 1, and the storage cost is relatively cheap.

RAID5 storage schematic

RAID5 storage schematic

RAID 6

Compared to RAID 5, RAID 6 adds a second independent parity block. Two independent parity systems use different algorithms, and the reliability of the data is very high. When any two disks fail at the same time, the data integrity will not be affected. RAID 6 requires more disk space allocated for parity information and additional parity calculations. It has a larger IO operation and computational load than RAID 5. Its “write performance” is strongly dependent on the specific implementation. RAID 6 is usually not implemented in software, but is more likely to be implemented in hardware/firmware.

A maximum of two disk corruptions are allowed in the same array. After replacing the new disk, the data will be recalculated and written to the new disk. According to design theory, RAID 6 must have more than four disks to take effect.

The usable capacity is the difference between the total number of hard disks minus 2, multiplied by the minimum capacity. Similarly, the data protection area capacity is the minimum capacity multiplied by 2.

RAID 6 is also the most common disk array feature in the hardware disk array card function.

RAID6 storage schematic

RAID6 storage schematic

Hard RAID Full Soft RAID Half Soft RAID

Hardware RAID (Hardware RAID)

Simply put, all the hardware that implements the RAID function is hard RAID . For example, various RAID cards and RAID that the motherboard can do are hard RAID.
Therefore, the hard RAID is to connect the hard disk and the computer with a dedicated RAID controller (RAID card). The RAID controller is responsible for configuring all the RAID member disks into one virtual RAID disk volume. For the operating system, he can only recognize the virtual disk configured by the RAID controller, but cannot identify the member disks that make up the RAID.

Hard RAID has its own RAID control/processing and I/O processing chip, and even Array Buffer, which has the most advantages in CPU usage and overall performance.

Software RAID (Software RAID)

In other words, it is soft RAID that uses the operating system to complete the RAID function. For example, in the Linux operating system, use 3 hard disks for RAID5.
That is, without using a RAID controller (known as the RAID co-processor in the industry) and I/O chips, RAID is implemented directly through the software layer. All functions are performed by the operating system (OS) and the CPU. It is conceivable that this is the least efficient RAID .
Different from the hard RAID, each member disk of the soft RAID is visible to the operating system, but the operating system does not present the member disks to the user, but only presents the virtual RAID volume configured through the software layer to the user. This allows users to use RAID volumes just like a normal disk.

Hardware-assisted RAID (Hardware-Assisted RAID)

Compared to hard RAID and full soft RAID, semi-soft RAID requires a RAID card and the drivers provided by the manufacturer.
However, semi-soft and semi-hard RAID lacks its own I/O processing chip, so this work is still done by the CPU and driver. Moreover, the RAID control/processing chip used in semi-soft and semi-hard RAID is generally weak and cannot support high RAID levels.
This RAID is easier to migrate to other computers.

RAID card

There are a variety of RAID cards, in addition to the RAID hard RAID that the motherboard can do, there are also a variety of dedicated RAID cards. Generally divided into hard RAID card and soft RAID card

The hardware that implements the RAID function is hard RAID, independent RAID card, and the integrated RAID chip of the motherboard is hard RAID.
The RAID card that passes the software and uses the CPU refers to the common calculation of using the CPU to complete the RAID. The software RAID occupies a high CPU resource.
Most server devices are hardware RAID

The RAID card has its own processor and does not require the CPU of the server. The advantage is that the read and write performance is the fastest, does not occupy the server resources, can be used for any operating system, can also read the hard disk through the backup battery module (BBU, Backup Battery Unit) and non-volatile memory (NVRAM) after the system is powered off. The remaining read and write jobs contained in the Journal (Journal) are first recorded in the memory. After waiting for the power supply to be undone, the log file data is retrieved by NVRAM, and then the read and write operations are completed, and the remaining read and write operations are safely completed to ensure read. Write integrity. The backup battery module usually cooperates with the Write-Back cache mode of the array card, so that the memory cache read and write operations can obtain higher read and write performance; however, the hardware disk array card of the battery module is not backed up, and the Write-Back cache is not used. Mode to avoid loss of read and write data due to power outages. In addition, because the hardware disk array card is equipped with a CPU processor, it can be separated from the system, and various operations can be performed on the hard disk, and the restore operation is faster than the software disk array. The downside is that it is expensive and is usually only used in RAID 5 and RAID 6.

The difference between motherboard integrated RAID and external RAID card RAID:

performance

The integrated RAID of the motherboard, its performance and its speed is realized by the CPU and memory of the motherboard, it will occupy a certain bandwidth of the motherboard, it will affect the performance of the whole machine, and the external RAID card, it itself by its own CPU And memory, so most of its data processing will be handled by itself, will not affect the CPU and memory speed on the motherboard. Overall, the RAID of the external RAID card is much faster than the RAID integrated by the motherboard.

safety

The security of the integrated RAID of the motherboard can not be guaranteed. For example, we use the P8SCT motherboard to make a SATA RAID. No matter what RAID you do, it is made by changing the BIOS option of the motherboard, so once the motherboard is damaged, the motherboard CMOS Battery power failure, unintentional changes to the motherboard BIOS settings will bring RAID loss, RAID made by the motherboard, once lost, will not be restored, the consequences are very serious, and the RAID card made by the external RAID card It will not affect the data due to the damage of the motherboard and the power failure of the motherboard’s CMOS battery. Therefore, the extra RAID card is much more secure than the motherboard.

Advantages and disadvantages

Soft RAID depends on the OS, and hard RAID is independent of the OS. So the performance of hard RAID and data security are definitely better.

advantage:
Hard RAID:

CPU usage and overall performance are the most advantageous of these three types.
Rebuild can be achieved when a hard disk is lost. If the RAID card is damaged, the RAID card can be replaced.
Soft RAID:

Low cost, only need motherboard support, no disk array card required
Simple implementation
Semi-soft RAID:

Performance and stability are softer RAID has a lot of improvement
Easier to migrate to other computers
Disadvantages:
Hard RAID:

Equipment cost is the highest of the three types
Need to have some technical knowledge
Soft RAID:

In order to consume more CPU resources and calculate RAID, it will cause problems such as heat and is not stable enough.
Depends on the operating system, and the operating system. . . .
If the motherboard is damaged, it may be difficult to purchase the same motherboard to rebuild the RAID.
Semi-soft RAID:

More than enough