Wednesday, November 2, 2011

Redundant Array of Inexpensive Disks(RAID)


  • ChatGPT
RAID (Redundant Array of Independent Disks) is a technology that combines multiple hard drives into a single unit to improve performance, redundancy, or both. Different RAID levels offer different benefits and trade-offs. Here’s an explanation of RAID 0, RAID 1, and RAID 5, along with examples of how they work.
RAID 0 (Striping)

Definition: RAID 0, also known as striping, splits data evenly across two or more disks. It does not provide redundancy or fault tolerance but improves performance by allowing multiple disks to work simultaneously.

Purpose:

    Increase Performance: Provides faster read and write speeds by splitting data across multiple disks.
    Maximize Disk Utilization: Uses the full capacity of all disks in the array.

Example:
Imagine you have two 1 TB hard drives (Disk A and Disk B). With RAID 0, data is split into stripes and written across both disks. For example:

less

Disk A: [Stripe 1] [Stripe 3] [Stripe 5]
Disk B: [Stripe 2] [Stripe 4] [Stripe 6]

When you save a file, it's split into chunks and written across both disks. For a read operation, data can be read from both disks simultaneously, increasing speed.

Pros:

    High performance for both read and write operations.
    Maximizes disk capacity.

Cons:

    No redundancy or fault tolerance. If one disk fails, all data is lost.

RAID 1 (Mirroring)

Definition: RAID 1, also known as mirroring, duplicates data across two or more disks. Each disk in the array contains an exact copy of the data.

Purpose:

    Provide Redundancy: Ensures data is not lost if one disk fails.
    Improve Read Performance: Can improve read speed as data can be read from any disk in the array.

Example:
If you have two 1 TB hard drives (Disk A and Disk B), RAID 1 will store identical copies of the data on both disks:

less

Disk A: [Data Copy]
Disk B: [Data Copy]

When data is written to the array, it's simultaneously written to both disks. If Disk A fails, Disk B still has the data.

Pros:

    High redundancy; data is safe if one disk fails.
    Improved read performance.

Cons:

    Reduced storage capacity (effective capacity is the size of one disk).
    Write performance is not improved.

RAID 5 (Striping with Parity)

Definition: RAID 5 uses striping (like RAID 0) but includes parity data to provide redundancy. Parity data is distributed across all disks in the array and is used to reconstruct data in case of a disk failure.

Purpose:

    Balance Performance and Redundancy: Provides a good mix of improved performance and data protection.
    Efficient Use of Storage: Requires only one disk’s worth of space for parity, regardless of the number of disks.

Example:
If you have three 1 TB hard drives (Disk A, Disk B, Disk C), RAID 5 distributes data and parity information across all disks. For instance:

less

Disk A: [Data 1] [Data 3] [Parity]
Disk B: [Data 2] [Parity] [Data 4]
Disk C: [Parity] [Data 5] [Data 6]

If one disk fails, the missing data can be reconstructed using the remaining data and parity information.

Pros:

    Good balance of performance and redundancy.
    Efficient use of disk space; only one disk’s worth of space is used for parity.

Cons:

    Write performance can be slower due to parity calculations.
    If more than one disk fails, data is lost.

Summary

    RAID 0: Stripes data across multiple disks to improve performance. No redundancy. Example: Two 1 TB disks give 2 TB total storage.
    RAID 1: Mirrors data across multiple disks for redundancy. Example: Two 1 TB disks give 1 TB total usable storage.
    RAID 5: Stripes data and distributes parity across multiple disks for both performance and redundancy. Example: Three 1 TB disks give 2 TB usable storage (one disk’s worth of space is used for parity).

Each RAID level has its own use case and benefits, so the choice depends on your specific needs for performance, redundancy, and storage capacity.
  • RAID (Redundant Array of Inexpensive Disks[1] or Drives, or Redundant Array of Independent Disks) is a data storage virtualization technology that combines multiple physical disk drive components into one or more logical units for the purposes of data redundancy, performance improvement, or both. 

https://en.wikipedia.org/wiki/RAID



RAID originally stood for Redundant Array of Inexpensive Disks, but the disk vendors did not like that, as it had cost implications. They changed it to mean Redundant Array of Independent Disks.

  • RAID 0

Advantages
block-level striping without parity or mirroring
has no redundancy
RAID 0 combines unused disk space on two or more hard drives into a single logical volume with data being written to equally sized stripes across all the disks
RAID 0 implements a striped disk array, the data is broken down into blocks and each block is written to a separate disk drive
disk access is faster, making the performance of RAID 0 better than other RAID solutions and significantly better than a single hard disk
I/O performance is greatly improved by spreading the I/O load across many channels and drives
Best performance is achieved when data is striped across multiple controllers with only one drive per controller
the data is broken into fragments called blocks.
The number of blocks is dictated by the stripe size, which is a configuration parameter of the array.
The blocks are written to their respective drives simultaneously on the same sector.
This allows smaller sections of the entire chunk of data to be read off the drive in parallel, increasing bandwidth
No parity calculation overhead is involved
Very simple design
Easy to implement
All storage capacity is used, there is no disk overhead
By using multiple disks, reads and writes are performed simultaneously across all drives
RAID0 is simply data striped over several disks. This gives a performance advantage, as it is possible to read parts of a file in parallel. However not only is there no data protection, it is actually less reliable than a single disk, as all the data is lost if a single disk in the array stripe fails.


Disadvantages

Not a "True" RAID because it is NOT fault-tolerant
The failure of just one drive will result in all data in an array being lost
Should never be used in mission critical environments
The downside of RAID 0 is that if any disk in the array fails, the data is lost and must be restored from backup
Even if two discs are used – 100 GB and 1TB – the array storage will equal 200GB. So the use of disks with different capacities is uneconomical because of the inability to use their full storage potential.
the whole capacity is equal to the number of discs multiplied by the capacity of the “smallest” one. i.e.- If we have two HDDs – 250GB and 500GB, the size of the array will be equal to 500GB.

Recommended Applications

Video Production and Editing
Image Editing
Pre-Press Applications
Any application requiring high bandwidth
RAID 0 is ideal for non-critical storage of data that have to be read/written at a high speed, such as on a Photoshop image retouching station

  • RAID 1
For Highest performance, the controller must be able to perform two concurrent separate Reads per mirrored pair or two duplicate Writes per mirrored pair.
RAID Level 1 requires a minimum of 2 drives to implement
data mirroring
mirroring without parity or striping
implementing RAID 1 with a separate controller for each drive in order to perform simultaneous reads (and writes) is sometimes called multiplexing (or duplexing when there are only 2 drives)
Data are stored twice by writing them to both the data disk (or set of data disks) and a mirror disk (or set of disks)
RAID 1 systems are often combined with RAID 0 to improve performance
In case a disk fails, data do not have to be rebuild, they just have to be copied to the replacement disk
When information is written to the hard disk, it is automatically and simultaneously written to the second hard disk. Both of the hard disks in the mirrored configuration use the same hard disk controller; the partitions used on the hard disk need to be approximately the same size to establish the mirror.
An extension of RAID 1 is disk duplexing. Disk duplexing is the same as mirroring with the exception of one key detail: It places the hard disks on separate hard disk controllers, eliminating the single point of failure.
total storage must be equal to the capacity of the smallest disk. i.e.: in the case of an array composed of 3 discs – 250GM, 500GB and 1TB – the usable space will be equal to 250GB

Disadvantages
Highest disk overhead of all RAID types (100%) - inefficient
The main disadvantage is that the effective storage capacity is only half of the total disk capacity because all data get written twice
Software RAID 1 solutions do not always allow a hot swap of a failed disk (meaning it cannot be replaced while the server keeps running). Ideally a hardware controller is used.
two 100GB hard drives only provide 100GB of storage space
RAID 1 also has a single point of failure, the hard disk controller. If it were to fail, the data would be inaccessible on either drive.

Recommended Applications
Accounting
Payroll
Financial
Any application requiring very high availability


  • Raid 5

block-level striping with distributed parity
a single drive failure results in reduced performance of the entire array until the failed drive has been replaced and the associated data rebuilt.
Upon drive failure, any subsequent reads can be calculated from the distributed parity such that the drive failure is masked from the end user.
parity information is spread across all the drives
if you have three 40GB hard disks, you have 80GB of storage space with the other 40GB used for parity
RAID 5 suffers from poor write performance because the parity has to be calculated and then written across several disks



Recommended Applications

File and Application servers
Database servers
Web, E-mail, and News servers
Intranet servers
Most versatile RAID level


References:
http://www.raid.com/raidedu/0
http://en.wikipedia.org/wiki/RAID
http://www.brainbell.com/tutorials/Networking/RAID_0_Stripe_Set_Without_Parity.html
http://www.prepressure.com/library/technology/raid
http://blog.open-e.com/what-is-raid-0/
http://www.lascon.co.uk/d008005.htm
http://www.youtube.com/watch?v=6yDpTj2lePI&feature=fvwrel
http://www.youtube.com/watch?v=PP0iQs8qBNU&feature=related
http://www.youtube.com/watch?v=LTq4pGZtzho&feature=related



  • RAID Structure


The general idea behind RAID is to employ a group of hard drives together with some form of duplication, either to increase reliability or to speed up operations, or sometimes both. )


  • RAID

Redundant Array of Inexpensive Disks or Redundant Array of Independent Disks,

Mirroring provides reliability but is expensive

Striping improves performance, but does not improve reliability.






  • RAID Levels


Raid Level 0 - This level includes striping only, with no mirroring.
Raid Level 1 - This level includes mirroring only, no striping.
Raid Level 5 - This level is similar to level 4, except the parity blocks are distributed over all disks, thereby more evenly balancing the load on the system.
For any given block on the disk(s), one of the disks will hold the parity information for that block and the other N-1 disks will hold the data.
Note that the same disk cannot hold both data and parity for the same block, as both would be lost in the event of a disk crash.





  • nested raid levels


There are also two RAID levels which combine RAID levels 0 and 1 striping and mirroring ) in different combinations, designed to provide both performance and  reliability at the expense of increased cost.

RAID level 0 + 1 disks are first striped, and then the striped disks mirrored to another set. This level generally provides better performance than RAID level 5.


RAID level 1 + 0 mirrors disks in pairs, and then stripes the mirrored pairs. The storage capacity, performance, etc. are all the same, but there is an advantage   to this approach in the event of multiple disk failures


http://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/12_MassStorage.html





  • All About RAID



mirroring
multiple disks contain identical data

striping
sequential blocks of data are split among multiple disks

fault tolerance
parity data is stored to recover if a disk fails

raid o (striping)
usable capacity is roughly as same as physical capacity of the drives

raid 1(mirroring)
usable capacity is roughly half of the physical capacity

raid 5(parity across disks)
min 3 drives
usable capacity is physical capacity minus one drive

raid 6(double parity)
similar to raid 5
min 4 drives
usable capacity is physical capacity minus two drives

raid 10 (combination of raid 1 and raid 0)
usable capacity is half of the physical capacity




  • RAID 5 & RAID 10 Tutorial & Explanation (NCIX Tech Tips #79)



raid 6
you can lose 2 drives out of 4 drives

raid 10
it is like raid 1 and raid 0 is combined
practical up to 4 drives
4 drives = 4 tb = 1tb x 4
4 tb is worth of 2 tb actual storage
it is like we are writing throug all 4 drives and reading throug all drives
fault tolerance
you can lose 2 drives out of 4 drives but still you can survive
no calculation


  • what is RAID



  • Raid 0 - 1 - 5 - 0+1 - 1+0
Understanding Different RAID Levels
CompTIA A+ Video Training - RAID and SCSI