Copy link to clipboard
Copied
People often ask: Should I raid my disks?
The question is simple, unfortunately the answer is not. So here I'm going to give you another guide to help you decide when a raid array is advantageous and how to go about it. Notice that this guide also applies to SSD's, with the expection of the parts about mechanical failure.
What is a RAID?
RAID is the acronym for "Redundant Array of Inexpensive Disks". The concept originated at the University of Berkely in 1987 and was intended to create large storage capacity with smaller disks without the need for very expensive and reliable disks, that were very expensive at that time, often a tenfold of smaller disks. Today prices of hard disks have fallen so much that it often is more attractive to buy a single 1 TB disk than two 500 GB disks. That is the reason that today RAID is often described as "Redundant Array of Independent Disks".
The idea behind RAID is to have a number of disks co-operate in such a way that it looks like one big disk. Note that 'Spanning' is not in any way comparable to RAID, it is just a way, like inverse partitioning, to extend the base partition to use multiple disks, without changing the method of reading and writing to that extended partition.
Why use a RAID?
Now with these lower disks prices today, why would a video editor consider a raid array? There are two reasons:
1. Redundancy (or security)
2. Performance
Notice that it can be a combination of both reasons, it is not an 'either/or' reason.
Does a video editor need RAID?
No, if the above two reasons, redundancy and performance are not relevant. Yes if either or both reasons are relevant.
Re 1. Redundancy
Every mechanical disk will eventually fail, sometimes on the first day of use, sometimes only after several years of usage. When that happens, all data on that disk are lost and the only solution is to get a new disk and recreate the data from a backup (if you have one) or through tedious and time-consuming work. If that does not bother you and you can spare the time to recreate the data that were lost, then redundancy is not an issue for you. Keep in mind that disk failures often occur at inconvenient moments, on a weekend when the shops are closed and you can't get a replacement disk, or when you have a tight deadline.
Re 2. Performance
Opponents of RAID will often say that any modern disk is fast enough for video editing and they are right, but only to a certain extent. As fill rates of disks go up, performance goes down, sometimes by 50%. As the number of disk activities on the disk go up , like accessing (reading or writing) pagefile, media cache, previews, media, project file, output file, performance goes down the drain. The more tracks you have in your project, the more strain is put on your disk. 10 tracks require 10 times the bandwidth of a single track. The more applications you have open, the more your pagefile is used. This is especially apparent on systems with limited memory.
The following chart shows how fill rates on a single disk will impact performance:
Remember that I said previously the idea behind RAID is to have a number of disks co-operate in such a way that it looks like one big disk. That means a RAID will not fill up as fast as a single disk and not experience the same performance degradation.
RAID basics
Now that we have established the reasons why people may consider RAID, let's have a look at some of the basics.
Single or Multiple?
There are three methods to configure a RAID array: mirroring, striping and parity check. These are called levels and levels are subdivided in single or multiple levels, depending on the method used. A single level RAID0 is striping only and a multiple level RAID15 is a combination of mirroring (1) and parity check (5). Multiple levels are designated by combining two single levels, like a multiple RAID10, which is a combination of single level RAID0 with a single level RAID1.
Hardware or Software?
The difference is quite simple: hardware RAID controllers have their own processor and usually their own cache. Software RAID controllers use the CPU and the RAM on the motherboard. Hardware controllers are faster but also more expensive. For RAID levels without parity check like Raid0, Raid1 and Raid10 software controllers are quite good with a fast PC.
The common Promise and Highpoint cards are all software controllers that (mis)use the CPU and RAM memory. Real hardware RAID controllers all use their own IOP (I/O Processor) and cache (ever wondered why these hardware controllers are expensive?).
There are two kinds of software RAID's. One is controlled by the BIOS/drivers (like Promise/Highpoint) and the other is solely OS dependent. The first kind can be booted from, the second one can only be accessed after the OS has started. In performance terms they do not differ significantly.
For the technically inclined: Cluster size, Block size and Chunk size
In short: Cluster size applies to the partition and Block or Stripe size applies to the array.
With a cluster size of 4 KB, data are distributed across the partition in 4 KB parts. Suppose you have a 10 KB file, three full clusters will be occupied: 4 KB - 4 KB - 2 KB. The remaining 2 KB is called slackspace and can not be used by other files. With a block size (stripe) of 64 KB, data are distributed across the array disks in 64 KB parts. Suppose you have a 200 KB file, the first part of 64 KB is located on disk A, the second 64 KB is located on disk B, the third 64 KB is located on disk C and the remaining 8 KB on disk D. Here there is no slackspace, because the block size is subdivided into clusters. When working with audio/video material a large block size is faster than smaller block size. Working with smaller files a smaller block size is preferred.
Sometimes you have an option to set 'Chunk size', depending on the controller. It is the minimal size of a data request from the controller to a disk in the array and only useful when striping is used. Suppose you have a block size of 16 KB and you want to read a 1 MB file. The controller needs to read 64 times a block of 16 KB. With a chunk size of 32 KB the first two blocks will be read from the first disk, the next two blocks from the next disk, and so on. If the chunk size is 128 KB. the first 8 blocks will be read from the first disk, the next 8 block from the second disk, etcetera. Smaller chunks are advisable with smaller filer, larger chunks are better for larger (audio/video) files.
RAID Levels
For a full explanation of various RAID levels, look here: http://www.acnc.com/04_01_00/html
What are the benefits of each RAID level for video editing and what are the risks and benefits of each level to help you achieve better redundancy and/or better performance? I will try to summarize them below.
RAID0
The Band AID of RAID. There is no redundancy! There is a risk of losing all data that is a multiplier of the number of disks in the array. A 2 disk array carries twice the risk over a single disk, a X disk array carries X times the risk of losing it all.
A RAID0 is perfectly OK for data that you will not worry about if you lose them. Like pagefile, media cache, previews or rendered files. It may be a hassle if you have media files on it, because it requires recapturing, but not the end-of-the-world. It will be disastrous for project files.
Performance wise a RAID0 is almost X times as fast as a single disk, X being the number of disks in the array.
RAID1
The RAID level for the paranoid. It gives no performance gain whatsoever. It gives you redundancy, at the cost of a disk. If you are meticulous about backups and make them all the time, RAID1 may be a better solution, because you can never forget to make a backup, you can restore instantly. Remember backups require a disk as well. This RAID1 level can only be advised for the C drive IMO if you do not have any trust in the reliability of modern-day disks. It is of no use for video editing.
RAID3
The RAID level for video editors. There is redundancy! There is only a small performance hit when rebuilding an array after a disk failure due to the dedicated parity disk. There is quite a perfomance gain achieveable, but the drawback is that it requires a hardware controller from Areca. You could do worse, but apart from it being the Rolls-Royce amongst the hardware controllers, it is expensive like the car.
Performance wise it will achieve around 85% (X-1) on reads and 60% (X-1) on writes over a single disk with X being the number of disks in the array. So with a 6 disk array in RAID3, you get around 0.85x (6-1) = 425% the performance of a single disk on reads and 300% on writes.
RAID5 & RAID6
The RAID level for non-video applications with distributed parity. This makes for a somewhat severe hit in performance in case of a disk failure. The double parity in RAID6 makes it ideal for NAS applications.
The performance gain is slightly lower than with a RAID3. RAID6 requires a dedicated hardware controller, RAID5 can be run on a software controller but the CPU overhead negates to a large extent the performance gain.
RAID10
The RAID level for paranoids in a hurry. It delivers the same redundancy as RAID 1, but since it is a multilevel RAID, combined with a RAID0, delivers twice the performance of a single disk at four times the cost, apart from the controller. The main advantage is that you can have two disk failures at the same time without losing data, but what are the chances of that happening?
RAID30, 50 & 60
Just striped arrays of RAID 3, 5 or 6 which doubles the speed while keeping redundancy at the same level.
EXTRAS
RAID level 0 is striping, RAID level 1 is mirroring and RAID levels 3, 5 & 6 are parity check methods. For parity check methods, dedicated controllers offer the possibility of defining a hot-spare disk. A hot-spare disk is an extra disk that does not belong to the array, but is instantly available to take over from a failed disk in the array. Suppose you have a 6 disk RAID3 array with a single hot-spare disk and assume one disk fails. What happens? The data on the failed disk can be reconstructed in the background, while you keep working with negligeable impact on performance, to the hot-spare. In mere minutes your system is back at the performance level you were before the disk failure. Sometime later you take out the failed drive, replace it for a new drive and define that as the new hot-spare.
As stated earlier, dedicated hardware controllers use their own IOP and their own cache instead of using the memory on the mobo. The larger the cache on the controller, the better the performance, but the main benefits of cache memory are when handling random R+W activities. For sequential activities, like with video editing it does not pay to use more than 2 GB of cache maximum.
REDUNDANCY
(or security)
Not using RAID entails the risk of a drive failing and losing all data. The same applies to using RAID0 (or better said AID0), only multiplied by the number of disks in the array.
RAID1 or 10 overcomes that risk by offering a mirror, an instant backup in case of failure at high cost.
RAID3, 5 or 6 offers protection for disk failure by reconstructing the lost data in the background (1 disk for RAID3 & 5, 2 disks for RAID6) while continuing your work. This is even enhanced by the use of hot-spares (a double assurance).
PERFORMANCE
RAID0 offers the best performance increase over a single disk, followed by RAID3, then RAID5 amd finally RAID6. RAID1 does not offer any performance increase.
Hardware RAID controllers offer the best performance and the best options (like adjustable block/stripe size and hot-spares), but they are costly.
SUMMARY
If you only have 3 or 4 disks in total, forget about RAID. Set them up as individual disks, or the better alternative, get more disks for better redundancy and better performance. What does it cost today to buy an extra disk when compared to the downtime you have when a single disk fails?
If you have room for at least 4 or more disks, apart from the OS disk, consider a RAID3 if you have an Areca controller, otherwise consider a RAID5.
If you have even more disks, consider a multilevel array by striping a parity check array to form a RAID30, 50 or 60.
If you can afford the investment get an Areca controller with battery backup module (BBM) and 2 GB of cache. Avoid as much as possible the use of software raids, especially under Windows if you can.
RAID, if properly configured will give you added redundancy (or security) to protect you from disk failure while you can continue working and will give you increased performance.
Look carefully at this chart to see what a properly configured RAID can do to performance and compare it to the earlier single disk chart to see the performance difference, while taking into consideration that you can have one disks (in each array) fail at the same time without data loss:
Hope this helps in deciding whether RAID is worthwhile for you.
WARNING: If you have a power outage without a UPS, all bets are off.
A power outage can destroy the contents of all your disks if you don't have a proper UPS. A BBM may not be sufficient to help in that case.
Copy link to clipboard
Copied
Today is hotter in my office, and the hottest drive was 50 degrees C.
Copy link to clipboard
Copied
I am not happy with the proformance of my RAID system. What do y'all know about the Raid Rocket 4320 Hardware PCIe card? I am usinmg (4) 74Gb SAS 10k in RAID3 (I think that's correct). The RAID3 is for my D drive and I boot C off of another larger SAS 10k drive. Too often, it appears like my disk system goes off to think for a few seconds before anything changes on the display!
Has it gotten to the point that RAID is not required as much due to the larger & faster drives coming out these days?
Suggestions please.
Thanks,
William
Copy link to clipboard
Copied
Hi Harm, your articles have been informative. Could you offer some suggestions configuring a RAID with the following:
Asus Rampage III Formula: 2 Sata 3 ports and 6 Sata 2 ports. Would you put the OS on one of the Sata 2 or Sata 3?
http://www.asus.com/Motherboards/Intel_Socket_1366/Rampage_III_Formula/#specifications
I have 6 Hatachi 7k3000. (plan to add more and work around these)
There is room for 9 HDD in the case.
I've noticed in your set up you have 2 drives in Raid 0 off the MB, and a Raid card for another raid.
I probably want to add a Raid card when I do this. So would it best to have the OS on a Intel Sata II plug, (1 of 6) two Hatachi in Raid 0 on Sata 6 (if it can do raid?) and the remaining 5 Sata II ports for Raid 3 with raid card.
I suppose another option is OS on Sata III port and configure the remaining 5 or 6 Sata II ports for Raid 3. That would leave one Sata III port.
That's the limit of my uderstanding. I havn't configured any Raid so don't know if I'm underthinking this or over thinking it.
So far my experience in Bios the 6 Sata II ports have IDE, ACHI, or Raid. But I've read that it can be set up for both ACHI and Raid. I'm not sure how to set this up. Or the Sata III ports. For example can a Raid 3 be on both the Sata II and Sata III ports? I've read that only Sata II ports (Intel) are Raid.
Jim also suggested a Intel utility for switching IDE, ACHI, Raid but I couldn't find it.
Also not sure what the limitations are, for example can you have C: on Sata III, 4 Sata II ports for Raid 3 with Raid card, and 2 Sata ports for Raid 0 on MB?
Anyway, I'm trying to figure out how the Sata II ports and Sata III ports are compatible with Raid on MB and Raid Card for editing with PP5.5.
Thanks if anyone can offer any suggestions.
Copy link to clipboard
Copied
I've found most of solutions in the drive set up http://forums.adobe.com/message/3023501#3023501.
With 6-8 disk one disk as OS and the rest in Raid 3 or 5.
Maybe this thread was created before that one becaue most of the post here are about a 4 disk set up and not about raid. I also haven't found any info about the compatibility between Sata II and Sata III ports.
I'll be installing 5.5 later this week and look forward to working with the program and sort out the set up issues and conflicts as best I can.
Copy link to clipboard
Copied
Is it a bad idea to use hard drives that are the same make and model but purchased a few months apart for a raid 0 or 5 array?
I've got 3 Samsung HD103SJ's and I'm thinking about getting a few more, plus a 150 gb velociraptor, to experiment with some
software raids before i take the dive into some hardware raid.
Thanks.
Copy link to clipboard
Copied
No, so long as they're the same make and model. I have 8x WD2003FYYS disks in an 8-member RAID6 that I bought over a month or two spread from different vendors. They run great!
Copy link to clipboard
Copied
Dear Harm,
In my editing in Premiere, I find huge lag times when making color changes to clips in my project which contains several dozens of clips (around 8minutes of HD XAVCS footage in total). All of my videos are stored on an external WD 'My Passport for Mac' drive. I am beginning to think that this arrangement is the cause for this lag. Would getting an extermal SSD solve this issue? Would the connection require USB 3.0 or Thunderbolt to actually pay off?
Any help is appreciated!
Copy link to clipboard
Copied
Have a look at Tweakers Page - External Drives and the other pages at Tweakers Page
Those may help to answer your question.