Subscribe to:
Post Comments (Atom)
Popular Posts
Labels
About Me
Best Practices
Career
Data Mining
Documentation
Feature Requests
Humor
MagicPASS
Meme Monday
Mirroring
Parameter Sniffing
PASS
Performance
PowerShell
Presentations
Query Tuning
Recognition
Replication
Scripts
Security
SQL Power Doc
SQL Server 2005
SQL Server 2008
SQLH2
SQLRally
SQLSaturday
SYDI
T-SQL Tuesday
Tips
Troubleshooting
Updates
VirtualBox
Windows
XML
What I'm Saying On Twitter
Copyright © 2015 Kendal Van Dyke. All rights reserved.
Kendal is a database strategist, community advocate, public speaker, and blogger. A practiced IT professional with over 15 years of SQL Server experience, Kendal excels at disaster recovery, high availability planning/implementation, & debugging/troubleshooting mission critical SQL Server environments. Kendal is a Senior Consultant on the Microsoft Premier Developer Support team and President of MagicPASS, the Orlando, FL based chapter of PASS. Before joining Microsoft, Kendal was a SQL Server/Data Platform MVP from 2011-2016.
[About Kendal]
(https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9MNPJA4l6-fUucihu7dVXmOnHaSrHG-zuAhn6IXsjHqnZ2RJ4xSd3KNLMkYXnKcrDKORYVXxC1WGsrrt8s5U8k1atkLfg0xnLtZA6RbLt6qhppadMtWFJLRpNgg1zLjyG3qFeokfkc0xs/w139-h140-p/IMG_3503.JPG)
15 comments
Would these figures indicate that RAID 5 is better than its reputation?
Say if your load is approx 70% read and 30 % write. Wouldn't that give somewhat equal results?
I believe today the word is: if write is more than 5% (10%) use RAID 10.
Recommendation could then be use RAID 5 for data file and RAID 1 or 10 for logs.
Perhaps, but in my experience the decision to go with RAID 5 is based more on storage needs rather than performance. That said, I don't think there's a magic number where you can draw the line. It's really subjective - if write performance is acceptable to the people who use the server than they could care less if it's RAID 5 or RAID 10 under the covers.
I spent several hours this week trying to find an honest and unbiased answer to the Raid 5 vs Raid 10 question -- you answered it well, thank you.
Based on the data you provided to compare between the raid 5 and raid 10. I draw a little chart for the 8k random read/write. This shows with more than 62% of read, raid 5 will be out performed the raid 10.
Thanks for the data. Do you have the performance impact measuring when RAID is missing one drive?
Will RAID5 has fewer impact than RAID10?
This is the best comparison I have seen to date on my quest to figure out which raid to use on my new MD3000. Thank you.
My questions is I plan on setting up a VM server farm for microsoft development systems (visual Studio/Sql Server/IIS). THere is also a minimal use web site and sql server app running that is not of concern today.
Does anyone know what a good recommendation is for a hyvery VM farm running on an MD3000. I can get more drives that is not the issue my lack of knowledge however is and I am looking for reasons to use Raid 10/5 or something else that will give optimal performance when using VM and there corresponding 25-40g virtual hard drives.
RAID 5 has horrible performance when it has lost one drive. It also has horrible performance when rebuilding the array after losing one drive as it has to read every single block of every single drive in the entire array to regenerate the lost disk. If you lose 2 drives in a RAID 5, you are guaranteed to lose all of your data! It is simply unsafe for critical enterprise storage. Just say no to RAID 5!!!
With RAID-10 you may still lose ALL of your data when 2 drives suddenly die at the same time. Is that actually a whole lot safer, having the odds of losing all your data or none of your data in the fairly extreme case that 2 disks die at the same time?
If you use proper hardware (good controllers and good drives), RAID5 is a perfectly fine solution that will benefit you a lot for applications that need many heavy read actions. A good example would be streaming media servers.
>> With RAID-10 you may still lose ALL of your data when 2 drives suddenly die at the same time. Is that actually a whole lot safer, having the odds of losing all your data or none of your data in the fairly extreme case that 2 disks die at the same time?
The only time this is true is if both drives were in the same RAID 1 set in the stripe which is possible but highly unlikely. If you have Disks A/B, C/D, E/F, G/H in RAID 1 and striped across the pairs listed for 1+0, you can lose 1 disk from each of the RAID 1 arrays, up to 4 disks, and the array survives with no data loss, and no performance loss either, that is the benefit of RAID 10. Disk A or B could fail along with Disk C or D, E or F, and G or H, but A and B, or C and D, or E and F, or G and H failing together would result in loss of the array.
And what about the no-RAID configuration, in some specific cases ?
Let's imagine that :
- you focuse only on writing performance, and don't care about redundancy,
- you have 5 SATA drives,
- you must write 20 data streams almost equivalent in throughput, each of them in a different directory.
Wouldn't it be a better solution to not configure a RAID array, but instead just create a partition per disk, and write a given data flow to a given disk/partition : flows 1 to 5 to disk 1, flows 6 to 10 to disk 2, etc...
The scaling of the charts in this article leads to incredibly misleading bars. For example, the first graph, IOs/sec read, shows a RAID 5 column more than twice as high as the RAID 10, yet the difference between the two is less than 9%. In contrast, the same chart for IOs/sec write (4th) shows a RAID 10 bar that is only slightly higher than RAID 5, yet the RAID 10 number is 59% higher! The use of different scales on each of the charts leads to meaningless visual information.
I test my systems with SQLIO, both systems used the same disks and SAN(EVA4400). The RAID5 disks were divided on 42 disks, while the RAID10 disks were divided between 72 disks. The performance of the RAID10 disks was way better that the performance of the RAID5 disks with reads and writes (Random and Sequential).
"The RAID5 disks were divided on 42 disks, while the RAID10 disks were divided between 72 disks. "
The question is: RAID10 42 disks, RAID 5 72 disks. which one is better ?!
thanks a lot for sharing your results. I had a question for you. I have a 4 disk raid 10 on Dell perc 6i. Will adding two more disks increase the performance?
The scaling of the graphs is dangerously misleading. The graphs that show speed improvements in RAID 5 are scaled ridiculously small, where graphs with improvements in RAID 10 are scaled ridiculously large.
On RAID 5 vs 10 performance: A four disk RAID 5 set can only ever use 3 disks for reading and writing. The last disk only stores parity information and is only used in the event of disk failure. Writes take a significant hit due to computation of the parity information. Reads take a hit because all 3 disks are needed for each read. A four disk RAID 10 (not 0+1) set is able to use all 4 disks for read operations, as only one disk from each mirror set is required, allowing two simultaneous reads to occur. In the event of a failure, RAID 5 takes a read hit of up to 70% across the set, whereas RAID 10 only loses one set of drives (in a 4 drive set, 50% reads, 0% writes, in a 6 drive set, 33% reads, 0% writes, and so on) resulting in a much improved and more graceful failure situation.
On Redundancy: RAID 5 can only ever lose 2 drives before a total loss event occurs. Even if 10 drives are used in the set, the loss of any 2 will result in a total loss. This makes RAID 5 sets statistically more likely to result in a total loss as you add more drives. In a RAID 10 set, a total loss event can only occur if all drives from a single mirror set are lost. Otherwise, a RAID 10 set can sustain the loss of all but one drive from each mirror set. In a 4 disk set, configured with two mirrored sets: Min to TL: 2, Max to TL: 3. In a 6 disk set: Min to TL: 3, Max to TL: 5, in a 8 disk set: Min to TL: 4, Max to TL: 7. This makes a total loss event in RAID 10 statistically less likely for each pair of drives added to the RAID set.
RAID 5’s evolution only ever occurred due to the high cost of drives and the low density of drive bays available then in servers. RAID 10 is the superior choice for all situations regarding performance and redundancy, and should be used over RAID 5 whenever possible. The author’s conclusions are misinformed and should not be considered when building a server deployment strategy. For a factual and accurate comparison, find Microsoft’s SQL Server deployment white paper and read the section on disk subsystems. That will clearly show the performance characteristics on the two RAID types. Spoiler Alert: RAID 5 doesn’t win.
One error to correct in the previous anonymous comment. "A four disk RAID 5 set can only ever use 3 disks for reading and writing" is a common misconception. In fact there is one disks's worth of parity data, but it's striped across all the disks in the array along with the data. So in this example all four disks are still spinning simultaneously.
Post a Comment