CentOS 6.3 on Hyper-V – Storage performance and comparison
by siv on Feb.27, 2013, under Hyper-V, Linux
One of the first few things I noticed is the very good storage performance of the box. Windows Server 2012 introduced support for CentOS and made the integration services available as an installation RPM.
The physical hardware is not bad at all, but is no miracle either.
For this test I am using a Quanta Barebone with 1xXeon E5-1650 CPU loaded with 4×3.5 inch SATA 7.2k drives (HItachi Ultrastar A2000).
The System has a LSI9271 controller and is using the active backplane of the barebone, so this introduces an additional bottleneck which could limit the performance of SSDs, but using SATA drives, we are on the safe side with 6Gbps SAS. I have configured the drives in RAID5 and installed the OS on a separate RAID1 SSD array (hosted on separate controller), to minimize interference.
The Linux OS drive is placed on the SSD array as well. This leaves just the drive to be tested on the RAID5 SATA array. Some info – sdb is a Hyper-V VHD with 1900 Gb (why the odd number, I will explain in a different post), which is presented as an LVM member, for flexibility purposes. FS is ext4 on all tests.
I am using a relatively simple test
write :
time sh -c “dd if=/dev/zero of=ddfile2 bs=8k count=1M”
read:
time sh -c “cat ddfile > /dev/null”
Pure sequential read/write which relates to best case scenario when dealing with HDDs. This should show sheer bottlenecks, if any.
If I wanted to go deeper in my testing methodology, I would need to create a threaded test and monitor how performance scales with multiple threads. But that is “to do”, not a subject of this post. I was thinking about running bonnie++ since it displays easily understandable results, but as time pushed me, this will be the base for the future.
I used the very good system monitoring tool atop (ready RPMs available from CentOS Extras repo). Here is a snip out of atop on Hyper-V 2012.
Test1
Read:
LVM | main1-app | busy 90% | read 37834 | write 0 | MBr/s 472.88 | MBw/s 0.00 | avio 0.23 ms |
DSK | sdb | busy 85% | read 37834 | write 0 | MBr/s 472.88 | MBw/s 0.00 | avio 0.21 ms |
Write:
LVM | main1-app | busy 73% | read 38 | write1179003 | MBr/s 0.01 | MBw/s 460.73 | avio 0.01 ms |
DSK | sdb | busy 71% | read 38 | write 47864 | MBr/s 0.01 | MBw/s 460.80 | avio 0.15 ms |
What does this tell us?
Reading: LVM requests 37834 IOs/sec, average speed of reading is 472.88 Mb or 484229.12 Kb/sec, so the average IO size is 12.79 Kb
Moreover IO response time is below 2 ms, which lets the kernel think, OK I am on a high performance storage array, don’t throttle down. As a general thumb rule everything below 10 ms is really good, so 2 ms is a great result.
Writing: LVM requests 1179003 blocks to be written, but kernel knows to optimize those, and translates the blocks to 47864 HW IOps on the HDD level. This means the ration between LVM writes and HDD writes is 24:1 .
Caching helps here, so we see a near zero latency. Average IO size is 9.85 kb/sec. This leads me to believe, that the hardware stripe size is 16k and is better suited for relatively small IO sizes. Real life analysis shows that 30% of IO requests are smaller than 16k, so this setting will surely help towards better performing storage for people that don’t specifically tune their storage subsys.
Attempting the same 100% read/write operations on another hardware box yields somehow different results, despite having identical spinners.
Test2
This is 6x 3.5 inch HDD SATA 7.2k array with hardware RAID controller (LSI 9260) with 512 Mb cache.
Read:
DSK | cciss/c0d1 | busy 102% | read 19037 | write 31 | avio 0 ms
Write:
DSK | cciss/c0d1 | busy 101% | read 11 | write 5917 | avio 1 ms |
Performance is 2 times lower, despite having 50% more drives in the array.
Lets attempt the same on a VMWare box, same hardware as in test1. Just plug a VMWare USB Flash drive, reformat the datastore to VMFS and lets see what it’s got.
Test 3
Read:
LVM | test-lv_home | busy 104% | read 15892 | write 6 | MBr/s 198.61 | MBw/s 0.00 | avio 0.63 ms |
DSK | sda | busy 104% | read 15892 | write 9 | MBr/s 198.61 | MBw/s 0.00 | avio 0.63 ms |
Write:
LVM | test-lv_home | busy 105% | read 3 | write 631219 | MBr/s 0.00 | MBw/s 246.57 | avio 0.02 ms |
DSK | sda | busy 105% | read 3 | write 4947 | MBr/s 0.00 | MBw/s 246.67 | avio 2.02 ms |
Reading: LVM requests 15892 IOs/sec, average speed of reading is 198.61 Mb/sek or 203376.64 Kb/sec, so the average IO is 12.79 Kb. Hmm the number is something we have seen before, haven’t we? But sheer performance is again very low compared to Hyper-V.
Why? (I am led to believe VMWare tells everybody on SATA drives just “Because F* you that is why”)
Ask VMWare. The HW is 100% VMWare certified and no 3rd party drivers have been installed.
Writing: LVM requests 631219 blocks to be written, but kernel knows to optimize those, and translates the blocks to 4947 HW IOps on the HDD level. This means the ration between LVM writes and HDD writes is 127:1 .Caching helps here, so we see a near zero latency. Compared to Hyper-V the number of writes is very low, however so is performance. Average IO in this test is 51K. A whopping 51K – this partially explains why VMW is so slow. it most likely utilizes 64k stripe size in its VMFS, to decrease number of IOPS, thus offload CPU. But most expensive resource for small business is not the CPU. Its the IO!
Conclusion: If you want/need single server solution for virtualization (not even a cluster solution) or are just starting with virtualization, try Hyper-V. If you will be running LARGE infrastructures, and have 10+k Eur for a node, then VMW is a solution too.
But most importantly – know your hypervisor and size your IO subsys correctly.