Thursday, August 23, 2007

Parallels Desktop benchmarked

Just recently I wrote about virtualization benchmarking. My point was that a good benchmark has to

  • Be based on real-world use cases
  • Be vendor and technology neutral

VMmark was certainly neither – based on an artificial workload mix and designed to work best with ESX and not work at all with Virtuozzo.

CNET recently did a performance comparison between Parallels Desktop and VMware Fusion. Surprisingly enough, their benchmark also suffered from unrealistic scenarios and vendor bias.

Pretty much everything about this test is upside down:

  • It does not make sense to use an exclusive monster 8-core desktop for benchmarking Windows on Mac. Most people use laptops with 2 CPU cores and 2GB of RAM at most.
  • Since Mac OS is the primary OS, it does not make sense to give both cores to a Windows VM that runs Word, Excel and Outlook. Office applications don't benefit from multiple CPUs. Perhaps that's why default configuration of Fusion is a single-CPU VM. Come on – most of the VMmark (server benchmark) workloads run in a single-core VM. Yet, desktop benchmark is run with dual cores – does not make much sense to me.
  • It does not make sense to use Vista inside a VM. Most people run XP because Vista license only allows the most expensive Vista SKUs to run inside VM, not to mention application compatibility issues.
  • It does not make sense to run QuickTime and Photoshop inside Windows VM. Why use Mac OS in the first place if not for running Mac OS-native multimedia apps?

Actually, some people noted that the benchmark looked strange. Some even took time to run some tests for themselves.

The results show that Parallels actually performs better than Fusion on most tests and on the overall benchmark. And, as you already know from CNET test, Fusion could not even run the 3D gaming test. Yet, I'm sure my colleagues at Parallels are already working hard and the next Desktop update will improve responsiveness and performance so that even artificial tests will pass wellJ

Finally, I wanted to remind you that Parallels has always been focused on the real-world productivity. It is the time wasted on extra clicks, not raw speed that affects productivity the most. And it is the real-world scenarios – running real software applications on real hardware configurations – that need to be reflected in the benchmarks.

Wednesday, August 15, 2007

Citrix acquires XenSource

Citrix just announced the acquisition of XenSource for $500 million. XenSource was obviously for sale – trying to avoid being between a rock (VMware) and a hard place (Microsoft) – but the buyer and the price were a big surprise for me. Which raises few questions:

  • Is Citrix desperate? – The core business of Citrix – presentation server – is being threatened by VDI (Virtual Desktop Infrastructure) and other Citrix businesses are still small in comparison. Citrix needed a hypervisor to make its VDI stack complete. Did it need it so desperately that it justified paying half-billion dollars for a company with tiny revenue?
  • Will Citrix cross Microsoft? – Good relationship with Microsoft was very important for success of Presentation Server, but is much less important for success of the VDI business, especially when Citrix owns the hypervisor.

    Citrix could partner with Microsoft and use Viridian hypervisor, which will be out in about a year. Even though not as advanced as Xen, it will surely work wonderfully with Windows, and Citrix could care less about Linux. Instead, with Xen acquisition, Citrix followed VMware and entered the platform game, where it will be directly competing with Microsoft not just as VDI, but as a platform vendor.

  • Is Citrix committed to Xen community? – Citrix is a proprietary Windows company. Xen, on the other hand, has relied on and enjoyed strong support from open-source Linux community. The two philosophies are vastly different and making them work together will be interesting to watch from the side lines.

Looking forward to your comments!

VMmark – comments

VMware recently announced availability of VMmark – the first virtualization benchmark that measures… well, some characteristics of workloads running in virtual machines.

Overall, VMmark is a step in the right direction. However, it has a long way to become a solid benchmark. Here are some of the issues:

  • Virtualization platform scalability is not measured – The largest VM in the tile is 2CPU/2GB SQL Server VM, which is much smaller than the physical server itself. VMware ESX supports a VM with 4 virtual CPUs and 16GB of memory, yet – why??? – these capabilities are not used in the test.
  • Workload mix is not realistic – There will always be different virtualized workloads running on the same physical machine, but all VMmark workloads are quite different. Such a mix could be more realistic for small/medium business consolidation scenario, but not the typical enterprise datacenter deployment.
  • Performance of individual workloads is not measured – with such a wild workload mix, it is difficult to find out how well a server is suited for consolidating several instances of a certain workload – such as Exchange Server, SQL Server or file server.
  • Mixing platforms is unnecessary – Mixing platforms on the same machine prevents running VMmark using Virtuozzo – not that I suggest it was done on purpose J In real life, enterprises typically run Windows and Linux on different boxes, and majority of the consolidated workloads are Windows-based.
  • Aggregate score has no physical meaning - All workloads in the tile are throttled and never run at the full capacity. And, if the server is powerful enough that a single tile cannot load it fully, you are supposed to add another, which would double the number of VMs and probably skew the results.

A better approach would be a more traditional methodology that runs workloads serially:

For each workload, measure maximum aggregate performance with 1, 2, 4, 8, 16 and 32 virtual environments. Normalize against the same workload on the same hardware without virtualization and average the results.

The benefits of this approach:

  • A single VE test will measure scalability and performance of the virtualization platform – ability to use multiple CPUs, large memory, fast storage subsystems and network interfaces.
  • 16/32 VE tests will measure realistic density limits for consolidating specific workloads on the tested virtualization platform.
  • It will measure for each workload how virtualization overhead grows with density.
  • Finally, Virtuozzo will be able to run the benchmark because only one OS is used for a single benchmark run.

Let me know what you think about pros and cons of each approach.

Subscribe to: Posts (RSS)