About Publications Downloads Related Projects Team L4hq.org  
 
Projects
Pistachio
Kickstart
Download
Virtualization
Pre-virtualization
Device Drivers
Multiprocessor
Marzipan
BurnNT
Download
IDL4
Release Notes
Documentation
Download
Persistence
Hazelnut
Download
Getting started
 
Miscellaneous
Mailing lists
Tools
VMwareGateway
Workshops
Google L4Ka.org:
 
 

Scalable Multiprocessor Virtual Machines

We support virtual machines with multiple virtual processors, and which flexibly schedule the virtual processors on physical processors. For example, where a VM has four virtual processors, potentially only one is actively using a physical processor and the remaining virtual processors are preempted. Or, one of the virtual processors runs at 2 GHz, and the remaining have limited access to physical processors and thus seem to run at slower speeds, such as 1 GHz.

Multiprocessor Virtual Machines
Diagram: flexible mapping of virtual processors to physical processors.

Free scheduling introduces the problem that virtual processors could try to acquire a kernel lock held by a preempted virtual processor. Spin locks are designed for short spin times and statistical fairness; in Linux 2.4, 95% of all lock hierarchies are released before 20 microseconds. A preempted lock holder will likely increase lock hold time to the order of multiple time slices, in the millisecond range, and thus over 1000 times longer than expected by the original design. When other virtual processors want to acquire the preempted lock, they'll wait multiple time slices, rather than several microseconds.

Free Scheduling
Diagram: the problem of lock preemption, and how it increases spin time on preempted locks.

When running an Apache 2 workload, we found that it was subject to a 39.2% probability of kernel lock holder preemption. Apache 2 uses the sendfile() Linux system call, which offloads file sending to the Linux kernel.

Apache 2 Lock Trace
Graph: a recursive lock acquisition trace for Apache 2, over time.

Although there is a 39.2% chance of preempting a kernel lock holder in the Apache 2 workload, we must ask whether other virtual processors actually try to acquire the preempted locks. We found that with more than two processors, 20% of the execution time was spent on extended lock waiting. Extended lock waiting is where a virtual processor spent more than a millisecond waiting for a (preempted) lock.

Apache 2 Extended Lock Wait Time (≥ 1ms)
Graph: preempted locks cause extended lock waiting for Apache 2 that is around 20% of the benchmark execution time.

We developed two solutions for avoiding lock holder preemption. One solution is intrusive and applicable to paravirtualization. The other solution is non-intrusive, and thus applicable to a fully virtualized environment. We compared our solutions to the original Linux spin locks, and to the obvious yet naive solution for handling lock holder preemption, which is to yield the time slice. Our solutions nearly eliminate all extended lock waiting time.

Apache 2 Extended Lock Wait Time (≥ 1ms)
Graph: using our techniques to avoid lock holder preemption, we nearly eliminate extended lock waiting time.

Our solution for avoiding lock holder preemption improved the Apache 2 throughput by 28% in some cases.

Apache 2 Bandwidth Relative to Native
Graph: our lock holder preemption techniques improve Apache 2 throughput.

Besides the problem of lock holder preemption, free scheduling introduces variable speed virtual processors, where the virtual processors can adjust speed independently. Commodity operating systems expect all processors to operate at the same speed, and thus their load balancers become confused in a freely scheduled virtual machine environment. We have a solution called time ballooning, which helps the native scheduler to make appropriate scheduling decisions in a VM environment.

Time Ballooning in Action

Graph: time ballooning used to rebalance the scheduler, versus time.

For more information about our solutions for multiprocessor virtual machine scalability, see our publication Towards Scalable Multiprocessor Virtual Machines.

   
 
 
 
  Mail to webmaster   © 2000-2010 University of Karlsruhe