Scalable Multiprocessor Virtual Machines
We support virtual machines with multiple virtual processors, and which
flexibly schedule the virtual processors on physical processors. For
example, where a VM has four virtual processors, potentially only one is
actively using a physical processor and the remaining virtual processors
are preempted. Or, one of the virtual processors runs at 2 GHz, and
the remaining have limited access to physical processors and thus
seem to run at slower speeds, such as 1 GHz.
Multiprocessor Virtual Machines
Free scheduling introduces the problem that virtual processors could
try to acquire a kernel lock held by a preempted virtual processor.
Spin locks are designed for short spin times and statistical fairness;
in Linux 2.4, 95% of all lock hierarchies are released before 20 microseconds.
A preempted lock holder will likely increase lock hold time to the
order of multiple time slices, in the millisecond range, and thus over
1000 times longer than expected by the original design.
When other virtual processors want to acquire the preempted lock, they'll
wait multiple time slices, rather than several microseconds.
Free Scheduling
When running an Apache 2 workload, we found that it was subject to a 39.2%
probability of kernel lock holder preemption. Apache 2 uses the
sendfile() Linux system call, which offloads file sending
to the Linux kernel.
Apache 2 Lock Trace
Although there is a 39.2% chance of preempting a kernel lock holder in the
Apache 2 workload, we must ask whether other virtual processors actually try to
acquire the preempted locks. We found that with more than two processors, 20%
of the execution time was spent on extended lock waiting. Extended lock
waiting is where a virtual processor spent more than a millisecond waiting for
a (preempted) lock.
Apache 2 Extended Lock Wait Time (≥ 1ms)
We developed two solutions for avoiding lock holder preemption. One
solution is intrusive and applicable to paravirtualization. The other
solution is non-intrusive, and thus applicable to a fully virtualized
environment. We compared our solutions to the original Linux spin locks,
and to the obvious yet naive solution for handling lock holder preemption,
which is to yield the time slice. Our solutions nearly eliminate all
extended lock waiting time.
Apache 2 Extended Lock Wait Time (≥ 1ms)
Our solution for avoiding lock holder preemption improved the Apache 2
throughput by 28% in some cases.
Apache 2 Bandwidth Relative to Native
Besides the problem of lock holder preemption, free scheduling introduces
variable speed virtual processors, where the virtual processors can adjust
speed independently. Commodity operating
systems expect all processors to operate at the same speed, and thus their
load balancers become confused in a freely scheduled virtual machine
environment. We have a solution called time ballooning, which helps
the native scheduler to make appropriate scheduling decisions in a VM
environment.
Time Ballooning in Action
For more information about our solutions for multiprocessor virtual machine
scalability, see our publication
Towards Scalable
Multiprocessor Virtual Machines.
|