| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now, when a VCPU wakes-up, we check whether it should preempt
what is running on the PCPU, and whether or not the waking VCPU can
be migrated (by tickling some idlers). However, this can result in
suboptimal or even wrong behaviour, as explained here:
http://lists.xen.org/archives/html/xen-devel/2012-10/msg01732.html
This change, instead, when deciding which PCPU(s) to tickle, upon
VCPU wake-up, considers both what it is likely to happen on the PCPU
where the wakeup occurs,and whether or not there are idlers where
the woken-up VCPU can run. In fact, if there are, we can avoid
interrupting the running VCPU. Only in case there aren't any of
these PCPUs, preemption and migration are the way to go.
This has been tested (on top of the previous change) by running
the following benchmarks inside 2, 6 and 10 VMs, concurrently, on
a shared host, each with 2 VCPUs and 960 MB of memory (host had 16
ways and 12 GB RAM).
1) All VMs had 'cpus="all"' in their config file.
$ sysbench --test=cpu ... (time, lower is better)
| VMs | w/o this change | w/ this change |
| 2 | 50.078467 +/- 1.6676162 | 49.673667 +/- 0.0094321 |
| 6 | 63.259472 +/- 0.1137586 | 61.680011 +/- 1.0208723 |
| 10 | 91.246797 +/- 0.1154008 | 90.396720 +/- 1.5900423 |
$ sysbench --test=memory ... (throughput, higher is better)
| VMs | w/o this change | w/ this change |
| 2 | 485.56333 +/- 6.0527356 | 487.83167 +/- 0.7602850 |
| 6 | 401.36278 +/- 1.9745916 | 409.96778 +/- 3.6761092 |
| 10 | 294.43933 +/- 0.8064945 | 302.49033 +/- 0.2343978 |
$ specjbb2005 ... (throughput, higher is better)
| VMs | w/o this change | w/ this change |
| 2 | 43150.63 +/- 1359.5616 | 43275.427 +/- 606.28185 |
| 6 | 29274.29 +/- 1024.4042 | 29716.189 +/- 1290.1878 |
| 10 | 19061.28 +/- 512.88561 | 19192.599 +/- 605.66058 |
2) All VMs had their VCPUs statically pinned to the host's PCPUs.
$ sysbench --test=cpu ... (time, lower is better)
| VMs | w/o this change | w/ this change |
| 2 | 47.8211 +/- 0.0215504 | 47.826900 +/- 0.0077872 |
| 6 | 62.689122 +/- 0.0877173 | 62.764539 +/- 0.3882493 |
| 10 | 90.321097 +/- 1.4803867 | 89.974570 +/- 1.1437566 |
$ sysbench --test=memory ... (throughput, higher is better)
| VMs | w/o this change | w/ this change |
| 2 | 550.97667 +/- 2.3512355 | 550.87000 +/- 0.8140792 |
| 6 | 443.15000 +/- 5.7471797 | 454.01056 +/- 8.4373466 |
| 10 | 313.89233 +/- 1.3237493 | 321.81167 +/- 0.3528418 |
$ specjbb2005 ... (throughput, higher is better)
| 2 | 49591.057 +/- 952.93384 | 49594.195 +/- 799.57976 |
| 6 | 33538.247 +/- 1089.2115 | 33671.758 +/- 1077.6806 |
| 10 | 21927.870 +/- 831.88742 | 21891.131 +/- 563.37929 |
Numbers show how the change has either no or very limited impact
(specjbb2005 case) or, when it does have some impact, that is a
real improvement in performances (sysbench-memory case).
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Moving some of them from sched_credit.c to generic scheduler code.
This also allows the other schedulers to use perf counters equally
easy.
This change is mainly preparatory work for what stated above. In fact,
it mostly does s/CSCHED_STAT/SCHED_STAT/, and, in general, it implies
no functional changes.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch can improve Xen performance:
1. Basically, the "delay method" can achieve 11% overall performance
boost for SPECvirt than original credit scheduler.
2. We have tried 1ms delay and 10ms delay, there is no big difference
between these two configurations. (1ms is enough to achieve a good
performance)
3. We have compared different load level response time/latency (low,
high, peak), "delay method" didn't bring very much response time
increase.
4. 1ms delay can reduce 30% context switch at peak performance, where
produces the benefits. (int sched_ratelimit_us = 1000 is the
recommended setting)
Signed-off-by: Hui Lv <hui.lv@intel.com>
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
| |
Signed-off-by: Xiaowei Yang <xiaowei.yang@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
them with atomic (locked) ops.
Conversion here isn't complete in the sense that many places still use
the old per-CPU accessors (which are now redundant). Since the patch
is already rather big, I'd prefer replacing those in a subsequent
patch.
While doing this, I also converted x86's multicall macros to no longer
require inclusion of asm-offsets.h in the respective C file (on IA64
the use of asm-offsets.h in C sources seems more wide spread, hence
there I rather used IA64_ prefixes for the otherwise conflicting
performance counter indices).
On x86, a few counter increments get moved a little, to avoid
duplicate counting of preempted hypercalls.
Also, a few counters are being added.
IA64 changes only compile-tested, hence somebody doing active IA64
work may want to have a close look at those changes.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Steven Hand <steven@xensource.com>
|
|
|
|
|
|
|
|
|
| |
Creates asm-ia64/perfc_defn.h (empty).
Includes asm/perfc_defn.h in xen/perfc_defn.h
Signed-off-by: Tristan Gingold <tristan.gingold@bull.net>
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|
|
|
|
| |
Signed-off-by: Tom Woller <thomas.woller@amd.com>
|
|
|
|
|
|
|
|
|
| |
(Appendix A1) indicates that there are 44 potential exit reason codes.
Based upon this, increase the size of the PERFCOUNTER_ARRAY for vmexits.
Signed-off-by: Ben Thomas (bthomas@virtualiron.com)
|
|
|
|
|
|
|
| |
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ben Thomas <ben@virtualiron.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
rename the core function ___shadow_status() to take its
place.
The 'fast path quick test' was ridiculously bloated and
didn't work for PGT_fl1_shadow pages.
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|
|
|
|
|
|
|
| |
uses of max_page variable to use the mfn_valid() predicate.
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|
|
|
|
|
|
|
|
|
| |
meaningless and unnecessary.
Rename rem_timer -> stop_timer.
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|
|
|
|
|
|
| |
Renamed gpfn_to_mfn_foreign() to get_mfn_from_pfn_foreign(), making it more
consistent with get_mfn_from_pfn().
|
| |
|
|
|
|
|
|
|
|
| |
Clean up the domain_page.h interfaces. One common header file
<xen/domain_page.h> and map_domain_mem() -> map_domain_page(), takes
a pfn rather than a paddr.
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|
|
|
|
|
|
|
|
| |
Enabling light-weight shadows (especially shadow_mode_dirty).
Light-weight shadows leave all the page ref counts based on the guest p.t. pages,
while heavy-weight shadows do all their ref counts based on the shadow's p.t. pages.
shadow_mode_refcounts(dom) == 1 implies heavy-weight shadows.
|
|
|
|
|
| |
Patch to run a domU in shadow test mode.
|
|
|
|
|
|
|
|
| |
Rename translate_gpfn_to_mfn to gpfn_to_mfn_foreign.
Minor bug fix from prior merges.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
|
|
|
|
|
|
| |
First attempt at cleanup after merge of shadow code with unstable.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
|\
| |
| |
| |
| |
| |
| | |
Initial attempt at merging shadow code with head of unstable tree.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Tidy the x86 emulator interface, and use it from within the
writable pagetable algorithm to deal with otherwise unhandleable cases.
For example: L1 mapped at multiple L2 slots; L1 that maps itself; L1
that also maps the code making the update, or the kernel stack.
This provides a proof-of-concept for the emulator that can be picked
up for the VMX code to improve the device-model emulation.
Signed-off-by: Keir Fraser <keir@xensource.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Snapshots of L1 page table pages now only snapshot the active portion of
the page.
Improved the tlb flushing of shadow mode somewhat...
Fixed a bug in the shadow_min_max encoding stuff.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| |
| |
| |
| |
| |
| |
| |
| | |
Keep a list of pre-zero'ed L1 shadow pages.
Avoid the cost of zero'ing them upon allocation.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
HL2's are now filled in on demand, rather than by doing the entire thing
on creation. Also fixed a bug in hl2 ref counting. hl2 entries don't
take a writable ref to the guest pages, as they are xen mappings, not
guest mappings. Also fixed a tlb flushing bug with hl2 entries.
Bug fix for shadow table ref counting. CR3's shadow table could, in theory,
get released while it's still pointing at it. Fixed.
Bug fix for shadow code with tlb flushes from hypervisor calls.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| |
| |
| |
| |
| |
| |
| | |
Cleanup after merge
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | | |
Merge with Rolf's tree
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Added prediction of where to find the last writable PTE for a given page;
greatly speeds up promotion of a page to be used as a page table.
Removed some broken concepts of write protecting PDEs and higher level
entries. To write protect a page, all we need to do is write protect all
L1 entries that point at it.
Fixed a bug with translated IO pages; gotta check that MFNs are really backed
by RAM before we go looking in the frame_table for them...
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
dom0 runs well in shadow translate mode!
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Initial commit for trying to get a translated dom0 up and running.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
fixed manual merge error
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
|\| |
| | |
| | |
| | |
| | |
| | |
| | | |
manual merge with michaels latest
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Added unshadowing of L2s that contain entries which are both
not present and non-zero. This is a hack, but ought to work OK
for linux domains.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Added extra shadow_sync_mfn() in do_update_va_mapping to deal
with a shortcoming of the checking code in _check_pagetable.
Better to have a few more flushes and checking code that can
still be used. It would be even better to have smarter checking
code, but that will take more time.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Make validate_(pte|pde)_changes a litter smarter.
Avoid some unnecessary calls to __shadow_status.
Added an early out for __shadow_status.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Added support for mapping other domain's memory from a privileged
shadowed domain. Should hopefully enable a shadowed dom0 to start
up other domains.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Initial fullshadow checkin.
Things still to do:
- reuse snapshots intelligently.
- minimize tlb flushes during resync.
- figure out when to free up no-longer-used L2 shadows, and
generally deal with out-of-memory kinds of problems.
Some basic guidelines:
- With fullshadow on, you can not trust
linear_pg_table unless you have first checked whether the VA
in which you are interested is out-of-sync or not.
- Significant new functions/macros include:
page_out_of_sync(mfn): returns true if page is out of sync.
shadow_mark_out_of_sync: make a page be out of sync (allocating
any necessary snapshots, etc)
shadow_out_of_sync(va): returns true if the current mappings
involved in va are out-of-sync.
shadow_sync_va(): bring the pages involved in mapping a particular
va back into sync. Currently calls shadow_sync_all().
shadow_sync_all(): bring all pages back in-sync.
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
| |/
|/|
| |
| |
| |
| |
| | |
michael's initial shadow code
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
|/
|
|
|
|
|
|
| |
Performance counters for hypercalls and exceptions. Perfctr histograms
for pagetable updates.
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@cl.cam.ac.uk>
|
|
|
|
|
|
|
|
| |
Keep a separate shadow and "hl2" shadow of each guest L2 page.
Still doing excessive clearing of these shadows, though...
Signed-off-by: michael.fetterman@cl.cam.ac.uk
|
|\
| |
| |
| |
| | |
Hand merge
|
| |\
| | |
| | |
| | |
| | | |
Hand merge
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Bug fix for shadow code.
When update_va_mapping() updates a entry, the corresponding shadow
entry may not be reachable via the shadow_linear_pg_table, even though
it is currently shadowed, as the corresponding spde has not necessarily
been faulted into place yet.
|
| |/
| |
| |
| |
| |
| | |
Improved check_pagetable checking.
Added check_all_pagetables as an alternative to check_pagetable().
|
|/
|
|
|
|
| |
Some VT-x software perf counters.
Signed-off-by: ian.pratt@cl.cam.ac.uk
|
|
|
|
|
|
|
|
| |
Replace pseudo-4GB-segment instruction emulation with a segment-type
trick plus instruction replay. Simpler and more robust but actually
somewhat slower (we fault more times, as we can fault on both +ve and
-ve accesses).
|
|
|
|
|
|
|
|
|
| |
Initial Xen support for 4GB segments thru instruction emulation.
The instruction decoder needs some refactoring as there is lots of
duplicated crufty code in there right now. Also, the TLS libraries
hit the emulator a LOT, but mainly with one or two instructions. Probably
we need to patch those within Linux.
|
|
|
|
|
| |
Removed old I/O world and cleaned up.
|