aboutsummaryrefslogtreecommitdiffstats
path: root/xen/include/xen/perfc_defn.h
Commit message (Collapse)AuthorAgeFilesLines
* xen: sched_credit: improve tickling of idle CPUsDario Faggioli2012-12-181-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now, when a VCPU wakes-up, we check whether it should preempt what is running on the PCPU, and whether or not the waking VCPU can be migrated (by tickling some idlers). However, this can result in suboptimal or even wrong behaviour, as explained here: http://lists.xen.org/archives/html/xen-devel/2012-10/msg01732.html This change, instead, when deciding which PCPU(s) to tickle, upon VCPU wake-up, considers both what it is likely to happen on the PCPU where the wakeup occurs,and whether or not there are idlers where the woken-up VCPU can run. In fact, if there are, we can avoid interrupting the running VCPU. Only in case there aren't any of these PCPUs, preemption and migration are the way to go. This has been tested (on top of the previous change) by running the following benchmarks inside 2, 6 and 10 VMs, concurrently, on a shared host, each with 2 VCPUs and 960 MB of memory (host had 16 ways and 12 GB RAM). 1) All VMs had 'cpus="all"' in their config file. $ sysbench --test=cpu ... (time, lower is better) | VMs | w/o this change | w/ this change | | 2 | 50.078467 +/- 1.6676162 | 49.673667 +/- 0.0094321 | | 6 | 63.259472 +/- 0.1137586 | 61.680011 +/- 1.0208723 | | 10 | 91.246797 +/- 0.1154008 | 90.396720 +/- 1.5900423 | $ sysbench --test=memory ... (throughput, higher is better) | VMs | w/o this change | w/ this change | | 2 | 485.56333 +/- 6.0527356 | 487.83167 +/- 0.7602850 | | 6 | 401.36278 +/- 1.9745916 | 409.96778 +/- 3.6761092 | | 10 | 294.43933 +/- 0.8064945 | 302.49033 +/- 0.2343978 | $ specjbb2005 ... (throughput, higher is better) | VMs | w/o this change | w/ this change | | 2 | 43150.63 +/- 1359.5616 | 43275.427 +/- 606.28185 | | 6 | 29274.29 +/- 1024.4042 | 29716.189 +/- 1290.1878 | | 10 | 19061.28 +/- 512.88561 | 19192.599 +/- 605.66058 | 2) All VMs had their VCPUs statically pinned to the host's PCPUs. $ sysbench --test=cpu ... (time, lower is better) | VMs | w/o this change | w/ this change | | 2 | 47.8211 +/- 0.0215504 | 47.826900 +/- 0.0077872 | | 6 | 62.689122 +/- 0.0877173 | 62.764539 +/- 0.3882493 | | 10 | 90.321097 +/- 1.4803867 | 89.974570 +/- 1.1437566 | $ sysbench --test=memory ... (throughput, higher is better) | VMs | w/o this change | w/ this change | | 2 | 550.97667 +/- 2.3512355 | 550.87000 +/- 0.8140792 | | 6 | 443.15000 +/- 5.7471797 | 454.01056 +/- 8.4373466 | | 10 | 313.89233 +/- 1.3237493 | 321.81167 +/- 0.3528418 | $ specjbb2005 ... (throughput, higher is better) | 2 | 49591.057 +/- 952.93384 | 49594.195 +/- 799.57976 | | 6 | 33538.247 +/- 1089.2115 | 33671.758 +/- 1077.6806 | | 10 | 21927.870 +/- 831.88742 | 21891.131 +/- 563.37929 | Numbers show how the change has either no or very limited impact (specjbb2005 case) or, when it does have some impact, that is a real improvement in performances (sysbench-memory case). Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com> Committed-by: Keir Fraser <keir@xen.org>
* xen: sched: generalize scheduling related perfcounter macrosDario Faggioli2012-10-231-5/+7
| | | | | | | | | | | | | | Moving some of them from sched_credit.c to generic scheduler code. This also allows the other schedulers to use perf counters equally easy. This change is mainly preparatory work for what stated above. In fact, it mostly does s/CSCHED_STAT/SCHED_STAT/, and, in general, it implies no functional changes. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Committed-by: Keir Fraser <keir@xen.org>
* sched_credit: Use delay to control scheduling frequencyHui Lv2012-01-171-0/+1
| | | | | | | | | | | | | | | | | | | | This patch can improve Xen performance: 1. Basically, the "delay method" can achieve 11% overall performance boost for SPECvirt than original credit scheduler. 2. We have tried 1ms delay and 10ms delay, there is no big difference between these two configurations. (1ms is enough to achieve a good performance) 3. We have compared different load level response time/latency (low, high, peak), "delay method" didn't bring very much response time increase. 4. 1ms delay can reduce 30% context switch at peak performance, where produces the benefits. (int sched_ratelimit_us = 1000 is the recommended setting) Signed-off-by: Hui Lv <hui.lv@intel.com> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Committed-by: Keir Fraser <keir@xen.org>
* Fix the perfc=y build.Keir Fraser2009-03-091-0/+1
| | | | Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* scheduler: Use perf_counter subsystem for statsKeir Fraser2009-03-091-0/+34
| | | | | Signed-off-by: Xiaowei Yang <xiaowei.yang@intel.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* xen: Remove legacy references to explicitly per-cpu perf counters.kfraser@localhost.localdomain2007-03-271-6/+6
| | | | Signed-off-by: Keir Fraser <keir@xensource.com>
* xen: Make all performance counter per-cpu, avoiding the need to updatekfraser@localhost.localdomain2007-03-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | them with atomic (locked) ops. Conversion here isn't complete in the sense that many places still use the old per-CPU accessors (which are now redundant). Since the patch is already rather big, I'd prefer replacing those in a subsequent patch. While doing this, I also converted x86's multicall macros to no longer require inclusion of asm-offsets.h in the respective C file (on IA64 the use of asm-offsets.h in C sources seems more wide spread, hence there I rather used IA64_ prefixes for the otherwise conflicting performance counter indices). On x86, a few counter increments get moved a little, to avoid duplicate counting of preempted hypercalls. Also, a few counters are being added. IA64 changes only compile-tested, hence somebody doing active IA64 work may want to have a close look at those changes. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* Remove redundant performance counters.Steven Hand2006-08-241-1/+0
| | | | Signed-off-by: Steven Hand <steven@xensource.com>
* Move x86 perf counters declarations to asm-x86/perfc_defn.hkfraser@localhost.localdomain2006-07-251-137/+1
| | | | | | | | | Creates asm-ia64/perfc_defn.h (empty). Includes asm/perfc_defn.h in xen/perfc_defn.h Signed-off-by: Tristan Gingold <tristan.gingold@bull.net> Signed-off-by: Keir Fraser <keir@xensource.com>
* [SVM] Add perfcounter svmexits array with correct size creation.kaf24@firebug.cl.cam.ac.uk2006-07-141-0/+2
| | | | | Signed-off-by: Tom Woller <thomas.woller@amd.com>
* According to the April 2005 Intel Virtualization Technology Specificationkaf24@firebug.cl.cam.ac.uk2006-03-091-1/+1
| | | | | | | | | (Appendix A1) indicates that there are 44 potential exit reason codes. Based upon this, increase the size of the PERFCOUNTER_ARRAY for vmexits. Signed-off-by: Ben Thomas (bthomas@virtualiron.com)
* Fix Xen builds with perfc=y and perfc_arrays=y.kaf24@firebug.cl.cam.ac.uk2006-02-161-0/+25
| | | | | | | Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Ben Thomas <ben@virtualiron.com>
* Delete old 'shortcut' function __shadow_status() andkaf24@firebug.cl.cam.ac.uk2006-02-091-1/+1
| | | | | | | | | | | | rename the core function ___shadow_status() to take its place. The 'fast path quick test' was ridiculously bloated and didn't work for PGT_fl1_shadow pages. Signed-off-by: Keir Fraser <keir@xensource.com>
* Fix some more pfn/mfn/gmfn/gpfn inconsistencies. Fix some directkaf24@firebug.cl.cam.ac.uk2006-02-021-1/+1
| | | | | | | | uses of max_page variable to use the mfn_valid() predicate. Signed-off-by: Keir Fraser <keir@xensource.com>
* Rename ac_timer_* interfaces -> timer_*. The ac_ iskaf24@firebug.cl.cam.ac.uk2006-01-121-1/+1
| | | | | | | | | | meaningless and unnecessary. Rename rem_timer -> stop_timer. Signed-off-by: Keir Fraser <keir@xensource.com>
* Allow __gpfn_to_mfn() to automatically deal with translated domains != current.Michael.Fetterman@cl.cam.ac.uk2005-11-281-1/+1
| | | | | | | Renamed gpfn_to_mfn_foreign() to get_mfn_from_pfn_foreign(), making it more consistent with get_mfn_from_pfn().
* Fix perfc_defn.h to allow multiple inclusion.kaf24@firebug.cl.cam.ac.uk2005-08-061-3/+4
|
* bitkeeper revision 1.1689 (42a58901_lkUvZPbAZcV8H9a9NNmtg)kaf24@firebug.cl.cam.ac.uk2005-06-071-103/+118
| | | | | | | | Clean up the domain_page.h interfaces. One common header file <xen/domain_page.h> and map_domain_mem() -> map_domain_page(), takes a pfn rather than a paddr. Signed-off-by: Keir Fraser <keir@xensource.com>
* bitkeeper revision 1.1385.1.7 (427f6405sUeICnIzUJ_HaXbYnLds4A)mafetter@fleming.research2005-05-091-2/+4
| | | | | | | | | Enabling light-weight shadows (especially shadow_mode_dirty). Light-weight shadows leave all the page ref counts based on the guest p.t. pages, while heavy-weight shadows do all their ref counts based on the shadow's p.t. pages. shadow_mode_refcounts(dom) == 1 implies heavy-weight shadows.
* bitkeeper revision 1.1272.1.1 (425b95a8cRhux_vtKXZDHhnHcViWRg)mafetter@fleming.research2005-04-121-0/+1
| | | | | Patch to run a domU in shadow test mode.
* bitkeeper revision 1.1266 (42511c5bb2cYQH5revHQWgKY0haDHg)mafetter@fleming.research2005-04-041-1/+1
| | | | | | | | Rename translate_gpfn_to_mfn to gpfn_to_mfn_foreign. Minor bug fix from prior merges. Signed-off-by: michael.fetterman@cl.cam.ac.uk
* bitkeeper revision 1.1265 (42435d13hIiIzrasNZHbz13uy4ZTKg)mafetter@fleming.research2005-03-251-1/+1
| | | | | | | First attempt at cleanup after merge of shadow code with unstable. Signed-off-by: michael.fetterman@cl.cam.ac.uk
* bitkeeper revision 1.1264 (4243449d-JwBVsSinjAWdYveMNhEjQ)mafetter@fleming.research2005-03-241-0/+1
|\ | | | | | | | | | | | | Initial attempt at merging shadow code with head of unstable tree. Signed-off-by: michael.fetterman@cl.cam.ac.uk
| * bitkeeper revision 1.1236.34.3 (4237063cE2rat5RdEGCsTzuaC6XCcA)kaf24@firebug.cl.cam.ac.uk2005-03-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | Tidy the x86 emulator interface, and use it from within the writable pagetable algorithm to deal with otherwise unhandleable cases. For example: L1 mapped at multiple L2 slots; L1 that maps itself; L1 that also maps the code making the update, or the kernel stack. This provides a proof-of-concept for the emulator that can be picked up for the VMX code to improve the device-model emulation. Signed-off-by: Keir Fraser <keir@xensource.com>
* | bitkeeper revision 1.1261 (4242ef64dXDbGRaZN94_Vg02rxL1tg)mafetter@fleming.research2005-03-241-12/+23
| | | | | | | | | | | | | | | | | | | | | | | | Snapshots of L1 page table pages now only snapshot the active portion of the page. Improved the tlb flushing of shadow mode somewhat... Fixed a bug in the shadow_min_max encoding stuff. Signed-off-by: michael.fetterman@cl.cam.ac.uk
* | bitkeeper revision 1.1260 (4242b380EoY-OHIALnp_JwJHYfsozA)mafetter@fleming.research2005-03-241-19/+16
| | | | | | | | | | | | | | | | Keep a list of pre-zero'ed L1 shadow pages. Avoid the cost of zero'ing them upon allocation. Signed-off-by: michael.fetterman@cl.cam.ac.uk
* | bitkeeper revision 1.1254 (42405db8CeBSiHgkIfnk7WeA9Pvjmw)mafetter@fleming.research2005-03-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HL2's are now filled in on demand, rather than by doing the entire thing on creation. Also fixed a bug in hl2 ref counting. hl2 entries don't take a writable ref to the guest pages, as they are xen mappings, not guest mappings. Also fixed a tlb flushing bug with hl2 entries. Bug fix for shadow table ref counting. CR3's shadow table could, in theory, get released while it's still pointing at it. Fixed. Bug fix for shadow code with tlb flushes from hypervisor calls. Signed-off-by: michael.fetterman@cl.cam.ac.uk
* | bitkeeper revision 1.1252 (423f0601ZuS2OaJ71fHZxF3wK4zUrQ)mafetter@fleming.research2005-03-211-0/+14
| | | | | | | | | | | | | | Cleanup after merge Signed-off-by: michael.fetterman@cl.cam.ac.uk
* | bitkeeper revision 1.1251 (423ef939YAuSbyU77UivO6Ybvl5Yzw)mafetter@fleming.research2005-03-211-1/+0
|\ \ | | | | | | | | | | | | | | | | | | Merge with Rolf's tree Signed-off-by: michael.fetterman@cl.cam.ac.uk
| * | bitkeeper revision 1.1236.32.14 (423eb7a0HqJL37tAErMbIXIQw6Q3Jg)mafetter@fleming.research2005-03-211-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added prediction of where to find the last writable PTE for a given page; greatly speeds up promotion of a page to be used as a page table. Removed some broken concepts of write protecting PDEs and higher level entries. To write protect a page, all we need to do is write protect all L1 entries that point at it. Fixed a bug with translated IO pages; gotta check that MFNs are really backed by RAM before we go looking in the frame_table for them... Signed-off-by: michael.fetterman@cl.cam.ac.uk
| * | bitkeeper revision 1.1236.32.11 (423b097bvEBDPFFtDR44bf9tw_JCqg)mafetter@fleming.research2005-03-181-1/+7
| | | | | | | | | | | | | | | | | | | | | dom0 runs well in shadow translate mode! Signed-off-by: michael.fetterman@cl.cam.ac.uk
| * | bitkeeper revision 1.1236.32.10 (4239772aZ9Ayf3Cwr_6ubXtSI1oZ9Q)mafetter@fleming.research2005-03-171-0/+2
| | | | | | | | | | | | | | | | | | | | | Initial commit for trying to get a translated dom0 up and running. Signed-off-by: michael.fetterman@cl.cam.ac.uk
* | | bitkeeper revision 1.1250 (42388967abs8cSqOtVzsPvhEiltK5Q)rneugeba@wyvis.research.intel-research.net2005-03-161-0/+1
| | | | | | | | | | | | | | | | | | | | | fixed manual merge error Signed-off-by: michael.fetterman@cl.cam.ac.uk
* | | bitkeeper revision 1.1249 (42387345w4RJ2RC5ifMnONI8xxsgWA)rneugeba@wyvis.research.intel-research.net2005-03-161-3/+8
|\| | | | | | | | | | | | | | | | | | | | manual merge with michaels latest Signed-off-by: michael.fetterman@cl.cam.ac.uk
| * | bitkeeper revision 1.1236.32.9 (42378931ytaSYjOpR6-Ss599yO6Zjg)mafetter@fleming.research2005-03-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Added unshadowing of L2s that contain entries which are both not present and non-zero. This is a hack, but ought to work OK for linux domains. Signed-off-by: michael.fetterman@cl.cam.ac.uk
| * | bitkeeper revision 1.1236.32.8 (4237887fr1Mo71Tp0RoJHmt875tSBg)mafetter@fleming.research2005-03-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added extra shadow_sync_mfn() in do_update_va_mapping to deal with a shortcoming of the checking code in _check_pagetable. Better to have a few more flushes and checking code that can still be used. It would be even better to have smarter checking code, but that will take more time. Signed-off-by: michael.fetterman@cl.cam.ac.uk
| * | bitkeeper revision 1.1236.33.2 (4236b517THiLxPjnIZVybs7stl7QFQ)mafetter@fleming.research2005-03-151-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Make validate_(pte|pde)_changes a litter smarter. Avoid some unnecessary calls to __shadow_status. Added an early out for __shadow_status. Signed-off-by: michael.fetterman@cl.cam.ac.uk
| * | bitkeeper revision 1.1236.33.1 (42369984aBV0c2ogV4Bh1SA0FxWSLA)mafetter@fleming.research2005-03-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Added support for mapping other domain's memory from a privileged shadowed domain. Should hopefully enable a shadowed dom0 to start up other domains. Signed-off-by: michael.fetterman@cl.cam.ac.uk
| * | bitkeeper revision 1.1236.32.2 (42360b33-HudAOddVBt3ez4shMiyOw)mafetter@fleming.research2005-03-141-16/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initial fullshadow checkin. Things still to do: - reuse snapshots intelligently. - minimize tlb flushes during resync. - figure out when to free up no-longer-used L2 shadows, and generally deal with out-of-memory kinds of problems. Some basic guidelines: - With fullshadow on, you can not trust linear_pg_table unless you have first checked whether the VA in which you are interested is out-of-sync or not. - Significant new functions/macros include: page_out_of_sync(mfn): returns true if page is out of sync. shadow_mark_out_of_sync: make a page be out of sync (allocating any necessary snapshots, etc) shadow_out_of_sync(va): returns true if the current mappings involved in va are out-of-sync. shadow_sync_va(): bring the pages involved in mapping a particular va back into sync. Currently calls shadow_sync_all(). shadow_sync_all(): bring all pages back in-sync. Signed-off-by: michael.fetterman@cl.cam.ac.uk
* | | bitkeeper revision 1.1247 (42386d3dpoPovazcjxeV5wadySvQoA)rneugeba@wyvis.research.intel-research.net2005-03-161-0/+30
| |/ |/| | | | | | | | | | | michael's initial shadow code Signed-off-by: michael.fetterman@cl.cam.ac.uk
* | bitkeeper revision 1.1236.1.41 (4224ab34YunoDc0_FV3T0OZPcJ0Pcw)kaf24@scramble.cl.cam.ac.uk2005-03-011-2/+10
|/ | | | | | | | Performance counters for hypercalls and exceptions. Perfctr histograms for pagetable updates. Signed-off-by: Rolf Neugebauer <rolf.neugebauer@intel.com> Signed-off-by: Keir Fraser <keir.fraser@cl.cam.ac.uk>
* bitkeeper revision 1.1236.3.3 (421f3ac7eVdbco19D20ncC6UepUAYw)maf46@burn.cl.cam.ac.uk2005-02-251-0/+1
| | | | | | | | Keep a separate shadow and "hl2" shadow of each guest L2 page. Still doing excessive clearing of these shadows, though... Signed-off-by: michael.fetterman@cl.cam.ac.uk
* bitkeeper revision 1.1195 (420e5ff3SFUc-sHp8lfCe-xCoUlk-A)mafetter@fleming.research2005-02-121-1/+5
|\ | | | | | | | | Hand merge
| * bitkeeper revision 1.1159.261.3 (420e3c341h1fbkH3NCtXo63yPlvjGg)mafetter@fleming.research2005-02-121-1/+2
| |\ | | | | | | | | | | | | Hand merge
| | * bitkeeper revision 1.1159.260.1 (420e07e16YlSevQI9RYNGLwarPr2gQ)mafetter@fleming.research2005-02-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bug fix for shadow code. When update_va_mapping() updates a entry, the corresponding shadow entry may not be reachable via the shadow_linear_pg_table, even though it is currently shadowed, as the corresponding spde has not necessarily been faulted into place yet.
| * | bitkeeper revision 1.1159.261.1 (420e14ceCymFrPEpDCTaPJueMUTvsg)mafetter@fleming.research2005-02-121-0/+3
| |/ | | | | | | | | | | Improved check_pagetable checking. Added check_all_pagetables as an alternative to check_pagetable().
* / bitkeeper revision 1.1159.259.1 (420d50b3Mu97o7HHZsGGVPlv3ORCOw)iap10@freefall.cl.cam.ac.uk2005-02-121-0/+5
|/ | | | | | Some VT-x software perf counters. Signed-off-by: ian.pratt@cl.cam.ac.uk
* bitkeeper revision 1.1159.1.244 (417679e8xXMjFVu9LO2SfkqXqR2RjA)kaf24@freefall.cl.cam.ac.uk2004-10-201-1/+1
| | | | | | | | Replace pseudo-4GB-segment instruction emulation with a segment-type trick plus instruction replay. Simpler and more robust but actually somewhat slower (we fault more times, as we can fault on both +ve and -ve accesses).
* bitkeeper revision 1.1104.1.1 (40f923322G2jO4f0TVh9AXW3jpr9bQ)kaf24@scramble.cl.cam.ac.uk2004-07-171-0/+2
| | | | | | | | | Initial Xen support for 4GB segments thru instruction emulation. The instruction decoder needs some refactoring as there is lots of duplicated crufty code in there right now. Also, the TLS libraries hit the emulator a LOT, but mainly with one or two instructions. Probably we need to patch those within Linux.
* bitkeeper revision 1.950 (40c86066TdQwTUVQtZ0q0py10MTUgg)kaf24@scramble.cl.cam.ac.uk2004-06-101-6/+0
| | | | | Removed old I/O world and cleaned up.