| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
| |
Both writing of certain MSRs and VCPUOP_get_physid make sense also for
dynamically (perhaps temporarily) pinned vcpus.
Likely a couple of other MSR writes (MSR_K8_HWCR, MSR_AMD64_NB_CFG,
MSR_FAM10H_MMIO_CONF_BASE) would make sense to be restricted by an
is_pinned() check too, possibly also some MSR reads.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
| |
... decreasing cache footprint. As a prerequisite this requires making
cmdline_parse() a little more flexible.
Also remove a few variables altogether, and adjust sections
annotations for several others.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
| |
clash.
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
If the guest sleeps in hypervisor context, it should not be destroyed
until execution reaches a safe point (i.e., guest context). This is
not implemented yet. :-) But the next patch will rely on it, to allow
an HVM guest to execute hypercalls that indirectly invoke __hvm_copy()
within an rcu_lock_current_domain() region.
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
| |
This is a prerequisite for allowing guest descheduling within a
hypercall.
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
| |
This cuts the size of the domain structure by around 30kB! It is now a
little over a page in size.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
| |
1. Try to allocate from nodes containing CPUs which a guest can be
scheduled on.
2. Remember which node we allocated from last, and round-robin
allocations among above-mentioned nodes.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
| |
Otherwise vcpu_periodic_timer_work() can think the next timer is in
the future (and re-issue it unchanged) while timer_softirq_action()
thinks it's in the past (and fires it immediately), leading to
livelock.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With IRQs getting bound to the CPU the binding vCPU currently runs on
there can result quite a bit of extra cross CPU traffic as soon as
that vCPU moves to a different pCPU. Likewise, when a domain re-binds
an event channel associated with a pIRQ, that IRQ's affinity should
also be adjusted.
The open issue is how to break ties for interrupts shared by multiple
domains - currently, the last request (at any point in time) is being
honored.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
| |
Each domain is allowed to set, reset and disable its timers; when any
timer runs out the domain is killed.
Patch from Christian Limpach <Christian.Limpach@citrix.com>
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
|
|
| |
Also remove pointless tasklet from mem_event notify path.
Signed-off-by: John Byrne <john.l.byrne@hp.com>
|
|
|
|
| |
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
interface
...along with a new sysctl to call it directly. This is in order to
support DornerWorks' new ARINC653 scheduler.
Based on code from Josh Holtrop and Kathy Hadley at DornerWorks, Ltd
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@eu.citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
| |
...rather than in softirq context. This is expected to avoid a lot of
subtle deadlocks relating to the fact that softirqs can interrupt a
scheduled vcpu.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
| |
This is preparation for implementing tasklets in vcpu context rather
than softirq context. There is no change to the implementation of
tasklets in this patch.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
| |
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when VCPU is polling on event channel, it usually has urgent task
running, e.g. spin_lock, in this case, it is better for cpuidle driver
not to enter deep C state.
This patch fix the issue that SLES 11 SP1 domain0 hangs in the box of
large number of CPUs (>= 64 CPUs).
Signed-off-by: Yu Ke <ke.yu@intel.com>
Signed-off-by: Tian Kevin <kevin.tian@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
| |
Avoids crash on dereference of poll_mask after domain_kill().
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Grzegorz Milos <Grzegorz.Milos@citrix.com>
|
|
|
|
| |
Signed-off-by: Patrick Colp <Patrick.Colp@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This attempts to address all the concerns raised in
http://lists.xensource.com/archives/html/xen-devel/2009-11/msg01195.html,
but I'm nevertheless still not convinced that all aspects of the
injection handling really work reliably. In particular, while the
patch here on top of the fixes for the problems menioned in the
referenced mail also adds code to keep send_guest_trap() from
injecting multiple events at a time, I don't think the is the right
mechanism - it should be possible to handle NMI/MCE nested within
each other.
Another fix on top of the ones for the earlier described problems is
that the vCPU affinity restore logic didn't account for software
injected NMIs - these never set cpu_affinity_tmp, but due to it most
likely being different from cpu_affinity it would have got restored
(to a potentially random value) nevertheless.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
|
|
|
|
|
|
|
| |
Adds new tool xenlockprof to run from dom0.
From: Juergen Gross <juergen.gross@ts.fujitsu.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
| |
This patch makes HVM's VTd page table dynamic just like what PV guest
does, so that avoid the overhead of maintaining page table until a PCI
device is truly assigned to the HVM guest.
Signed-Off-By: Zhai, Edwin <edwin.zhai@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the shared info layout is fixed, guests are required to use
VCPUOP_register_vcpu_info prior to booting any vCPU beyond the
traditional limit of 32.
MAX_VIRT_CPUS, being an implemetation detail of the hypervisor, is no
longer being exposed in the public headers.
The tools changes are clearly incomplete (and done only so things
would
build again), and the current state of the tools (using scalar
variables all over the place to represent vCPU bitmaps) very likely
doesn't permit booting DomU-s with more than the traditional number of
vCPU-s. Testing of the extended functionality was done with Dom0 (96
vCPU-s, as well as 128 vCPU-s out of which the kernel elected - by way
of a simple kernel side patch - to use only some, resulting in a
sparse
bitmap).
ia64 changes only to make things build, and build-tested only (and the
tools part only as far as the build would go without encountering
unrelated problems in the blktap code).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... splitting it into global nr_irqs (determined at boot time) and
per- domain nr_pirqs (derived from nr_irqs and a possibly command line
specified value, which probably should later become a per-domain
config setting).
This has the (desirable imo) side effect of reducing the size of
struct hvm_irq_dpci from requiring an order-3 page to order-2 (on
x86-64), which nevertheless still is too large.
However, there is now a variable size bit array on the stack in
pt_irq_time_out() - while for the moment this probably is okay, it
certainly doesn't look nice. However, replacing this with a static
(pre-)allocation also seems less than ideal, because that would
require at least min(d->nr_pirqs, NR_VECTORS) bit arrays of
d->nr_pirqs bits, since this bit array is used outside of the
serialized code region in that function, and keeping the domain's
event lock acquired across pirq_guest_eoi() doesn't look like a good
idea either.
The IRQ- and vector-indexed arrays hanging off struct hvm_irq_dpci
could in fact be changed further to dynamically use the smaller of the
two ranges for indexing, since there are other assumptions about a
one-to-one relationship between IRQs and vectors here and elsewhere.
Additionally, it seems to me that struct hvm_mirq_dpci_mapping's
digl_list and gmsi fields could really be overlayed, which would yield
significant savings since this structure gets always instanciated in
form of d->nr_pirqs (as per the above could also be the smaller of
this and NR_VECTORS) dimensioned arrays.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tmem, when called from a tmem-capable (paravirtualized) guest, makes
use of otherwise unutilized ("fallow") memory to create and manage
pools of pages that can be accessed from the guest either as
"ephemeral" pages or as "persistent" pages. In either case, the pages
are not directly addressible by the guest, only copied to and fro via
the tmem interface. Ephemeral pages are a nice place for a guest to
put recently evicted clean pages that it might need again; these pages
can be reclaimed synchronously by Xen for other guests or other uses.
Persistent pages are a nice place for a guest to put "swap" pages to
avoid sending them to disk. These pages retain data as long as the
guest lives, but count against the guest memory allocation.
Tmem pages may optionally be compressed and, in certain cases, can be
shared between guests. Tmem also handles concurrency nicely and
provides limited QoS settings to combat malicious DoS attempts.
Save/restore and live migration support is not yet provided.
Tmem is primarily targeted for an x86 64-bit hypervisor. On a 32-bit
x86 hypervisor, it has limited functionality and testing due to
limitations of the xen heap. Nearly all of tmem is
architecture-independent; three routines remain to be ported to ia64
and it should work on that architecture too. It is also structured to
be portable to non-Xen environments.
Tmem defaults off (for now) and must be enabled with a "tmem" xen boot
option (and does nothing unless a tmem-capable guest is running). The
"tmem_compress" boot option enables compression which takes about 10x
more CPU but approximately doubles the number of pages that can be
stored.
Tmem can be controlled via several "xm" commands and many interesting
tmem statistics can be obtained. A README and internal specification
will follow, but lots of useful prose about tmem, as well as Linux
patches, can be found at http://oss.oracle.com/projects/tmem .
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
| |
Signed-off-by: Yu Ke <ke.yu@intel.com>
Signed-off-by: Yang Xiaowei <xiaowei.yang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
cpuidle can collaborate with scheduler to reduce unnecessary timer
interrupt. For example, credit scheduler accounting timer
doesn't need to be active at idle time, so it can be stopped at
cpuidle entry and resumed at cpuidle exit. This patch implements this
function by adding two ops in scheduler: tick_suspend/tick_resume, and
implement them for credit scheduler
Signed-off-by: Yu Ke <ke.yu@intel.com>
Signed-off-by: Tian Kevin <kevin.tian@intel.com>
|
|
|
|
|
|
|
|
|
| |
Current scheduler only care performance, thus always picks pCPU from
the most idle package. This knob provides another option to pick pCPU from
least idle package, for user who want performance power balance.
Signed-off-by: Yu Ke <ke.yu@intel.com>
Signed-off-by: Tian Kevin <kevin.tian@intel.com>
|
|
|
|
|
|
| |
Pointed out by Jan Beulich.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Xiaowei Yang <xiaowei.yang@intel.com>
|
|
|
|
|
|
| |
Based on a patch by Joe Jin <joe.jin@oracle.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a per-domain flag to specify whether a domain will be
S3 integrity protected when Xen is launched using tboot/TXT.
The tools now support an integer domain configuration parameter called
's3_integrity', which defaults to 1, to enable S3 integrity protection.
The struct arch_domain structure has been extended to have an
's3_integrity' field that represents this setting.
Signed-off-by: Shane Wang <shane.wang@intel.com>
Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the following compilation error.
Since struct page_list_head is defined in mm.h, sched.h needs mm.h.
Other circular inclusions are sorted out.
> In file included from xen/include/asm-ia64/linux-xen/asm/smp.h:50,
> from xen/include/linux/smp.h:5,
> from xen/include/asm-ia64/linux/topology.h:33,
> from xen/include/asm-ia64/linux-xen/linux/gfp.h:6,
> from xen/include/asm/mm.h:11,
> from xen/include/xen/mm.h:90,
> from viosapic.c:35:
> xen/include/xen/sched.h:174: error: field page_list has incomplete
> type
> xen/include/xen/sched.h:175: error: field xenpage_list has
> incomplete type
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unless more than 16Tb are going to ever be supported in Xen, this will
allow reducing the linked list entries in struct page_info from 16 to
8 bytes.
This doesn't modify struct shadow_page_info, yet, so in order to meet
the constraints of that 'mirror' structure the list entry gets
artificially forced to be 16 bytes in size. That workaround will be
removed in a subsequent patch.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
| |
This requires a bzImage v2.08 or later kernel.
xen/common/inflate.c is taken unmodified from Linux v2.6.28.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
|