| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Otherwise, with arch_compat_vcpu_op() calling arch_do_vcpu_op() to
handle it, it results in -ENOSYS after 6ff9e4f7 ("xen: move
VCPUOP_register_vcpu_info to common code") for 32-bit x86 domains.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
.. as the root page table validation (and the dropping of an eventual
old one) can require meaningful amounts of time.
This is part of CVE-2013-1918 / XSA-45.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
|
| |
When a domain's shared info field "max_pfn" is zero,
domain_get_maximum_gpfn() so far returned ULONG_MAX, which
do_memory_op() in turn converted to -1 (i.e. -EPERM). Make the former
always return a sensible number (i.e. zero if the field was zero) and
have the latter no longer truncate return values.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
| |
A shift with a negative count was erroneously used here, yielding
undefined behavior.
Reported-by: Xi Wang <xi@mit.edu>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
| |
The emacs variable to set the C style from a local variable block is
c-file-style, not c-set-style.
Signed-off-by: David Vrabel <david.vrabel@citrix.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
- use the variants not validating the VA range when writing back
structures/fields to the same space that they were previously read
from
- when only a single field of a structure actually changed, copy back
just that field where possible
- consolidate copying back results in a few places
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
| |
Failure should always be detected and handled.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ever since its existence (3.0.3 iirc) the handler for this has been
using non address range checking guest memory accessors (i.e.
the ones prefixed with two underscores) without first range
checking the accessed space (via guest_handle_okay()), allowing
a guest to access and overwrite hypervisor memory.
This is XSA-29 / CVE-2012-5513.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson.citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
c/s 20281:95ea2052b41b, which introduces Grant Table version 2
hypercalls introduces a vulnerability whereby the compat hypercall
handler can fall into an infinite loop.
If the watchdog is enabled, Xen will die after the timeout.
This is a security problem, XSA-24 / CVE-2012-4539.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
|
|
|
|
|
|
|
|
|
|
| |
More substitutions in this patch, not as obvious as the ones in the
previous patch.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note: these changes don't make any difference on x86.
Replace XEN_GUEST_HANDLE with XEN_GUEST_HANDLE_PARAM when it is used as
an hypercall argument.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
|
|
|
|
|
|
|
|
|
|
| |
Add a trace record for every hypercall inside a multicall. These use
a new event ID (with a different sub-class ) so they may be filtered
out if only the calls into hypervisor are of interest.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
| |
for common and x86 ones at least, to address the problem of storing
zero-extended values into the multicall result field otherwise.
Reported-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reinstates the XENMEM_remove_from_physmap hypercall
which was removed in 19041:ee62aaafff46 because it was not used.
However, is now needed in order to support xenstored stub domains.
The xenstored stub domain is not priviliged like dom0 and so cannot
unilaterally map the xenbus page of other guests into it's address
space. Therefore, before creating a domU the domain builder needs to
seed its grant table with a grant ref allowing the xenstored stub
domain to access the new domU's xenbus page.
At present domU's do not start with their grant table mapped.
Instead it gets mapped when the guest requests a grant table from
the hypervisor.
In order to seed the grant table, the domain builder first needs to
map it into dom0 address space. But the hypercall to do this
requires a gpfn (guest pfn), which is an mfn for PV guest, but a pfn
for HVM guests. Therfore, in order to seed the grant table of an
HVM guest, dom0 needs to *temporarily* map it into the guest's
"physical" address space.
Hence the need to reinstate the XENMEM_remove_from_physmap hypercall.
Signed-off-by: Alex Zeffertt <alex.zeffertt@eu.citrix.com>
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
provided that they are not currently active.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
| |
is_initialised under the per-domain lock.
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
| |
Just like with the __RING_SIZE() macro, the compat mode type checking
macros also need changing in order to work with gcc 4.5.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Steven Smith <steven.smith@citrix.com>
|
|
|
|
|
|
| |
isn't in the way when we introduce struct grant_entry_v2.
Signed-off-by: Steven Smith <steven.smith@citrix.com>
|
|
|
|
|
|
|
|
|
| |
Eliminate the hard-coded, arbitrarily chosen limit of 512 grant table
ops a domain may submit at a time, and instead check for necessary
preemption after each individual element got processed, invoking the
hypercall continuation logic when necessary.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
| |
This is useful from tools in the same way /proc/cmdline is useful for
the domain 0 kernel.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the shared info layout is fixed, guests are required to use
VCPUOP_register_vcpu_info prior to booting any vCPU beyond the
traditional limit of 32.
MAX_VIRT_CPUS, being an implemetation detail of the hypervisor, is no
longer being exposed in the public headers.
The tools changes are clearly incomplete (and done only so things
would
build again), and the current state of the tools (using scalar
variables all over the place to represent vCPU bitmaps) very likely
doesn't permit booting DomU-s with more than the traditional number of
vCPU-s. Testing of the extended functionality was done with Dom0 (96
vCPU-s, as well as 128 vCPU-s out of which the kernel elected - by way
of a simple kernel side patch - to use only some, resulting in a
sparse
bitmap).
ia64 changes only to make things build, and build-tested only (and the
tools part only as far as the build would go without encountering
unrelated problems in the blktap code).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tmem, when called from a tmem-capable (paravirtualized) guest, makes
use of otherwise unutilized ("fallow") memory to create and manage
pools of pages that can be accessed from the guest either as
"ephemeral" pages or as "persistent" pages. In either case, the pages
are not directly addressible by the guest, only copied to and fro via
the tmem interface. Ephemeral pages are a nice place for a guest to
put recently evicted clean pages that it might need again; these pages
can be reclaimed synchronously by Xen for other guests or other uses.
Persistent pages are a nice place for a guest to put "swap" pages to
avoid sending them to disk. These pages retain data as long as the
guest lives, but count against the guest memory allocation.
Tmem pages may optionally be compressed and, in certain cases, can be
shared between guests. Tmem also handles concurrency nicely and
provides limited QoS settings to combat malicious DoS attempts.
Save/restore and live migration support is not yet provided.
Tmem is primarily targeted for an x86 64-bit hypervisor. On a 32-bit
x86 hypervisor, it has limited functionality and testing due to
limitations of the xen heap. Nearly all of tmem is
architecture-independent; three routines remain to be ported to ia64
and it should work on that architecture too. It is also structured to
be portable to non-Xen environments.
Tmem defaults off (for now) and must be enabled with a "tmem" xen boot
option (and does nothing unless a tmem-capable guest is running). The
"tmem_compress" boot option enables compression which takes about 10x
more CPU but approximately doubles the number of pages that can be
stored.
Tmem can be controlled via several "xm" commands and many interesting
tmem statistics can be obtained. A README and internal specification
will follow, but lots of useful prose about tmem, as well as Linux
patches, can be found at http://oss.oracle.com/projects/tmem .
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
| |
Never used by a guest OS (except in IA64 hcall translation layer) and
obsoleted in the tools for ages. Recent usage by qemu-dm is now
removed.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
| |
GMFN==INVALID (~0UL).
Found by Diego Ongaro <diego.ongaro@citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
| |
- handle VCPUOP_register_vcpu_info and VCPUOP_get_physid (and add
respective layout checks)
- add missing structure size check for struct vcpu_info
- add missing layout check for vcpu_set_periodic_timer
- handle VCPUOP_set_singleshot_timer via argument translation as the
structure sizes differ (due to padding in 64-bits)
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
| |
While trying to run a 32-bit PV domU on a 64-bit hypervisor, I
triggered an assert in the hypervisor. The assert dealt with the
maximum number of grants that a domU can have. I made the hypervisor
a bit more graceful by returning an error rather than asserting.
Signed-off-by: Michael Abd-El-Malek <mabdelmalek@cmu.edu>
|
| |
|
|
|
|
|
|
|
| |
Add an explicit kexec_load_unload_compat() using the same method
that was used to create kexec_range_compat()
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unless I am mistaken, the compat functions are provided a stable ABI.
This includes providing a stable version of xen_kexec_range_t in the
form of compat_kexec_range_t. However, internally it doesn't really
matter how xen represents the data.
Currently the code provides for the creation of a compat version of
all kexec range functions, which use the compat_kexec_range_t
function. This is difficult to extend if range code exists outside of
xen/common/kexec.c.
The existence of "#ifdef CONFIG_X86_64" in the code suggests that some
of the range code might be better off in architecture specific code.
Furthermore, subsequent patches will introduce ia64-specific range
handling code, which really would be much better off somewhere in
arch/ia64/.
With this in mind, the handling of compat_kexec_range_t is changed
such that the code which reads and returns data from user-space
translates between compat_kexec_range_t and xen_kexec_range_t. As,
padding aside, the two structures are currently the same this is quite
easy. Things may get more tricky in the future, but I don't believe
this change is likely to make things significantly worse (or better)
in that regard. In any case, refactoring can occur again as required.
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
|
|
|
| |
From: George Dunlap <George.Dunlap@eu.citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|
|
|
| |
Signed-off-by: Ian Campbell <ian.campbell@xensource.com>
|
|
|
|
| |
Signed-off-by: Tim Deegan <Tim.Deegan@xensource.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mcs->call (struct multicall_entry) always needs to be translated into
mcs->compat_call (struct compat_multicall_entry) when a multicall is
preempted in compat mode. Previously this translation only occured for
those hypercalls which explicitly called hypercall_xlat_continuation()
which doesn't cover all hypercalls which could potentially be
preempted.
Change hypercall_xlat_continuation() to only translate only the
hypercall arguments themselves and not the multicall_entry
layout. Translate the layout for all hypercalls in in
compat_multicall() instead.
Signed-off-by: Ian Campbell <ian.campbell@xensource.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The operation unmap_and_replace is an extension of unmap_grant_ref.
A new argument in the form of a virtual address (for PV) is given.
Instead of modifying the PTE for the mapped grant table entry to
null, we change it to the PTE for the new address. In turn we
point the new address to null.
As it stands grant table entries once mapped cannot be
remapped by the guest OS (it can however perform a new
mapping on the same entry but that is within our control).
Therefore it's safe to manipulate the mapped PTE entry to
redirect it to a normal page where we've copied the contents.
It's intended to be used as follows:
1) map_grant_ref to v1
2) ...
3) alloc page at v2
4) copy the page at v1 to v2
5) unmap_and_replace v1 with v2
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Added compat integration (PAE-on-64).
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|\ |
|
| |
| |
| |
| | |
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|/
|
|
|
|
| |
in a single batch for a 32-on-64 domain.
Signed-off-by: Steven Smith <sos22@cam.ac.uk>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xensource.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of kludging a max_gpfn estimate in shared_info, add a new
XENMEM command to discover the actual maximum gpfn value as known by
the shadow code.
This needs to be more robust when we support HVM ballooning in future
anyway. One interesting point is that max_gpfn may be close to 4GB
even for small-memory HVM guests since for example SVGA LFB is mapped
into the I/O hole. We may need to special case the I/O hole somehow,
or provide some finer-grained way to find out which parts of the GPFN
space are actually used (e.g., get Xen to fill in a bitmap with 1 bit
per 1024 pages, or similar).
Signed-off-by: Keir Fraser <keir@xensource.com>
|