| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xen currently makes no strong distinction between the SMP barriers (smp_mb
etc) and the regular barrier (mb etc). In Linux, where we inherited these
names from having imported Linux code which uses them, the SMP barriers are
intended to be sufficient for implementing shared-memory protocols between
processors in an SMP system while the standard barriers are useful for MMIO
etc.
On x86 with the stronger ordering model there is not much practical difference
here but ARM has weaker barriers available which are suitable for use as SMP
barriers.
Therefore ensure that common code uses the SMP barriers when that is all which
is required.
On both ARM and x86 both types of barrier are currently identical so there is
no actual change. A future patch will change smp_mb to a weaker barrier on
ARM.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using tmem with Xen 4.3 (and debug build) we end up with:
(XEN) Xen BUG at domain_page.c:143
(XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]----
(XEN) CPU: 3
(XEN) RIP: e008:[<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1
..
(XEN) Xen call trace:
(XEN) [<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1
(XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b
(XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284
(XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4
(XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0
(XEN) [<ffff82c4c022391b>] syscall_enter+0xeb/0x145
(XEN)
A bit of debugging revealed that the map_domain_page and unmap_domain_page
are meant for short life-time mappings. And that those mappings are finite.
In the 2 VCPU guest we only have 32 entries and once we have exhausted those
we trigger the BUG_ON condition.
The two functions - tmh_persistent_pool_page_[get,put] are used by the xmem_pool
when xmem_pool_[alloc,free] are called. These xmem_pool_* function are wrapped
in macro and functions - the entry points are via: tmem_malloc
and tmem_page_alloc. In both cases the users are in the hypervisor and they
do not seem to suffer from using the hypervisor virtual addresses.
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Despite the changes below, tmem still has code assuming to be able to
directly access all memory, or mapping arbitrary amounts of not
directly accessible memory. I cannot see how to fix this without
converting _all_ its domheap allocations to xenheap ones. And even then
I wouldn't be certain about there not being other cases where the "all
memory is always mapped" assumption would be broken. Therefore, tmem
gets disabled by the next patch for the time being if the full 1:1
mapping isn't always visible.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note: these changes don't make any difference on x86.
Replace XEN_GUEST_HANDLE with XEN_GUEST_HANDLE_PARAM when it is used as
an hypercall argument.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
| |
- one more case of checking for a specific rather than any error
- drop no longer needed first parameter from cli_put_page()
- drop a redundant cast
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
|
| |
This implies adjusting callers to deal with errors other than -EFAULT
and removing some comments which would otherwise become stale.
Reported-by: Tim Deegan <tim@xen.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is not permitted, not even for buffers coming from Dom0 (and it
would also break the moment Dom0 runs in HVM mode). An implication from
the changes here is that tmh_copy_page() can't be used anymore for
control operations calling tmh_copy_{from,to}_client() (as those pass
the buffer by virtual address rather than MFN).
Note that tmemc_save_get_next_page() previously didn't set the returned
handle's pool_id field, while the new code does. It need to be
confirmed that this is not a problem (otherwise the copy-out operation
will require further tmh_...() abstractions to be added).
Further note that the patch removes (rather than adjusts) an invalid
call to unmap_domain_page() (no matching map_domain_page()) from
tmh_compress_from_client() and adds a missing one to an error return
path in tmh_copy_from_client().
Finally note that the patch adds a previously missing return statement
to cli_get_page() (without which that function could de-reference a
NULL pointer, triggerable from guest mode).
This is part of XSA-15 / CVE-2012-3497.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
| |
Signed-off-by: Tim Deegan <tim@xen.org>
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
|
|
|
|
|
|
|
|
|
|
|
| |
It retains IA64-specific bits in code imported from elsewhere (e.g.
ACPI, EFI) as well as in the public headers.
It also doesn't touch the tools, mini-os, and unmodified_drivers
sub-trees.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
| |
Include few missing header files; introduce defined(CONFIG_ARM) where
required.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
|
|
|
|
|
|
|
|
|
|
| |
This undoes a single change from c/s 24136:3622d7fae14d
(common/grant_table.c) and several from c/s 24100:be8daf78856a
(common/memory.c). It also completes the former with two previously
missing ia64 specific code adjustments. Authors Cc-ed.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Callers of lookups into the p2m code are now variants of get_gfn. All
callers need to call put_gfn. The code behind it is a no-op at the
moment, but will change to proper locking in a later patch.
This patch does not change functionality. Only naming, and adds
put_gfn's.
set_p2m_entry retains its name because it is always called with
p2m_lock held.
This patch is humongous, unfortunately, given the dozens of call sites
involved.
After this patch, anyone using old style gfn_to_mfn will not succeed
in compiling their code. This is on purpose: adapt to the new API.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As part of the nested HVM patch series, many p2m functions were changed
to take pointers to p2m tables rather than to domains. This patch
reverses that for almost all of them, which:
- gets rid of a lot of "p2m_get_hostp2m(d)" in code which really
shouldn't have to know anything about how gfns become mfns.
- ties sharing and paging interfaces to a domain, which is
what they actually act on, rather than a particular p2m table.
In developing this patch it became clear that memory-sharing and nested
HVM are unlikely to work well together. I haven't tried to fix that
here beyond adding some assertions around suspect paths (as this patch
is big enough with just the interface changes)
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
|
|
|
|
| |
Although one major source of order>0 allocations has been removed,
others still remain, so re-disable tmem until the issue can be fixed
properly.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
| |
... decreasing cache footprint. As a prerequisite this requires making
cmdline_parse() a little more flexible.
Also remove a few variables altogether, and adjust sections
annotations for several others.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
| |
Mfns for PV domains were not properly checked, potentially
allowing a buggy or malicious PV guest to crash Xen. Also,
use get_page/put_page to claim a reference to the pages
so they can't disappear out from under tmem's feet.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
| |
Change p2m infrastructure to operate on per-p2m instead of per-domain.
This allows us to use multiple p2m tables per-domain.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Acked-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
must still be tmem-enabled to use this functionality (e.g.
won't work for Windows), but upstream Linux tmem (aka
cleancache and frontswap) patches apply cleanly on top
of PV on HVM patches.
Also, fix up some ASSERTS and code used only when bad guest
mfns are passed to tmem. Previous code could crash Xen
if a buggy/malicious guest passes bad gmfns.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
|
| |
Obtaining a domain reference count is neither necessary nor
sufficient. Instead we simply check whether a domain is already dying
when it first becomes a client of tmem. If it is not then we will
correctly clean up later via tmem_destroy() called from domain_kill().
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
| |
The significant remaining culprits for x86 are credit2, hpet, and
percpu-area subsystems. To be dealt with in a separate patch.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Late in the 4.0 release it was discovered that certain order>0
allocations could fail and had no fallback. This conflicted with
tmem especially when combined with aggressive ballooning.
A hack-y workaround patch was added in time for 4.0 that has
reduced (but not completely eliminated) the problem but
tmem was left disabled-by-default for the 4.0 release.
Re-enable it in xen-unstable by default to help identify cases
where the workaround is insufficient. Tmem can be
disabled with the no-tmem Xen boot option. Please report
failures (that are fixed with the no-tmem option) to me.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
trailing-zero-elimination
Add "page deduplication" capability (with optional compression
and trailing-zero elimination) to Xen's tmem.
(Transparent to tmem-enabled guests.) Ephemeral pages
that have the exact same content are "combined" so that only
one page frame is needed. Since ephemeral pages are essentially
read-only, no C-O-W (and thus no equivalent of swapping) is
necessary. Deduplication can be combined with compression
or "trailing zero elimination" for even more space savings.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Besides two unlikely/rarely hit ones in x86 code, the main offender
was tmh_client_from_cli_id(), which didn't even have a counterpart
(albeit it had a comment correctly saying that it causes d->refcnt to
get incremented). Unfortunately(?) this required a bit of code
restructuring (as I needed to change the code anyway, I also fixed
a couple os missing bounds checks which would sooner or later be
reported as security vulnerabilities), so I would hope Dan could give
it his blessing before it gets applied.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
| |
This reverts 20758:4e56f809ddbf and 20655:3c5b5c4c1d79
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
| |
While tmem has gotten limited testing with a 32-bit Xen, it
has severe limitations due to 32-bit heap restrictions.
So, turn it off by default for 32-bit so nobody accidentally
runs into this.
Signed-off by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
|
|
| |
Tmem has been in-tree for about seven months, but disabled
by default. Enabling it should be entirely harmless
unless a running PV domain has been tmem-modified.
I'd like to confirm that by enabling it now, so that
it can be enabled by default for the 4.0.0 release.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
| |
Various warnings appeared since 3.4 - eliminate at least some of them.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
| |
Tmem double-frees a high-level data structure causing memory
corruption under certain circumstances.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
| |
scrubbing"
Fix incorrect page_list macro choice from page-scrub code cleanup.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
| |
Tmem fails to put_domain so a dying domain never gets
properly shut down. Also, fix race condition when
domain is dying by not allowing any new ops to succeed.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a variant of map_domain_page() directly getting passed a
struct page_info * argument, based on the observation that in many
places the argument to this function so far simply was the result of
page_to_mfn(). This is meaningful for the x86-64 case where
map_domain_page() really just is an invocation of mfn_to_virt(), and
hence the combined mfn_to_virt(page_to_mfn()) now represents a
needless round trip conversion compressed -> uncompressed ->
compressed of the MFN representation.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Expose tmem "freeable" memory for use by management tools.
Management tools looking for a machine with available
memory often look at free_memory to determine if there
is enough physical memory to house a new or migrating
guest. Since tmem absorbs much or all free memory,
and since "ephemeral" tmem memory can be synchronously
freed, management tools need more data -- not only how
much memory is "free" but also how much memory is
"freeable" by tmem if tmem is told (via an already
existing tmem hypercall) to relinquish freeable memory.
This patch provides that extra piece of data (in MB).
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Attached patch implements save/restore/migration/livemigration
for transcendent memory ("tmem"). Without this patch, domains
using tmem may in some cases lose data when doing save/restore
or migrate/livemigrate. Also included in this patch is
support for a new (privileged) hypercall for authorizing
domains to share pools; this provides the foundation to
accomodate upstream linux requests for security for shared
pools.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
|
|
|
|
| |
... to be more precise in naming, and also to match Linux.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since mixing data that only gets setup once and then (perhaps
frequently) gets read by remote CPUs with data that the local CPU may
modify (again, perhaps frequently) still causes undesirable cache
protocol related bus traffic, separate the former class of objects
from the latter.
These objects converted here are just picked based on their write-once
(or write-very-rarely) properties; perhaps some more adjustments may
be desirable subsequently. The primary users of the new sub-section
will result from the next patch.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
| |
The original user for this was domain destruction. Now that this is
preemptible all the way back up to dom0 userspace, asynchrony is
better iontroduced at that level, if at all, imo.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
| |
- don't mis-use guest handle for passing an MFN value
- eliminate unnecessary (and misplaced) use of XEN_GUEST_HANDLE_64
- use copy_from_guest() instead of __copy_from_guest() for loading the
argument structure
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
| |
Reset a counter when all tmem pages are released. This
only affects status reporting (as displayed by xm tmem-list
or the just patched xenballoon-monitor) but the incorrectly
reported result is misleading.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|
|
Tmem, when called from a tmem-capable (paravirtualized) guest, makes
use of otherwise unutilized ("fallow") memory to create and manage
pools of pages that can be accessed from the guest either as
"ephemeral" pages or as "persistent" pages. In either case, the pages
are not directly addressible by the guest, only copied to and fro via
the tmem interface. Ephemeral pages are a nice place for a guest to
put recently evicted clean pages that it might need again; these pages
can be reclaimed synchronously by Xen for other guests or other uses.
Persistent pages are a nice place for a guest to put "swap" pages to
avoid sending them to disk. These pages retain data as long as the
guest lives, but count against the guest memory allocation.
Tmem pages may optionally be compressed and, in certain cases, can be
shared between guests. Tmem also handles concurrency nicely and
provides limited QoS settings to combat malicious DoS attempts.
Save/restore and live migration support is not yet provided.
Tmem is primarily targeted for an x86 64-bit hypervisor. On a 32-bit
x86 hypervisor, it has limited functionality and testing due to
limitations of the xen heap. Nearly all of tmem is
architecture-independent; three routines remain to be ported to ia64
and it should work on that architecture too. It is also structured to
be portable to non-Xen environments.
Tmem defaults off (for now) and must be enabled with a "tmem" xen boot
option (and does nothing unless a tmem-capable guest is running). The
"tmem_compress" boot option enables compression which takes about 10x
more CPU but approximately doubles the number of pages that can be
stored.
Tmem can be controlled via several "xm" commands and many interesting
tmem statistics can be obtained. A README and internal specification
will follow, but lots of useful prose about tmem, as well as Linux
patches, can be found at http://oss.oracle.com/projects/tmem .
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
|