aboutsummaryrefslogtreecommitdiffstats
path: root/xen/common/tmem_xen.c
Commit message (Collapse)AuthorAgeFilesLines
* use SMP barrier in common code dealing with shared memory protocolsIan Campbell2013-07-041-5/+5
| | | | | | | | | | | | | | | | | | | | | | | Xen currently makes no strong distinction between the SMP barriers (smp_mb etc) and the regular barrier (mb etc). In Linux, where we inherited these names from having imported Linux code which uses them, the SMP barriers are intended to be sufficient for implementing shared-memory protocols between processors in an SMP system while the standard barriers are useful for MMIO etc. On x86 with the stronger ordering model there is not much practical difference here but ARM has weaker barriers available which are suitable for use as SMP barriers. Therefore ensure that common code uses the SMP barriers when that is all which is required. On both ARM and x86 both types of barrier are currently identical so there is no actual change. A future patch will change smp_mb to a weaker barrier on ARM. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* tmem: Don't use map_domain_page for long-life-time pagesKonrad Rzeszutek Wilk2013-06-131-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using tmem with Xen 4.3 (and debug build) we end up with: (XEN) Xen BUG at domain_page.c:143 (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 .. (XEN) Xen call trace: (XEN) [<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284 (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4 (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0 (XEN) [<ffff82c4c022391b>] syscall_enter+0xeb/0x145 (XEN) A bit of debugging revealed that the map_domain_page and unmap_domain_page are meant for short life-time mappings. And that those mappings are finite. In the 2 VCPU guest we only have 32 entries and once we have exhausted those we trigger the BUG_ON condition. The two functions - tmh_persistent_pool_page_[get,put] are used by the xmem_pool when xmem_pool_[alloc,free] are called. These xmem_pool_* function are wrapped in macro and functions - the entry points are via: tmem_malloc and tmem_page_alloc. In both cases the users are in the hypervisor and they do not seem to suffer from using the hypervisor virtual addresses. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* tmem: partial adjustments for x86 16Tb supportJan Beulich2013-01-231-18/+8
| | | | | | | | | | | | | | | Despite the changes below, tmem still has code assuming to be able to directly access all memory, or mapping arbitrary amounts of not directly accessible memory. I cannot see how to fix this without converting _all_ its domheap allocations to xenheap ones. And even then I wouldn't be certain about there not being other cases where the "all memory is always mapped" assumption would be broken. Therefore, tmem gets disabled by the next patch for the time being if the full 1:1 mapping isn't always visible. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com> Acked-by: Keir Fraser <keir@xen.org>
* xen: replace XEN_GUEST_HANDLE with XEN_GUEST_HANDLE_PARAM when appropriateStefano Stabellini2012-10-171-4/+4
| | | | | | | | | | | | Note: these changes don't make any difference on x86. Replace XEN_GUEST_HANDLE with XEN_GUEST_HANDLE_PARAM when it is used as an hypercall argument. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
* xen: Remove x86_32 build target.Keir Fraser2012-09-121-17/+0
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* tmem: cleanupJan Beulich2012-09-111-10/+10
| | | | | | | | | - one more case of checking for a specific rather than any error - drop no longer needed first parameter from cli_put_page() - drop a redundant cast Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* tmem: detect arithmetic overflow in tmh_copy_{from,to}_client()Jan Beulich2012-09-111-0/+8
| | | | | | | | | This implies adjusting callers to deal with errors other than -EFAULT and removing some comments which would otherwise become stale. Reported-by: Tim Deegan <tim@xen.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* tmem: don't access guest memory without using the accessors intended for thisJan Beulich2012-09-111-27/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is not permitted, not even for buffers coming from Dom0 (and it would also break the moment Dom0 runs in HVM mode). An implication from the changes here is that tmh_copy_page() can't be used anymore for control operations calling tmh_copy_{from,to}_client() (as those pass the buffer by virtual address rather than MFN). Note that tmemc_save_get_next_page() previously didn't set the returned handle's pool_id field, while the new code does. It need to be confirmed that this is not a problem (otherwise the copy-out operation will require further tmh_...() abstractions to be added). Further note that the patch removes (rather than adjusts) an invalid call to unmap_domain_page() (no matching map_domain_page()) from tmh_compress_from_client() and adds a missing one to an error return path in tmh_copy_from_client(). Finally note that the patch adds a previously missing return statement to cli_get_page() (without which that function could de-reference a NULL pointer, triggerable from guest mode). This is part of XSA-15 / CVE-2012-3497. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* common: Use get_page_from_gfn() instead of get_gfn()/put_gfn.Tim Deegan2012-05-171-16/+10
| | | | | Signed-off-by: Tim Deegan <tim@xen.org> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
* remove ia64Jan Beulich2012-04-031-1/+1
| | | | | | | | | | | It retains IA64-specific bits in code imported from elsewhere (e.g. ACPI, EFI) as well as in the public headers. It also doesn't touch the tools, mini-os, and unmodified_drivers sub-trees. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* ia64: fix build (once more)Jan Beulich2012-03-081-1/+2
| | | | Signed-off-by: Jan Beulich <jbeulich@suse.com>
* arm: compile tmemStefano Stabellini2012-02-091-1/+3
| | | | | | | | Include few missing header files; introduce defined(CONFIG_ARM) where required. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
* ia64: build fixes (again)Jan Beulich2011-11-241-1/+1
| | | | | | | | | | This undoes a single change from c/s 24136:3622d7fae14d (common/grant_table.c) and several from c/s 24100:be8daf78856a (common/memory.c). It also completes the former with two previously missing ia64 specific code adjustments. Authors Cc-ed. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
* Modify naming of queries into the p2mAndres Lagar-Cavilla2011-11-111-7/+14
| | | | | | | | | | | | | | | | | | | | | | Callers of lookups into the p2m code are now variants of get_gfn. All callers need to call put_gfn. The code behind it is a no-op at the moment, but will change to proper locking in a later patch. This patch does not change functionality. Only naming, and adds put_gfn's. set_p2m_entry retains its name because it is always called with p2m_lock held. This patch is humongous, unfortunately, given the dozens of call sites involved. After this patch, anyone using old style gfn_to_mfn will not succeed in compiling their code. This is on purpose: adapt to the new API. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Keir Fraser <keir@xen.org>
* x86/mm/p2m: Make p2m interfaces take struct domain arguments.Tim Deegan2011-06-021-1/+1
| | | | | | | | | | | | | | | | | As part of the nested HVM patch series, many p2m functions were changed to take pointers to p2m tables rather than to domains. This patch reverses that for almost all of them, which: - gets rid of a lot of "p2m_get_hostp2m(d)" in code which really shouldn't have to know anything about how gfns become mfns. - ties sharing and paging interfaces to a domain, which is what they actually act on, rather than a particular p2m table. In developing this patch it became clear that memory-sharing and nested HVM are unlikely to work well together. I haven't tried to fix that here beyond adding some assertions around suspect paths (as this patch is big enough with just the interface changes) Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
* Disable tmem by default for 4.1 release.Keir Fraser2011-01-191-1/+1
| | | | | | | | Although one major source of order>0 allocations has been removed, others still remain, so re-disable tmem until the issue can be fixed properly. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
* Use bool_t for various boolean variablesKeir Fraser2010-12-241-6/+6
| | | | | | | | | | | ... decreasing cache footprint. As a prerequisite this requires making cmdline_parse() a little more flexible. Also remove a few variables altogether, and adjust sections annotations for several others. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xen.org>
* ia64: fix build in grant table and tmem codeKeir Fraser2010-10-241-2/+2
| | | | Signed-off-by: Jan Beulich <jbeulich@novell.com>
* tmem: disallow bad gmfns from PV domainsKeir Fraser2010-09-221-40/+89
| | | | | | | | | Mfns for PV domains were not properly checked, potentially allowing a buggy or malicious PV guest to crash Xen. Also, use get_page/put_page to claim a reference to the pages so they can't disappear out from under tmem's feet. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* Nested Virtualization: p2m infrastructureKeir Fraser2010-08-091-1/+1
| | | | | | | | Change p2m infrastructure to operate on per-p2m instead of per-domain. This allows us to use multiple p2m tables per-domain. Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com>
* Enable tmem functionality for PV on HVM guests. Guest kernelKeir Fraser2010-06-211-1/+1
| | | | | | | | | | | | | must still be tmem-enabled to use this functionality (e.g. won't work for Windows), but upstream Linux tmem (aka cleancache and frontswap) patches apply cleanly on top of PV on HVM patches. Also, fix up some ASSERTS and code used only when bad guest mfns are passed to tmem. Previous code could crash Xen if a buggy/malicious guest passes bad gmfns. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* tmem: Fix domain lifecycle synchronisation.Keir Fraser2010-06-101-1/+1
| | | | | | | | | Obtaining a domain reference count is neither necessary nor sufficient. Instead we simply check whether a domain is already dying when it first becomes a client of tmem. If it is not then we will correctly clean up later via tmem_destroy() called from domain_kill(). Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* Remove many uses of cpu_possible_map and iterators over NR_CPUS.Keir Fraser2010-05-141-19/+71
| | | | | | | The significant remaining culprits for x86 are credit2, hpet, and percpu-area subsystems. To be dealt with in a separate patch. Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* tmem: (re-)enable by defaultKeir Fraser2010-04-191-1/+1
| | | | | | | | | | | | | | | | Late in the 4.0 release it was discovered that certain order>0 allocations could fail and had no fallback. This conflicted with tmem especially when combined with aggressive ballooning. A hack-y workaround patch was added in time for 4.0 that has reduced (but not completely eliminated) the problem but tmem was left disabled-by-default for the 4.0 release. Re-enable it in xen-unstable by default to help identify cases where the workaround is insufficient. Tmem can be disabled with the no-tmem Xen boot option. Please report failures (that are fixed with the no-tmem option) to me. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* tmem: add page deduplication with optional compression or ↵Keir Fraser2010-04-061-3/+30
| | | | | | | | | | | | | | | | trailing-zero-elimination Add "page deduplication" capability (with optional compression and trailing-zero elimination) to Xen's tmem. (Transparent to tmem-enabled guests.) Ephemeral pages that have the exact same content are "combined" so that only one page frame is needed. Since ephemeral pages are essentially read-only, no C-O-W (and thus no equivalent of swapping) is necessary. Deduplication can be combined with compression or "trailing zero elimination" for even more space savings. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* Fix domain reference leaksKeir Fraser2010-02-101-4/+3
| | | | | | | | | | | | | Besides two unlikely/rarely hit ones in x86 code, the main offender was tmh_client_from_cli_id(), which didn't even have a counterpart (albeit it had a comment correctly saying that it causes d->refcnt to get incremented). Unfortunately(?) this required a bit of code restructuring (as I needed to change the code anyway, I also fixed a couple os missing bounds checks which would sooner or later be reported as security vulnerabilities), so I would hope Dan could give it his blessing before it gets applied. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* tmem: Disable by default: enable with Xen boot param 'tmem'Keir Fraser2010-02-101-4/+0
| | | | | | This reverts 20758:4e56f809ddbf and 20655:3c5b5c4c1d79 Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* tmem: Only enable by default for x86_64Keir Fraser2010-01-061-0/+4
| | | | | | | | | While tmem has gotten limited testing with a 32-bit Xen, it has severe limitations due to 32-bit heap restrictions. So, turn it off by default for 32-bit so nobody accidentally runs into this. Signed-off by: Dan Magenheimer <dan.magenheimer@oracle.com>
* Turn tmem (transcendent memory) support on by default.Keir Fraser2009-12-161-1/+1
| | | | | | | | | | Tmem has been in-tree for about seven months, but disabled by default. Enabling it should be entirely harmless unless a running PV domain has been tmem-modified. I'd like to confirm that by enabling it now, so that it can be enabled by default for the 4.0.0 release. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* ia64: eliminate build warningsKeir Fraser2009-11-301-0/+1
| | | | | | Various warnings appeared since 3.4 - eliminate at least some of them. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* tmem: fix double-free bugKeir Fraser2009-11-231-2/+1
| | | | | | | Tmem double-frees a high-level data structure causing memory corruption under certain circumstances. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* tmem: fix regression from c/s 19886 "Remove page-scrub lists and async ↵Keir Fraser2009-11-231-2/+3
| | | | | | | | scrubbing" Fix incorrect page_list macro choice from page-scrub code cleanup. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* tmem: fix domain shutdown problem/raceKeir Fraser2009-11-141-0/+1
| | | | | | | | Tmem fails to put_domain so a dying domain never gets properly shut down. Also, fix race condition when domain is dying by not allowing any new ops to succeed. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* Introduce new flavour of map_domain_page()Keir Fraser2009-09-221-1/+1
| | | | | | | | | | | | | Introduce a variant of map_domain_page() directly getting passed a struct page_info * argument, based on the observation that in many places the argument to this function so far simply was the result of page_to_mfn(). This is meaningful for the x86-64 case where map_domain_page() really just is an invocation of mfn_to_virt(), and hence the combined mfn_to_virt(page_to_mfn()) now represents a needless round trip conversion compressed -> uncompressed -> compressed of the MFN representation. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* tmem: expose freeable memoryKeir Fraser2009-08-101-0/+2
| | | | | | | | | | | | | | | | | Expose tmem "freeable" memory for use by management tools. Management tools looking for a machine with available memory often look at free_memory to determine if there is enough physical memory to house a new or migrating guest. Since tmem absorbs much or all free memory, and since "ephemeral" tmem memory can be synchronously freed, management tools need more data -- not only how much memory is "free" but also how much memory is "freeable" by tmem if tmem is told (via an already existing tmem hypercall) to relinquish freeable memory. This patch provides that extra piece of data (in MB). Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* tmem: save/restore/migrate/livemigrate and shared pool authenticationKeir Fraser2009-08-061-18/+32
| | | | | | | | | | | | | Attached patch implements save/restore/migration/livemigration for transcendent memory ("tmem"). Without this patch, domains using tmem may in some cases lose data when doing save/restore or migrate/livemigrate. Also included in this patch is support for a new (privileged) hypercall for authorizing domains to share pools; this provides the foundation to accomodate upstream linux requests for security for shared pools. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* Rename for_each_cpu() to for_each_possible_cpu()Keir Fraser2009-07-151-1/+1
| | | | | | ... to be more precise in naming, and also to match Linux. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* Introduce and use a per-CPU read-mostly sub-sectionKeir Fraser2009-07-131-2/+2
| | | | | | | | | | | | | | | Since mixing data that only gets setup once and then (perhaps frequently) gets read by remote CPUs with data that the local CPU may modify (again, perhaps frequently) still causes undesirable cache protocol related bus traffic, separate the former class of objects from the latter. These objects converted here are just picked based on their write-once (or write-very-rarely) properties; perhaps some more adjustments may be desirable subsequently. The primary users of the new sub-section will result from the next patch. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* Remove page-scrub lists and async scrubbing.Keir Fraser2009-07-021-4/+6
| | | | | | | | The original user for this was domain destruction. Now that this is preemptible all the way back up to dom0 userspace, asynchrony is better iontroduced at that level, if at all, imo. Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* tmem: cleanupsKeir Fraser2009-06-161-4/+1
| | | | | | | | | - don't mis-use guest handle for passing an MFN value - eliminate unnecessary (and misplaced) use of XEN_GUEST_HANDLE_64 - use copy_from_guest() instead of __copy_from_guest() for loading the argument structure Signed-off-by: Jan Beulich <jbeulich@novell.com>
* tmem: fix minor accounting errorKeir Fraser2009-06-161-0/+1
| | | | | | | | | Reset a counter when all tmem pages are released. This only affects status reporting (as displayed by xm tmem-list or the just patched xenballoon-monitor) but the incorrectly reported result is misleading. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
* Transcendent memory ("tmem") for Xen.Keir Fraser2009-05-261-0/+334
Tmem, when called from a tmem-capable (paravirtualized) guest, makes use of otherwise unutilized ("fallow") memory to create and manage pools of pages that can be accessed from the guest either as "ephemeral" pages or as "persistent" pages. In either case, the pages are not directly addressible by the guest, only copied to and fro via the tmem interface. Ephemeral pages are a nice place for a guest to put recently evicted clean pages that it might need again; these pages can be reclaimed synchronously by Xen for other guests or other uses. Persistent pages are a nice place for a guest to put "swap" pages to avoid sending them to disk. These pages retain data as long as the guest lives, but count against the guest memory allocation. Tmem pages may optionally be compressed and, in certain cases, can be shared between guests. Tmem also handles concurrency nicely and provides limited QoS settings to combat malicious DoS attempts. Save/restore and live migration support is not yet provided. Tmem is primarily targeted for an x86 64-bit hypervisor. On a 32-bit x86 hypervisor, it has limited functionality and testing due to limitations of the xen heap. Nearly all of tmem is architecture-independent; three routines remain to be ported to ia64 and it should work on that architecture too. It is also structured to be portable to non-Xen environments. Tmem defaults off (for now) and must be enabled with a "tmem" xen boot option (and does nothing unless a tmem-capable guest is running). The "tmem_compress" boot option enables compression which takes about 10x more CPU but approximately doubles the number of pages that can be stored. Tmem can be controlled via several "xm" commands and many interesting tmem statistics can be obtained. A README and internal specification will follow, but lots of useful prose about tmem, as well as Linux patches, can be found at http://oss.oracle.com/projects/tmem . Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>