xen/xen - xen

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add DOMCTL to limit the number of event channels a domain may use	David Vrabel	2013-10-14	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add XEN_DOMCTL_set_max_evtchn which may be used during domain creation to set the maximum event channel port a domain may use. This may be used to limit the amount of Xen resources (global mapping space and xenheap) that a domain may use for event channels. A domain that does not have a limit set may use all the event channels supported by the event channel ABI in use. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Keir Fraser <keir@xen.org>
*	xen: allow for explicitly specifying node-affinity	Dario Faggioli	2013-04-17	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make it possible to pass the node-affinity of a domain to the hypervisor from the upper layers, instead of always being computed automatically. Note that this also required generalizing the Flask hooks for setting and getting the affinity, so that they now deal with both vcpu and node affinity. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Acked-by: Keir Fraser <keir@xen.org>
*	xen, libxc: rename xenctl_cpumap to xenctl_bitmap	Dario Faggioli	2013-04-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	More specifically: 1. replaces xenctl_cpumap with xenctl_bitmap 2. provides bitmap_to_xenctl_bitmap and the reverse; 3. re-implement cpumask_to_xenctl_bitmap with bitmap_to_xenctl_bitmap and the reverse; Other than #3, no functional changes. Interface only slightly afected. This is in preparation of introducing NUMA node-affinity maps. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Acked-by: Keir Fraser <keir@xen.org>
*	mmu: Introduce XENMEM_claim_pages (subop of memory ops)	Dan Magenheimer	2013-03-11	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When guests memory consumption is volatile (multiple guests ballooning up/down) we are presented with the problem of being able to determine exactly how much memory there is for allocation of new guests without negatively impacting existing guests. Note that the existing models (xapi, xend) drive the memory consumption from the tool-stack and assume that the guest will eventually hit the memory target. Other models, such as the dynamic memory utilized by tmem, do this differently - the guest drivers the memory consumption (up to the d->max_pages ceiling). With dynamic memory model, the guest frequently can balloon up and down as it sees fit. This presents the problem to the toolstack that it does not know atomically how much free memory there is (as the information gets stale the moment the d->tot_pages information is provided to the tool-stack), and hence when starting a guest can fail during the memory creation process. Especially if the process is done in parallel. In a nutshell what we need is a atomic value of all domains tot_pages during the allocation of guests. Naturally holding a lock for such a long time is unacceptable. Hence the goal of this hypercall is to attempt to atomically and very quickly determine if there are sufficient pages available in the system and, if so, "set aside" that quantity of pages for future allocations by that domain. Unlike an existing hypercall such as increase_reservation or populate_physmap, specific physical pageframes are not assigned to the domain because this cannot be done sufficiently quickly (especially for very large allocations in an arbitrarily fragmented system) and so the existing mechanisms result in classic time-of-check-time-of-use (TOCTOU) races. One can think of claiming as similar to a "lazy" allocation, but subsequent hypercalls are required to do the actual physical pageframe allocation. Note that one of effects of this hypercall is that from the perspective of other running guests - suddenly there is a new guest occupying X amount of pages. This means that when we try to balloon up they will hit the system-wide ceiling of available free memory (if the total sum of the existing d->max_pages >= host memory). This is OK - as that is part of the overcommit. What we DO NOT want to do is dictate their ceiling should be (d->max_pages) as that is risky and can lead to guests OOM-ing. It is something the guest needs to figure out. In order for a toolstack to "get" information about whether a domain has a claim and, if so, how large, and also for the toolstack to measure the total system-wide claim, a second subop has been added and exposed through domctl and libxl (see "xen: XENMEM_claim_pages: xc"). == Alternative solutions == There has been a variety of discussion whether the problem hypercall is solving can be done in user-space, such as: - For all the existing guest, set their d->max_pages temporarily to d->tot_pages and create the domain. This forces those domains to stay at their current consumption level (fyi, this is what the tmem freeze call is doing). The disadvantage of this is that needlessly forces the guests to stay at the memory usage instead of allowing it to decide the optimal target. - Account only using d->max_pages of how much free memory there is. This ignores ballooning changes and any over-commit scenario. This is similar to the scenario where the sum of all d->max_pages (and the one to be allocated now) on the host is smaller than the available free memory. As such it ignores the over-commit problem. - Provide a ring/FIFO along with event channel to notify an userspace daemon of guests memory consumption. This daemon can then provide up-to-date information to the toolstack of how much free memory there is. This duplicates what the hypervisor is already doing and introduced latency issues and catching breath for the toolstack as there might be millions of these updates on heavily used machine. There might not be any quiescent state ever and the toolstack will heavily consume CPU cycles and not ever provide up-to-date information. It has been noted that this claim mechanism solves the underlying problem (slow failure of domain creation) for a large class of domains but not all, specifically not handling (but also not making the problem worse for) PV domains that specify the "superpages" flag, and 32-bit PV domains on large RAM systems. These will be addressed at a later time. Code overview: Though the hypercall simply does arithmetic within locks, some of the semantics in the code may be a bit subtle. The key variables (d->unclaimed_pages and total_unclaimed_pages) starts at zero if no claim has yet been staked for any domain. (Perhaps a better name is "claimed_but_not_yet_possessed" but that's a bit unwieldy.) If no claim hypercalls are executed, there should be no impact on existing usage. When a claim is successfully staked by a domain, it is like a watermark but there is no record kept of the size of the claim. Instead, d->unclaimed_pages is set to the difference between d->tot_pages and the claim. When d->tot_pages increases or decreases, d->unclaimed_pages atomically decreases or increases. Once d->unclaimed_pages reaches zero, the claim is satisfied and d->unclaimed pages stays at zero -- unless a new claim is subsequently staked. The systemwide variable total_unclaimed_pages is always the sum of d->unclaimed_pages, across all domains. A non-domain- specific heap allocation will fail if total_unclaimed_pages exceeds free (plus, on tmem enabled systems, freeable) pages. Claim semantics could be modified by flags. The initial implementation had three flag, which discerns whether the caller would like tmem freeable pages to be considered in determining whether or not the claim can be successfully staked. This in later patches was removed and there are no flags. A claim can be cancelled by requesting a claim with the number of pages being zero. A second subop returns the total outstanding claimed pages systemwide. Note: Save/restore/migrate may need to be modified, else it can be documented that all claims are cancelled. This patch of the proposed XENMEM_claim_pages hypercall/subop, takes into account review feedback from Jan and Keir and IanC and Matthew Daley, plus some fixes found via runtime debugging. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Keir Fraser <keir@xen.org>
*	Fix emacs local variable block to use correct C style variable.	David Vrabel	2013-02-21	1	-1/+1
\| \| \| \| \| \| \|	The emacs variable to set the C style from a local variable block is c-file-style, not c-set-style. Signed-off-by: David Vrabel <david.vrabel@citrix.com
*	X86/vMCE: handle broken page with regard to migration	Liu Jinsong	2012-12-06	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the sender xc_domain_save has a key point: 'to query the types of all the pages with xc_get_pfn_type_batch' 1) if broken page occur before the key point, migration will be fine since proper pfn_type and pfn number will be transferred to the target and then take appropriate action; 2) if broken page occur after the key point, whole system will crash and no need care migration any more; At the target Target will populates pages for guest. As for the case of broken page, we prefer to keep the type of the page for the sake of seamless migration. Target will set p2m as p2m_ram_broken for broken page. If guest access the broken page again it will kill itself as expected. Suggested-by: George Dunlap <george.dunlap@eu.citrix.com> Signed-off-by: Liu Jinsong <jinsong.liu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
*	tools: Synchronize privcmd header constants	Andres Lagar-Cavilla	2012-11-12	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since Linux's git commit ceb90fa0a8008059ecbbf9114cb89dc71a730bb6, the privcmd.h interface between Linux and libxc specifies two new constants, PRIVCMD_MMAPBATCH_MFN_ERROR and PRIVCMD_MMAPBATCH_PAGED_ERROR. These constants represent the error codes encoded in the top nibble of an mfn slot passed to the legacy MMAPBATCH ioctl. In particular, libxenctrl checks for the equivalent of the latter constant when dealing with paged out frames that might be the target of a foreign map. Previously, the relevant constant was defined in the domctl hypervisor interface header (XEN_DOMCTL_PFINFO_PAGEDTAB). Because this top-nibble encoding is a contract between the dom0 kernel and libxc, a domctl.h definition is misplaced. - Sync the privcmd.h header to that now available in upstream Linux - Update libxc appropriately - Remove the unnecessary constant in domctl.h Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Ian Campbell <ian.campbelL@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
*	fix inclusion style in public/domctl.h	Jan Beulich	2012-10-04	1	-1/+1
\| \| \| \| \| \| \| \|	Public headers should include one another only via self-relative include directives (violated by 25955:07d0d5b3a005). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: vMCE save and restore	Liu, Jinsong	2012-09-26	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch provides vMCE save/restore when migration. 1. MCG_CAP is well-defined. However, considering future cap extension, we keep save/restore logic that Jan implement at c/s 24887; 2. MCi_CTL2 initialized by guestos when booting, so need save/restore otherwise guest would surprise; 3. Other MSRs do not need save/restore since they are either error- related and pointless to save/restore, or, unified among all vMCE platform; Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> - fix handling of partial data in XEN_DOMCTL_set_ext_vcpucontext - fix adjustment of xen_domctl_ext_vcpucontext Signed-off-by: Jan Beulich <jbeulich@suse.com> Committed-by: Jan Beulich <jbeulich@suse.com>
*	tools: drop ia64 support	Ian Campbell	2012-09-12	1	-24/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Removed support from libxc and mini-os. This also took me under xen/include/public via various symlinks. Dropped tools/debugger/xenitp entirely, it was described upon commit as: "Xenitp is a low-level debugger for ia64" and doesn't appear to be linked into the build anywhere. 99 files changed, 14 insertions(+), 32361 deletions(-) Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
*	domctl.h: document non-standard error codes for enabling paging/access	Olaf Hering	2012-04-03	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	The domctl to enable paging and access returns some non-standard error codes after failure. This can be used in the tools to print specific error messages. xenpaging recognizes these errno values and shows them if the init function fails. Document the return codes in the public header file. Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
*	x86/mm: wire up sharing ring	Tim Deegan	2012-03-08	1	-1/+19
\| \| \| \| \| \| \| \| \|	Now that we have an interface close to finalizing, do the necessary plumbing to set up a ring for reporting failed allocations in the unshare path. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
*	Use a reserved pfn in the guest address space to store mem event rings	Tim Deegan	2012-03-08	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This solves a long-standing issue in which the pages backing these rings were pages belonging to dom0 user-space processes. Thus, if the process would die unexpectedly, Xen would keep posting events to a page now belonging to some other process. We update all API-consumers in tree (xenpaging and xen-access). This is an API/ABI change, so please speak up if it breaks your accumptions. The patch touches tools, hypervisor x86/hvm bits, and hypervisor x86/mm bits. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Tim Deegan <tim@xen.org>
*	Tools: Remove shared page from mem_event/access/paging interfaces	Tim Deegan	2012-03-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't use the superfluous shared page, return the event channel directly as part of the domctl struct, instead. In-tree consumers (xenpaging, xen-access) updated. This is an ABI/API change, so please voice any concerns. Known pending issues: - pager could die and its ring page could be used by some other process, yet Xen retains the mapping to it. - use a saner interface for the paging_load buffer. This change also affects the x86/mm bits in the hypervisor that process the mem_event setup domctl. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Tim Deegan <tim@xen.org>
*	x86/vMCE: save/restore MCA capabilities	Jan Beulich	2012-02-24	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows migration to a host with less MCA banks than the source host had, while without this patch accesses to the excess banks' MSRs caused #GP-s in the guest after migration (and it depended on the guest kernel whether this would be fatal). A fundamental question is whether we should also save/restore MCG_CTL and MCi_CTL, as the HVM save record would better be defined to the complete state that needs saving from the beginning (I'm unsure whether the save/restore logic allows for future extension of an existing record). Of course, this change is expected to make migration from new to older Xen impossible (again I'm unsure what the save/restore logic does with records it doesn't even know about). The (trivial) tools side change may seem unrelated, but the code should have been that way from the beginning to allow the hypervisor to look at currently unused ext_vcpucontext fields without risking to read garbage when those fields get a meaning assigned in the future. This isn't being enforced here - should it be? (Obviously, for backwards compatibility, the hypervisor must assume these fields to be clear only when the extended context's size exceeds the old original one.) A future addition to this change might be to allow configuration of the number of banks and other MCA capabilities for a guest before it starts (i.e. to not inherits the values seen on the first host it runs on). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	Use memops for mem paging, sharing, and access, instead of domctls	Andres Lagar-Cavilla	2012-02-10	1	-71/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Per page operations in the paging, sharing, and access tracking subsystems are all implemented with domctls (e.g. a domctl to evict one page, or to share one page). Under heavy load, the domctl path reveals a lack of scalability. The domctl lock serializes dom0's vcpus in the hypervisor. When performing thousands of per-page operations on dozens of domains, these vcpus will spin in the hypervisor. Beyond the aggressive locking, an added inefficiency of blocking vcpus in the domctl lock is that dom0 is prevented from re-scheduling any of its other work-starved processes. We retain the domctl interface for setting up and tearing down paging/sharing/mem access for a domain. But we migrate all the per page operations to use the memory_op hypercalls (e.g XENMEM_*). Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla> Signed-off-by: Adin Scannell <adin@scannell.ca> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Tim Deegan <tim@xen.org>
*	xen: allow global VIRQ handlers to be delegated to other domains	Daniel De Graaf	2012-01-28	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch sends global VIRQs to a domain designated as the VIRQ handler instead of sending all global VIRQ events to dom0. This is required in order to run xenstored in a stubdom, because VIRQ_DOM_EXC must be sent to xenstored for domain destruction to work properly. This patch was inspired by the xenstored stubdomain patch series sent to xen-devel by Alex Zeffertt in 2009. Signed-off-by: Diego Ongaro <diego.ongaro@citrix.com> Signed-off-by: Alex Zeffertt <alex.zeffertt@eu.citrix.com> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Keir Fraser <keir@xen.org>
*	x86/mm: New domctl: add a shared page to the physmap	Andres Lagar-Cavilla	2012-01-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	This domctl is useful to, for example, populate parts of a domain's physmap with shared frames, directly. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
*	x86/mm: Update mem sharing interface to (re)allow sharing of grants	Andres Lagar-Cavilla	2012-01-26	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, the mem sharing code would return an opaque handle to index shared pages (and nominees) in its global hash table. By removing the hash table, the new interfaces requires a gfn and a version. However, when sharing grants, the caller provides a grant ref and a version. Update interface to handle this case. The use case for grant sharing is when sharing from within a backend (e.g. memshr + blktap2), in which case the backend is only exposed to grant references. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> Committed-by: Tim Deegan <tim@xen.org> Acked-by: Tim Deegan <tim@xen.org>
*	x86/mm: Eliminate hash table in sharing code as index of shared mfns	Andres Lagar-Cavilla	2012-01-26	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Eliminate the sharing hastable mechanism by storing a list head directly in the page info for the case when the page is shared. This does not add any extra space to the page_info and serves to remove significant complexity from sharing. Signed-off-by: Adin Scannell <adin@scannell.ca> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
*	x86/mm: remove 0x55 debug pattern from M2P table	Tim Deegan	2011-12-02	1	-2/+1
\| \| \| \| \| \| \| \|	It's not really any more useful than explicitly setting new M2P entries to the invalid value. Signed-off-by: Tim Deegan <tim@xen.org> Committed-by: Keir Fraser <keir@xen.org>
*	After preparing a page for page-in, allow immediate fill-in of the page contents	Andres Lagar-Cavilla	2011-12-01	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	p2m_mem_paging_prep ensures that an mfn is backing the paged-out gfn, and transitions to the next state in the paging state machine for that page. Foreign mappings of the gfn will now succeed. This is the key idea, as it allows the pager to now map the gfn and fill in its contents. Unfortunately, it also allows any other foreign mapper to map the gfn and read its contents. This is particularly dangerous when the populate is launched by a foreign mapper in the first place, which will be actively retrying the map operation and might race with the pager. Qemu-dm being a prime example. Fix the race by allowing a buffer to be optionally passed in the prep operation, and having the hypervisor memcpy from that buffer into the newly prepped page before promoting the gfn type. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
*	x86/mm: Rework stale p2m auditing	Andres Lagar-Cavilla	2011-12-01	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The p2m audit code doesn't even compile, let alone work. It also partially supports ept. Make it: - compile - lay groundwork for eventual ept support - move out of the way of all calls and turn it into a domctl. It's obviously not being used by anybody presently. - enable it via said domctl Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
*	xenpaging: track number of paged pages in struct domain	Olaf Hering	2011-09-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The toolstack should know how many pages are paged-out at a given point in time so it could make smarter decisions about how many pages should be paged or ballooned. Add a new member to xen_domctl_getdomaininfo and bump interface version. Use the new member in xc_dominfo_t. The SONAME of libxc should be changed if this patch gets applied. Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
*	PCI multi-seg: adjust domctl interface	Jan Beulich	2011-09-22	1	-3/+3
\| \| \| \| \| \| \|	Again, a couple of directly related functions at once get adjusted to account for the segment number. Signed-off-by: Jan Beulich <jbeulich@suse.com>
*	mem_event: use different ringbuffers for share, paging and access	Olaf Hering	2011-09-16	1	-20/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Up to now a single ring buffer was used for mem_share, xenpaging and xen-access. Each helper would have to cooperate and pull only its own requests from the ring. Unfortunately this was not implemented. And even if it was, it would make the whole concept fragile because a crash or early exit of one helper would stall the others. What happend up to now is that active xenpaging + memory_sharing would push memsharing requests in the buffer. xenpaging is not prepared for such requests. This patch creates an independet ring buffer for mem_share, xenpaging and xen-access and adds also new functions to enable xenpaging and xen-access. The xc_mem_event_enable/xc_mem_event_disable functions will be removed. The various XEN_DOMCTL_MEM_EVENT_* macros were cleaned up. Due to the removal the API changed, so the SONAME will be changed too. Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Tim Deegan <tim@xen.org>
*	tools: Introduce "allocate-only" page type for migration	George Dunlap	2011-05-26	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To detect presence of superpages on the receiver side, we need to have strings of sequential pfns sent across on the first iteration through the memory. However, as we go through the memory, more and more of it will be marked dirty, making it wasteful to send those pages. This patch introduces a new PFINFO type, "XALLOC". Like PFINFO_XTAB, it indicates that there is no corresponding page present in the subsquent page buffer. However, unlike PFINFO_XTAB, it contains a pfn which should be allocated. This new type is only used for migration; but it's placed in xen/public/domctl.h so that the value isn't reused. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson.citrix.com>
*	mem_access: access listener can be required	Joe Epstein	2011-01-07	1	-0/+12
\| \| \| \| \| \| \| \|	* Adds the ability to set that a domain that an access listener; that is, it pauses the VCPU if there is no memory event listener. Signed-off-by: Joe Epstein <jepstein98@gmail.com> Acked-by: Keir Fraser <keir@xen.org>
*	mem_access: mem event additions for access	Joe Epstein	2011-01-07	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Adds an ACCESS memory event type, with RESUME as the action. * Refactors the bits in the memory event to store whether the memory event was a read, write, or execute (for access memory events only). I used bits sparingly to keep the structure somewhat the same size. * Modified VMX to report the needed information in its nested page fault. SVM is not implemented in this patch series. Signed-off-by: Joe Epstein <jepstein98@gmail.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Tim Deegan <Tim.Deegan@citrix.com>
*	ARINC 653 scheduler	Keir Fraser	2010-12-01	1	-0/+1
\| \| \| \| \| \|	From: Josh Holtrop <Josh.Holtrop@dornerworks.com> Signed-off-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
*	x86: xsave save/restore support for both PV and HVM guests (version 2)	Keir Fraser	2010-11-08	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	I have tested the patch in the following senarios: 1> Non-xsave platform 2> Xsave-capable platform, guest does not support xsave, xen support xsave 3> Xsave-capable platform, guest does support xsave, xen supports xsave 4> Guest (non-xsave) saved on platform without xsave, restored on a Xsave-capable system. All passed. Signed-off-by: Shan Haitao <haitao.shan@intel.com> Signed-off-by: Han Weidong <weidong.han@intel.com>
*	Revert 22347:16093532f384 "x86: xsave save/restore support"	Keir Fraser	2010-11-04	1	-28/+0
\| \| \| \| \| \|	Completely broken when xsave is not enabled or supported on the host. Signed-off-by: Keir Fraser <keir@xen.org>
*	x86: xsave save/restore support for both PV and HVM guests.	Keir Fraser	2010-11-03	1	-0/+28
\| \| \| \| \|	Signed-off-by: Shan Haitao <haitao.shan@intel.com> Signed-off-by: Han Weidong <weidong.han@intel.com>
*	x86: add CMCI software injection interface	Keir Fraser	2010-06-09	1	-5/+0
\| \| \| \| \| \| \| \|	A new command is added. User can set the target CPU map, since the CMCI can be triggered on some specific CPUs. Please be noticed that the xenctl_cpumap structure is moved from domctl.h to xen.h. Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
*	cpupool: Control interface should be a sysctl rather than a domctl.	Keir Fraser	2010-05-04	1	-27/+0
\| \| \| \|	Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
*	cpupools [1/6]: hypervisor changes	Keir Fraser	2010-04-21	1	-2/+29
\| \| \| \|	Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
*	credit2: Add credit2 scheduler to hypervisor	Keir Fraser	2010-04-14	1	-0/+4
\| \| \| \| \| \| \| \| \|	This is the core credit2 patch. It adds the new credit2 scheduler to the hypervisor, as the non-default scheduler. It should be emphasized that this is still in the development phase, and is probably still unstable. It is known to be suboptimal for multi-socket systems. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
*	hvm: Add ACPI fixed sleep button	Keir Fraser	2010-01-20	1	-0/+1
\| \| \| \|	Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
*	x86: add and use XEN_DOMCTL_getpageframeinfo3	Keir Fraser	2010-01-13	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To support wider than 28-bit MFNs, add XEN_DOMCTL_getpageframeinfo3 (with the type replacing the passed in MFN rather than getting or-ed into it) to properly back xc_get_pfn_type_batch(). With xc_get_pfn_type_batch() only used internally to libxc, move its prototype from xenctrl.h to xc_private.h. This also fixes a couple of bugs in pre-existing code: - the failure path for init_mem_info() leaked minfo->pfn_type, - one error path of the XEN_DOMCTL_getpageframeinfo2 handler used put_domain() where rcu_unlock_domain() was meant, and - the XEN_DOMCTL_getpageframeinfo2 handler could call xsm_getpageframeinfo() with an invalid struct page_info pointer. Signed-off-by: Jan Beulich <jbeulich@novell.com>
*	domctl: Fix command-number clashes and place all #defines together to	Keir Fraser	2010-01-04	1	-62/+118
\| \| \| \| \| \| \|	avoid the problem in future. From: Juergen Gross <juergen.gross@ts.fujitsu.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
*	domctl/sysctl: Clean up definitions	Keir Fraser	2009-12-22	1	-30/+32
\| \| \| \| \| \| \| \|	- Use fixed-width types only - Use named unions only - Bump domctl version number Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
*	Domctls defined for all relevant memory sharing operations.	Keir Fraser	2009-12-17	1	-0/+49
\| \| \| \|	Signed-off-by: Grzegorz Milos <Grzegorz.Milos@citrix.com>
*	Memory paging domctl support, which is a sub-operation of the generic memory	Keir Fraser	2009-12-17	1	-0/+12
\| \| \| \| \| \|	event domctl support. Signed-off-by: Patrick Colp <Patrick.Colp@citrix.com>
*	domctl support for generic memory event handling.	Keir Fraser	2009-12-17	1	-0/+26
\| \| \| \|	Signed-off-by: Patrick Colp <Patrick.Colp@citrix.com>
*	Replace tsc_native config option with tsc_mode config option	Keir Fraser	2009-11-25	1	-6/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(NOTE: pvrdtscp mode not finished yet, but all other modes have been tested so sooner seemed better than later to submit this fairly major patch so we can get more mileage on it before next release.) New tsc_mode config option supercedes tsc_native and offers a more intelligent default and an additional option for intelligent apps running on PV domains ("pvrdtscp"). For PV domains, default mode will determine if the initial host has a "safe" TSC (meaning it is always synchronized across all physical CPUs). If so, all domains will execute all rdtsc instructions natively; if not, all domains will emulate all rdtsc instructions but providing the TSC hertz rate of the initial machine. After being restored or live-migrated, all PV domains will emulate all rdtsc instructions. Hence, this default mode guarantees correctness while providing native performance in most conditions. For PV domains, tsc_mode==1 will always emulate rdtsc and tsc_mode==2 will never emulate rdtsc. For tsc_mode==3, rdtsc will never be emulated, but information is provided through pvcpuid instructions and rdtscp instructions so that an app can obtain "safe" pvclock-like TSC information across save/restore and live migration. (Will be completed in a follow-on patch.) For HVM domains, the default mode and "always emulate" mode do the same as tsc_native==0; the other two modes do the same as tsc_native==1. (HVM domains since 3.4 have implemented a tsc_mode=default-like functionality, but also can preserve native TSC across save/restore and live-migration IFF the initial and target machines have a common TSC cycle rate.) All newer AMD machines, and Nehalem and future Intel machines have "Invariant TSC"; many newer Intel machines have "Constant TSC" and do not support deep-C sleep states; these and all single-processor machines are "safe". Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
*	Remove unused XEN_DOMINF_cpu{mask,shift} definitions.	Keir Fraser	2009-10-21	1	-3/+0
\| \| \| \|	Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
*	Add nomigrate config option to disable migration/restore	Keir Fraser	2009-10-20	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new nomigrate option can be set to non-zero in vm.cfg (for both hvm and pvm) to disallow a guest from being migrated or restored. (Save is still allowed for the purpose of checkpointing.) The option persists into a save file and is also communicated into the hypervisor, the latter for the purposes of a to-be-added hypercall for communicating to guests that migration is disallowed (which will be used initially for userland TSC-related sensing, but may find other uses). Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
*	Per-domain switch to disable oos shadow page tables	Keir Fraser	2009-10-19	1	-0/+3
\| \| \| \|	Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
*	gdbsx: a gdbserver stub for xen.	Keir Fraser	2009-10-15	1	-0/+26
\| \| \| \| \| \| \| \|	It should be run on dom0 on gdbsx enabled hypervisor. For details, please see tools/debugger/gdbsx/README Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
*	x86: Allow TSC mode (emulate vs native) to be configured per domain.	Keir Fraser	2009-09-28	1	-1/+6
\| \| \| \| \| \| \| \|	The default is to emulate. Old saved images will be restored with legacy behaviour however (native TSC, no emulation). Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>