xen/xen - xen

	Commit message (Collapse)	Author	Age	Files	Lines
*	x86: use {rd,wr}{fs,gs}base when available	Jan Beulich	2013-10-11	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	... as being intended to be faster than MSR reads/writes. In the case of emulate_privileged_op() also use these in favor of the cached (but possibly stale) addresses from arch.pv_vcpu. This allows entirely removing the code that was the subject of XSA-67. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86/xsave: fix migration from xsave-capable to xsave-incapable host	Jan Beulich	2013-09-09	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With CPUID features suitably masked this is supposed to work, but was completely broken (i.e. the case wasn't even considered when the original xsave save/restore code was written). First of all, xsave_enabled() wrongly returned the value of cpu_has_xsave, i.e. not even taking into consideration attributes of the vCPU in question. Instead this function ought to check whether the guest ever enabled xsave support (by writing a [non-zero] value to XCR0). As a result of this, a vCPU's xcr0 and xcr0_accum must no longer be initialized to XSTATE_FP_SSE (since that's a valid value a guest could write to XCR0), and the xsave/xrstor as well as the context switch code need to suitably account for this (by always enforcing at least this part of the state to be saved/loaded). This involves undoing large parts of c/s 22945:13a7d1f7f62c ("x86: add strictly sanity check for XSAVE/XRSTOR") - we need to cleanly distinguish between hardware capabilities and vCPU used features. Next both HVM and PV save code needed tweaking to not always save the full state supported by the underlying hardware, but just the parts that the guest actually used. Similarly the restore code should bail not just on state being restored that the hardware cannot handle, but also on inconsistent save state (inconsistent XCR0 settings or size of saved state not in line with XCR0). And finally the PV extended context get/set code needs to use slightly different logic than the HVM one, as here we can't just key off of xsave_enabled() (i.e. avoid doing anything if a guest doesn't use xsave) because the tools use this function to determine host capabilities as well as read/write vCPU state. The set operation in particular needs to be capable of cleanly dealing with input that consists of only the xcr0 and xcr0_accum values (if they're both zero then no further data is required). While for things to work correctly both sides (saving _and_ restoring host) need to run with the fixed code, afaict no breakage should occur if either side isn't up to date (other than the breakage that this patch attempts to fix). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Yang Zhang <yang.z.zhang@intel.com> Acked-by: Keir Fraser <keir@xen.org>
*	xen: move VCPUOP_register_vcpu_info to common code	Stefano Stabellini	2013-05-08	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the implementation of VCPUOP_register_vcpu_info from x86 specific to commmon code. Move vcpu_info_mfn from an arch specific vcpu sub-field to the common vcpu struct. Move the initialization of vcpu_info_mfn to common code. Move unmap_vcpu_info and the call to unmap_vcpu_info at domain destruction time to common code. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: make vcpu_destroy_pagetables() preemptible	Jan Beulich	2013-05-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	... as it may take significant amounts of time. The function, being moved to mm.c as the better home for it anyway, and to avoid having to make a new helper function there non-static, is given a "preemptible" parameter temporarily (until, in a subsequent patch, its other caller is also being made capable of dealing with preemption). This is part of CVE-2013-1918 / XSA-45. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
*	x86: allow VCPUOP_register_vcpu_info to work again on PVHVM guests	Konrad Rzeszutek Wilk	2013-04-17	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For details on the hypercall please see commit c58ae69360ccf2495a19bf4ca107e21cf873c75b (VCPUOP_register_vcpu_info) and the c/s 23143 (git commit 6b063a4a6f44245a727aa04ef76408b2e00af9c7) (x86: move pv-only members of struct vcpu to struct pv_vcpu) that introduced the regression. The current code allows the PVHVM guest to make this hypercall. But for PVHVM guest it always returns -EINVAL (-22) for Xen 4.2 and above. Xen 4.1 and earlier worked. The reason is that the check in map_vcpu_info would fail at: if ( v->arch.vcpu_info_mfn != INVALID_MFN ) The reason is that the vcpu_info_mfn for PVHVM guests ends up by defualt with the value of zero (introduced by c/s 23143). The code in vcpu_initialise which initialized vcpu_info_mfn to a valid value (INVALID_MFN), would never be called for PVHVM: if ( is_hvm_domain(d) ) { rc = hvm_vcpu_initialise(v); goto done; } v->arch.pv_vcpu.vcpu_info_mfn = INVALID_MFN; while previously it would be: v->arch.vcpu_info_mfn = INVALID_MFN; [right at the start of the function in Xen 4.1] This fixes the problem with Linux advertising this error: register_vcpu_info failed: err=-22 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
*	x86/mm: warn if we ever run out of shadow/hap pool for p2m/lgd ops.	Tim Deegan	2013-03-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Even if the error propagates up through the p2m ops to the caller, it'll look like ENOMEM, which won't be obviously a shadow-pool problem. Warn on the console, once per domain. Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com>
*	x86/mm: use bool_t for flags in shadow-pagetable structs	Tim Deegan	2013-03-14	1	-11/+11
\| \| \| \| \| \| \|	and reshuffle the domain struct to pack a little better. Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com>
*	x86: use linear L1 page table for map_domain_page() page table manipulation	Jan Beulich	2013-02-28	1	-3/+1
\| \| \| \| \| \| \| \|	This saves allocation of a Xen heap page for tracking the L1 page table pointers. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: rework hypercall argument translation area setup	Jan Beulich	2013-02-28	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	... using the new per-domain mapping management functions, adding destroy_perdomain_mapping() to the previously introduced pair. Rather than using an order-1 Xen heap allocation, use (currently 2) individual domain heap pages to populate space in the per-domain mapping area. Also fix a benign off-by-one mistake in is_compat_arg_xlat_range(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: introduce create_perdomain_mapping()	Jan Beulich	2013-02-28	1	-11/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	... as well as free_perdomain_mappings(), and use them to carry out the existing per-domain mapping setup/teardown. This at once makes the setup of the first sub-range PV domain specific (with idle domains also excluded), as the GDT/LDT mapping area is needed only for those. Also fix an improperly scaled BUILD_BUG_ON() expression in mapcache_domain_init(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	Fix emacs local variable block to use correct C style variable.	David Vrabel	2013-02-21	1	-1/+1
\| \| \| \| \| \| \|	The emacs variable to set the C style from a local variable block is c-file-style, not c-set-style. Signed-off-by: David Vrabel <david.vrabel@citrix.com
*	x86: properly use map_domain_page() during domain creation/destruction	Jan Beulich	2013-01-23	1	-8/+6
\| \| \| \| \| \| \| \|	This involves no longer storing virtual addresses of the per-domain mapping L2 and L3 page tables. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: properly use map_domain_page() when building Dom0	Jan Beulich	2013-01-23	1	-0/+1
\| \| \| \| \| \| \| \| \|	This requires a minor hack to allow the correct page tables to be used while running on Dom0's page tables (as they can't be determined from "current" at that time). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: re-introduce map_domain_page() et al	Jan Beulich	2013-01-23	1	-11/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is being done mostly in the form previously used on x86-32, utilizing the second L3 page table slot within the per-domain mapping area for those mappings. It remains to be determined whether that concept is really suitable, or whether instead re-implementing at least the non-global variant from scratch would be better. Also add the helpers {clear,copy}_domain_page() as well as initial uses of them. One question is whether, to exercise the non-trivial code paths, we shouldn't make the trivial shortcuts conditional upon NDEBUG being defined. See the debugging patch at the end of the series. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: vMCE emulation	Liu, Jinsong	2012-09-26	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch provides virtual MCE support to guest. It emulates a simple and clean MCE MSRs interface to guest by faking caps to guest if needed and masking caps if unnecessary: 1. Providing a well-defined MCG_CAP to guest, filter out un-necessary caps and provide only guest needed caps; 2. Disabling MCG_CTL to avoid model specific; 3. Sticking all 1's to MCi_CTL to guest to avoid model specific; 4. Enabling CMCI cap but never really inject to guest to prevent polling periodically; 5. Masking MSCOD field of MCi_STATUS to avoid model specific; 6. Keeping natural semantics by per-vcpu instead of per-domain variables; 7. Using bank1 and reserving bank0 to work around 'bank0 quirk' of some very old processors; 8. Cleaning some vMCE# injection logic which shared by Intel and AMD but useless under new vMCE implement; 9. Keeping compatilbe w/ old xen version which has been backported to SLES11 SP2, so that old vMCE would not blocked when migrate to new vMCE; Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> - make printing consistent (and non-exploitable) - fix return values of intel_mce_{rd,wr}msr() for out of range banks - miscellaneous cleanup Signed-off-by: Jan Beulich <jbeulich@suse.com> Committed-by: Jan Beulich <jbeulich@suse.com>
*	x86: Remove CONFIG_COMPAT ifdef'ery from arch/x86 -- it is always defined.	Keir Fraser	2012-09-12	1	-2/+0
\| \| \| \|	Signed-off-by: Keir Fraser <keir@xen.org>
*	x86: We can assume CONFIG_PAGING_LEVELS==4.	Keir Fraser	2012-09-12	1	-2/+0
\| \| \| \|	Signed-off-by: Keir Fraser <keir@xen.org>
*	xen: Remove x86_32 build target.	Keir Fraser	2012-09-12	1	-55/+0
\| \| \| \|	Signed-off-by: Keir Fraser <keir@xen.org>
*	x86/mm/sharing: Clean ups for relinquishing shared pages on destroy	Andres Lagar-Cavilla	2012-04-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a domain is destroyed, its pages are freed in relinquish_resources in a preemptible mode, in the context of a synchronous domctl. P2m entries pointing to shared pages are, however, released during p2m cleanup in an RCU callback, and in non-preemptible mode. This is an O(n) operation for a very large n, which may include actually freeing shared pages for which the domain is the last holder. To improve responsiveness, move this operation to the preemtible portion of domain destruction, during the synchronous domain_kill hypercall. And remove the bulk of the work from the RCU callback. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
*	x86/vMCE: save/restore MCA capabilities	Jan Beulich	2012-02-24	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows migration to a host with less MCA banks than the source host had, while without this patch accesses to the excess banks' MSRs caused #GP-s in the guest after migration (and it depended on the guest kernel whether this would be fatal). A fundamental question is whether we should also save/restore MCG_CTL and MCi_CTL, as the HVM save record would better be defined to the complete state that needs saving from the beginning (I'm unsure whether the save/restore logic allows for future extension of an existing record). Of course, this change is expected to make migration from new to older Xen impossible (again I'm unsure what the save/restore logic does with records it doesn't even know about). The (trivial) tools side change may seem unrelated, but the code should have been that way from the beginning to allow the hypervisor to look at currently unused ext_vcpucontext fields without risking to read garbage when those fields get a meaning assigned in the future. This isn't being enforced here - should it be? (Obviously, for backwards compatibility, the hypervisor must assume these fields to be clear only when the extended context's size exceeds the old original one.) A future addition to this change might be to allow configuration of the number of banks and other MCA capabilities for a guest before it starts (i.e. to not inherits the values seen on the first host it runs on). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	xen: introduce PHYSDEVOP_pirq_eoi_gmfn_v2	Stefano Stabellini	2012-01-28	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PHYSDEVOP_pirq_eoi_gmfn changes the semantics of PHYSDEVOP_eoi. In order to improve the interface this patch: - renames PHYSDEVOP_pirq_eoi_gmfn to PHYSDEVOP_pirq_eoi_gmfn_v1; - introduces PHYSDEVOP_pirq_eoi_gmfn_v2, that is like PHYSDEVOP_pirq_eoi_gmfn_v1 but it doesn't modify the behaviour of another hypercall; - bump __XEN_LATEST_INTERFACE_VERSION__; - #define PHYSDEVOP_pirq_eoi_gmfn to PHYSDEVOP_pirq_eoi_gmfn_v1 or PHYSDEVOP_pirq_eoi_gmfn_v2 depending on the __XEN_INTERFACE_VERSION. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Keir Fraser <keir@xen.org>
*	x86/mm: Enforce ordering constraints for the page alloc lock in the PoD code	Tim Deegan	2011-11-10	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	The page alloc lock is sometimes used in the PoD code, with an explicit expectation of ordering. Use our ordering constructs in the mm layer to enforce this. The additional book-keeping variables are kept in the arch_domain sub-struct, as they are x86-specific to the whole domain. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
*	x86: consistently serialize CMOS/RTC accesses on rtc_lock	Jan Beulich	2011-07-19	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since RTC/CMOS accesses aren't atomic, there are possible races between code paths setting the index register and subsequently reading/writing the data register. This is supposed to be dealt with by acquiring rtc_lock, but two places up to now lacked respective synchronization: Accesses to the EFI time functions and smpboot_{setup,restore}_warm_reset_vector(). This in turn requires no longer directly passing through guest writes to the index register, but instead using a machanism similar to that for PCI config space method 1 accesses. Signed-off-by: Jan Beulich <jbeulich@novell.com>
*	replace d->nr_pirqs sized arrays with radix tree	Jan Beulich	2011-06-23	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this it is questionable whether retaining struct domain's nr_pirqs is actually necessary - the value now only serves for bounds checking, and this boundary could easily be nr_irqs. Note that ia64, the build of which is broken currently anyway, is only being partially fixed up. v2: adjustments for split setup/teardown of translation data v3: re-sync with radix tree implementation changes Signed-off-by: Jan Beulich <jbeulich@novell.com>
*	Merge	Tim Deegan	2011-06-06	1	-6/+8
\|\
\| *	x86: Enable Supervisor Mode Execution Protection (SMEP)	Keir Fraser	2011-06-03	1	-6/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Intel new CPU supports SMEP (Supervisor Mode Execution Protection). SMEP prevents software operating with CPL < 3 (supervisor mode) from fetching instructions from any linear address with a valid translation for which the U/S flag (bit 2) is 1 in every paging-structure entry controlling the translation for the linear address. This patch enables SMEP in Xen to protect Xen hypervisor from executing pv guest instructions, whose translation paging-structure entries' U/S flags are all set. Signed-off-by: Yang Wei <wei.y.yang@intel.com> Signed-off-by: Shan Haitao <haitao.shan@intel.com> Signed-off-by: Li Xin <xin.li@intel.com> Signed-off-by: Keir Fraser <keir@xen.org>
* \|	x86/mm: merge the shadow, hap and log-dirty locks into a single paging lock.	Tim Deegan	2011-06-02	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will allow us to simplify the locking around calls between hap/shadow and log-dirty code. Many log-dirty paths already need the shadow or HAP lock so it shouldn't increase contention that much. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
* \|	x86/mm: dedup the various copies of the shadow lock functions	Tim Deegan	2011-06-02	1	-12/+5
\|/ \| \| \| \| \| \| \| \|	Define the lock and unlock functions once, and list all the locks in one place so (a) it's obvious what the locking discipline is and (b) none of the locks are visible to non-mm code. Automatically enforce that these locks never get taken in the wrong order. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
*	x86/fpu: create lazy and non-lazy FPU restore functions	Wei Huang	2011-05-09	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently Xen relies on #NM (via CR0.TS) to trigger FPU context restore. But not all FPU state is tracked by TS bit. This function creates two FPU restore functions: vcpu_restore_fpu_lazy() and vcpu_restore_fpu_eager(). vcpu_restore_fpu_lazy() is still used when #NM is triggered. vcpu_restore_fpu_eager(), as a comparision, is called for vcpu which is being scheduled in on every context switch. To minimize restore overhead, it creates a flag, nonlazy_xstate_used, to control non-lazy restore. Signed-off-by: Wei Huang <wei.huang2@amd.com>
*	x86: replace nr_irqs sized per-domain arrays with radix trees	Jan Beulich	2011-05-09	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It would seem possible to fold the two trees into one (making e.g. the emuirq bits stored in the upper half of the pointer), but I'm not certain that's worth it as it would make deletion of entries more cumbersome. Unless pirq-s and emuirq-s were mutually exclusive... v2: Split setup/teardown into two stages - (de-)allocation (tree node (de-)population) is done with just d->event_lock held (and hence interrupts enabled), while actual insertion/removal of translation data gets done with irq_desc's lock held (and interrupts disabled). Signed-off-by: Jan Beulich <jbeulich@novell.com> Fix up for new radix-tree implementation. In particular, we should never insert NULL into a radix tree, as that means empty slot (which can be reclaimed during a deletion). Make use of radix_tree_int_to_ptr() (and its inverse) to hide some of these details. Signed-off-by: Keir Fraser <keir@xen.org>
*	Revert 23295:4891f1f41ba5 and 23296:24346f749826	Keir Fraser	2011-05-02	1	-2/+5
\| \| \| \| \| \|	Fails current lock checking mechanism in spinlock.c in debug=y builds. Signed-off-by: Keir Fraser <keir@xen.org>
*	replace d->nr_pirqs sized arrays with radix tree	Jan Beulich	2011-05-01	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this it is questionable whether retaining struct domain's nr_pirqs is actually necessary - the value now only serves for bounds checking, and this boundary could easily be nr_irqs. Another thing to consider is whether it's worth storing the pirq number in struct pirq, to avoid passing the number and a pointer to quite a number of functions. Note that ia64, the build of which is broken currently anyway, is only partially fixed up. Signed-off-by: Jan Beulich <jbeulich@novell.com>
*	x86: replace nr_irqs sized per-domain arrays with radix trees	Jan Beulich	2011-05-01	1	-3/+3
\| \| \| \| \| \| \| \| \|	It would seem possible to fold the two trees into one (making e.g. the emuirq bits stored in the upper half of the pointer), but I'm not certain that's worth it as it would make deletion of entries more cumbersome. Unless pirq-s and emuirq-s were mutually exclusive... Signed-off-by: Jan Beulich <jbeulich@novell.com>
*	x86: make the pv-only e820 array be dynamic.	Keir Fraser	2011-04-13	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	During creation of the PV domain we allocate the E820 structure to have the amount of E820 entries on the machine, plus the number three. This will allow the tool stack to fill the E820 with more than three entries. Specifically the use cases is , where the toolstack retrieves the E820, sanitizes it, and then sets it for the PV guest (for PCI passthrough), this dynamic number of E820 is just right. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Keir Fraser <keir@xen.org>
*	Implement Nested-on-Nested.	cegger	2011-04-05	1	-0/+9
\| \| \| \| \| \| \| \|	This allows the guest to run nested guest with hap enabled. Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
*	x86: split struct domain	Jan Beulich	2011-04-05	1	-10/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is accomplished by converting a couple of embedded arrays (in one case a structure containing an array) into separately allocated pointers, and (just as for struct arch_vcpu in a prior patch) overlaying some PV-only fields with HVM-only ones. One particularly noteworthy change in the opposite direction is that of PITState - this field so far lived in the HVM-only portion, but is being used by PV guests too, and hence needed to be moved out of struct hvm_domain. The change to XENMEM_set_memory_map (and hence libxl__build_pre() and the movement of the E820 related pieces to struct pv_domain) are subject to a positive response to a query sent to xen-devel regarding the need for this to happen for HVM guests (see http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01848.html). The protection of arch.hvm_domain.irq.dpci accesses by is_hvm_domain() is subject to confirmation that the field is used for HVM guests only (see http://lists.xensource.com/archives/html/xen-devel/2011-03/msg02004.html). In the absence of any reply to these queries, and given the early state of 4.2 development, I think it should be acceptable to take the risk of having to later undo/redo some of this. Signed-off-by: Jan Beulich <jbeulich@novell.com>
*	x86: move pv-only members of struct vcpu to struct pv_vcpu	Jan Beulich	2011-04-05	1	-28/+27
\| \| \| \| \| \| \| \| \| \| \|	... thus further shrinking overall size of struct arch_vcpu. This has a minor effect on XEN_DOMCTL_{get,set}_ext_vcpucontext - for HVM guests, some meaningless fields will no longer get stored or retrieved: reads will now return zero, and writes are required to be (mostly) zero (the same as was already done on x86-32). Signed-off-by: Jan Beulich <jbeulich@novell.com>
*	x86: split struct vcpu	Jan Beulich	2011-04-05	1	-6/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is accomplished by splitting the guest_context member, which by itself is larger than a page on x86-64. Quite a number of fields of this structure is completely meaningless for HVM guests, and thus a new struct pv_vcpu gets introduced, which is being overlaid with struct hvm_vcpu in struct arch_vcpu. The one member that is mostly responsible for the large size is trap_ctxt, which now gets allocated separately (unless fitting on the same page as struct arch_vcpu, as is currently the case for x86-32), and only for non-hvm, non-idle domains. This change pointed out a latent problem in arch_set_info_guest(), which is permitted to be called on already initialized vCPU-s, but so far copied the new state into struct arch_vcpu without (in this case) actually going through all the necessary accounting/validation steps. The logic gets changed so that the pieces that bypass accounting will at least be verified to be no different from the currently active bits, and the whole change will fail in case they are. The logic does not get adjusted here to do full error recovery, that is, partially modified state continues to not get unrolled in case of failure. Signed-off-by: Jan Beulich <jbeulich@novell.com>
*	x86: add strictly sanity check for XSAVE/XRSTOR	Wei Gang	2011-02-21	1	-1/+1
\| \| \| \| \| \| \| \|	Replace most checks on cpu_has_xsave with checks on new fn xsave_enabled(), do additional sanity checks in the new fn. Signed-off-by: Wei Gang <gang.wei@intel.com> Signed-off-by: Keir Fraser <keir.xen@gmail.com>
*	Interrupt remapping to PIRQs in HVM guests	Keir Fraser	2010-11-19	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	This patch allows HVM guests to remap interrupts and MSIs into pirqs; once the mapping is in place the guest will receive the interrupt (or the MSI) as an event. The interrupt to be remapped can either be an interrupt of an emulated device or an interrupt of a passthrough device and we keep track of that. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
*	x86/mm: Allocate log-dirty bitmaps from shadow/HAP memory.	Keir Fraser	2010-11-19	1	-0/+4
\| \| \| \| \| \| \| \| \|	Move the p2m alloc and free functions back into the per-domain paging assistance structure and allow them to be called from the log-dirty code. This makes it less likely that log-dirty code will run out of memory populating the log-dirty bitmap. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
*	x86_64: Make 32-bit-hypercall translate area per-vcpu.	Keir Fraser	2010-11-16	1	-0/+4
\| \| \| \| \| \| \|	This is a prerequisite for allowing guest descheduling within a hypercall. Signed-off-by: Keir Fraser <keir@xen.org>
*	x86: do away with the boot time low-memory 1:1 mapping	Keir Fraser	2010-11-09	1	-10/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By doing so, we're no longer restricted to be able to place all boot loader modules into the low 1Gb/4Gb (32-/64-bit) of memory, nor is there a dependency anymore on where the boot loader places the modules. We're also no longer restricted to copy the modules into a place below 4Gb, nor to put them all together into a single piece of memory. Further it allows even the 32-bit Dom0 kernel to be loaded anywhere in physical memory (except if it doesn't support PAE-above-4G). Signed-off-by: Jan Beulich <jbeulich@novell.com>
*	x86: Xsave support for PV guests.	Keir Fraser	2010-11-03	1	-1/+19
\| \| \| \| \|	Signed-off-by: Shan Haitao <haitao.shan@intel.com> Signed-off-by: Han Weidong <weidong.han@intel.com>
*	x86: enable support for {rd,wr}{fs,gs}base instructions	Keir Fraser	2010-10-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	... so that once in a while Xen knows of a new CPU feature before Linux starts making use of it. While (obviously) I wasn't able to test this, it seemed strait forward enough to enable anyway. Signed-off-by: Jan Beulich <jbeulich@novell.com>
*	x86 shadow: allocate all shadow memory in single pages	Tim Deegan	2010-09-01	1	-2/+1
\| \| \| \| \| \|	now that multi-page shadows need not be contiguous. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
*	gdbsx: Always build -- remove build-time config option.	Keir Fraser	2010-07-02	1	-2/+0
\| \| \| \|	Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
*	x86 hvm: implement HVMOP_pagetable_dying	Keir Fraser	2010-06-21	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This patch implements HVMOP_pagetable_dying: an hypercall for guests to notify Xen that a pagetable is about to be destroyed so that Xen can use it as a hint to unshadow the pagetable soon and unhook the top-level user-mode shadow entries right away. Gianluca Guida is the original author of this patch. Signed-off-by: Gianluca Guida <glguida@gmail.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
*	x86 hvm: implement vector callback for evtchn delivery	Keir Fraser	2010-05-25	1	-0/+4
\| \| \| \| \| \|	Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
*	p2m: move phystable into p2m	Keir Fraser	2010-05-20	1	-3/+0
\| \| \| \| \| \| \|	Moves phys_table from struct domain to struct p2m_domain. Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Acked-by: Tim Deegan <Tim.Deegan@citrix.com>