xen/xen - xen

	Commit message (Collapse)	Author	Age	Files	Lines
*	x86: make hvm_cpuid() tolerate NULL pointers	Jan Beulich	2013-10-04	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \|	Now that other HVM code started making more extensive use of hvm_cpuid(), let's not force every caller to declare dummy variables for output not cared about. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	Nested VMX: fix IA32_VMX_CR4_FIXED1 msr emulation	Yang Zhang	2013-10-04	1	-4/+51
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, it use hardcode value for IA32_VMX_CR4_FIXED1. This is wrong. We should check guest's cpuid to know which bits are writeable in CR4 by guest and allow the guest to set the corresponding bit only when guest has the feature. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Cleanup. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	VMX: clean up capability checks	Jan Beulich	2013-10-04	1	-17/+27
\| \| \| \| \| \| \| \| \| \| \| \|	VMCS size validation on APs should check against BP's size. No need for a separate cpu_has_vmx_ins_outs_instr_info variable anymore. Use proper symbolics. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	Nested VMX: check VMX capability before read VMX related MSRs	Yang Zhang	2013-10-04	2	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \|	VMX MSRs only available when the CPU support the VMX feature. In addition, VMX_TRUE* MSRs only available when bit 55 of VMX_BASIC MSR is set. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Cleanup. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	x86: properly handle hvm_copy_from_guest_{phys,virt}() errors	Jan Beulich	2013-09-30	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Ignoring them generally implies using uninitialized data and, in all but two of the cases dealt with here, potentially leaking hypervisor stack contents to guests. This is CVE-2013-4355 / XSA-63. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
*	Nested VMX: Expose unrestricted guest feature to guest	Yang Zhang	2013-09-30	1	-0/+3
\| \| \| \| \| \| \| \|	With virtual unrestricted guest feature, L2 guest is allowed to run with PG cleared. Also, allow PAE not set during virtual vmexit emulation. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Acked-by: Eddie.Dong@intel.com
*	VMX: also use proper instruction mnemonic for VMREAD	Jan Beulich	2013-09-23	6	-148/+233
\| \| \| \| \| \| \| \| \| \| \|	... when assembler supports it, following commit cfd54835 ("VMX: use proper instruction mnemonics if assembler supports them"). This merely got split off from the earlier change becase of the significant number of call sites needing to be changed. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	VMX: fix failure path in construct_vmcs	George Dunlap	2013-09-18	1	-0/+3
\| \| \| \| \| \| \| \| \|	If the allocation fails, make sure to call vmx_vmcs_exit(). This is a candidate for backport. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
*	hvm/vpmu: Prevent dump handlers from incorrectly mutating state	Andrew Cooper	2013-09-16	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Discovered by Coverity, CID 1055181 core2_vpmu_dump() was incorrectly setting VPMU_CONTEXT_LOADED when it was intending to check for it. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> This would have been avoided if the dump function declared all its pointers "const" - doing this now (also in SVM). Also fixing some indentation issues at once. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
*	Nested VMX: Clear bit 31 of IA32_VMX_BASIC MSR	Yang Zhang	2013-09-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The bit 31 of revision_id will set to 1 if vmcs shadowing enabled. And according intel SDM, the bit 31 of IA32_VMX_BASIC MSR is always 0. So we cannot set low 32 bit of IA32_VMX_BASIC to revision_id directly. Must clear the bit 31 to 0. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
*	x86/xsave: fix migration from xsave-capable to xsave-incapable host	Jan Beulich	2013-09-09	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With CPUID features suitably masked this is supposed to work, but was completely broken (i.e. the case wasn't even considered when the original xsave save/restore code was written). First of all, xsave_enabled() wrongly returned the value of cpu_has_xsave, i.e. not even taking into consideration attributes of the vCPU in question. Instead this function ought to check whether the guest ever enabled xsave support (by writing a [non-zero] value to XCR0). As a result of this, a vCPU's xcr0 and xcr0_accum must no longer be initialized to XSTATE_FP_SSE (since that's a valid value a guest could write to XCR0), and the xsave/xrstor as well as the context switch code need to suitably account for this (by always enforcing at least this part of the state to be saved/loaded). This involves undoing large parts of c/s 22945:13a7d1f7f62c ("x86: add strictly sanity check for XSAVE/XRSTOR") - we need to cleanly distinguish between hardware capabilities and vCPU used features. Next both HVM and PV save code needed tweaking to not always save the full state supported by the underlying hardware, but just the parts that the guest actually used. Similarly the restore code should bail not just on state being restored that the hardware cannot handle, but also on inconsistent save state (inconsistent XCR0 settings or size of saved state not in line with XCR0). And finally the PV extended context get/set code needs to use slightly different logic than the HVM one, as here we can't just key off of xsave_enabled() (i.e. avoid doing anything if a guest doesn't use xsave) because the tools use this function to determine host capabilities as well as read/write vCPU state. The set operation in particular needs to be capable of cleanly dealing with input that consists of only the xcr0 and xcr0_accum values (if they're both zero then no further data is required). While for things to work correctly both sides (saving _and_ restoring host) need to run with the fixed code, afaict no breakage should occur if either side isn't up to date (other than the breakage that this patch attempts to fix). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Yang Zhang <yang.z.zhang@intel.com> Acked-by: Keir Fraser <keir@xen.org>
*	VMX: use proper instruction mnemonics if assembler supports them	Jan Beulich	2013-09-09	2	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the hex byte emission we were taking away a good part of flexibility from the compiler, as for simplicity reasons these were built using fixed operands. All half way modern build environments would allow using the mnemonics (but we can't disable the hex variants yet, since the binutils around at the time gcc 4.1 got released didn't support these yet). I didn't convert __vmread() yet because that would, just like for __vmread_safe(), imply converting to a macro so that the output operand can be the caller supplied variable rather than an intermediate one. As that would require touching all invocation points of __vmread() (of which there are quite a few), I'd first like to be certain the approach is acceptable; the main question being whether the now conditional code might be considered to cause future maintenance issues, and the second being that of parameter/argument ordering (here I made __vmread_safe() match __vmwrite(), but one could also take the position that read and write should use the inverse order of one another, in line with the actual instruction operands). Additionally I was quite puzzled to find that all the asm()-s involved here have memory clobbers - what are they needed for? Or can they be dropped at least in some cases? Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org>
*	VMX: streamline entry.S code	Jan Beulich	2013-09-09	2	-62/+33
\| \| \| \| \| \| \| \| \| \| \| \| \|	- move stuff easily/better done in C into C code - re-arrange code paths so that no redundant GET_CURRENT() would remain on the fast paths - move long latency operations earlier - slightly defer disabling interrupts on the VM entry path - use ENTRY() instead of open coding it Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org>
*	x86/Intel: add support for Haswell CPU models	Jan Beulich	2013-08-27	2	-2/+7
\| \| \| \| \| \| \|	... according to their most recent public documentation. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	VMX: convert EOI exit bitmap to a proper bitmap	Jan Beulich	2013-08-27	2	-32/+19
\| \| \| \| \| \| \| \| \|	... allowing bitmap operations to be used on it, making things consistent with struct pi_desc's pir field, and shrinking overall source code size. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	Nested VMX: Update APIC-v(RVI/SVI) when vmexit to L1	Yang Zhang	2013-08-22	3	-8/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If enabling APIC-v, all interrupts to L1 are delivered through APIC-v. But when L2 is running, external interrupt will casue L1 vmexit with reason external interrupt. Then L1 will pick up the interrupt through vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when L1 is running, so APIC-v hardware still will do vEOI updating. The problem is that the interrupt is delivered not through APIC-v hardware, this means SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI updating. The solution is that, when L1 tried to pick up the interrupt from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make sure the following vEOI updating and vPPR updating corrently. Also, since interrupt is delivered through vmcs12, so APIC-v hardware will not cleare vIRR and hypervisor need to clear it before L1 running. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
*	Nested VMX: Clear APIC-v control bit in vmcs02	Yang Zhang	2013-08-22	1	-0/+12
\| \| \| \| \| \| \| \|	There is no vAPIC-v support, so mask APIC-v control bit when constructing vmcs02. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
*	Nested VMX: Check whether interrupt is blocked by TPR	Yang Zhang	2013-08-22	1	-0/+5
\| \| \| \| \| \| \| \|	If interrupt is blocked by L1's TPR, L2 should not see it and keep running. Adding the check before L2 to retrive interrupt. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
*	VMX: add boot parameter to enable/disable APIC-v dynamically	Yang Zhang	2013-08-13	1	-2/+5
\| \| \| \| \| \| \|	Add a boot parameter to enable/disable the APIC-v dynamically. APIC-v is enabled by default. User can use apicv=0 to disable it. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
*	Intel/VPMU: Add support for full-width PMC writes	Boris Ostrovsky	2013-08-07	1	-5/+36
\| \| \| \| \| \| \| \| \| \| \|	A recent Linux commit (069e0c3c405814778c7475d95b9fff5318f39834) added support for full-width PMC writes to performance counter registers, making these registers default for perf. Since current Xen VPMU does not support these new MSRs perf will fail to initialise in guests. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Acked-by: Keir Faser <keir@xen.org>
*	VMX: suppress pointless indirect calls	Jan Beulich	2013-07-17	1	-10/+8
\| \| \| \| \| \| \| \| \| \| \|	Get the other virtual interrupt delivery related actors in sync with the newly added handle_eoi() one: Clear the respective pointers (thus avoiding the call from generic code) when the feature is unavailable instead of checking feature availability in the actors. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Yang Zhang <yang.z.zhang@intel.com>
*	VMX: fix interaction of APIC-V and Viridian emulation	Jan Beulich	2013-07-17	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	Viridian using a synthetic MSR for issuing EOI notifications bypasses the normal in-processor handling, which would clear GUEST_INTR_STATUS.SVI. Hence we need to do this in software in order for future interrupts to get delivered. Based on analysis by Yang Z Zhang <yang.z.zhang@intel.com>. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Yang Zhang <yang.z.zhang@intel.com>
*	x86: Special case __HYPERVISOR_iret rather more when writing hypercall pages	Andrew Cooper	2013-07-16	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In all cases when a hypercall page is written, __HYPERVISOR_iret is first written as a regular hypercall, then subsequently rewritten in its special case. For VMX and SVM, this means that following the ud2a instruction is 3 bytes of an imm32 parameter. For a ring3 kernel, this means that following the syscall instruction is the second half of 'pop %r11'. For a ring1 kernel, the iret case ends up as the same number of bytes as the rest of the hypercalls, but it is pointless writing it twice, and is changed for consistency. Therefore, skip the loop iteration which would write the incorrect __HYPERVISOR_iret hypercall. This removes junk machine code from the tail and makes disassemblers rather more happy when looking at the hypercall page. Also, a miscellaneous whitespace fix in the comment for ring3 kernel. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
*	x86/HVM: key handler registration functions can be __init	Jan Beulich	2013-07-10	1	-1/+1
\| \| \| \| \| \| \|	This applies to both SVM and VMX. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	nested vmx: Fix the booting of L2 PAE guest	Dongxiao Xu	2013-06-27	1	-12/+15
\| \| \| \| \| \| \| \| \| \| \| \|	When doing virtual VM entry and virtual VM exit, we need to sychronize the PAE PDPTR related VMCS registers. With this fix, we can boot 32bit PAE L2 guest (Win7 & RHEL6.4) on "Xen on Xen" environment. Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> Tested-by: Yongjie Ren <yongjie.ren@intel.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
*	x86: fix XCR0 handling	Jan Beulich	2013-06-04	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- both VMX and SVM ignored the ECX input to XSETBV - both SVM and VMX used the full 64-bit RAX when calculating the input mask to XSETBV - faults on XSETBV did not get recovered from Also consolidate the handling for PV and HVM into a single function, and make the per-CPU variable "xcr0" static to xstate.c. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
*	Currently only a few Intel models have VPMU workaround turned on. It	Boris Ostrovsky	2013-05-31	1	-11/+5
\| \| \| \| \| \| \| \| \| \|	appears, however, that this issue exists on more models than what is covered by check_pmc_quirk(). Since we don't know exactly which cpus are affected we should turn this workaround on for all family 6 processors. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Sort-of-Acked-by: "Auld, Will" <will.auld@intel.com>
*	x86: handle paged gfn in wrmsr_hypervisor_regs	Olaf Hering	2013-05-07	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If xenpaging is started very early for a guest the gfn for the hypercall page may be paged-out already. This leads to a guest crash: ... (XEN) HVM10: Allocated Xen hypercall page at 169ff000 (XEN) traps.c:654:d10 Bad GMFN 169ff (MFN 3e900000000) to MSR 40000000 (XEN) HVM10: Detected Xen v4.3 (XEN) io.c:201:d10 MMIO emulation failed @ 0008:c2c2c2c2: 18 7c 55 6d 03 83 ff ff 10 7c (XEN) hvm.c:1253:d10 Triple fault on VCPU0 - invoking HVM shutdown action 1. (XEN) HVM11: HVM Loader ... Update return codes of wrmsr_hypervisor_regs, update callers to deal with the new return codes: 0: not handled 1: handled -EAGAIN: retry Currently wrmsr_hypervisor_regs will not return the following error, it will be added in a separate patch: -EINVAL: error during handling Also update the gdprintk to handle a page value of NULL to avoid printing a bogus MFN value. Update also computing of MSR value in gdprintk, the idx was always zero. Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Keir Fraser <keir@xen.org>
*	x86/HVM: move per-vendor function tables into .init.data	Jan Beulich	2013-04-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	hvm_enable() copies the table contents rather than storing the pointer, so there's no need to keep these tables post-boot. Also constify the return values of the per-vendor initialization functions, making clear that once the per-vendor initialization is complete, the vendor specific tables won't get modified anymore. Finally, in hvm_enable(), use the returned pointer for all read accesses as being more efficient than global variable accesses. Writes of course still need to go to the global variable. Signed-off-by: Jan Beulich <jbeulich@suse.com>
*	VMX: adjust correct table when there's no posted interrupt support	Jan Beulich	2013-04-19	1	-2/+2
\| \| \| \| \| \| \|	The caller of start_vmx() will overwrite hvm_funcs, so there's no point in adjusting that table in start_vmx(). Signed-off-by: Jan Beulich <jbeulich@suse.com>
*	VMX: Add posted interrupt supporting	Yang Zhang	2013-04-18	1	-0/+62
\| \| \| \| \| \| \| \| \|	Add the supporting of using posted interrupt to deliver interrupt. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
*	VMX: Turn on posted interrupt bit in vmcs	Yang Zhang	2013-04-18	2	-0/+11
\| \| \| \| \| \| \| \| \|	Turn on posted interrupt for vcpu if posted interrupt is avaliable. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
*	VMX: Detect posted interrupt capability	Yang Zhang	2013-04-18	1	-1/+11
\| \| \| \| \| \| \| \| \|	Check whether the Hardware supports posted interrupt capability. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
*	x86/VPMU: Save/restore VPMU only when necessary	Boris Ostrovsky	2013-04-15	2	-5/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	VPMU doesn't need to always be saved during context switch. If we are comming back to the same processor and no other VPCU has run here we can simply continue running. This is especailly useful on Intel processors where Global Control MSR is stored in VMCS, thus not requiring us to stop the counters during save operation. On AMD we need to explicitly stop the counters but we don't need to save them. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
*	x86/VPMU: Factor out VPMU common code	Boris Ostrovsky	2013-04-15	1	-29/+1
\| \| \| \| \| \| \| \| \|	Factor out common code from SVM amd VMX into VPMU. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	x86/VPMU: Add Haswell support	Boris Ostrovsky	2013-04-15	1	-0/+1
\| \| \| \| \| \| \|	Initialize VPMU on Haswell CPUs. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
*	vpmu intel: Dump vpmu infos in 'q' keyhandler	Dietmar Hahn	2013-04-08	1	-3/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch extends the printout of the VPCU infos of the keyhandler 'q'. If vPMU is enabled is on the VCPU and active lines are printed like (when running HVM openSuSE-12.3 with 'perf top'); (XEN) vPMU running (XEN) general_0: 0x000000ffffff3ae1 ctrl: 0x000000000053003c (XEN) fixed_1: 0x000000ff90799188 ctrl: 0xb This means general counter 0 and fixed counter 1 are running with showing their contents and the contents of their configuration msr. Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	vpmu intel: Use PMU defines instead of numerals and bit masks	Dietmar Hahn	2013-04-08	1	-24/+37
\| \| \| \| \| \| \| \| \|	This patch uses the new defines in Intel vPMU to replace existing numerals and bit masks. Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	vpmu intel: Better names and replacing numerals with defines	Dietmar Hahn	2013-04-08	1	-21/+21
\| \| \| \| \| \| \| \| \| \|	This patch renames core2_counters to core2_fix_counters for better understanding the code and subtitutes 2 numerals with defines in fixed counter handling. Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	VMX: Always disable SMEP when guest is in non-paging mode	Stefan Bader	2013-04-04	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit e7dda8ec9fc9020e4f53345cdbb18a2e82e54a65 VMX: disable SMEP feature when guest is in non-paging mode disabled the SMEP bit if a guest VCPU was using HAP and was not in paging mode. However I could observe VCPUs getting stuck in the trampoline after the following patch in the Linux kernel changed the way CR4 gets set up: x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline The change will set CR4 from already set flags which includes the SMEP bit. On bare metal this does not matter as the CPU is in non- paging mode at that time. But Xen seems to use the emulated non- paging mode regardless of HAP (I verified that on the guests I was seeing the issue, HAP was not used). Therefor it seems right to unset the SMEP bit for a VCPU that is not in paging-mode, regardless of its HAP usage. Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Dongxiao Xu <dongxiao.xu@intel.com>
*	vpmu intel: Add cpuid handling when vpmu disabled	Dietmar Hahn	2013-03-26	1	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Even though vpmu is disabled in the hypervisor in the HVM guest the call of cpuid(0xa) returns informations about usable performance counters. This may confuse guest software when trying to use the counters and nothing happens. This patch clears most bits in registers eax and edx of cpuid(0xa) instruction for the guest when vpmu is disabled: - version ID of architectural performance counting - number of general pmu registers - width of general pmu registers - number of fixed pmu registers - width of ixed pmu registers Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Acked-by: Keir Fraser <keir@xen.org>
*	vpmu intel: pass through cpuid bits when BTS is enabled	Dietmar Hahn	2013-03-12	1	-0/+4
\| \| \| \| \| \| \| \| \|	This patch passes the orginal cpuid bits for X86_FEATURE_DTES64 (64-bit DS Area) and X86_FEATURE_DSCPL (CPL Qualified Debug Store) to the guest when the BTS feature is switched on. I forgot this when I did this BTS emulation. Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
*	x86/vPMU: change Intel model numbers from decimal to hex	Konrad Rzeszutek Wilk	2013-03-08	1	-14/+14
\| \| \| \| \| \|	Suggested-by: "Nakajima, Jun" <jun.nakajima@intel.com> Suggested-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
*	x86/vPMU: add missing Merom, Westmere, and Nehalem models	Konrad Rzeszutek Wilk	2013-03-08	1	-2/+13
\| \| \| \| \| \| \| \| \| \|	Mainly 22 (Merom-L); 30 (Nehelem); and 37, 44 (Westmere). A comprehensive list is available at: http://software.intel.com/en-us/articles/intel-architecture-and-processor-identification-with-cpuid-model-and-family-numbers Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	x86/vPMU: provide comments for which Intel model is what	Konrad Rzeszutek Wilk	2013-03-08	1	-10/+10
\| \| \| \| \|	Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Tim Deegan <tim@xen.org>
*	x86: don't rely on __softirq_pending to be the first field in irq_cpustat_t	Jan Beulich	2013-03-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is even more so as the field doesn't have a comment to that effect in the structure definition. Once modifying the respective assembly code, also convert the IRQSTAT_shift users to do a 32-bit shift only (as we won't support 48M CPUs any time soon) and use "cmpl" instead of "testl" when checking the field (both reducing code size). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	vmx: fix handling of NMI VMEXIT.	Tim Deegan	2013-02-28	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Call do_nmi() directly and explicitly re-enable NMIs rather than raising an NMI through the APIC. Since NMIs are disabled after the VMEXIT, the raised NMI would be blocked until the next IRET instruction (i.e. the next real interrupt, or after scheduling a PV guest) and in the meantime the guest will spin taking NMI VMEXITS. Also, handle NMIs before re-enabling interrupts, since if we handle an interrupt (and therefore IRET) before calling do_nmi(), we may end up running the NMI handler with NMIs enabled. Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
*	x86/nhvm: properly clean up after failure to set up all vCPU-s	Jan Beulich	2013-02-22	1	-14/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise we may leak memory when setting up nHVM fails half way. This implies that the individual destroy functions will have to remain capable (in the VMX case they first need to be made so, following 26486:7648ef657fe7 and 26489:83a3fa9c8434) of being called for a vCPU that the corresponding init function was never run on. Once at it, also remove a redundant check from the corresponding parameter validation code. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org> Tested-by: Olaf Hering <olaf@aepfle.de>
*	Fix emacs local variable block to use correct C style variable.	David Vrabel	2013-02-21	3	-3/+3
\| \| \| \| \| \| \|	The emacs variable to set the C style from a local variable block is c-file-style, not c-set-style. Signed-off-by: David Vrabel <david.vrabel@citrix.com
*	x86/VMX: fix VMCS setting for x2APIC mode guest while enabling APICV	Jiongxi Li	2013-02-18	2	-25/+104
\| \| \| \| \| \| \| \| \| \| \| \| \|	The "APIC-register virtualization" and "virtual-interrupt deliver" VM-execution control has no effect on the behavior of RDMSR/WRMSR if the "virtualize x2APIC mode" VM-execution control is 0. When guest uses x2APIC mode, we should enable "virtualize x2APIC mode" for APICV first. Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>