aboutsummaryrefslogtreecommitdiffstats
path: root/xen/include/asm-x86/hvm
Commit message (Collapse)AuthorAgeFilesLines
* x86/HVM: cache emulated instruction for retry processingJan Beulich2013-10-141-0/+3
| | | | | | | | | | | | | Rather than re-reading the instruction bytes upon retry processing, stash away and re-use what we already read. That way we can be certain that the retry won't do something different from what requested the retry, getting once again closer to real hardware behavior (where what we use retries for is simply a bus operation, not involving redundant decoding of instructions). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/HVM: fix direct PCI port I/O emulation retry and error handlingJan Beulich2013-10-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | dpci_ioport_{read,write}() guest memory access failure handling should be modelled after process_portio_intercept()'s (and others): Upon encountering an error on other than the first iteration, the count successfully handled needs to be stored and X86EMUL_OKAY returned, in order for the generic instruction emulator to update register state correctly before reporting failure or retrying (both of which would only happen after re-invoking emulation). Further we leverage (and slightly extend, due to the above mentioned need to return X86EMUL_OKAY) the "large MMIO" retry model. Note that there is still a special case not explicitly taken care of here: While the first retry on the last iteration of a "rep ins" correctly recovers the already read data, an eventual subsequent retry is being handled by the pre-existing mmio-large logic (through hvmemul_do_io() storing the [recovered] data [again], also taking into consideration that the emulator converts a single iteration "ins" to ->read_io() plus ->write()). Also fix an off-by-one in the mmio-large-read logic, and slightly simplify the copying of the data. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* VMX: clean up capability checksJan Beulich2013-10-041-2/+8
| | | | | | | | | | | | VMCS size validation on APs should check against BP's size. No need for a separate cpu_has_vmx_ins_outs_instr_info variable anymore. Use proper symbolics. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
* Nested VMX: check VMX capability before read VMX related MSRsYang Zhang2013-10-041-0/+2
| | | | | | | | | | | | VMX MSRs only available when the CPU support the VMX feature. In addition, VMX_TRUE* MSRs only available when bit 55 of VMX_BASIC MSR is set. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Cleanup. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
* x86/AMD-Vi: Fix IVRS HPET special->handle overrideSuravee Suthikulpanit2013-09-301-2/+5
| | | | | | | | | | | | | The current logic does not handle the case when HPET special->handle is invalid in IVRS. On such system, the following message is shown: (XEN) AMD-Vi: Failed to setup HPET MSI remapping: Wrong HPET This patch will allow the ivrs_hpet[<handle>]=<sbdf> to override the IVRS. Also, it removes struct hpet_sbdf.iommu since it is not used anywhere in the code. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
* VMX: drop memory clobbers from vmread/vmwrite wrappersJan Beulich2013-09-231-9/+6
| | | | | | | | All effects are properly being described by the asm() constraints. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
* VMX: also use proper instruction mnemonic for VMREADJan Beulich2013-09-231-8/+13
| | | | | | | | | | | ... when assembler supports it, following commit cfd54835 ("VMX: use proper instruction mnemonics if assembler supports them"). This merely got split off from the earlier change becase of the significant number of call sites needing to be changed. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
* hvm/vpmu: Prevent dump handlers from incorrectly mutating stateAndrew Cooper2013-09-161-1/+1
| | | | | | | | | | | | | | | | | | Discovered by Coverity, CID 1055181 core2_vpmu_dump() was incorrectly setting VPMU_CONTEXT_LOADED when it was intending to check for it. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> This would have been avoided if the dump function declared all its pointers "const" - doing this now (also in SVM). Also fixing some indentation issues at once. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
* console: buffer and show origin of guest PV writesDaniel De Graaf2013-09-101-6/+0
| | | | | | | | | | | | | Guests other than domain 0 using the console output have previously been controlled by the VERBOSE #define, but with no designation of which guest's output was on the console. This patch converts the HVM output buffering to be used by all domains except the hardware domain (dom0): stripping non-printable characters, line buffering the output, and prefixing it with the domain ID. This is especially useful for debugging stub domains during early boot. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Keir Fraser <keir@xen.org>
* x86/xsave: fix migration from xsave-capable to xsave-incapable hostJan Beulich2013-09-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With CPUID features suitably masked this is supposed to work, but was completely broken (i.e. the case wasn't even considered when the original xsave save/restore code was written). First of all, xsave_enabled() wrongly returned the value of cpu_has_xsave, i.e. not even taking into consideration attributes of the vCPU in question. Instead this function ought to check whether the guest ever enabled xsave support (by writing a [non-zero] value to XCR0). As a result of this, a vCPU's xcr0 and xcr0_accum must no longer be initialized to XSTATE_FP_SSE (since that's a valid value a guest could write to XCR0), and the xsave/xrstor as well as the context switch code need to suitably account for this (by always enforcing at least this part of the state to be saved/loaded). This involves undoing large parts of c/s 22945:13a7d1f7f62c ("x86: add strictly sanity check for XSAVE/XRSTOR") - we need to cleanly distinguish between hardware capabilities and vCPU used features. Next both HVM and PV save code needed tweaking to not always save the full state supported by the underlying hardware, but just the parts that the guest actually used. Similarly the restore code should bail not just on state being restored that the hardware cannot handle, but also on inconsistent save state (inconsistent XCR0 settings or size of saved state not in line with XCR0). And finally the PV extended context get/set code needs to use slightly different logic than the HVM one, as here we can't just key off of xsave_enabled() (i.e. avoid doing anything if a guest doesn't use xsave) because the tools use this function to determine host capabilities as well as read/write vCPU state. The set operation in particular needs to be capable of cleanly dealing with input that consists of only the xcr0 and xcr0_accum values (if they're both zero then no further data is required). While for things to work correctly both sides (saving _and_ restoring host) need to run with the fixed code, afaict no breakage should occur if either side isn't up to date (other than the breakage that this patch attempts to fix). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Yang Zhang <yang.z.zhang@intel.com> Acked-by: Keir Fraser <keir@xen.org>
* VMX: use proper instruction mnemonics if assembler supports themJan Beulich2013-09-091-20/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the hex byte emission we were taking away a good part of flexibility from the compiler, as for simplicity reasons these were built using fixed operands. All half way modern build environments would allow using the mnemonics (but we can't disable the hex variants yet, since the binutils around at the time gcc 4.1 got released didn't support these yet). I didn't convert __vmread() yet because that would, just like for __vmread_safe(), imply converting to a macro so that the output operand can be the caller supplied variable rather than an intermediate one. As that would require touching all invocation points of __vmread() (of which there are quite a few), I'd first like to be certain the approach is acceptable; the main question being whether the now conditional code might be considered to cause future maintenance issues, and the second being that of parameter/argument ordering (here I made __vmread_safe() match __vmwrite(), but one could also take the position that read and write should use the inverse order of one another, in line with the actual instruction operands). Additionally I was quite puzzled to find that all the asm()-s involved here have memory clobbers - what are they needed for? Or can they be dropped at least in some cases? Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org>
* VMX: move various uses of UD2 out of fast pathsJan Beulich2013-09-091-6/+19
| | | | | | | | | ... at once making conditional forward jumps, which are statically predicted to be not taken, only used for the unlikely (error) cases. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org>
* VMX: streamline entry.S codeJan Beulich2013-09-091-0/+15
| | | | | | | | | | | | | - move stuff easily/better done in C into C code - re-arrange code paths so that no redundant GET_CURRENT() would remain on the fast paths - move long latency operations earlier - slightly defer disabling interrupts on the VM entry path - use ENTRY() instead of open coding it Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org>
* AMD IOMMU: allow command line overrides for broken IVRS tablesJan Beulich2013-08-291-0/+1
| | | | | | | | | | | | | | With there being so many systems with broken ACPI tables, and with it generally being known what's wrong with those tables, give people a handle to overcome the resulting disabling of their IOMMUs. Inspired by Linux side patches providing similar functionality. Suggested-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-By: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Suravee Suthikulpanit <suravee.suthikulapanit@amd.com>
* VMX: convert EOI exit bitmap to a proper bitmapJan Beulich2013-08-271-9/+3
| | | | | | | | | ... allowing bitmap operations to be used on it, making things consistent with struct pi_desc's pir field, and shrinking overall source code size. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* Nested VMX: Update APIC-v(RVI/SVI) when vmexit to L1Yang Zhang2013-08-223-2/+4
| | | | | | | | | | | | | | | | | | | If enabling APIC-v, all interrupts to L1 are delivered through APIC-v. But when L2 is running, external interrupt will casue L1 vmexit with reason external interrupt. Then L1 will pick up the interrupt through vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when L1 is running, so APIC-v hardware still will do vEOI updating. The problem is that the interrupt is delivered not through APIC-v hardware, this means SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI updating. The solution is that, when L1 tried to pick up the interrupt from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make sure the following vEOI updating and vPPR updating corrently. Also, since interrupt is delivered through vmcs12, so APIC-v hardware will not cleare vIRR and hypervisor need to clear it before L1 running. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
* x86/AMD: Inject #GP instead of #UD when unable to map vmcbSuravee Suthikulpanit2013-08-131-1/+1
| | | | | | | | | | According to AMD Programmer's Manual vol2, vmrun, vmsave and vmload should inject #GP instead of #UD when unable to access memory location for vmcb. Also, the code should make sure that L1 guest EFER.SVME is not zero. Otherwise, #UD should be injected. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Tim Deegan <tim@xen.org>
* x86/AMD: Fix nested svm crash due to assertion in __virt_to_maddrSuravee Suthikulpanit2013-08-131-4/+7
| | | | | | | | | Fix assertion in __virt_to_maddr when starting nested SVM guest in debug mode. Investigation has shown that svm_vmsave/svm_vmload make use of __pa() with invalid address. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Tim Deegan <tim@xen.org>
* VMX: fix interaction of APIC-V and Viridian emulationJan Beulich2013-07-171-0/+1
| | | | | | | | | | | | | Viridian using a synthetic MSR for issuing EOI notifications bypasses the normal in-processor handling, which would clear GUEST_INTR_STATUS.SVI. Hence we need to do this in software in order for future interrupts to get delivered. Based on analysis by Yang Z Zhang <yang.z.zhang@intel.com>. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Yang Zhang <yang.z.zhang@intel.com>
* AMD IOMMU: allocate IRTE entries instead of using a static mappingJan Beulich2013-07-162-7/+7
| | | | | | | | | | | For multi-vector MSI, where we surely don't want to allocate contiguous vectors and be able to set affinities of the individual vectors separately, we need to drop the use of the tuple of vector and delivery mode to determine the IRTE to use, and instead allocate IRTEs (which imo should have been done from the beginning). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
* AMD IOMMU: use ioremap()Jan Beulich2013-07-151-3/+0
| | | | | | | | | | There's no point in using the fixmap here, and it gets map_iommu_mmio_region() in line with unmap_iommu_mmio_region(), which was already using iounmap() (thus crashing if actually used). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
* x86: drop MAX_VECTOR definitionJan Beulich2013-07-041-2/+0
| | | | | | | | | .. in favor of NR_VECTORS, as being redundant and as the latter is correct in terms of its naming, while the former is off by one. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
* iommu/amd: Fix logic for clearing the IOMMU interrupt bitsSuravee Suthikulpanit2013-07-021-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IOMMU interrupt bits in the IOMMU status registers are "read-only, and write-1-to-clear (RW1C). Therefore, the existing logic which reads the register, set the bit, and then writing back the values could accidentally clear certain bits if it has been set. The correct logic would just be writing only the value which only set the interrupt bits, and leave the rest to zeros. This patch also, clean up #define masks as Jan has suggested. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> With iommu_interrupt_handler() properly having got switched its readl() from status to control register, the subsequent writel() needed to be switched too (and the RW1C comment there was bogus). Some of the cleanup went too far - undone. Further, with iommu_interrupt_handler() now actually disabling the interrupt sources, they also need to get re-enabled by the tasklet once it finished processing the respective log. This also implies re-running the tasklet so that log entries added between reading the log and re- enabling the interrupt will get handled in a timely manner. Finally, guest write emulation to the status register needs to be done with the RW1C (and RO for all other bits) semantics in mind too. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
* x86: fix XCR0 handlingJan Beulich2013-06-041-1/+1
| | | | | | | | | | | | | | - both VMX and SVM ignored the ECX input to XSETBV - both SVM and VMX used the full 64-bit RAX when calculating the input mask to XSETBV - faults on XSETBV did not get recovered from Also consolidate the handling for PV and HVM into a single function, and make the per-CPU variable "xcr0" static to xstate.c. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
* x86/HVM: properly handle RTC periodic timer even when !RTC_PIEJan Beulich2013-05-211-1/+1
| | | | | | | | | Since in that case the processing it pt_intr_post() won't occur, we need to do some additional work in pt_update_irq(). Additionally we must not pay attention to the respective IRQ being masked. Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com> (FreeBSD guest)
* x86/HVM: move per-vendor function tables into .init.dataJan Beulich2013-04-291-2/+2
| | | | | | | | | | | | | | | hvm_enable() copies the table contents rather than storing the pointer, so there's no need to keep these tables post-boot. Also constify the return values of the per-vendor initialization functions, making clear that once the per-vendor initialization is complete, the vendor specific tables won't get modified anymore. Finally, in hvm_enable(), use the returned pointer for all read accesses as being more efficient than global variable accesses. Writes of course still need to go to the global variable. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* x86/HVM: Call vlapic_set_irq() to delivery virtual interruptYang Zhang2013-04-181-1/+1
| | | | | | | | | Move kick_vcpu into vlapic_set_irq. And call it to deliver virtual interrupt instead set vIRR directly. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
* VMX: Add posted interrupt supportingYang Zhang2013-04-183-0/+42
| | | | | | | | | Add the supporting of using posted interrupt to deliver interrupt. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
* VMX: Turn on posted interrupt bit in vmcsYang Zhang2013-04-182-0/+9
| | | | | | | | | Turn on posted interrupt for vcpu if posted interrupt is avaliable. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
* VMX: Detect posted interrupt capabilityYang Zhang2013-04-181-0/+6
| | | | | | | | | Check whether the Hardware supports posted interrupt capability. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
* x86/VPMU: Save/restore VPMU only when necessaryBoris Ostrovsky2013-04-151-4/+10
| | | | | | | | | | | | | VPMU doesn't need to always be saved during context switch. If we are comming back to the same processor and no other VPCU has run here we can simply continue running. This is especailly useful on Intel processors where Global Control MSR is stored in VMCS, thus not requiring us to stop the counters during save operation. On AMD we need to explicitly stop the counters but we don't need to save them. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
* x86/VPMU: Factor out VPMU common codeBoris Ostrovsky2013-04-152-1/+1
| | | | | | | | | Factor out common code from SVM amd VMX into VPMU. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
* x86/AMD: Allow more fine-grained control of VMCB MSR Permission MapBoris Ostrovsky2013-04-151-2/+6
| | | | | | | | | Currently VMCB's MSRPM can be updated to either intercept both reads and writes to an MSR or not intercept neither. In some cases we may want to be more selective and intercept one but not the other. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
* IOMMU: allow MSI message to IRTE propagation to failJan Beulich2013-04-151-1/+1
| | | | | | | | | | | | With the need to allocate multiple contiguous IRTEs for multi-vector MSI, the chance of failure here increases. While on the AMD side there's no allocation of IRTEs at present at all (and hence no way for this allocation to fail, which is going to change with a later patch in this series), VT-d already ignores an eventual error here, which this patch fixes. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: "Zhang, Xiantao" <xiantao.zhang@intel.com>
* vpmu intel: Dump vpmu infos in 'q' keyhandlerDietmar Hahn2013-04-081-0/+2
| | | | | | | | | | | | | | | | | This patch extends the printout of the VPCU infos of the keyhandler 'q'. If vPMU is enabled is on the VCPU and active lines are printed like (when running HVM openSuSE-12.3 with 'perf top'); (XEN) vPMU running (XEN) general_0: 0x000000ffffff3ae1 ctrl: 0x000000000053003c (XEN) fixed_1: 0x000000ff90799188 ctrl: 0xb This means general counter 0 and fixed counter 1 are running with showing their contents and the contents of their configuration msr. Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
* vpmu intel: Better names and replacing numerals with definesDietmar Hahn2013-04-081-3/+8
| | | | | | | | | | This patch renames core2_counters to core2_fix_counters for better understanding the code and subtitutes 2 numerals with defines in fixed counter handling. Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
* x86/hvm: Let the guest miss a few ticks before resetting the timer.Tim Deegan2013-03-291-0/+1
| | | | | | Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/hvm: Avoid needlessly resetting the periodic timer.Tim Deegan2013-03-291-5/+5
| | | | | | Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/hvm: Run the RTC periodic timer on a consistent time series.Tim Deegan2013-03-291-1/+2
| | | | | | | | | | When the RTC periodic timer gets restarted, align it to the VM's boot time, not to whatever time it is now. Otherwise every read of REG_C will restart the current period Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com>
* hvm: Improve APIC INIT/SIPI emulation, fixing it for call paths other than ↵Keir Fraser2013-03-281-3/+2
| | | | | | | | | | | | | | x86_emulate(). In particular, on broadcast/multicast INIT/SIPI, we handle all target APICs at once in a single invocation of the init/sipi tasklet. This avoids needing to return an X86EMUL_RETRY error code to the caller, which was being ignored by all except x86_emulate(). The original bug, and the general approach in this fix, pointed out by Intel (yang.z.zhang@intel.com). Signed-off-by: Keir Fraser <keir@xen.org>
* Fix emacs local variable block to use correct C style variable.David Vrabel2013-02-217-7/+7
| | | | | | | The emacs variable to set the C style from a local variable block is c-file-style, not c-set-style. Signed-off-by: David Vrabel <david.vrabel@citrix.com
* x86/VMX: fix VMCS setting for x2APIC mode guest while enabling APICVJiongxi Li2013-02-181-0/+4
| | | | | | | | | | | | | The "APIC-register virtualization" and "virtual-interrupt deliver" VM-execution control has no effect on the behavior of RDMSR/WRMSR if the "virtualize x2APIC mode" VM-execution control is 0. When guest uses x2APIC mode, we should enable "virtualize x2APIC mode" for APICV first. Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* x86/VMX: fix live migration while enabling APICVJiongxi Li2013-02-182-0/+5
| | | | | | | | | | | SVI should be restored in case guest is processing virtual interrupt while saveing a domain state. Otherwise SVI would be missed when virtual interrupt delivery is enabled. Signed-off-by: Jiongxi Li <jiongxi.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* AMD,IOMMU: Clean up old entries in remapping tables when creating new oneJan Beulich2013-02-051-0/+1
| | | | | | | | | | | | | | | | | When changing the affinity of an IRQ associated with a passed through PCI device, clear previous mapping. This is XSA-36 / CVE-2013-0153. Signed-off-by: Jan Beulich <jbeulich@suse.com> In addition, because some BIOSes may incorrectly program IVRS entries for IOAPIC try to check for entry's consistency. Specifically, if conflicting entries are found disable IOMMU if per-device remapping table is used. If entries refer to bogus IOAPIC IDs disable IOMMU unconditionally Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
* x86/HVM: assorted RTC emulation adjustmentsJan Beulich2013-02-051-0/+1
| | | | | | | | | | | | | - only call check_update_timer() on REG_B writes when SET changes - only call alarm_timer_update() on REG_B writes when relevant bits change - instead properly handle AF and PF when the guest is not also setting AIE/PIE respectively (for UF this was already the case, only a comment was slightly inaccurate), including calling the respective update functions upon REG_C reads Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Olaf Hering <olaf@aepfle.de>
* nested vmx: enable VMCS shadowing featureDongxiao Xu2013-01-251-1/+17
| | | | | | | | | | | | | | | The current logic for handling the non-root VMREAD/VMWRITE is by VM-Exit and emulate, which may bring certain overhead. On new Intel platform, it introduces a new feature called VMCS shadowing, where non-root VMREAD/VMWRITE will not trigger VM-Exit, and the hardware will read/write the virtual VMCS instead. This is proved to have performance improvement with the feature. Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* nested vmx: optimize for bulk access of virtual VMCSDongxiao Xu2013-01-251-0/+2
| | | | | | | | | | | | After we use the VMREAD/VMWRITE to build up the virtual VMCS, each access to the virtual VMCS needs two VMPTRLD and one VMCLEAR to switch the environment, which might be an overhead to performance. This commit tries to handle multiple virtual VMCS access together to improve the performance. Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> Acked-by Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* nested vmx: use VMREAD/VMWRITE to construct vVMCS if enabled VMCS shadowingDongxiao Xu2013-01-252-4/+17
| | | | | | | | | | | | | | | Before the VMCS shadowing feature, we use memory operation to build up the virtual VMCS. This does work since this virtual VMCS will never be loaded into real hardware. However after we introduce the VMCS shadowing feature, this VMCS will be loaded into hardware, which requires all fields in the VMCS accessed by VMREAD/VMWRITE. Besides, the virtual VMCS revision identifer should also meet the hardware's requirement, instead of using a faked one. Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> Acked-by Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* nested vmx: Use a list to store the launched vvmcs for L1 VMMDongxiao Xu2013-01-252-2/+6
| | | | | | | | | | | | | Originally we use a virtual VMCS field to store the launch state of a certain vmcs. However if we introduce VMCS shadowing feature, this virtual VMCS should also be able to load into real hardware, and VMREAD/VMWRITE operate invalid fields. The new approach is to store the launch state into a list for L1 VMM. Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> Acked-by Eddie Dong <eddie.dong@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* x86: properly use map_domain_page() in nested HVM codeJan Beulich2013-01-231-3/+3
| | | | | | | | | | | | | | This eliminates a couple of incorrect/inconsistent uses of map_domain_page() from VT-x code. Note that this does _not_ add error handling where none was present before, even though I think NULL returns from any of the mapping operations touched here need to properly be handled. I just don't know this code well enough to figure out what the right action in each case would be. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>