| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than re-reading the instruction bytes upon retry processing,
stash away and re-use what we already read. That way we can be certain
that the retry won't do something different from what requested the
retry, getting once again closer to real hardware behavior (where what
we use retries for is simply a bus operation, not involving redundant
decoding of instructions).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dpci_ioport_{read,write}() guest memory access failure handling should
be modelled after process_portio_intercept()'s (and others): Upon
encountering an error on other than the first iteration, the count
successfully handled needs to be stored and X86EMUL_OKAY returned, in
order for the generic instruction emulator to update register state
correctly before reporting failure or retrying (both of which would
only happen after re-invoking emulation).
Further we leverage (and slightly extend, due to the above mentioned
need to return X86EMUL_OKAY) the "large MMIO" retry model.
Note that there is still a special case not explicitly taken care of
here: While the first retry on the last iteration of a "rep ins"
correctly recovers the already read data, an eventual subsequent retry
is being handled by the pre-existing mmio-large logic (through
hvmemul_do_io() storing the [recovered] data [again], also taking into
consideration that the emulator converts a single iteration "ins" to
->read_io() plus ->write()).
Also fix an off-by-one in the mmio-large-read logic, and slightly
simplify the copying of the data.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
VMCS size validation on APs should check against BP's size.
No need for a separate cpu_has_vmx_ins_outs_instr_info variable
anymore.
Use proper symbolics.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
VMX MSRs only available when the CPU support the VMX feature. In addition,
VMX_TRUE* MSRs only available when bit 55 of VMX_BASIC MSR is set.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Cleanup.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current logic does not handle the case when HPET special->handle
is invalid in IVRS. On such system, the following message is shown:
(XEN) AMD-Vi: Failed to setup HPET MSI remapping: Wrong HPET
This patch will allow the ivrs_hpet[<handle>]=<sbdf> to override the
IVRS. Also, it removes struct hpet_sbdf.iommu since it is not
used anywhere in the code.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
|
|
|
|
|
|
|
|
| |
All effects are properly being described by the asm() constraints.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
... when assembler supports it, following commit cfd54835 ("VMX: use
proper instruction mnemonics if assembler supports them"). This merely
got split off from the earlier change becase of the significant number
of call sites needing to be changed.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Discovered by Coverity, CID 1055181
core2_vpmu_dump() was incorrectly setting VPMU_CONTEXT_LOADED when it
was intending to check for it.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
This would have been avoided if the dump function declared all its
pointers "const" - doing this now (also in SVM).
Also fixing some indentation issues at once.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Guests other than domain 0 using the console output have previously been
controlled by the VERBOSE #define, but with no designation of which
guest's output was on the console. This patch converts the HVM output
buffering to be used by all domains except the hardware domain (dom0):
stripping non-printable characters, line buffering the output, and
prefixing it with the domain ID. This is especially useful for debugging
stub domains during early boot.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With CPUID features suitably masked this is supposed to work, but was
completely broken (i.e. the case wasn't even considered when the
original xsave save/restore code was written).
First of all, xsave_enabled() wrongly returned the value of
cpu_has_xsave, i.e. not even taking into consideration attributes of
the vCPU in question. Instead this function ought to check whether the
guest ever enabled xsave support (by writing a [non-zero] value to
XCR0). As a result of this, a vCPU's xcr0 and xcr0_accum must no longer
be initialized to XSTATE_FP_SSE (since that's a valid value a guest
could write to XCR0), and the xsave/xrstor as well as the context
switch code need to suitably account for this (by always enforcing at
least this part of the state to be saved/loaded).
This involves undoing large parts of c/s 22945:13a7d1f7f62c ("x86: add
strictly sanity check for XSAVE/XRSTOR") - we need to cleanly
distinguish between hardware capabilities and vCPU used features.
Next both HVM and PV save code needed tweaking to not always save the
full state supported by the underlying hardware, but just the parts
that the guest actually used. Similarly the restore code should bail
not just on state being restored that the hardware cannot handle, but
also on inconsistent save state (inconsistent XCR0 settings or size of
saved state not in line with XCR0).
And finally the PV extended context get/set code needs to use slightly
different logic than the HVM one, as here we can't just key off of
xsave_enabled() (i.e. avoid doing anything if a guest doesn't use
xsave) because the tools use this function to determine host
capabilities as well as read/write vCPU state. The set operation in
particular needs to be capable of cleanly dealing with input that
consists of only the xcr0 and xcr0_accum values (if they're both zero
then no further data is required).
While for things to work correctly both sides (saving _and_ restoring
host) need to run with the fixed code, afaict no breakage should occur
if either side isn't up to date (other than the breakage that this
patch attempts to fix).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Yang Zhang <yang.z.zhang@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the hex byte emission we were taking away a good part of
flexibility from the compiler, as for simplicity reasons these were
built using fixed operands. All half way modern build environments
would allow using the mnemonics (but we can't disable the hex variants
yet, since the binutils around at the time gcc 4.1 got released didn't
support these yet).
I didn't convert __vmread() yet because that would, just like for
__vmread_safe(), imply converting to a macro so that the output operand
can be the caller supplied variable rather than an intermediate one. As
that would require touching all invocation points of __vmread() (of
which there are quite a few), I'd first like to be certain the approach
is acceptable; the main question being whether the now conditional code
might be considered to cause future maintenance issues, and the second
being that of parameter/argument ordering (here I made __vmread_safe()
match __vmwrite(), but one could also take the position that read and
write should use the inverse order of one another, in line with the
actual instruction operands).
Additionally I was quite puzzled to find that all the asm()-s involved
here have memory clobbers - what are they needed for? Or can they be
dropped at least in some cases?
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
| |
... at once making conditional forward jumps, which are statically
predicted to be not taken, only used for the unlikely (error) cases.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- move stuff easily/better done in C into C code
- re-arrange code paths so that no redundant GET_CURRENT() would remain
on the fast paths
- move long latency operations earlier
- slightly defer disabling interrupts on the VM entry path
- use ENTRY() instead of open coding it
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With there being so many systems with broken ACPI tables, and with it
generally being known what's wrong with those tables, give people a
handle to overcome the resulting disabling of their IOMMUs.
Inspired by Linux side patches providing similar functionality.
Suggested-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-By: Sander Eikelenboom <linux@eikelenboom.it>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Suravee Suthikulpanit <suravee.suthikulapanit@amd.com>
|
|
|
|
|
|
|
|
|
| |
... allowing bitmap operations to be used on it, making things
consistent with struct pi_desc's pir field, and shrinking overall
source code size.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
But when L2 is running, external interrupt will casue L1 vmexit with
reason external interrupt. Then L1 will pick up the interrupt through
vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
L1 is running, so APIC-v hardware still will do vEOI updating. The problem
is that the interrupt is delivered not through APIC-v hardware, this means
SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
updating. The solution is that, when L1 tried to pick up the interrupt
from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
sure the following vEOI updating and vPPR updating corrently.
Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
not cleare vIRR and hypervisor need to clear it before L1 running.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
According to AMD Programmer's Manual vol2, vmrun, vmsave and vmload
should inject #GP instead of #UD when unable to access memory
location for vmcb. Also, the code should make sure that L1 guest
EFER.SVME is not zero. Otherwise, #UD should be injected.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
| |
Fix assertion in __virt_to_maddr when starting nested SVM guest
in debug mode. Investigation has shown that svm_vmsave/svm_vmload
make use of __pa() with invalid address.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Viridian using a synthetic MSR for issuing EOI notifications bypasses
the normal in-processor handling, which would clear
GUEST_INTR_STATUS.SVI. Hence we need to do this in software in order
for future interrupts to get delivered.
Based on analysis by Yang Z Zhang <yang.z.zhang@intel.com>.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Yang Zhang <yang.z.zhang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
For multi-vector MSI, where we surely don't want to allocate
contiguous vectors and be able to set affinities of the individual
vectors separately, we need to drop the use of the tuple of vector and
delivery mode to determine the IRTE to use, and instead allocate IRTEs
(which imo should have been done from the beginning).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
|
|
|
|
|
|
|
|
|
|
| |
There's no point in using the fixmap here, and it gets
map_iommu_mmio_region() in line with unmap_iommu_mmio_region(), which
was already using iounmap() (thus crashing if actually used).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
|
|
|
|
|
|
|
|
|
| |
.. in favor of NR_VECTORS, as being redundant and as the latter is
correct in terms of its naming, while the former is off by one.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The IOMMU interrupt bits in the IOMMU status registers are
"read-only, and write-1-to-clear (RW1C). Therefore, the existing
logic which reads the register, set the bit, and then writing back
the values could accidentally clear certain bits if it has been set.
The correct logic would just be writing only the value which only
set the interrupt bits, and leave the rest to zeros.
This patch also, clean up #define masks as Jan has suggested.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
With iommu_interrupt_handler() properly having got switched its readl()
from status to control register, the subsequent writel() needed to be
switched too (and the RW1C comment there was bogus).
Some of the cleanup went too far - undone.
Further, with iommu_interrupt_handler() now actually disabling the
interrupt sources, they also need to get re-enabled by the tasklet once
it finished processing the respective log. This also implies re-running
the tasklet so that log entries added between reading the log and re-
enabling the interrupt will get handled in a timely manner.
Finally, guest write emulation to the status register needs to be done
with the RW1C (and RO for all other bits) semantics in mind too.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- both VMX and SVM ignored the ECX input to XSETBV
- both SVM and VMX used the full 64-bit RAX when calculating the input
mask to XSETBV
- faults on XSETBV did not get recovered from
Also consolidate the handling for PV and HVM into a single function,
and make the per-CPU variable "xcr0" static to xstate.c.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
|
|
|
|
|
|
|
|
|
| |
Since in that case the processing it pt_intr_post() won't occur, we
need to do some additional work in pt_update_irq(). Additionally we
must not pay attention to the respective IRQ being masked.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Roger Pau Monné <roger.pau@citrix.com> (FreeBSD guest)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hvm_enable() copies the table contents rather than storing the pointer,
so there's no need to keep these tables post-boot.
Also constify the return values of the per-vendor initialization
functions, making clear that once the per-vendor initialization is
complete, the vendor specific tables won't get modified anymore.
Finally, in hvm_enable(), use the returned pointer for all read
accesses as being more efficient than global variable accesses. Writes
of course still need to go to the global variable.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
| |
Move kick_vcpu into vlapic_set_irq. And call it to deliver virtual interrupt
instead set vIRR directly.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
|
|
|
|
|
|
|
|
|
| |
Add the supporting of using posted interrupt to deliver interrupt.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
|
|
|
|
|
|
|
|
|
| |
Turn on posted interrupt for vcpu if posted interrupt is avaliable.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
|
|
|
|
|
|
|
|
|
| |
Check whether the Hardware supports posted interrupt capability.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VPMU doesn't need to always be saved during context switch. If we are
comming back to the same processor and no other VPCU has run here we can
simply continue running. This is especailly useful on Intel processors
where Global Control MSR is stored in VMCS, thus not requiring us to stop
the counters during save operation. On AMD we need to explicitly stop the
counters but we don't need to save them.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
|
|
|
|
|
|
|
|
|
| |
Factor out common code from SVM amd VMX into VPMU.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
|
|
| |
Currently VMCB's MSRPM can be updated to either intercept both reads and
writes to an MSR or not intercept neither. In some cases we may want to
be more selective and intercept one but not the other.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the need to allocate multiple contiguous IRTEs for multi-vector
MSI, the chance of failure here increases. While on the AMD side
there's no allocation of IRTEs at present at all (and hence no way for
this allocation to fail, which is going to change with a later patch in
this series), VT-d already ignores an eventual error here, which this
patch fixes.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: "Zhang, Xiantao" <xiantao.zhang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch extends the printout of the VPCU infos of the keyhandler 'q'.
If vPMU is enabled is on the VCPU and active lines are printed like
(when running HVM openSuSE-12.3 with 'perf top');
(XEN) vPMU running
(XEN) general_0: 0x000000ffffff3ae1 ctrl: 0x000000000053003c
(XEN) fixed_1: 0x000000ff90799188 ctrl: 0xb
This means general counter 0 and fixed counter 1 are running with showing
their contents and the contents of their configuration msr.
Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
This patch renames core2_counters to core2_fix_counters for better
understanding the code and subtitutes 2 numerals with defines in fixed counter
handling.
Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
| |
Signed-off-by: Tim Deegan <tim@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
| |
Signed-off-by: Tim Deegan <tim@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
When the RTC periodic timer gets restarted, align it to the VM's boot
time, not to whatever time it is now. Otherwise every read of REG_C
will restart the current period
Signed-off-by: Tim Deegan <tim@xen.org>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
x86_emulate().
In particular, on broadcast/multicast INIT/SIPI, we handle all target
APICs at once in a single invocation of the init/sipi tasklet. This
avoids needing to return an X86EMUL_RETRY error code to the caller,
which was being ignored by all except x86_emulate().
The original bug, and the general approach in this fix, pointed out by
Intel (yang.z.zhang@intel.com).
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
| |
The emacs variable to set the C style from a local variable block is
c-file-style, not c-set-style.
Signed-off-by: David Vrabel <david.vrabel@citrix.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "APIC-register virtualization" and "virtual-interrupt deliver"
VM-execution control has no effect on the behavior of RDMSR/WRMSR if
the "virtualize x2APIC mode" VM-execution control is 0.
When guest uses x2APIC mode, we should enable "virtualize x2APIC mode"
for APICV first.
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
SVI should be restored in case guest is processing virtual interrupt
while saveing a domain state. Otherwise SVI would be missed when
virtual interrupt delivery is enabled.
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When changing the affinity of an IRQ associated with a passed
through PCI device, clear previous mapping.
This is XSA-36 / CVE-2013-0153.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
In addition, because some BIOSes may incorrectly program IVRS
entries for IOAPIC try to check for entry's consistency. Specifically,
if conflicting entries are found disable IOMMU if per-device
remapping table is used. If entries refer to bogus IOAPIC IDs
disable IOMMU unconditionally
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- only call check_update_timer() on REG_B writes when SET changes
- only call alarm_timer_update() on REG_B writes when relevant bits
change
- instead properly handle AF and PF when the guest is not also setting
AIE/PIE respectively (for UF this was already the case, only a
comment was slightly inaccurate), including calling the respective
update functions upon REG_C reads
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current logic for handling the non-root VMREAD/VMWRITE is by
VM-Exit and emulate, which may bring certain overhead.
On new Intel platform, it introduces a new feature called VMCS
shadowing, where non-root VMREAD/VMWRITE will not trigger VM-Exit,
and the hardware will read/write the virtual VMCS instead.
This is proved to have performance improvement with the feature.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
After we use the VMREAD/VMWRITE to build up the virtual VMCS, each
access to the virtual VMCS needs two VMPTRLD and one VMCLEAR to
switch the environment, which might be an overhead to performance.
This commit tries to handle multiple virtual VMCS access together
to improve the performance.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before the VMCS shadowing feature, we use memory operation to build up
the virtual VMCS. This does work since this virtual VMCS will never be
loaded into real hardware. However after we introduce the VMCS
shadowing feature, this VMCS will be loaded into hardware, which
requires all fields in the VMCS accessed by VMREAD/VMWRITE.
Besides, the virtual VMCS revision identifer should also meet the
hardware's requirement, instead of using a faked one.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally we use a virtual VMCS field to store the launch state of
a certain vmcs. However if we introduce VMCS shadowing feature, this
virtual VMCS should also be able to load into real hardware,
and VMREAD/VMWRITE operate invalid fields.
The new approach is to store the launch state into a list for L1 VMM.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This eliminates a couple of incorrect/inconsistent uses of
map_domain_page() from VT-x code.
Note that this does _not_ add error handling where none was present
before, even though I think NULL returns from any of the mapping
operations touched here need to properly be handled. I just don't know
this code well enough to figure out what the right action in each case
would be.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|