| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
commit 6859874b61d5ddaf5289e72ed2b2157739b72ca5 ("x86/HVM: fix x2APIC
APIC_ID read emulation") introduced an error for the hvm emulation of
x2apic. Any try to write to APIC_ICR MSR will result in a GP fault.
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
But when L2 is running, external interrupt will casue L1 vmexit with
reason external interrupt. Then L1 will pick up the interrupt through
vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
L1 is running, so APIC-v hardware still will do vEOI updating. The problem
is that the interrupt is delivered not through APIC-v hardware, this means
SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
updating. The solution is that, when L1 tried to pick up the interrupt
from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
sure the following vEOI updating and vPPR updating corrently.
Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
not cleare vIRR and hypervisor need to clear it before L1 running.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
External interrupt is allowed to notify CPU only when it has higher
priority than current in servicing interrupt. With APIC-v, the priority
comparing is done by hardware and hardware will inject the interrupt to
VCPU when it recognizes an interrupt. Currently, there is no virtual
APIC-v feature available for L1 to use, so when L2 is running, we still need
to compare interrupt priority with ISR in hypervisor instead via hardware.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Viridian using a synthetic MSR for issuing EOI notifications bypasses
the normal in-processor handling, which would clear
GUEST_INTR_STATUS.SVI. Hence we need to do this in software in order
for future interrupts to get delivered.
Based on analysis by Yang Z Zhang <yang.z.zhang@intel.com>.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Yang Zhang <yang.z.zhang@intel.com>
|
|
|
|
|
|
|
|
|
| |
.. in favor of NR_VECTORS, as being redundant and as the latter is
correct in terms of its naming, while the former is off by one.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
APIC and x2APIC have different format for APIC_ID register. Need
translation.
Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
Signed-off-by: Xiaowei Yang <xiaowei.yang@huawei.com>
Convert code to use switch(), fixing coding style issue at once, and
use GET_xAPIC_ID() on the value read instead of VLAPIC_ID() (reading
the field again).
In the course of this also properly reject both read and writes on the
non-existing MSR corresponding to APIC_ICR2.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
... as dropping the old page tables may take significant amounts of
time.
This is part of CVE-2013-1918 / XSA-45.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
Deliver virtual interrupt through posted way if posted interrupt
is enabled.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
|
|
|
|
|
|
|
|
|
| |
Move kick_vcpu into vlapic_set_irq. And call it to deliver virtual interrupt
instead set vIRR directly.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
|
|
|
|
|
|
|
|
|
| |
Add the supporting of using posted interrupt to deliver interrupt.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
|
|
|
|
|
|
|
| |
In particular, correctly propagate errors through vlapic_apicv_write()
and hvm_x2apic_msr_write().
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
x86_emulate().
In particular, on broadcast/multicast INIT/SIPI, we handle all target
APICs at once in a single invocation of the init/sipi tasklet. This
avoids needing to return an X86EMUL_RETRY error code to the caller,
which was being ignored by all except x86_emulate().
The original bug, and the general approach in this fix, pointed out by
Intel (yang.z.zhang@intel.com).
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
| |
SVI should be restored in case guest is processing virtual interrupt
while saveing a domain state. Otherwise SVI would be missed when
virtual interrupt delivery is enabled.
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
| |
Performance is not an issue with printk(), so let the function do
minimally more work and instead save a byte per affected format
specifier.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Virtual interrupt delivery avoids Xen to inject vAPIC interrupts
manually, which is fully taken care of by the hardware. This needs
some special awareness into existing interrupr injection path:
For pending interrupt from vLAPIC, instead of direct injection, we may
need update architecture specific indicators before resuming to guest.
Before returning to guest, RVI should be updated if any pending IRRs
EOI exit bitmap controls whether an EOI write should cause VM-Exit. If
set, a trap-like induced EOI VM-Exit is triggered. The approach here
is to manipulate EOI exit bitmap based on value of TMR. Level
triggered irq requires a hook in vLAPIC EOI write, so that vIOAPIC EOI
is triggered and emulated
Signed-off-by: Gang Wei <gang.wei@intel.com>
Signed-off-by: Yang Zhang <yang.z.zhang@intel.com>
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
Add APIC register virtualization support
- APIC read doesn't cause VM-Exit
- APIC write becomes trap-like
Signed-off-by: Gang Wei <gang.wei@intel.com>
Signed-off-by: Yang Zhang <yang.z.zhang@intel.com>
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
|
|
|
|
|
|
|
|
| |
In a few cases this also extends to making them static in the first
place.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the virtual LAPIC, correct the delta calculation when emulating the
TSC deadline timer.
Without this fix, XenServer (which is based on Xen 4.1) does not work
when running as an HVM guest. dom0 fails to boot because its timer
interrupts are very delayed (by several minutes in some cases).
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
| |
When the subject domain is not the current one (e.g. during domctl or
HVM save/restore handling), use of gdprintk() is questionable at best,
as it won't give the intended information on what domain is affected.
Use plain printk() or dprintk() instead, but keep things (mostly) as
guest messages by using XENLOG_G_*.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
On a real machine a cpu disabled via hlt with interrupts disabled can
be reactivated via a nmi ipi. Enable the hypervisor to do this for
hvm, too.
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Signed-off-by: Keir Fraser <keir@xen.org>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In vlapic_set_irq, we set the IRR register before the TMR. And the IRR
might be serviced before setting TMR, and even worse EOI might occur
before TMR setting, in which case the vioapic_update_EOI won't be
called, and further prevent all the subsequent interrupt injecting.
Reorder setting the TMR and IRR will solve the problem.
Besides, KVM has fixed a similar bug in:
http://markmail.org/search/?q=APIC_TMR#query:APIC_TMR+page:1+mid:rphs4f7lkxjlldne+state:results
Signed-off-by: Yongan Liu<Liuyongan@huawei.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
| |
This patch adds back proper guest MSI EOI hook for correctly handling
unmaskable MSI interrupt, which is wrongly removed by changset 23703.
Signed-off-by: Shan Haitao <haitao.shan@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
The reason is:
1. The logic has negative impact on 10G NIC performance (assigned to
guest) by lowering the interrupt frequency that Xen can handle.
2. Xen already has IRQ rate limit logic, which can also help to
prevent IRQ storms.
Signed-off-by: Shan Haitao <haitao.shan@intel.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Accesses to MSR_IA32_TSC_DEADLINE are trapped, with value stored in a
new field vlapic->hw.tdt_msr. vlapic->pt is reused in one shot mode
for vtdt to trigger expire events.
For details, please refer to the Intel Architectures Software
Developer's Manual 3A, 10.5.4.1 TSC-Deadline Mode.
Signed-off-by: Wei Gang <gang.wei@intel.com>
|
|
|
|
| |
Signed-off-by: Wei Gang <gang.wei@intel.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
| |
Guest vcpu may totally lose all ticks if the vlapic->pt.irq was not
restored during save/restore process. Fix it.
Signed-off-by: Wei Gang <gang.wei@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch would enable Xen to handle x2APIC MSR accessing of HVM
guest, which is faster(avoid decoding of MMIO accessing). The credit
comes to Gleb Natapov who complete the work for KVM.
Have tested with 4 vcpus guest, with/without x2apic support.
From: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
asynchronous tasklet completes its work.
This is a little bit cleaner than busy-spinning in a retry loop.
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a livelock in hvmloader with credit2 scheduler, whereby an
AP can be brought online, do its work, and shut itself down, before
the BSP re-emulates the VLAPIC write that sent the SIPI. BSP then ends
up in an endless re-emulation work where it sees the target vcpu is
down, therefore schedules a tasklet, which does no work because the
vcpu is already initialised. The fix is to check v->is_initialised
rather than VPF_down, before scheduling the tasklet.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
| |
...and fix up the ensuing fall-out of implicit dependencies
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
| |
This is possible now that tasklets run in idle-vcpu context
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Writing TMICT=0 disables the timer. Use this fact to simplify and
improve reprogram_timer(). In particular, we always write TMICT, and
write zero when we do not need a timer interrupt.
2. In HPET broadcast timer handler, set TMICT=0 when we mask the APIC
local timer. May as well do this early, before entering deep sleep.
3. In HVM-guest APIC emulation, disable the emulated local timer when
the guest sets TMICT=0. Previously we would issue an immediate
one-shot interrupt.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
| |
Make various data items const or __read_mostly where
possible/reasonable.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a variant of map_domain_page() directly getting passed a
struct page_info * argument, based on the observation that in many
places the argument to this function so far simply was the result of
page_to_mfn(). This is meaningful for the x86-64 case where
map_domain_page() really just is an invocation of mfn_to_virt(), and
hence the combined mfn_to_virt(page_to_mfn()) now represents a
needless round trip conversion compressed -> uncompressed ->
compressed of the MFN representation.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch gets rid of a timer which IRQ is masked from vcpu's timer
list. It reduces the overhead of VM EXIT and context switch of vm.
Also fixes a potential bug.
(1) VCPU#0: mask the IRQ of a timer. (ex. vioapic.redir[2].mask=1)
(2) VCPU#1: pt_timer_fn() is invoked by expiration of the timer.
(3) VCPU#1: pt_update_irq() is called but does nothing by
pt_irq_masked()==1.
(4) VCPU#1: sleep by halt.
(5) VCPU#0: unmask the IRQ of the timer.
After that, no one wakes up the VCPU#1.
IRQ of ISA is masked by:
- PIC's IMR
- IOAPIC's redir[0]
- IOAPIC's redir[N].mask
- LAPIC's LVT0
- LAPIC enabled/disabled
IRQ of LAPIC timer is masked by:
- LAPIC's LVTT
- LAPIC disabled
When above stuffs are changed, the corresponding vcpu is kicked and
suspended timer emulation is resumed.
In addition, a small bug fix in pt_adjust_global_vcpu_target().
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
|
|
|
|
|
|
|
|
| |
A boolean flag was overflowing a uint8_t.
Thanks to Dongxiao Xu at Intel for tracking down the bug.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
| |
In particular, avoid intermediate delivery bitmaps which restrict
number of vcpus supported.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
| |
{vcpu,domain} and {vlapic,vpic,vrtc,hpet}. Completely avoids
accidental aliasing.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
| |
This patch is needed for kexec/kdump since VCPU#0 is halted.
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the shared info layout is fixed, guests are required to use
VCPUOP_register_vcpu_info prior to booting any vCPU beyond the
traditional limit of 32.
MAX_VIRT_CPUS, being an implemetation detail of the hypervisor, is no
longer being exposed in the public headers.
The tools changes are clearly incomplete (and done only so things
would
build again), and the current state of the tools (using scalar
variables all over the place to represent vCPU bitmaps) very likely
doesn't permit booting DomU-s with more than the traditional number of
vCPU-s. Testing of the extended functionality was done with Dom0 (96
vCPU-s, as well as 128 vCPU-s out of which the kernel elected - by way
of a simple kernel side patch - to use only some, resulting in a
sparse
bitmap).
ia64 changes only to make things build, and build-tested only (and the
tools part only as far as the build would go without encountering
unrelated problems in the blktap code).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
| |
Instead of round robin the vcpu with the lowest processor
priority is selected for the interrupt. If multiple vcpus
share the same low priority then interrupts are distributed between
those round robin.
Signed-off-by: Juergen Gross <juergen.gross@fujitsu-siemens.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current hpet implementation runs a one-shot xen timer for each
hpet timer whenever the main counter is enabled regardless of whether
or not the individual hpet timers are enabled. When the timer fires,
if it is enabled the interrupt is routed to the guest. If the hpet
timer is periodic, a new one-shot timer is set, for NOW()+period.
There are a number of problems with this the most significant is guest
time drift. Windows does not read the hardware clock to verify time,
it depends on timer interrupts firing at the expected interval. The
existing implementation queues a new one-shot timer each time it fires
and does not allow for a difference between NOW() and the time the
timer was expected to fire, causing drift. Also there is
no allowance for lost ticks. This modification changes HPET to use the
Virtual Platform Timer (VPT) and, for periodic timers, to use periodic
timers. The VPT ensures an interrupt is delivered to the guest for
each period that elapses, plus, its use of xen periodic timers ensures
no drift.
Signed-off-by: Peter Johnston <peter.johnston@citrix.com>
|