| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
VMCS size validation on APs should check against BP's size.
No need for a separate cpu_has_vmx_ins_outs_instr_info variable
anymore.
Use proper symbolics.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
VMX MSRs only available when the CPU support the VMX feature. In addition,
VMX_TRUE* MSRs only available when bit 55 of VMX_BASIC MSR is set.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Cleanup.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
|
| |
All effects are properly being described by the asm() constraints.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
... when assembler supports it, following commit cfd54835 ("VMX: use
proper instruction mnemonics if assembler supports them"). This merely
got split off from the earlier change becase of the significant number
of call sites needing to be changed.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the hex byte emission we were taking away a good part of
flexibility from the compiler, as for simplicity reasons these were
built using fixed operands. All half way modern build environments
would allow using the mnemonics (but we can't disable the hex variants
yet, since the binutils around at the time gcc 4.1 got released didn't
support these yet).
I didn't convert __vmread() yet because that would, just like for
__vmread_safe(), imply converting to a macro so that the output operand
can be the caller supplied variable rather than an intermediate one. As
that would require touching all invocation points of __vmread() (of
which there are quite a few), I'd first like to be certain the approach
is acceptable; the main question being whether the now conditional code
might be considered to cause future maintenance issues, and the second
being that of parameter/argument ordering (here I made __vmread_safe()
match __vmwrite(), but one could also take the position that read and
write should use the inverse order of one another, in line with the
actual instruction operands).
Additionally I was quite puzzled to find that all the asm()-s involved
here have memory clobbers - what are they needed for? Or can they be
dropped at least in some cases?
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
| |
... at once making conditional forward jumps, which are statically
predicted to be not taken, only used for the unlikely (error) cases.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
| |
... allowing bitmap operations to be used on it, making things
consistent with struct pi_desc's pir field, and shrinking overall
source code size.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
But when L2 is running, external interrupt will casue L1 vmexit with
reason external interrupt. Then L1 will pick up the interrupt through
vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
L1 is running, so APIC-v hardware still will do vEOI updating. The problem
is that the interrupt is delivered not through APIC-v hardware, this means
SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
updating. The solution is that, when L1 tried to pick up the interrupt
from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
sure the following vEOI updating and vPPR updating corrently.
Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
not cleare vIRR and hypervisor need to clear it before L1 running.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
|
|
|
|
|
|
|
|
|
| |
Add the supporting of using posted interrupt to deliver interrupt.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
|
|
|
|
|
|
|
|
|
| |
Turn on posted interrupt for vcpu if posted interrupt is avaliable.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
|
|
|
|
|
|
|
|
|
| |
Check whether the Hardware supports posted interrupt capability.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (from a release perspective)
|
|
|
|
|
|
|
|
|
| |
Factor out common code from SVM amd VMX into VPMU.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
This patch renames core2_counters to core2_fix_counters for better
understanding the code and subtitutes 2 numerals with defines in fixed counter
handling.
Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
|
|
|
|
|
|
|
| |
The emacs variable to set the C style from a local variable block is
c-file-style, not c-set-style.
Signed-off-by: David Vrabel <david.vrabel@citrix.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "APIC-register virtualization" and "virtual-interrupt deliver"
VM-execution control has no effect on the behavior of RDMSR/WRMSR if
the "virtualize x2APIC mode" VM-execution control is 0.
When guest uses x2APIC mode, we should enable "virtualize x2APIC mode"
for APICV first.
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
SVI should be restored in case guest is processing virtual interrupt
while saveing a domain state. Otherwise SVI would be missed when
virtual interrupt delivery is enabled.
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current logic for handling the non-root VMREAD/VMWRITE is by
VM-Exit and emulate, which may bring certain overhead.
On new Intel platform, it introduces a new feature called VMCS
shadowing, where non-root VMREAD/VMWRITE will not trigger VM-Exit,
and the hardware will read/write the virtual VMCS instead.
This is proved to have performance improvement with the feature.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
After we use the VMREAD/VMWRITE to build up the virtual VMCS, each
access to the virtual VMCS needs two VMPTRLD and one VMCLEAR to
switch the environment, which might be an overhead to performance.
This commit tries to handle multiple virtual VMCS access together
to improve the performance.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before the VMCS shadowing feature, we use memory operation to build up
the virtual VMCS. This does work since this virtual VMCS will never be
loaded into real hardware. However after we introduce the VMCS
shadowing feature, this VMCS will be loaded into hardware, which
requires all fields in the VMCS accessed by VMREAD/VMWRITE.
Besides, the virtual VMCS revision identifer should also meet the
hardware's requirement, instead of using a faked one.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally we use a virtual VMCS field to store the launch state of
a certain vmcs. However if we introduce VMCS shadowing feature, this
virtual VMCS should also be able to load into real hardware,
and VMREAD/VMWRITE operate invalid fields.
The new approach is to store the launch state into a list for L1 VMM.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Expose EPT's and VPID 's basic features to L1 VMM.
For EPT, no EPT A/D bit feature supported.
For VPID, exposes all features to L1 VMM
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Virtualize VPID for the nested vmm, use host's VPID
to emualte guest's VPID. For each virtual vmentry, if
guest'v vpid is changed, allocate a new host VPID for
L2 guest.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
Add the INVEPT instruction emulation logic.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Emulate permission check for the nested p2m. Current solution is to
use minimal permission, and once meet permission violation in L0, then
determin whether it is caused by guest EPT or host EPT
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Once found EPT is enabled by L1 VMM, enabled nested EPT support
for L2 guest.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Share the current EPT logic with nested EPT case, so
make the related data structure or operations netural
to comment EPT and nested EPT.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implment guest EPT PT walker, some logic is based on shadow's
ia32e PT walker. During the PT walking, if the target pages are
not in memory, use RETRY mechanism and get a chance to let the
target page back.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
EPT and NPT adopts differnt formats for each-level entry,
so change the walker functions to vendor-specific.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
VMX doesn't have the concept about host cr3 for nested p2m,
and only SVM has, so change it to netural words.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
| |
Use the host value to emulate IA32_VMX_MISC MSR for L1 VMM.
For CR3-target value, we don't support this feature currently and
set the number to zero.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
| |
If L0 is to handle the TSC access, then we need to update guest EIP by
calling update_guest_eip().
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
| |
Besides, use literal name instead of hard numbers for this bit 55 in
IA32_VMX_BASIC_MSR.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
| |
For those default 1 settings in VMX MSR, use some literal name
instead of hard numbers in the code.
Besides, fix the default 1 setting for pin based control MSR.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In nested vmx virtualization for MSR bitmaps, L0 hypervisor will trap
all the VM exit from L2 guest by disable the MSR_BITMAP feature. When
handling this VM exit, L0 hypervisor judges whether L1 hypervisor uses
MSR_BITMAP feature and the corresponding bit is set to 1. If so, L0
will inject such VM exit into L1 hypervisor; otherwise, L0 will be
responsible for handling this VM exit.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
| |
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
| |
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
| |
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
| |
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable secondary processor-based control in VMCS
Besides that, add a helper function to get the certain control bit
in secondary processor-based control MSR.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
| |
basically to benefit from apicv, we need clear MSR bitmap for
corresponding x2apic MSRs:
0x800 - 0x8ff: no read intercept for apicv register virtualization
TPR,EOI,SELF-IPI: no write intercept for virtual interrupt
delivery
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Virtual interrupt delivery avoids Xen to inject vAPIC interrupts
manually, which is fully taken care of by the hardware. This needs
some special awareness into existing interrupr injection path:
For pending interrupt from vLAPIC, instead of direct injection, we may
need update architecture specific indicators before resuming to guest.
Before returning to guest, RVI should be updated if any pending IRRs
EOI exit bitmap controls whether an EOI write should cause VM-Exit. If
set, a trap-like induced EOI VM-Exit is triggered. The approach here
is to manipulate EOI exit bitmap based on value of TMR. Level
triggered irq requires a hook in vLAPIC EOI write, so that vIOAPIC EOI
is triggered and emulated
Signed-off-by: Gang Wei <gang.wei@intel.com>
Signed-off-by: Yang Zhang <yang.z.zhang@intel.com>
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
Add APIC register virtualization support
- APIC read doesn't cause VM-Exit
- APIC write becomes trap-like
Signed-off-by: Gang Wei <gang.wei@intel.com>
Signed-off-by: Yang Zhang <yang.z.zhang@intel.com>
Signed-off-by: Jiongxi Li <jiongxi.li@intel.com>
|
|
|
|
|
|
|
|
| |
Direct calls perform better, so we should prefer them and use indirect
ones only when there indeed is a need for indirection.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous order of relinquish resource is:
relinquish_domain_resources() -> vcpu_destroy() ->
nvmx_vcpu_destroy(). However some L1 resources like nv_vvmcx and
io_bitmaps are free in nvmx_vcpu_destroy(), therefore the
relinquish_domain_resources() will not reduce the refcnt of the domain
to 0, therefore the latter vcpu release functions will not be called.
To fix this issue, we need to release the nv_vvmcx and io_bitmaps in
relinquish_domain_resources().
Besides, after destroy the nested vcpu, we need to switch the
vmx->vmcs back to the L1 and let the vcpu_destroy() logic to free the
L1 VMCS page.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
Define new struct hvm_trap to represent information of trap, and
renames hvm_inject_exception to hvm_inject_trap, then define a couple
of wrappers around that function for existing callers.
Signed-off-by: Keir Fraser <keir@xen.org>
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
Add the BTS functionality to the existing vpmu implementation for Intel
CPUs.
Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
| |
This was always bogus (xen/config.h should have been used instead) and
is superfluous now that xen/config.h gets included through the compiler
command line.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch handle PCID/INVPCID for hvm:
For hap hvm, we enable PCID/INVPCID, since no need to intercept
INVPCID, and we just set INVPCID non-root behavior as running natively;
For shadow hvm, we disable PCID/INVPCID, otherwise we need to emulate
INVPCID at vmm by setting INVPCID non-root behavior as vmexit.
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
|