aboutsummaryrefslogtreecommitdiffstats
path: root/xen/arch/x86/traps.c
Commit message (Collapse)AuthorAgeFilesLines
* x86: print relevant (tail) part of filename for warnings and crashesJan Beulich2013-10-171-8/+14
| | | | | | | | | | In particular when the origin construct is in a header file (and hence the file name is an absolute path instead of just the file name portion) the information can otherwise become rather useless when the build tree isn't sitting relatively close to the file system root. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: use {rd,wr}{fs,gs}base when availableJan Beulich2013-10-111-20/+10
| | | | | | | | | | | | ... as being intended to be faster than MSR reads/writes. In the case of emulate_privileged_op() also use these in favor of the cached (but possibly stale) addresses from arch.pv_vcpu. This allows entirely removing the code that was the subject of XSA-67. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: correct LDT checksJan Beulich2013-10-111-5/+22
| | | | | | | | | | | | | | | | | | | | | | - MMUEXT_SET_LDT should behave as similarly to the LLDT instruction as possible: fail only if the base address is non-canonical - instead LDT descriptor accesses should fault if the descriptor address ends up being non-canonical (by ensuring this we at once avoid reading an entry from the mach-to-phys table and consider it a page table entry) - fault propagation on using LDT selectors must distinguish #PF and #GP (the latter must be raised for a non-canonical descriptor address, which also applies to several other uses of propagate_page_fault(), and hence the problem is being fixed there) - map_ldt_shadow_page() should properly wrap addresses for 32-bit VMs At once remove the odd invokation of map_ldt_shadow_page() from the MMUEXT_SET_LDT handler: There's nothing really telling us that the first LDT page is going to be preferred over others. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: check segment descriptor read result in 64-bit OUTS emulationMatthew Daley2013-10-101-4/+4
| | | | | | | | | | | | | | | | | | | | When emulating such an operation from a 64-bit context (CS has long mode set), and the data segment is overridden to FS/GS, the result of reading the overridden segment's descriptor (read_descriptor) is not checked. If it fails, data_base is left uninitialized. This can lead to 8 bytes of Xen's stack being leaked to the guest (implicitly, i.e. via the address given in a #PF). Coverity-ID: 1055116 This is CVE-2013-4368 / XSA-67. Signed-off-by: Matthew Daley <mattjd@gmail.com> Fix formatting. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* x86/traps: improvements to {rd,wr}msr_hypervisor_regs()Andrew Cooper2013-10-091-26/+15
| | | | | | | | | | | | | | | | | | | Coverity ID: 1055249 1055250 Coverity was complaining that the switch statments contained dead code in their default statements. While this is quite minor, the code flow in wrmsr_hypervisor_regs() was sufficiently opaque that I felt it approprate to fix. Other improvements include: * not shadowing the function parameter 'idx'. * use of PAGE_{SHIFT,SIZE} instead of opencoded numbers. * a more descriptive error message for attempting to write invalid indicies for hypercall pages. There is no behavioural change as a result. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
* x86: Improve information from domain_crash_synchronousAndrew Cooper2013-10-041-0/+11
| | | | | | | | | | | | | | | | | | | | | As it currently stands, the string "domain_crash_sync called from entry.S" is not helpful at identifying why the domain was crashed, and a debug build of Xen doesn't help the matter This patch improves the information printed, by pointing to where the crash decision was made. Specific improvements include: * Moving the ascii string "domain_crash_sync called from entry.S\n" away from some semi-hot code cache lines. * Moving the printk into C code (especially as this_cpu() is miserable to use in assembly code) * Undo the previous confusing situation of having the domain_crash_synchronous() as a macro in C code, yet a global symbol in assembly code. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/traps: Record last extable faulting addressAndrew Cooper2013-10-041-0/+5
| | | | | | | | ... so the following patch can identify the location of faults leading to a decision to crash a domain. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/xsave: fix migration from xsave-capable to xsave-incapable hostJan Beulich2013-09-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With CPUID features suitably masked this is supposed to work, but was completely broken (i.e. the case wasn't even considered when the original xsave save/restore code was written). First of all, xsave_enabled() wrongly returned the value of cpu_has_xsave, i.e. not even taking into consideration attributes of the vCPU in question. Instead this function ought to check whether the guest ever enabled xsave support (by writing a [non-zero] value to XCR0). As a result of this, a vCPU's xcr0 and xcr0_accum must no longer be initialized to XSTATE_FP_SSE (since that's a valid value a guest could write to XCR0), and the xsave/xrstor as well as the context switch code need to suitably account for this (by always enforcing at least this part of the state to be saved/loaded). This involves undoing large parts of c/s 22945:13a7d1f7f62c ("x86: add strictly sanity check for XSAVE/XRSTOR") - we need to cleanly distinguish between hardware capabilities and vCPU used features. Next both HVM and PV save code needed tweaking to not always save the full state supported by the underlying hardware, but just the parts that the guest actually used. Similarly the restore code should bail not just on state being restored that the hardware cannot handle, but also on inconsistent save state (inconsistent XCR0 settings or size of saved state not in line with XCR0). And finally the PV extended context get/set code needs to use slightly different logic than the HVM one, as here we can't just key off of xsave_enabled() (i.e. avoid doing anything if a guest doesn't use xsave) because the tools use this function to determine host capabilities as well as read/write vCPU state. The set operation in particular needs to be capable of cleanly dealing with input that consists of only the xcr0 and xcr0_accum values (if they're both zero then no further data is required). While for things to work correctly both sides (saving _and_ restoring host) need to run with the fixed code, afaict no breakage should occur if either side isn't up to date (other than the breakage that this patch attempts to fix). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Yang Zhang <yang.z.zhang@intel.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: move struct bug_frame instances out of lineJan Beulich2013-08-231-43/+45
| | | | | | | | | | | | | | Just like Linux did many years ago, move them into a separate (data) section, such that they no longer pollute instruction caches and TLBs. Assertion frames, requiring two pointers to be stored, occupy two slots in the array, with the second slot mimicking a frame the location pointer of which doesn't match any address within .text or .init.text (it effectively points back to the slot itself, which - being in a data section - can't be reached by non-buggy execution). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* watchdog: Move watchdog from being x86 specific to common codeAndrew Cooper2013-08-131-0/+1
| | | | | | | | | | | | | | | Augment watchdog_setup() to be able to possibly return an error, and introduce watchdog_enabled() as a better alternative to knowing the architectures internal details. This patch does not change the x86 implementaion, beyond making it compile. For header files, some includes of xen/nmi.h were only for the watchdog functions, so are replaced rather than adding an extra include of xen/watchdog.h Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: fix XCR0 handlingJan Beulich2013-06-041-19/+3
| | | | | | | | | | | | | | - both VMX and SVM ignored the ECX input to XSETBV - both SVM and VMX used the full 64-bit RAX when calculating the input mask to XSETBV - faults on XSETBV did not get recovered from Also consolidate the handling for PV and HVM into a single function, and make the per-CPU variable "xcr0" static to xstate.c. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
* x86/xsave: properly check guest input to XSETBVJan Beulich2013-06-041-0/+5
| | | | | | | | | | Other than the HVM emulation path, the PV case so far failed to check that YMM state requires SSE state to be enabled, allowing for a #GP to occur upon passing the inputs to XSETBV inside the hypervisor. This is CVE-2013-2078 / XSA-54. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* xen: move for_each_set_bit to xen/bitops.hStefano Stabellini2013-05-081-1/+1
| | | | | | | | Move for_each_set_bit from asm-x86/bitops.h to xen/bitops.h. Replace #include <asm/bitops.h> with #include <xen/bitops.h> everywhere. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* rename IS_PRIV to is_hardware_domainDaniel De Graaf2013-05-071-6/+6
| | | | | | | | | | | Since the remaining uses of IS_PRIV are actually concerned with the domain having control of the hardware (i.e. being the initial domain), clarify this by renaming IS_PRIV to is_hardware_domain. This also removes IS_PRIV_FOR since the only remaining user was xsm/dummy.h. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (for 4.3 release) Acked-by: Keir Fraser <keir@xen.org>
* x86: handle paged gfn in wrmsr_hypervisor_regsOlaf Hering2013-05-071-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If xenpaging is started very early for a guest the gfn for the hypercall page may be paged-out already. This leads to a guest crash: ... (XEN) HVM10: Allocated Xen hypercall page at 169ff000 (XEN) traps.c:654:d10 Bad GMFN 169ff (MFN 3e900000000) to MSR 40000000 (XEN) HVM10: Detected Xen v4.3 (XEN) io.c:201:d10 MMIO emulation failed @ 0008:c2c2c2c2: 18 7c 55 6d 03 83 ff ff 10 7c (XEN) hvm.c:1253:d10 Triple fault on VCPU0 - invoking HVM shutdown action 1. (XEN) HVM11: HVM Loader ... Update return codes of wrmsr_hypervisor_regs, update callers to deal with the new return codes: 0: not handled 1: handled -EAGAIN: retry Currently wrmsr_hypervisor_regs will not return the following error, it will be added in a separate patch: -EINVAL: error during handling Also update the gdprintk to handle a page value of NULL to avoid printing a bogus MFN value. Update also computing of MSR value in gdprintk, the idx was always zero. Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Keir Fraser <keir@xen.org>
* x86: miscellaneous mm.c cleanupJan Beulich2013-05-021-2/+2
| | | | | | | | | | | | This simply streamlines code in a few places, where room for improvement was noticed during the earlier here and the patches in the XSA-45 series. This also drops the bogus use of the domain lock in the CR3 write emulation (which protected against nothing). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
* x86: make new_guest_cr3() preemptibleJan Beulich2013-05-021-2/+13
| | | | | | | | | ... as it may take significant amounts of time. This is part of CVE-2013-1918 / XSA-45. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
* Fix emacs local variable block to use correct C style variable.David Vrabel2013-02-211-1/+1
| | | | | | | The emacs variable to set the C style from a local variable block is c-file-style, not c-set-style. Signed-off-by: David Vrabel <david.vrabel@citrix.com
* x86: properly use map_domain_page() in miscellaneous placesJan Beulich2013-01-231-1/+5
| | | | | Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: properly use map_domain_page() during page table manipulationJan Beulich2013-01-231-0/+5
| | | | | Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: restore (optional) forwarding of PCI SERR induced NMI to Dom0Jan Beulich2013-01-221-3/+15
| | | | | | | | | | | | | | | | c/s 22949:54fe1011f86b removed the forwarding of NMIs to Dom0 when they were caused by PCI SERR. NMI buttons as well as BMCs (like HP's iLO) may however want such events to be seen in Dom0 (e.g. to trigger a dump). Therefore restore most of the functionality which named c/s removed (adjusted for subsequent changes, and adjusting the public interface to use the modern term, retaining the old one for backwards compatibility). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: handle both NMI kinds if they occur simultaneouslyJan Beulich2013-01-171-3/+3
| | | | | | | | | | | We shouldn't assume PCI SERR excludes IOCHK. Once at it, also remove the doubly redundant range restriction on "reason" - the variable already is "unsigned char". Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* xen/xsm: Add xsm_default parameter to XSM hooksDaniel De Graaf2013-01-111-1/+1
| | | | | | | | | | | | | | Include the default XSM hook action as the first argument of the hook to facilitate quick understanding of how the call site is expected to be used (dom0-only, arbitrary guest, or device model). This argument does not solely define how a given hook is interpreted, since any changes to the hook's default action need to be made identically to all callers of a hook (if there are multiple callers; most hooks only have one), and may also require changing the arguments of the hook. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Keir Fraser <keir@xen.org>
* x86: save/restore only partial register state where possibleJan Beulich2012-10-301-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | ... and make restore conditional not only upon having saved the state, but also upon whether saved state was actually modified (and register values are known to have been preserved). Note that RBP is unconditionally considered a volatile register (i.e. irrespective of CONFIG_FRAME_POINTER), since the RBP handling would become overly complicated due to the need to save/restore it on the compat mode hypercall path [6th argument]. Note further that for compat mode code paths, saving/restoring R8...R15 is entirely unnecessary - we don't allow those guests to enter 64-bit mode, and hence they have no way of seeing these registers' contents (and there consequently also is no information leak, except if the context saving domctl would be considered such). Finally, note that this may not properly deal with gdbstub's needs, yet (but if so, I can't really suggest adjustments, as I don't know that code). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* xen: replace XEN_GUEST_HANDLE with XEN_GUEST_HANDLE_PARAM when appropriateStefano Stabellini2012-10-171-1/+1
| | | | | | | | | | | | Note: these changes don't make any difference on x86. Replace XEN_GUEST_HANDLE with XEN_GUEST_HANDLE_PARAM when it is used as an hypercall argument. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
* x86: consolidate frame state manipulation functionsJan Beulich2012-10-041-28/+0
| | | | | | | | Rather than doing this in multiple places, have a single central function (decode_register()) to be used by all other code. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: replace literal numbersJan Beulich2012-09-281-2/+2
| | | | | | | | | In various cases, 256 was being used instead of NR_VECTORS or a derived ARRAY_SIZE() expression. In one case (guest_has_trap_callback()), a wrong (unrelated) constant was used instead of NR_VECTORS. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: vMCE emulationLiu, Jinsong2012-09-261-44/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides virtual MCE support to guest. It emulates a simple and clean MCE MSRs interface to guest by faking caps to guest if needed and masking caps if unnecessary: 1. Providing a well-defined MCG_CAP to guest, filter out un-necessary caps and provide only guest needed caps; 2. Disabling MCG_CTL to avoid model specific; 3. Sticking all 1's to MCi_CTL to guest to avoid model specific; 4. Enabling CMCI cap but never really inject to guest to prevent polling periodically; 5. Masking MSCOD field of MCi_STATUS to avoid model specific; 6. Keeping natural semantics by per-vcpu instead of per-domain variables; 7. Using bank1 and reserving bank0 to work around 'bank0 quirk' of some very old processors; 8. Cleaning some vMCE# injection logic which shared by Intel and AMD but useless under new vMCE implement; 9. Keeping compatilbe w/ old xen version which has been backported to SLES11 SP2, so that old vMCE would not blocked when migrate to new vMCE; Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> - make printing consistent (and non-exploitable) - fix return values of intel_mce_{rd,wr}msr() for out of range banks - miscellaneous cleanup Signed-off-by: Jan Beulich <jbeulich@suse.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* x86: slightly improve stack trace on debug buildsJan Beulich2012-09-261-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As was rather obvious from crashes recently happening in stage testing, the debug hypervisor, in that special case, has a drawback compared to the non-debug one: When a call through a bad pointer happens, there's no frame, and the top level (and frequently most important for analysis) stack entry would get skipped: (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 1 (XEN) RIP: e008:[<0000000000000000>] ??? (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor (XEN) rax: 0000000000000008 rbx: 0000000000000001 rcx: 0000000000000003 (XEN) rdx: 0000003db54eb700 rsi: 7fffffffffffffff rdi: 0000000000000001 (XEN) rbp: ffff8302357e7ee0 rsp: ffff8302357e7e58 r8: 0000000000000000 (XEN) r9: 000000000000003e r10: ffff8302357e7f18 r11: ffff8302357e7f18 (XEN) r12: ffff8302357ee340 r13: ffff82c480263980 r14: ffff8302357ee3d0 (XEN) r15: 0000000000000001 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 00000000bf473000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen stack trace from rsp=ffff8302357e7e58: (XEN) ffff82c4801a3d05 ffff8302357eca70 0000000800000020 ffff82c4802ead60 (XEN) 0000000000000001 ffff8302357e7ea0 ffff82c48016bf07 0000000000000000 (XEN) 0000000000000000 ffff8302357e7ee0 fffff830fffff830 0000000000000046 (XEN) ffff8302357e7f18 ffff82c480263980 ffff8302357e7f18 0000000000000000 (XEN) 0000000000000000 ffff8302357e7f10 ffff82c48015c2be 8302357dc0000fff ... (XEN) Xen call trace: (XEN) [<0000000000000000>] ??? (XEN) [<ffff82c48015c2be>] idle_loop+0x6c/0x7a (XEN) (XEN) Pagetable walk from 0000000000000000: Since the bad pointer is being printed anyway (as part of the register state), replace it with the top of stack value in such a case. With the introduction of is_active_kernel_text(), use it also at the (few) other suitable places (I intentionally didn't replace the use in xen/arch/arm/mm.c - while it would be functionally correct, the dependency on system_state wouldn't be from an abstract perspective). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: HYPERVISOR_VIRT_END is always defined. Remove ifdef'ery.Keir Fraser2012-09-121-5/+0
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* x86: Remove CONFIG_COMPAT ifdef'ery from arch/x86 -- it is always defined.Keir Fraser2012-09-121-16/+2
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* x86: We can assume CONFIG_PAGING_LEVELS==4.Keir Fraser2012-09-121-16/+0
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* xen: Remove x86_32 build target.Keir Fraser2012-09-121-72/+1
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* x86/PCI: pass correct register value to XSMDaniel De Graaf2012-06-221-0/+12
| | | | | | | | | | | | | | | | When attempting to use AMD's extension to access the extended PCI config space, only the low byte of the register number was being passed to XSM. Include the correct value of the register if this feature is enabled; otherwise, bits 24-30 of port cf8 are reserved, so disallow the invalid access. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Don't fail the permission check except when the MSR can't be read. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Committed-by: Jan Beulich <jbeulich@suse.com>
* AMD IOMMU: add mechanism to protect their PCI devices' config spacesJan Beulich2012-06-221-12/+22
| | | | | | | | | | | | | | | | | | | Recent Dom0 kernels want to disable PCI MSI on all devices, yet doing so on AMD IOMMUs (which get represented by a PCI device) disables part of the functionality set up by the hypervisor. Add a mechanism to mark certain PCI devices as having write protected config spaces (both through port based [method 1] accesses and, for x86-64, mmconfig), and use that for AMD's IOMMUs. Note that due to ptwr_do_page_fault() being run first, there'll be a MEM_LOG() issued for each such mmconfig based write attempt. If that's undesirable, the order of the calls in fixup_page_fault() would need to be swapped. Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Wei Wang <wei.wang2@amd.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/PCI: fix guest_io_read() when pci_cfg_ok() denies accessJan Beulich2012-06-221-1/+1
| | | | | | | | | | | | | | For a multi-byte aligned read, this so far resulted in 0x00ff to be put in the guest's register rather than 0xffff or 0xffffffff, which in turn could confuse bus scanning functions (which, when reading vendor and/or device IDs, expect to get back all zeroes or all ones). As the value gets masked to the read width when merging back into the full result, setting the initial value to all ones should not harm any or the other cases. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86-64: don't allow non-canonical addresses to be set for any callbackJan Beulich2012-06-181-0/+6
| | | | | | | | | | Rather than deferring the detection of these to the point where they get actually used (the fix for XSA-7, 25480:76eaf5966c05, causing a #GP to be raised by IRET, which invokes the guest's [fragile] fail-safe callback), don't even allow such to be set. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/nmi: Fix deadlock in unknown_nmi_error()Andrew Cooper2012-06-111-4/+4
| | | | | | | | | Additionally, correct the text description to reflect what is being done, and make use of fatal_trap() in preference to kexec_crash() in case an unknown NMI occurs before a kdump kernel has been loaded. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Committed-by: Keir Fraser <keir@xen.org>
* x86: Use get_page_from_gfn() instead of get_gfn()/put_gfn.Tim Deegan2012-05-171-13/+14
| | | | Signed-off-by: Tim Deegan <tim@xen.org>
* X86: expose HLE/RTM features to dom0Liu, Jinsong2012-02-281-0/+2
| | | | | | | | | Intel recently release 2 new features, HLE and TRM. Refer to http://software.intel.com/file/41417. This patch expose them to dom0. Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* x86: avoid deadlock after a PCI SERR NMIDavid Vrabel2012-02-021-3/+10
| | | | | | | | | | | | | | If a PCI System Error (SERR) is asserted it causes an NMI. If this NMI occurs while the CPU is in printk() then Xen may deadlock as pci_serr_error() calls console_force_unlock() which screws up the console lock. printk() isn't safe to call from NMI context so defer the diagnostic message to a softirq. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Tested-by: George Dunlap <george.dunlap@eu.citrix.com> Committed-by: Keir Fraser <keir@xen.org>
* x86: Use defines for bits of MSR_IA32_DEBUGCTLMSR instead of numbersDietmar Hahn2012-02-011-2/+2
| | | | | Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* x86: Make asmlinkage explicitly a no-op, and avoid usage in arch/x86Keir Fraser2012-01-151-15/+15
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* xsm: add checks on PCI configuration accessDaniel De Graaf2011-12-181-4/+22
| | | | | | | | PCI configuration access is allowed to any privileged domain regardless of I/O port access restrictions; add XSM hooks for these accesses. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
* X86: Disable PCID/INVPCID for dom0Liu, Jinsong2011-12-011-0/+1
| | | | | | | | | | | | | | | | PCID (Process-context identifier) is a facility by which a logical processor may cache information for multiple linear-address spaces. INVPCID is an new instruction to invalidate TLB. Refer latest Intel SDM http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html We disable PCID/INVPCID for dom0 and pv. Exposing them into dom0 and pv may result in performance regression, and it would trigger GP or UD depending on whether platform suppport INVPCID or not. This patch disables PCID/INVPCID for dom0. Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* X86: expose Intel new features to dom0Liu, Jinsong2011-12-011-2/+5
| | | | | | | | This patch expose Intel new features to dom0, including FMA/AVX2/BMI1/BMI2/LZCNT/MOVBE. Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* Modify naming of queries into the p2mAndres Lagar-Cavilla2011-11-111-5/+14
| | | | | | | | | | | | | | | | | | | | | | Callers of lookups into the p2m code are now variants of get_gfn. All callers need to call put_gfn. The code behind it is a no-op at the moment, but will change to proper locking in a later patch. This patch does not change functionality. Only naming, and adds put_gfn's. set_p2m_entry retains its name because it is always called with p2m_lock held. This patch is humongous, unfortunately, given the dozens of call sites involved. After this patch, anyone using old style gfn_to_mfn will not succeed in compiling their code. This is on purpose: adapt to the new API. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Keir Fraser <keir@xen.org>
* introduce and use nr_cpu_ids and nr_cpumask_bitsJan Beulich2011-10-211-1/+1
| | | | | | | | | | | | | | | The former is the runtime equivalent of NR_CPUS (and users of NR_CPUS, where necessary, get adjusted accordingly), while the latter is for the sole use of determining the allocation size when dynamically allocating CPU masks (done later in this series). Adjust accessors to use either of the two to bound their bitmap operations - which one gets used depends on whether accessing the bits in the gap between nr_cpu_ids and nr_cpumask_bits is benign but more efficient. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: Further fixes for xsave leaf in pv_cpuid().Shan Haitao2011-10-131-5/+7
| | | | | Signed-off-by: Shan Haitao <haitao.shan@intel.com> Committed-by: Keir Fraser <keir@xen.org>
* constify vcpu_set_affinity()'s second parameterJan Beulich2011-10-131-8/+2
| | | | | | | | | | | None of the callers actually make use of the function's returning of the old affinity through its second parameter, and eliminating this capability allows some callers to no longer use a local variable here, reducing their stack footprint significantly when building with large NR_CPUS. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>