aboutsummaryrefslogtreecommitdiffstats
path: root/xen/arch/x86/x86_64/traps.c
Commit message (Collapse)AuthorAgeFilesLines
* x86: check for canonical address before doing page walksJan Beulich2013-10-111-0/+2
| | | | | | | | | | | | ... as there doesn't really exists any valid mapping for them. Particularly in the case of do_page_walk() this also avoids returning non-NULL for such invalid input. Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* watchdog/crash: Always disable watchdog in console_force_unlock()Andrew Cooper2013-08-131-2/+0
| | | | | | | | | | | | | | | | | | | | | | Depending on the state of the conring and serial_tx_buffer, console_force_unlock() can be a long running operation, usually because of serial_start_sync() XenServer testing has found a reliable case where console_force_unlock() on one PCPU takes long enough for another PCPU to timeout due to the watchdog (such as waiting for a tlb flush callin). The watchdog timeout causes the second PCPU to repeat the console_force_unlock(), at which point the first PCPU typically fails an assertion in spin_unlock_irqrestore(&port->tx_lock) (because the tx_lock has been unlocked behind itself). console_force_unlock() is only on emergency paths, so one way or another the host is going down. Disable the watchdog before forcing the console lock to help prevent having pcpus completing with each other to bring the host down. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* watchdog: Move watchdog from being x86 specific to common codeAndrew Cooper2013-08-131-0/+1
| | | | | | | | | | | | | | | Augment watchdog_setup() to be able to possibly return an error, and introduce watchdog_enabled() as a better alternative to knowing the architectures internal details. This patch does not change the x86 implementaion, beyond making it compile. For header files, some includes of xen/nmi.h were only for the watchdog functions, so are replaced rather than adding an extra include of xen/watchdog.h Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: Special case __HYPERVISOR_iret rather more when writing hypercall pagesAndrew Cooper2013-07-161-2/+5
| | | | | | | | | | | | | | | | | | | | | | In all cases when a hypercall page is written, __HYPERVISOR_iret is first written as a regular hypercall, then subsequently rewritten in its special case. For VMX and SVM, this means that following the ud2a instruction is 3 bytes of an imm32 parameter. For a ring3 kernel, this means that following the syscall instruction is the second half of 'pop %r11'. For a ring1 kernel, the iret case ends up as the same number of bytes as the rest of the hypercalls, but it is pointless writing it twice, and is changed for consistency. Therefore, skip the loop iteration which would write the incorrect __HYPERVISOR_iret hypercall. This removes junk machine code from the tail and makes disassemblers rather more happy when looking at the hypercall page. Also, a miscellaneous whitespace fix in the comment for ring3 kernel. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
* Fix emacs local variable block to use correct C style variable.David Vrabel2013-02-211-1/+1
| | | | | | | The emacs variable to set the C style from a local variable block is c-file-style, not c-set-style. Signed-off-by: David Vrabel <david.vrabel@citrix.com
* xen: Define debug_build() based on NDEBUG. Use it in a few printk's.Keir Fraser2013-01-301-6/+1
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* x86: properly use map_domain_page() during page table manipulationJan Beulich2013-01-231-4/+8
| | | | | Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: also print CRn register values upon double faultJan Beulich2012-12-201-16/+13
| | | | | | | Do so by simply re-using _show_registers(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/IST: Create set_ist() helper functionAndrew Cooper2012-12-111-3/+3
| | | | | | | | ... to save using open-coded bitwise operations, and update all IST manipulation sites to use the function. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* x86: mark certain items staticJan Beulich2012-12-071-1/+1
| | | | | | | ..., and at once constify the data items among them. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: save/restore only partial register state where possibleJan Beulich2012-10-301-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | ... and make restore conditional not only upon having saved the state, but also upon whether saved state was actually modified (and register values are known to have been preserved). Note that RBP is unconditionally considered a volatile register (i.e. irrespective of CONFIG_FRAME_POINTER), since the RBP handling would become overly complicated due to the need to save/restore it on the compat mode hypercall path [6th argument]. Note further that for compat mode code paths, saving/restoring R8...R15 is entirely unnecessary - we don't allow those guests to enter 64-bit mode, and hence they have no way of seeing these registers' contents (and there consequently also is no information leak, except if the context saving domctl would be considered such). Finally, note that this may not properly deal with gdbstub's needs, yet (but if so, I can't really suggest adjustments, as I don't know that code). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* xen: replace XEN_GUEST_HANDLE with XEN_GUEST_HANDLE_PARAM when appropriateStefano Stabellini2012-10-171-1/+1
| | | | | | | | | | | | Note: these changes don't make any difference on x86. Replace XEN_GUEST_HANDLE with XEN_GUEST_HANDLE_PARAM when it is used as an hypercall argument. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
* x86: enable VIA CPU supportJan Beulich2012-09-211-1/+2
| | | | | | | | | | | | | Newer VIA CPUs have both 64-bit and VMX support. Enable them to be recognized for these purposes, at once stripping off any 32-bit CPU only bits from the respective CPU support file, and adding 64-bit ones found in recent Linux. This particularly implies untying the VMX == Intel assumption in a few places. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86_64: Fix off-by-one error setting up the Interrupt Stack TablesAndrew Cooper2012-05-101-3/+3
| | | | | | | | | | | | | | | | | | | | The Interrupt Stack Table entries in a 64bit TSS are a 1 based data structure as far as hardware is concerned. As a result, the code setting up stacks in subarch_percpu_traps_init() fills in the wrong IST entries. The result is that the MCE handler executes on the stack set up for NMIs; the NMI handler executes on a stack set up for Double Faults, and Double Faults are executed with a stack pointer set to 0. Once the #DF handler starts to execute, it will usually take a page fault looking up the address at 0xfffffffffffffff8, which will cause a triple fault. If a guest has mapped a page in that location, then it will have some state overwritten, but as the #DF handler always calls panic(), this is not a problem the guest will have time to care about. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Committed-by: Keir Fraser <keir@xen.org>
* x86: Make asmlinkage explicitly a no-op, and avoid usage in arch/x86Keir Fraser2012-01-151-2/+2
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* x86: show page walk also for early page faultsJan Beulich2011-06-231-4/+8
| | | | | | | At once, move the common (between 32- and 64-bit) definition of machine_to_phys_mapping_valid to a common location. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* xen: remove more declarations from C files.Tim Deegan2011-05-271-4/+0
| | | | | | | | | | This patch moves some more, mostly data, extern declarations into header files. I haven't been as strict as I was with functions; in particular there are a number of declarations of assembler labels that are only used in one place. I've also left a few compat-mode tricks, and all the magic in symbols.c Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
* xen: Include headers that are actually needed, drop everything else.Christoph Egger2011-05-201-0/+1
| | | | Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
* x86: move pv-only members of struct vcpu to struct pv_vcpuJan Beulich2011-04-051-5/+5
| | | | | | | | | | | ... thus further shrinking overall size of struct arch_vcpu. This has a minor effect on XEN_DOMCTL_{get,set}_ext_vcpucontext - for HVM guests, some meaningless fields will no longer get stored or retrieved: reads will now return zero, and writes are required to be (mostly) zero (the same as was already done on x86-32). Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: split struct vcpuJan Beulich2011-04-051-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | This is accomplished by splitting the guest_context member, which by itself is larger than a page on x86-64. Quite a number of fields of this structure is completely meaningless for HVM guests, and thus a new struct pv_vcpu gets introduced, which is being overlaid with struct hvm_vcpu in struct arch_vcpu. The one member that is mostly responsible for the large size is trap_ctxt, which now gets allocated separately (unless fitting on the same page as struct arch_vcpu, as is currently the case for x86-32), and only for non-hvm, non-idle domains. This change pointed out a latent problem in arch_set_info_guest(), which is permitted to be called on already initialized vCPU-s, but so far copied the new state into struct arch_vcpu without (in this case) actually going through all the necessary accounting/validation steps. The logic gets changed so that the pieces that bypass accounting will at least be verified to be no different from the currently active bits, and the whole change will fail in case they are. The logic does *not* get adjusted here to do full error recovery, that is, partially modified state continues to not get unrolled in case of failure. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: fix MCE/NMI injectionKeir Fraser2009-12-011-54/+2
| | | | | | | | | | | | | | | | | | | | This attempts to address all the concerns raised in http://lists.xensource.com/archives/html/xen-devel/2009-11/msg01195.html, but I'm nevertheless still not convinced that all aspects of the injection handling really work reliably. In particular, while the patch here on top of the fixes for the problems menioned in the referenced mail also adds code to keep send_guest_trap() from injecting multiple events at a time, I don't think the is the right mechanism - it should be possible to handle NMI/MCE nested within each other. Another fix on top of the ones for the earlier described problems is that the vCPU affinity restore logic didn't account for software injected NMIs - these never set cpu_affinity_tmp, but due to it most likely being different from cpu_affinity it would have got restored (to a potentially random value) nevertheless. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: Remove EF_* duplicate defs for X86_EFLAGS_*.Keir Fraser2009-08-141-3/+7
| | | | Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* x86: make show_page_walk() more robustKeir Fraser2009-07-201-3/+6
| | | | | | Also add in a missing line in x86-64's do_page_walk(). Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: move init_tss into per-CPU spaceKeir Fraser2009-07-131-3/+3
| | | | Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86 hvm mce: Support HVM Guest virtual MCA handling.Keir Fraser2009-06-301-4/+5
| | | | | | | | | | When MCE# happens, if the error has been contained/recovered by XEN and it impacts one guest Domain(DOM0/HVM Guest/PV Guest), we will inject the corresponding vMCE# into the impacted Domain. Guest OS will go on its own recovery job if it has MCA handler. Signed-off-by: Liping Ke <liping.ke@intel.com> Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>
* x86: Core support for Intel MCA supportKeir Fraser2009-03-201-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Those patches based on AMD and SUN's MCA related jobs. We have latest rebase after SUN's latest improvements. We will have late following patches for recovery actions. This is a basic framework for Intel. Some implementation notes: 1) When error happens, if the error is fatal (pcc = 1) or can't be recovered (pcc = 0, yet no good recovery methods), for avoiding losing logs in DOM0, we will reset machine immediately. Most of MCA MSRs are sticky. After reboot, MCA polling mechanism will send vIRQ to DOM0 for logging. 2) When MCE# happens, all CPUs enter MCA context. The first CPU who read&clear the error MSR bank will be this MCE# owner. Necessary locks/synchronization will help to judge the owner and select most severe error. 3) For convenience, we will select the most offending CPU to do most of processing&recovery job. 4) MCE# happens, we will do three jobs: a. Send vIRQ to DOM0 for logging b. Send vMCE# to Impacted Guest (Currently Only inject to impacted DOM0) c. Guest vMCE MSR virtualization 5) Some further improvement/adds for newer CPUs might be done later a) Connection with recovery actions (cpu/memory online/offline) b) More software-recovery identification in severity_scan c) More refines and tests for HVM might be done when needed. This patch Enable basic MCA support For Intel Signed-off-by: Jiang, Yunhong<yunhong.jiang@intel.com> Signed-off-by: Ke, Liping <Liping.ke@intel.com>
* x86: make GDT per-CPUKeir Fraser2008-09-221-4/+3
| | | | | | | | | | | The major issue with supporting a significantly larger number of physical CPUs appears to be the use of per-CPU GDT entries - at present, x86-64 could support only up to 126 CPUs (with code changes to also use the top-most GDT page, that would be 254). Instead of trying to go with incremental steps here, by converting the GDT itself to be per-CPU, limitations in that respect go away entirely. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: also show event upcall mask when dumping guest stateKeir Fraser2008-08-081-11/+23
| | | | | Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* x86: MCA support.Keir Fraser2008-07-041-2/+7
| | | | Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
* Revert incorrectly checked-in changes.Keir Fraser2008-07-041-4/+0
| | | | Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* Add facility to get notification of domain suspend by event channel.Keir Fraser2008-07-041-0/+4
| | | | | | | | | | | | This event channel will be notified when the domain transitions to the suspended state, which can be much faster than raising VIRQ_DOM_EXC and waiting for the notification to be propagated via xenstore. No attempt is made here to prevent multiple subscribers (last one wins), or to detect that the subscriber has gone away. Userspace tools should take care. Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
* dom0 state dumpKeir Fraser2008-06-121-22/+48
| | | | | | | | | | Since xenctx cannot (for obvious reasons) display the context of dom0's vCPU-s, here are the beginnings of a console based mechanism to achieve the same (useful if dom0 hangs with one or more de-scheduled vCPU-s). The stack handling obviously needs improvement, but the register context should come out fine in all cases. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86-64: use 1GB pages in 1:1 mapping if availableKeir Fraser2008-01-281-3/+5
| | | | | | | | | | | | | | | | | | At once adjust the 2/4Mb page handling slightly in a few places (to match the newly added code): - when re-creating a large page mapping after finding that all small page mappings in the respective area are using identical flags and suitable MFNs, the virtual address was already incremented pas the area to be dealt with, which needs to be accounted for in the invocation of flush_area() in that path - don't or-in/and-out _PAGE_PSE on non-present pages - when comparing flags, try minimse the number of l1f_to_lNf()/ lNf_to_l1f() instances used - instead of skipping a single page when encountering a big page mapping equalling to what a small page mapping would establish, skip to the next larger page boundary Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: make show_page_walk() more robust.Keir Fraser2008-01-241-4/+4
| | | | Signed-off-by: Jan Beulich <jbeulich@novell.com>
* do_set_trap_table()'s argument can be const.Keir Fraser2008-01-181-1/+1
| | | | | | | | Also, automatically generate const version of every guest handle definition. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* do_callback_op()'s second argument can be const allowing the guest toKeir Fraser2008-01-181-1/+1
| | | | | | declare these (mostly static) argument structures 'const'. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: Remove CALLBACKTYPE_sysexit.Keir Fraser2007-10-241-6/+0
| | | | | | | Looking at the Linux patch as an example, it adds more code and complexity than it removes, for no obvious performance benefit. Signed-off-by: Keir Fraser <keir@xensource.com>
* x86-64: syscall/sysenter support for 32-bit apps for both 32-bit appsKeir Fraser2007-10-241-3/+36
| | | | | | | in 64-bit pv guests and 32on64. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xensource.com>
* x86: add option to display last exception records during register dumpsKeir Fraser2007-10-171-1/+9
| | | | | Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xensource.com>
* hvm: Always keep canonical copy of RIP/RSP/RFLAGS inkfraser@localhost.localdomain2007-09-191-1/+0
| | | | | | | guest_cpu_user_regs(). Reduces complexity at little or no performance cost (except on really old Intel P4 hardware where VMREAD/VMWRITE are silly expensive). Signed-off-by: Keir Fraser <keir@xensource.com>
* hvm: hvm_{load,store}_cpu_guest_regs() does not touch segmentkfraser@localhost.localdomain2007-09-191-3/+21
| | | | | | | selectors. We have separate accessors for that now. It is now an invariant that guest_cpu_user_regs()->{cs,ds,es,fs,gs,ss} are invalid for an HVM guest. Signed-off-by: Keir Fraser <keir@xensource.com>
* x86: Clean up asm keyword usage (asm volatile rather than __asm__kfraser@localhost.localdomain2007-09-111-3/+3
| | | | | | | | | | | | | | | | __volatile__ in most places) and ensure we use volatile keyword wherever we have an asm stmt that produces outputs but has other unspecified side effects or dependencies other than the explicitly-stated inputs. Also added volatile in a few places where its not strictly necessary but where it's unlikely to produce worse code and it makes our intentions perfectly clear. The original problem this patch fixes was tracked down by Joseph Cihula <joseph.cihula@intel.com>. Signed-off-by: Keir Fraser <keir@xensource.com>
* Clean up usage of 'current' in do_iret() hypercall.kfraser@localhost.localdomain2007-08-091-2/+2
| | | | | Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Signed-off-by: Keir Fraser <keir@xensource.com>
* Convert __init into __devinit in wakeup path.kfraser@localhost.localdomain2007-07-111-1/+1
| | | | | | | | | | | | | Need to ensure all the code slice in the wakeup path still existing. For this purpose, we have to use __devinit instead of __init, since the former is null for CONFIG_HOTPLUG while the latter always causes related code to be free-ed after first boot. Later when adding __init to some function, be sure to consider wakeup case and cpu hotplug! Signed-off-by <Kevin.Tian@intel.com>
* x86-64: bump STACK_SIZE to 32 so that trampoline and IST stacks fitkfraser@localhost.localdomain2007-07-031-11/+14
| | | | | without undue squeezing. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: machine check exception handlingkfraser@localhost.localdomain2007-06-211-3/+7
| | | | | | | | | | Properly handle MCE (connecting the exisiting, but so far unused vendor specific handlers). HVM guests don't own CR4.MCE (and hence can't suppress the exception) anymore, preventing silent machine shutdown. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xensource.com>
* Use clear_page() wherever possible/reasonable instead of open codedkfraser@localhost.localdomain2007-06-201-0/+1
| | | | | | memset() calls (likewise a few replacements memcpy -> copy_page). Signed-off-by: Jan Beulich <jbeulich@novell.com>
* xen: Big changes to x86 start-of-day:kfraser@localhost.localdomain2007-05-101-68/+46
| | | | | | | | | | | | | | | | | | | | 1. x86/64 Xen now relocates itself to physical high memory. This is useful if we have devices that need very low memory, or if in future we want to grant a 1:1 mapping of low physical memory to a special 'native client domain'. 2. We now only map low 16MB RAM statically. All other RAM is mapped dynamically within the constraints of the e820 map. It is recommended never to map MMIO regions, and this change means that Xen now obeys this constraint. 3. The CPU bootup trampoline is now permanently installed at 0x90000. This is necessary prereq for CPU hotplug. 4. Start-of-day asm is generally cleaned up and diff between x86/32 and x86/64 is reduced. Signed-off-by: Keir Fraser <keir@xensource.com>
* xen: More 'IS_COMPAT' cleanups.kfraser@localhost.localdomain2007-04-271-2/+2
| | | | Signed-off-by: Keir Fraser <keir@xensource.com>
* xen: Fix up use of trap_bounce structure.kfraser@localhost.localdomain2007-04-251-3/+0
| | | | | Fixes suggested by Jan Beulich. Signed-off-by: Keir Fraser <keir@xensource.com>