aboutsummaryrefslogtreecommitdiffstats
path: root/xen/include/xen
Commit message (Collapse)AuthorAgeFilesLines
* evtchn: refactor low-level event channel port opsDavid Vrabel2013-10-142-0/+49
| | | | | | | | | | | | | Use functions for the low-level event channel port operations (set/clear pending, unmask, is_pending and is_masked). Group these functions into a struct evtchn_port_op so they can be replaced by alternate implementations (for different ABIs) on a per-domain basis. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* sched: Correct function prototypesAndrew Cooper2013-10-141-3/+3
| | | | | | | struct vcpu pointers are traditionally v rather than d. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* scheduler: adjust internal locking interfaceJan Beulich2013-10-141-82/+56
| | | | | | | | | | | | | | | Make the locking functions return the lock pointers, so they can be passed to the unlocking functions (which in turn can check that the lock is still actually providing the intended protection, i.e. the parameters determining which lock is the right one didn't change). Further use proper spin lock primitives rather than open coded local_irq_...() constructs, so that interrupts can be re-enabled as appropriate while spinning. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* xen: Add macro ROUNDUPJulien Grall2013-10-081-0/+2
| | | | | | Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Keir Fraser <keir@xen.org> CC: Jan Beulich <jbeulich@suse.com>
* xen: Add macros MB and GBJulien Grall2013-10-081-0/+3
| | | | | | | Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Keir Fraser <keir@xen.org> CC: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
* xen: add LZ4 decompression supportKyungsik Lee2013-10-072-1/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for Xen are based on LZ4 implementation by Yann Collet. Benchmark Results(PATCH v3) Compiler: Linaro ARM gcc 4.6.2 1. ARMv7, 1.5GHz based board Kernel: linux 3.4 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.7MB 20.1MB/s, 25.2MB/s(UA) LZ4 7.3MB 29.1MB/s, 45.6MB/s(UA) 2. ARMv7, 1.7GHz based board Kernel: linux 3.7 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.0MB 34.1MB/s, 52.2MB/s(UA) LZ4 6.5MB 86.7MB/s - UA: Unaligned memory Access support - Latest patch set for LZO applied This patch set is for adding support for LZ4-compressed Kernel. LZ4 is a very fast lossless compression algorithm and it also features an extremely fast decoder [1]. But we have five of decompressors already and one question which does arise, however, is that of where do we stop adding new ones? This issue had been discussed and came to the conclusion [2]. Russell King said that we should have: - one decompressor which is the fastest - one decompressor for the highest compression ratio - one popular decompressor (eg conventional gzip) If we have a replacement one for one of these, then it should do exactly that: replace it. The benchmark shows that an 8% increase in image size vs a 66% increase in decompression speed compared to LZO(which has been known as the fastest decompressor in the Kernel). Therefore the "fast but may not be small" compression title has clearly been taken by LZ4 [3]. [1] http://code.google.com/p/lz4/ [2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157 [3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347 LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html LZ4 source repository: http://code.google.com/p/lz4/ Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Yann Collet <yann.collet.73@gmail.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: Improve information from domain_crash_synchronousAndrew Cooper2013-10-041-0/+7
| | | | | | | | | | | | | | | | | | | | | As it currently stands, the string "domain_crash_sync called from entry.S" is not helpful at identifying why the domain was crashed, and a debug build of Xen doesn't help the matter This patch improves the information printed, by pointing to where the crash decision was made. Specific improvements include: * Moving the ascii string "domain_crash_sync called from entry.S\n" away from some semi-hot code cache lines. * Moving the printk into C code (especially as this_cpu() is miserable to use in assembly code) * Undo the previous confusing situation of having the domain_crash_synchronous() as a macro in C code, yet a global symbol in assembly code. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* xen: arm: add two new device tree helpersIan Campbell2013-09-271-0/+17
| | | | | | | | - dt_property_read_u64 - dt_find_node_by_type Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org>
* x86/microcode: Scan the initramfs payload for microcode blobKonrad Rzeszutek Wilk2013-09-271-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Linux kernel is able to update the microcode during early bootup via inspection of the initramfs blob to see if there is an cpio image with certain microcode files. Linux is able to function with two (or more) cpio archives in the initrd b/c it unpacks all of the cpio archives. The format of the early initramfs is nicely documented in Linux's Documentation/x86/early-microcode.txt: Early load microcode ==================== By Fenghua Yu <fenghua.yu@intel.com> Kernel can update microcode in early phase of boot time. Loading microcode early can fix CPU issues before they are observed during kernel boot time. Microcode is stored in an initrd file. The microcode is read from the initrd file and loaded to CPUs during boot time. The format of the combined initrd image is microcode in cpio format followed by the initrd image (maybe compressed). Kernel parses the combined initrd image during boot time. The microcode file in cpio name space is: kernel/x86/microcode/GenuineIntel.bin During BSP boot (before SMP starts), if the kernel finds the microcode file in the initrd file, it parses the microcode and saves matching microcode in memory. If matching microcode is found, it will be uploaded in BSP and later on in all APs. The cached microcode patch is applied when CPUs resume from a sleep state. There are two legacy user space interfaces to load microcode, either through /dev/cpu/microcode or through /sys/devices/system/cpu/microcode/reload file in sysfs. In addition to these two legacy methods, the early loading method described here is the third method with which microcode can be uploaded to a system's CPUs. The following example script shows how to generate a new combined initrd file in /boot/initrd-3.5.0.ucode.img with original microcode microcode.bin and original initrd image /boot/initrd-3.5.0.img. mkdir initrd cd initrd mkdir kernel mkdir kernel/x86 mkdir kernel/x86/microcode cp ../microcode.bin kernel/x86/microcode/GenuineIntel.bin find .|cpio -oc >../ucode.cpio cd .. cat ucode.cpio /boot/initrd-3.5.0.img >/boot/initrd-3.5.0.ucode.img As such this code inspects the initrd to see if the microcode signatures are present and if so updates the hypervisor. The option to turn this scan on/off is gated by the 'ucode' parameter. The options are now: 'scan' Scan for the microcode in any multiboot payload. <index> Attempt to load microcode blob (not the cpio archive format) from the multiboot payload number. This option alters slightly the 'ucode' parameter by only allowing either parameter: ucode=[<index>|scan] Implementation wise the ucode_blob is defined as __initdata. That is OK from the viewpoint of suspend/resume as the the underlaying architecture microcode (microcode_intel or microcode_amd) end up saving the blob in 'struct ucode_cpu_info' which is a per-cpu data structure (see ucode_cpu_info). They end up saving it when doing the pre-SMP (for CPU0) and SMP (for the rest) microcode loading. Naturally if one does a hypercall to update the microcode and it is newer, then the old per-cpu data is replaced. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Keir Fraser <keir@xen.org>
* AMD IOMMU: fix Dom0 device setup failure for host bridgesSuravee Suthikulpanit2013-09-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The host bridge device (i.e. 0x18 for AMD) does not require IOMMU, and therefore is not included in the IVRS. The current logic tries to map all PCI devices to an IOMMU. In this case, "xl dmesg" shows the following message on AMD sytem. (XEN) setup 0000:00:18.0 for d0 failed (-19) (XEN) setup 0000:00:18.1 for d0 failed (-19) (XEN) setup 0000:00:18.2 for d0 failed (-19) (XEN) setup 0000:00:18.3 for d0 failed (-19) (XEN) setup 0000:00:18.4 for d0 failed (-19) (XEN) setup 0000:00:18.5 for d0 failed (-19) This patch adds a new device type (i.e. DEV_TYPE_PCI_HOST_BRIDGE) which corresponds to PCI class code 0x06 and sub-class 0x00. Then, it uses this new type to filter when trying to map device to IOMMU. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reported-by: Stefan Bader <stefan.bader@canonical.com> On VT-d refuse (un)mapping host bridges for other than the hardware domain. Coding style cleanup. Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
* xen/arm: Reserve FDT via early module mechanismIan Campbell2013-09-261-6/+7
| | | | | | | | | | This will stop us putting any heaps or relocating Xen itself over the FDT. The devicetree will be copied to allocated memory in setup_mm and the original copy will be freed by discard_initial_modules. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
* ns16550: support DesignWare 8250Ian Campbell2013-09-211-0/+2
| | | | | | | | | | | | This hardware has an additional feature which signals an error if you try to write LCR while the UART is busy. We need to clear this error during setup, otherwise LCR.DLAB doesn't get set and we cannot read/write the divisor. This has been tested on the cubieboard2 Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Cc: jbeulich@suse.com
* x86: mark BUG()s and assertion failures as terminal.Tim Deegan2013-09-191-0/+6
| | | | | | | | | This helps avoid static analysis false-positives, and might lead to better code density as the compiler knows it doesn't have to restore spilled state &c. Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Keir Fraser <keir@xen.org>
* xen/dts: Clean up the exported API for device treeJulien Grall2013-09-171-15/+0
| | | | | | | | | | | All Xen code has been converted to the new device tree API that uses a tree structure to describe the DTS. The Flat Device tree is still used by Xen during early boot stage, but only in internal. Remove entirely unneeded functions or move to a static function. Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xen/dts: dt_find_interrupt_controller: accept multiple compatible stringsJulien Grall2013-09-171-1/+2
| | | | | Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xen: Use the right string comparison function in device treeJulien Grall2013-09-171-2/+2
| | | | | | | | | When of_node_cmp and of_compat_cmp was introduced in commit fb97eb6 "xen/arm: Create a hierarchical device tree", they were copied from the wrong Linux header. Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xen/dts: Add new helpers to use the device treeJulien Grall2013-09-171-1/+150
| | | | | | | | | | | | | | | | | | | | | | | List of new helpers taken from linux (commit 74b9272): - dt_property_read_string - dt_match_node - dt_find_maching_node - dt_device_is_available - dt_prop_cmp Other new helpers: - dt_set_cell - dt_for_each_child - dt_set_range - dt_cells_to_size - dt_next_cell - dt_get_range - dt_node_name_is_equal - dt_node_path_is_equal - dt_property_name_is_equal Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xen/dts: Prefix device tree macro by dt_Julien Grall2013-09-171-2/+2
| | | | | | | | | | There is 2 macros: for_each_device_node and for_each_property_of_node with a too generic name. Also replace all call-site with the new function names. Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xen/dts: Constify device_tree_flattenedJulien Grall2013-09-171-1/+1
| | | | | | | The Flat Device Tree is given by the bootloader. Xen doesn't need to modify it. Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* console: buffer and show origin of guest PV writesDaniel De Graaf2013-09-102-0/+8
| | | | | | | | | | | | | Guests other than domain 0 using the console output have previously been controlled by the VERBOSE #define, but with no designation of which guest's output was on the console. This patch converts the HVM output buffering to be used by all domains except the hardware domain (dom0): stripping non-printable characters, line buffering the output, and prefixing it with the domain ID. This is especially useful for debugging stub domains during early boot. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Keir Fraser <keir@xen.org>
* xen: Add new string functionJulien Grall2013-09-101-0/+3
| | | | | | | Add strcasecmp. The code is copied from Linux. Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xen: Introduce __initconst to store initial const dataJulien Grall2013-09-101-0/+1
| | | | | | | | | | | | | | | It's possible to have 2 type (const and non-const) of data in the same compilation unit. Using only __initdata will result to a compilation error: error: $variablename causes as section tupe conflict with $variablename2 because a section containing const variables is marked read only and so cannot contain non-const variables. Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Cambell <ian.campbell@citrix.com> CC: Jan Beulich <JBeulich@suse.com> CC: Keir Fraser <keir@xen.org>
* xen/dts: fix DT_ROOT_NODE_ADDR_CELLS_DEFAULTJulien Grall2013-09-091-1/+1
| | | | | | | | | | | | The commit dbd1243 "xen/arm: Add helpers to use the device tree" introduced DT_ROOT_NODE_ADDR_CELLS_DEFAULT with is used for default value when bad copy from Linux code. The ePAR (section 2.3.5) says: "If missing, a client program should assume a default value of 2 for #address-cells, and a value of 1 for #size-cells." Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* PCI UART: better cope with UART being temporarily unavailableTomasz Wroblewski2013-08-281-2/+3
| | | | | | | | | | | | | This happens for example when dom0 disables ioport responses during PCI subsystem initialisation. If a __ns16550_poll() happens to be scheduled during that time, Xen hangs. Detect and exit that condition. Amended ns16550_ioport_invalid function to only check IER register, which contins 3 reserved (always 0) bits, therefore it's sufficient for that test. Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* PCI: centralize parsing of device coordinates in command line optionsJan Beulich2013-08-281-0/+2
| | | | | | | | | With yet another case to come in a subsequent patch, it seems time to do this in a single place rather than hand crafting it in various scattered around locations. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* xen/arm: use defines for boot module indexes instead of open coded numbersIan Campbell2013-08-271-2/+8
| | | | | Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Julien Grall <julien.grall@linaro.org>
* un-alias cpumask_any() from cpumask_first()Jan Beulich2013-08-232-1/+24
| | | | | | | | | | | | | | | | | | In order to achieve more symmetric distribution of certain things, cpumask_any() shouldn't always pick the first CPU (which frequently will end up being CPU0). To facilitate that, introduce a library-like function to obtain random numbers. The per-architecture function is supposed to return zero if no valid random number can be obtained (implying that if occasionally zero got produced as random number, it wouldn't be considered such). As fallback this uses the trivial algorithm from the C standard, extended to produce "unsigned int" results. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
* PCI: break MSI-X data out of struct pci_dev_infoJan Beulich2013-08-231-12/+2
| | | | | | | | | | | Considering that a significant share of PCI devices out there (not the least the myriad of CPU-exposed ones) don't support MSI-X at all, and that the amount of data is well beyond a handful of bytes, break this out of the common structure, at once allowing the actual data to be tracked to become architecture specific. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* xen/arm: Add the new OMAP UART driver.Chen Baozi2013-08-221-0/+51
| | | | | | | | | TI OMAP UART introduces some features such as register access modes, which makes its configuration and interrupt handling differs from 8250 compatible UART. Thus, we seperate this driver from ns16550's implementation. Signed-off-by: Chen Baozi <baozich@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xen: Introduce a helper to read a u32 property in device tree.Chen Baozi2013-08-221-0/+11
| | | | | Signed-off-by: Chen Baozi <baozich@gmail.com> Acked-by: Julien Grall <julien.grall@linaro.org>
* xen: rename ns16550-uart.h to 8250-uart.h and fix some typosChen Baozi2013-08-221-9/+9
| | | | | | | | | | | Since UARTs on OMAP5 & Allwinner's SoC are not ns16550 but only 8250 compatible, rename ns16550-uart.h to 8250-uart.h, which is a more pervasive name. At the same time, fix some typos, which have redundance UART_ prefixes in some macros. Signed-off-by: Chen Baozi <baozich@gmail.com> Acked-by: Julien Grall <julien.grall@linaro.org> Acked-by: Keir Fraser <keir@xen.org>
* ACPI: fix acpi_os_map_memory()Jan Beulich2013-08-211-1/+1
| | | | | | | | | | | | | | It using map_domain_page() was entirely wrong. Use __acpi_map_table() instead for the time being, with locking added as the mappings it produces get replaced with subsequent invocations. Using locking in this way is acceptable here since the only two runtime callers are acpi_os_{read,write}_memory(), which don't leave mappings pending upon returning to their callers. Also fix __acpi_map_table()'s first parameter's type - while benign for unstable, backports to pre-4.3 trees will need this. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* xen/x86: hypervisor build fixes for FreeBSD.Tim Deegan2013-08-162-2/+2
| | | | | | | | | These allow an x86_64 hypervisor to build on FreeBSD 9.1/amd64. - like OpenBSD, needs a different arch passed to ld. - like OpenBSD, stdarg.h and stdbool.h are in /usr/include. Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Keir Fraser <keir@xen.org>
* xen: Add stdbool.h workaround for BSD.Tim Deegan2013-08-152-2/+15
| | | | | | | | | | | | | | | | On *BSD, stdbool.h lives in /usr/include, but we don't want to have that on the search path in case we pick up any headers from the build host's C libraries. Copy the equivalent hack already in place for stdarg.h: on all supported compilers the contents of stdbool.h are trivial, so just supply the things we need in a xen/stdbool.h header. Signed-off-by: Tim Deegan <tim@xen.org> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Tested-by: Patrick Welche <prlw1@cam.ac.uk>
* watchdog: Move watchdog from being x86 specific to common codeAndrew Cooper2013-08-131-0/+35
| | | | | | | | | | | | | | | Augment watchdog_setup() to be able to possibly return an error, and introduce watchdog_enabled() as a better alternative to knowing the architectures internal details. This patch does not change the x86 implementaion, beyond making it compile. For header files, some includes of xen/nmi.h were only for the watchdog functions, so are replaced rather than adding an extra include of xen/watchdog.h Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* cleanup unused request{_dt,}_irq() parameterAndrew Cooper2013-08-081-1/+1
| | | | | | | | | The irqflags parameter appears to be an unused vestigial parameter right from the integration of the IOMMU code in 2007. The parameter is 0 at all callsites and never used. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
* xen/arm: New callback in uart_driver to retrieve serial informationJulien Grall2013-08-081-0/+13
| | | | | | | | | | There is no way to retrieve basic informations (base address, size, ....) for an UART. This callback will be used later to partially emulate the real UART for DOM0 on ARM. Signed-off-by: Julien Grall <julien.grall@linaro.org> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Keir Fraser <keir@xen.org>
* stubdom: Fix stubdom undeclared function build warningsSamuel Thibault2013-08-021-0/+2
| | | | | | | | | | | This includes a few headers to fix some missing function declarations. ../grub-upstream/stage2/builtins.c:1728:3: warning: implicit declaration of function ‘do_exit’ [-Wimplicit-function-declaration] stubdom/include/xen/libelf/libelf.h:453:5: warning: implicit declaration of function ‘memcpy’ [-Wimplicit-function-declaration] Reported-by: IAN DELANEY <della5@iinet.com.au> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* extract register definitions from ns16550 into a separated headerChen Baozi2013-07-161-0/+104
| | | | | Signed-off-by: Chen Baozi <baozich@gmail.com> Acked-by: Keir Fraser <keir@xen.org>
* use SMP barrier in common code dealing with shared memory protocolsIan Campbell2013-07-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Xen currently makes no strong distinction between the SMP barriers (smp_mb etc) and the regular barrier (mb etc). In Linux, where we inherited these names from having imported Linux code which uses them, the SMP barriers are intended to be sufficient for implementing shared-memory protocols between processors in an SMP system while the standard barriers are useful for MMIO etc. On x86 with the stronger ordering model there is not much practical difference here but ARM has weaker barriers available which are suitable for use as SMP barriers. Therefore ensure that common code uses the SMP barriers when that is all which is required. On both ARM and x86 both types of barrier are currently identical so there is no actual change. A future patch will change smp_mb to a weaker barrier on ARM. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* bitmap_*() should cope with zero size bitmapsJan Beulich2013-07-041-64/+64
| | | | | | | | | | | | | | | | ... to match expectations set by memset()/memcpy(). Similarly for find_{first,next}_{,zero_}_bit() on x86. __bitmap_shift_{left,right}() would also need fixing (they more generally can't cope with the shift count being larger than the bitmap size, and they perform undefined operations by possibly shifting an unsigned long value by BITS_PER_LONG bits), but since these functions aren't really used anywhere I wonder if we wouldn't better simply get rid of them. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* gnttab: remove unused MAPTRACK_MAX_ENTRIES defineMatthew Daley2013-07-021-3/+0
| | | | | | | The maximum number of grant mappings is given by max_nr_maptrack_frames() as of commit 5ce8faf, not this unused define. Signed-off-by: Matthew Daley <mattjd@gmail.com>
* libelf: fix printing of pointersJan Beulich2013-06-261-4/+4
| | | | | | | Printing them as decimal number, the more with 0x prefix, is confusing and presumably relatively useless to most of us. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* libelf: abolish obsolete macrosIan Jackson2013-06-141-36/+12
| | | | | | | | | | | | | | | | Abolish ELF_PTRVAL_[CONST_]{CHAR,VOID}; change uses to elf_ptrval. Abolish ELF_HANDLE_DECL_NONCONST; change uses to ELF_HANDLE_DECL. Abolish ELF_OBSOLETE_VOIDP_CAST; simply remove all uses. No functional change. (Verified by diffing assembler output.) This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v2: New patch.
* libelf: check loops for running awayIan Jackson2013-06-141-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that libelf does not have any loops which can run away indefinitely even if the input is bogus. (Grepped for \bfor, \bwhile and \bgoto in libelf and xc_dom_*loader*.c.) Changes needed: * elf_note_next uses the note's unchecked alleged length, which might wrap round. If it does, return ELF_MAX_PTRVAL (0xfff..fff) instead, which will be beyond the end of the section and so terminate the caller's loop. Also check that the returned psuedopointer is sane. * In various loops over section and program headers, check that the calculated header pointer is still within the image, and quit the loop if it isn't. * Some fixed limits to avoid potentially O(image_size^2) loops: - maximum length of strings: 4K (longer ones ignored totally) - maximum total number of ELF notes: 65536 (any more are ignored) * Check that the total program contents (text, data) we copy or initialise doesn't exceed twice the output image area size. * Remove an entirely useless loop from elf_xen_parse (!) * Replace a nested search loop in in xc_dom_load_elf_symtab in xc_dom_elfloader.c by a precomputation of a bitmap of referenced symtabs. We have not changed loops which might, in principle, iterate over the whole image - even if they might do so one byte at a time with a nontrivial access check function in the middle. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v8: Fix the two loops in libelf-dominfo.c; the comment about PT_NOTE and SHT_NOTE wasn't true because the checks did "continue", not "break". Add a comment about elf_note_next's expectations of the caller's loop conditions (which most plausible callers will follow anyway). v5: Fix regression due to wrong image size loop limit calculation. Check return value from xc_dom_malloc. v4: Fix regression due to misplacement of test in elf_shdr_by_name (uninitialised variable). Introduce fixed limits. Avoid O(size^2) loops. Check returned psuedopointer from elf_note_next is correct. A few style fixes. v3: Fix a whitespace error. v2: BUGFIX: elf_shdr_by_name, elf_note_next: Reject new <= old, not just <. elf_shdr_by_name: Change order of checks to be a bit clearer. elf_load_bsdsyms: shdr loop check, improve chance of brokenness detection. Style fixes.
* libelf: use only unsigned integersIan Jackson2013-06-141-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed integers have undesirable undefined behaviours on overflow. Malicious compilers can turn apparently-correct code into code with security vulnerabilities etc. So use only unsigned integers. Exceptions are booleans (which we have already changed) and error codes. We _do_ change all the chars which aren't fixed constants from our own text segment, but not the char*s. This is because it is safe to access an arbitrary byte through a char*, but not necessarily safe to convert an arbitrary value to a char. As a consequence we need to compile libelf with -Wno-pointer-sign. It is OK to change all the signed integers to unsigned because all the inequalities in libelf are in contexts where we don't "expect" negative numbers. In libelf-dominfo.c:elf_xen_parse we rename a variable "rc" to "more_notes" as it actually contains a note count derived from the input image. The "error" return value from elf_xen_parse_notes is changed from -1 to ~0U. grepping shows only one occurrence of "PRId" or "%d" or "%ld" in libelf and xc_dom_elfloader.c (a "%d" which becomes "%u"). This is part of the fix to a security issue, XSA-55. For those concerned about unintentional functional changes, the following rune produces a version of the patch which is much smaller and eliminates only non-functional changes: GIT_EXTERNAL_DIFF=.../unsigned-differ git-diff <before>..<after> where <before> and <after> are git refs for the code before and after this patch, and unsigned-differ is this shell script: #!/bin/bash set -e seddery () { perl -pe 's/\b(?:elf_errorstatus|elf_negerrnoval)\b/int/g' } path="$1" in="$2" out="$5" set +e diff -pu --label "$path~" <(seddery <"$in") --label "$path" <(seddery <"$out") rc=$? set -e if [ $rc = 1 ]; then rc=0; fi exit $rc Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v8: Use "?!?!" to express consternation instead of a ruder phrase. v5: Introduce ELF_NOTE_INVALID, instead of using a literal ~0U. v4: Fix regression in elf_round_up; use uint64_t here. v3: Changes to booleans split off into separate patch. v2: BUGFIX: Eliminate conversion to int of return from elf_xen_parse_notes. BUGFIX: Fix the one printf format thing which needs changing. Remove irrelevant change to constify note_desc.name in libelf-dominfo.c. In xc_dom_load_elf_symtab change one sizeof(int) to sizeof(unsigned). Do not change type of 2nd argument to memset. Provide seddery for easier review. Style fix.
* libelf: use C99 bool for booleansIan Jackson2013-06-141-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to remove uses of "int" because signed integers have undesirable undefined behaviours on overflow. Malicious compilers can turn apparently-correct code into code with security vulnerabilities etc. In this patch we change all the booleans in libelf to C99 bool, from <stdbool.h>. For the one visible libelf boolean in libxc's public interface we retain the use of int to avoid changing the ABI; libxc converts it to a bool for consumption by libelf. It is OK to change all values only ever used as booleans to _Bool (bool) because conversion from any scalar type to a _Bool works the same as the boolean test in if() or ?: and is always defined (C99 6.3.1.2). But we do need to check that all these variables really are only ever used that way. (It is theoretically possible that the old code truncated some 64-bit values to 32-bit ints which might become zero depending on the value, which would mean a behavioural change in this patch, but it seems implausible that treating 0x????????00000000 as false could have been intended.) This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v3: Use <stdbool.h>'s bool (or _Bool) instead of defining elf_bool. Split this into a separate patch.
* libelf: Check pointer references in elf_is_elfbinaryIan Jackson2013-06-141-1/+3
| | | | | | | | | | | | | | | | | | | | | | elf_is_elfbinary didn't take a length parameter and could potentially access out of range when provided with a very short image. We only need to check the size is enough for the actual dereference in elf_is_elfbinary; callers are just using it to check the magic number and do their own checks (usually via the new elf_ptrval system) before dereferencing other parts of the header. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v7: Add a comment about the limited function of elf_is_elfbinary. v2: Style fix. Fix commit message subject.
* libelf: check all pointer accessesIan Jackson2013-06-141-59/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We change the ELF_PTRVAL and ELF_HANDLE types and associated macros: * PTRVAL becomes a uintptr_t, for which we provide a typedef elf_ptrval. This means no arithmetic done on it can overflow so the compiler cannot do any malicious invalid pointer arithmetic "optimisations". It also means that any places where we dereference one of these pointers without using the appropriate macros or functions become a compilation error. So we can be sure that we won't miss any memory accesses. All the PTRVAL variables were previously void* or char*, so the actual address calculations are unchanged. * ELF_HANDLE becomes a union, one half of which keeps the pointer value and the other half of which is just there to record the type. The new type is not a pointer type so there can be no address calculations on it whose meaning would change. Every assignment or access has to go through one of our macros. * The distinction between const and non-const pointers and char*s and void*s in libelf goes away. This was not important (and anyway libelf tended to cast away const in various places). * The fields elf->image and elf->dest are renamed. That proves that we haven't missed any unchecked uses of these actual pointer values. * The caller may fill in elf->caller_xdest_base and _size to specify another range of memory which is safe for libelf to access, besides the input and output images. * When accesses fail due to being out of range, we mark the elf "broken". This will be checked and used for diagnostics in a following patch. We do not check for write accesses to the input image. This is because libelf actually does this in a number of places. So we simply permit that. * Each caller of libelf which used to set dest now sets dest_base and dest_size. * In xc_dom_load_elf_symtab we provide a new actual-pointer value hdr_ptr which we get from mapping the guest's kernel area and use (checking carefully) as the caller_xdest area. * The STAR(h) macro in libelf-dominfo.c now uses elf_access_unsigned. * elf-init uses the new elf_uval_3264 accessor to access the 32-bit fields, rather than an unchecked field access (ie, unchecked pointer access). * elf_uval has been reworked to use elf_uval_3264. Both of these macros are essentially new in this patch (although they are derived from the old elf_uval) and need careful review. * ELF_ADVANCE_DEST is now safe in the sense that you can use it to chop parts off the front of the dest area but if you chop more than is available, the dest area is simply set to be empty, preventing future accesses. * We introduce some #defines for memcpy, memset, memmove and strcpy: - We provide elf_memcpy_safe and elf_memset_safe which take PTRVALs and do checking on the supplied pointers. - Users inside libelf must all be changed to either elf_mem*_unchecked (which are just like mem*), or elf_mem*_safe (which take PTRVALs) and are checked. Any unchanged call sites become compilation errors. * We do _not_ at this time fix elf_access_unsigned so that it doesn't make unaligned accesses. We hope that unaligned accesses are OK on every supported architecture. But it does check the supplied pointer for validity. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v7: Remove a spurious whitespace change. v5: Use allow_size value from xc_dom_vaddr_to_ptr to set xdest_size correctly. If ELF_ADVANCE_DEST advances past the end, mark the elf broken. Always regard NULL allowable region pointers (e.g. dest_base) as invalid (since NULL pointers don't point anywhere). v4: Fix ELF_UNSAFE_PTR to work on 32-bit even when provided 64-bit values. Fix xc_dom_load_elf_symtab not to call XC_DOM_PAGE_SIZE unnecessarily if load is false. This was a regression. v3.1: Introduce a change to elf_store_field to undo the effects of the v3.1 change to the previous patch (the definition there is not compatible with the new types). v3: Fix a whitespace error. v2 was Acked-by: Ian Campbell <ian.campbell@citrix.com> v2: BUGFIX: elf_strval: Fix loop termination condition to actually work. BUGFIX: elf_strval: Fix return value to not always be totally wild. BUGFIX: xc_dom_load_elf_symtab: do proper check for small header size. xc_dom_load_elf_symtab: narrow scope of `hdr_ptr'. xc_dom_load_elf_symtab: split out uninit'd symtab.class ref fix. More comments on the lifetime/validity of elf-> dest ptrs etc. libelf.h: write "obsolete" out in full libelf.h: rename "dontuse" to "typeonly" and add doc comment elf_ptrval_in_range: Document trustedness of arguments. Style and commit message fixes.
* libelf: check nul-terminated strings properlyIan Jackson2013-06-141-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is not safe to simply take pointers into the ELF and use them as C pointers. They might not be properly nul-terminated (and the pointers might be wild). So we are going to introduce a new function elf_strval for safely getting strings. This will check that the addresses are in range and that there is a proper nul-terminated string. Of course it might discover that there isn't. In that case, it will be made to fail. This means that elf_note_name might fail, too. For the benefit of call sites which are just going to pass the value to a printf-like function, we provide elf_strfmt which returns "(invalid)" on failure rather than NULL. In this patch we introduce dummy definitions of these functions. We introduce calls to elf_strval and elf_strfmt everywhere, and update all the call sites with appropriate error checking. There is not yet any semantic change, since before this patch all the places where we introduce elf_strval dereferenced the value anyway, so it mustn't have been NULL. In future patches, when elf_strval is made able return NULL, when it does so it will mark the elf "broken" so that an appropriate diagnostic can be printed. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v7: Change readnotes.c check to use two if statements rather than ||. v2: Fix coding style, in one "if" statement.