xen/xen - xen

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	domctl: replace cpumask_weight() uses	Jan Beulich	2013-08-23	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	In one case it could easily be replaced by range checking the result of a subsequent operation, and in general cpumask_next(), not always needing to scan the whole bitmap, is more efficient than the specific uses of cpumask_weight() here. (When running on big systems, operations on CPU masks aren't cheap enough to use them carelessly.) Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	credit1: replace cpumask_empty() uses	Jan Beulich	2013-08-23	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \|	In one case it was redundant with the operation it got combined with, and in the other it could easily be replaced by range checking the result of a subsequent operation. (When running on big systems, operations on CPU masks aren't cheap enough to use them carelessly.) Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
*	credit2: replace cpumask_first() uses	Jan Beulich	2013-08-23	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	... with cpumask_any() or cpumask_cycle(). In one case this also allows elimination of a cpumask_empty() call, and while doing this I also spotted a redundant use of cpumask_weight(). (When running on big systems, operations on CPU masks aren't cheap enough to use them carelessly.) Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
*	un-alias cpumask_any() from cpumask_first()	Jan Beulich	2013-08-23	2	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to achieve more symmetric distribution of certain things, cpumask_any() shouldn't always pick the first CPU (which frequently will end up being CPU0). To facilitate that, introduce a library-like function to obtain random numbers. The per-architecture function is supposed to return zero if no valid random number can be obtained (implying that if occasionally zero got produced as random number, it wouldn't be considered such). As fallback this uses the trivial algorithm from the C standard, extended to produce "unsigned int" results. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
*	xen: Introduce a helper to read a u32 property in device tree.	Chen Baozi	2013-08-22	1	-0/+15
\| \| \| \| \|	Signed-off-by: Chen Baozi <baozich@gmail.com> Acked-by: Julien Grall <julien.grall@linaro.org>
*	xen: gate split heap code on its own config option rather than !X86	Ian Campbell	2013-08-20	1	-1/+1
\| \| \| \| \| \| \| \| \|	I'm going to want to disable this for 64 bit ARM. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	xen: remove evtchn_upcall_mask from interface on ARM	Ian Campbell	2013-08-20	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On ARM event-channel upcalls are masked using the hardware's interrupt mask bit and not by a software bit. Leaving this field present in the interface has caused some confusion already and is liable to mean it gets inadvertently used in the future. So arrange for this field to be turned into a padding field on ARM by introducing a XEN_HAVE_PV_UPCALL_MASK define. This bit is also unused for x86 PV-on-HVM guests, but we can't realistically distinguish those from x86 PV guests in the headers. Add a per-arch vcpu_event_delivery_is_enabled function to replace an open coded use of evtchn_upcall_mask in common code (in a debug keyhandler). The existing local_event_delivery_is_enabled, which operates only on current, was unimplemented on ARM and unused on x86, so remove it. ifdef the use of evtchn_upcall_mask when setting up a new vcpu info page. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
*	watchdog: Move watchdog from being x86 specific to common code	Andrew Cooper	2013-08-13	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Augment watchdog_setup() to be able to possibly return an error, and introduce watchdog_enabled() as a better alternative to knowing the architectures internal details. This patch does not change the x86 implementaion, beyond making it compile. For header files, some includes of xen/nmi.h were only for the watchdog functions, so are replaced rather than adding an extra include of xen/watchdog.h Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	libelf: Fix typo in header guard macro	Patrick Welche	2013-08-08	1	-2/+2
\| \| \| \| \| \| \| \|	s/__LIBELF_PRIVATE_H_/__LIBELF_PRIVATE_H__/ Signed-off-by: Patrick Welche <prlw1@cam.ac.uk> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	fix off-by-one mistakes in vm_alloc()	Jan Beulich	2013-08-05	1	-10/+18
\| \| \| \| \| \| \| \| \| \| \|	Also add another pair of assertions to catch eventual further cases of incorrect accounting, and remove the temporary debuggin messages again which commit 68caac7f ("x86: don't use destroy_xen_mappings() for vunmap()") added. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	Don't take the domain lock for p2m operations.	Tim Deegan	2013-07-29	1	-4/+0
\| \| \| \| \| \| \| \| \| \|	P2M ops are covered by their own locks, and these uses of the domain lock are relics of shadow-v1 code. Signed-off-by: Tim Deegan <tim@xen.org> Reviewed-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: don't use destroy_xen_mappings() for vunmap()	Jan Beulich	2013-07-17	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Its attempt to tear down intermediate page table levels may race with map_pages_to_xen() establishing them, and now that map_domain_page_global() is backed by vmap() this teardown is also wasteful (as it's very likely to need the same address space populated again within foreseeable time). As the race between vmap() and vunmap(), according to the latest stage tester logs, doesn't appear to be the only one still left, the patch also adds logging for vmap() and vunmap() uses (there shouldn't be too many of them, so logs shouldn't get flooded). These are supposed to get removed (and are made stand out clearly) as soon as we're certain that there's no issue left. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	use SMP barrier in common code dealing with shared memory protocols	Ian Campbell	2013-07-04	8	-19/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Xen currently makes no strong distinction between the SMP barriers (smp_mb etc) and the regular barrier (mb etc). In Linux, where we inherited these names from having imported Linux code which uses them, the SMP barriers are intended to be sufficient for implementing shared-memory protocols between processors in an SMP system while the standard barriers are useful for MMIO etc. On x86 with the stronger ordering model there is not much practical difference here but ARM has weaker barriers available which are suitable for use as SMP barriers. Therefore ensure that common code uses the SMP barriers when that is all which is required. On both ARM and x86 both types of barrier are currently identical so there is no actual change. A future patch will change smp_mb to a weaker barrier on ARM. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: make map_domain_page_global() a simple wrapper around vmap()	Jan Beulich	2013-07-04	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is in order to reduce the number of fundamental mapping mechanisms as well as to reduce the amount of code to be maintained. In the course of this the virtual space available to vmap() is being grown from 16Gb to 64Gb. Note that this requires callers of unmap_domain_page_global() to no longer pass misaligned pointers - map_domain_page_global() returns page size aligned pointers, so unmappinmg should be done accordingly. unmap_vcpu_info() violated this and is being adjusted here. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	libelf: abolish obsolete macros	Ian Jackson	2013-06-14	3	-27/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Abolish ELF_PTRVAL_[CONST_]{CHAR,VOID}; change uses to elf_ptrval. Abolish ELF_HANDLE_DECL_NONCONST; change uses to ELF_HANDLE_DECL. Abolish ELF_OBSOLETE_VOIDP_CAST; simply remove all uses. No functional change. (Verified by diffing assembler output.) This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v2: New patch.
*	libelf: check loops for running away	Ian Jackson	2013-06-14	3	-20/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ensure that libelf does not have any loops which can run away indefinitely even if the input is bogus. (Grepped for \bfor, \bwhile and \bgoto in libelf and xc_dom_loader.c.) Changes needed: * elf_note_next uses the note's unchecked alleged length, which might wrap round. If it does, return ELF_MAX_PTRVAL (0xfff..fff) instead, which will be beyond the end of the section and so terminate the caller's loop. Also check that the returned psuedopointer is sane. * In various loops over section and program headers, check that the calculated header pointer is still within the image, and quit the loop if it isn't. * Some fixed limits to avoid potentially O(image_size^2) loops: - maximum length of strings: 4K (longer ones ignored totally) - maximum total number of ELF notes: 65536 (any more are ignored) * Check that the total program contents (text, data) we copy or initialise doesn't exceed twice the output image area size. * Remove an entirely useless loop from elf_xen_parse (!) * Replace a nested search loop in in xc_dom_load_elf_symtab in xc_dom_elfloader.c by a precomputation of a bitmap of referenced symtabs. We have not changed loops which might, in principle, iterate over the whole image - even if they might do so one byte at a time with a nontrivial access check function in the middle. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v8: Fix the two loops in libelf-dominfo.c; the comment about PT_NOTE and SHT_NOTE wasn't true because the checks did "continue", not "break". Add a comment about elf_note_next's expectations of the caller's loop conditions (which most plausible callers will follow anyway). v5: Fix regression due to wrong image size loop limit calculation. Check return value from xc_dom_malloc. v4: Fix regression due to misplacement of test in elf_shdr_by_name (uninitialised variable). Introduce fixed limits. Avoid O(size^2) loops. Check returned psuedopointer from elf_note_next is correct. A few style fixes. v3: Fix a whitespace error. v2: BUGFIX: elf_shdr_by_name, elf_note_next: Reject new <= old, not just <. elf_shdr_by_name: Change order of checks to be a bit clearer. elf_load_bsdsyms: shdr loop check, improve chance of brokenness detection. Style fixes.
*	libelf: use only unsigned integers	Ian Jackson	2013-06-14	4	-47/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed integers have undesirable undefined behaviours on overflow. Malicious compilers can turn apparently-correct code into code with security vulnerabilities etc. So use only unsigned integers. Exceptions are booleans (which we have already changed) and error codes. We _do_ change all the chars which aren't fixed constants from our own text segment, but not the chars. This is because it is safe to access an arbitrary byte through a char, but not necessarily safe to convert an arbitrary value to a char. As a consequence we need to compile libelf with -Wno-pointer-sign. It is OK to change all the signed integers to unsigned because all the inequalities in libelf are in contexts where we don't "expect" negative numbers. In libelf-dominfo.c:elf_xen_parse we rename a variable "rc" to "more_notes" as it actually contains a note count derived from the input image. The "error" return value from elf_xen_parse_notes is changed from -1 to ~0U. grepping shows only one occurrence of "PRId" or "%d" or "%ld" in libelf and xc_dom_elfloader.c (a "%d" which becomes "%u"). This is part of the fix to a security issue, XSA-55. For those concerned about unintentional functional changes, the following rune produces a version of the patch which is much smaller and eliminates only non-functional changes: GIT_EXTERNAL_DIFF=.../unsigned-differ git-diff <before>..<after> where <before> and <after> are git refs for the code before and after this patch, and unsigned-differ is this shell script: #!/bin/bash set -e seddery () { perl -pe 's/\b(?:elf_errorstatus\|elf_negerrnoval)\b/int/g' } path="$1" in="$2" out="$5" set +e diff -pu --label "$path~" <(seddery <"$in") --label "$path" <(seddery <"$out") rc=$? set -e if [ $rc = 1 ]; then rc=0; fi exit $rc Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v8: Use "?!?!" to express consternation instead of a ruder phrase. v5: Introduce ELF_NOTE_INVALID, instead of using a literal ~0U. v4: Fix regression in elf_round_up; use uint64_t here. v3: Changes to booleans split off into separate patch. v2: BUGFIX: Eliminate conversion to int of return from elf_xen_parse_notes. BUGFIX: Fix the one printf format thing which needs changing. Remove irrelevant change to constify note_desc.name in libelf-dominfo.c. In xc_dom_load_elf_symtab change one sizeof(int) to sizeof(unsigned). Do not change type of 2nd argument to memset. Provide seddery for easier review. Style fix.
*	libelf: use C99 bool for booleans	Ian Jackson	2013-06-14	4	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want to remove uses of "int" because signed integers have undesirable undefined behaviours on overflow. Malicious compilers can turn apparently-correct code into code with security vulnerabilities etc. In this patch we change all the booleans in libelf to C99 bool, from <stdbool.h>. For the one visible libelf boolean in libxc's public interface we retain the use of int to avoid changing the ABI; libxc converts it to a bool for consumption by libelf. It is OK to change all values only ever used as booleans to _Bool (bool) because conversion from any scalar type to a _Bool works the same as the boolean test in if() or ?: and is always defined (C99 6.3.1.2). But we do need to check that all these variables really are only ever used that way. (It is theoretically possible that the old code truncated some 64-bit values to 32-bit ints which might become zero depending on the value, which would mean a behavioural change in this patch, but it seems implausible that treating 0x????????00000000 as false could have been intended.) This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v3: Use <stdbool.h>'s bool (or _Bool) instead of defining elf_bool. Split this into a separate patch.
*	libelf: Check pointer references in elf_is_elfbinary	Ian Jackson	2013-06-14	2	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	elf_is_elfbinary didn't take a length parameter and could potentially access out of range when provided with a very short image. We only need to check the size is enough for the actual dereference in elf_is_elfbinary; callers are just using it to check the magic number and do their own checks (usually via the new elf_ptrval system) before dereferencing other parts of the header. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v7: Add a comment about the limited function of elf_is_elfbinary. v2: Style fix. Fix commit message subject.
*	libelf: check all pointer accesses	Ian Jackson	2013-06-14	4	-14/+123
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We change the ELF_PTRVAL and ELF_HANDLE types and associated macros: * PTRVAL becomes a uintptr_t, for which we provide a typedef elf_ptrval. This means no arithmetic done on it can overflow so the compiler cannot do any malicious invalid pointer arithmetic "optimisations". It also means that any places where we dereference one of these pointers without using the appropriate macros or functions become a compilation error. So we can be sure that we won't miss any memory accesses. All the PTRVAL variables were previously void* or char, so the actual address calculations are unchanged. ELF_HANDLE becomes a union, one half of which keeps the pointer value and the other half of which is just there to record the type. The new type is not a pointer type so there can be no address calculations on it whose meaning would change. Every assignment or access has to go through one of our macros. * The distinction between const and non-const pointers and chars and voids in libelf goes away. This was not important (and anyway libelf tended to cast away const in various places). * The fields elf->image and elf->dest are renamed. That proves that we haven't missed any unchecked uses of these actual pointer values. * The caller may fill in elf->caller_xdest_base and _size to specify another range of memory which is safe for libelf to access, besides the input and output images. * When accesses fail due to being out of range, we mark the elf "broken". This will be checked and used for diagnostics in a following patch. We do not check for write accesses to the input image. This is because libelf actually does this in a number of places. So we simply permit that. * Each caller of libelf which used to set dest now sets dest_base and dest_size. * In xc_dom_load_elf_symtab we provide a new actual-pointer value hdr_ptr which we get from mapping the guest's kernel area and use (checking carefully) as the caller_xdest area. * The STAR(h) macro in libelf-dominfo.c now uses elf_access_unsigned. * elf-init uses the new elf_uval_3264 accessor to access the 32-bit fields, rather than an unchecked field access (ie, unchecked pointer access). * elf_uval has been reworked to use elf_uval_3264. Both of these macros are essentially new in this patch (although they are derived from the old elf_uval) and need careful review. * ELF_ADVANCE_DEST is now safe in the sense that you can use it to chop parts off the front of the dest area but if you chop more than is available, the dest area is simply set to be empty, preventing future accesses. * We introduce some #defines for memcpy, memset, memmove and strcpy: - We provide elf_memcpy_safe and elf_memset_safe which take PTRVALs and do checking on the supplied pointers. - Users inside libelf must all be changed to either elf_mem_unchecked (which are just like mem), or elf_mem_safe (which take PTRVALs) and are checked. Any unchanged call sites become compilation errors. We do _not_ at this time fix elf_access_unsigned so that it doesn't make unaligned accesses. We hope that unaligned accesses are OK on every supported architecture. But it does check the supplied pointer for validity. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v7: Remove a spurious whitespace change. v5: Use allow_size value from xc_dom_vaddr_to_ptr to set xdest_size correctly. If ELF_ADVANCE_DEST advances past the end, mark the elf broken. Always regard NULL allowable region pointers (e.g. dest_base) as invalid (since NULL pointers don't point anywhere). v4: Fix ELF_UNSAFE_PTR to work on 32-bit even when provided 64-bit values. Fix xc_dom_load_elf_symtab not to call XC_DOM_PAGE_SIZE unnecessarily if load is false. This was a regression. v3.1: Introduce a change to elf_store_field to undo the effects of the v3.1 change to the previous patch (the definition there is not compatible with the new types). v3: Fix a whitespace error. v2 was Acked-by: Ian Campbell <ian.campbell@citrix.com> v2: BUGFIX: elf_strval: Fix loop termination condition to actually work. BUGFIX: elf_strval: Fix return value to not always be totally wild. BUGFIX: xc_dom_load_elf_symtab: do proper check for small header size. xc_dom_load_elf_symtab: narrow scope of `hdr_ptr'. xc_dom_load_elf_symtab: split out uninit'd symtab.class ref fix. More comments on the lifetime/validity of elf-> dest ptrs etc. libelf.h: write "obsolete" out in full libelf.h: rename "dontuse" to "typeonly" and add doc comment elf_ptrval_in_range: Document trustedness of arguments. Style and commit message fixes.
*	libelf: check nul-terminated strings properly	Ian Jackson	2013-06-14	2	-6/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is not safe to simply take pointers into the ELF and use them as C pointers. They might not be properly nul-terminated (and the pointers might be wild). So we are going to introduce a new function elf_strval for safely getting strings. This will check that the addresses are in range and that there is a proper nul-terminated string. Of course it might discover that there isn't. In that case, it will be made to fail. This means that elf_note_name might fail, too. For the benefit of call sites which are just going to pass the value to a printf-like function, we provide elf_strfmt which returns "(invalid)" on failure rather than NULL. In this patch we introduce dummy definitions of these functions. We introduce calls to elf_strval and elf_strfmt everywhere, and update all the call sites with appropriate error checking. There is not yet any semantic change, since before this patch all the places where we introduce elf_strval dereferenced the value anyway, so it mustn't have been NULL. In future patches, when elf_strval is made able return NULL, when it does so it will mark the elf "broken" so that an appropriate diagnostic can be printed. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v7: Change readnotes.c check to use two if statements rather than \|\|. v2: Fix coding style, in one "if" statement.
*	libelf: introduce macros for memory access and pointer handling	Ian Jackson	2013-06-14	3	-108/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We introduce a collection of macros which abstract away all the pointer arithmetic and dereferences used for accessing the input ELF and the output area(s). We use the new macros everywhere. For now, these macros are semantically identical to the code they replace, so this patch has no functional change. elf_is_elfbinary is an exception: since it doesn't take an elf, we need to handle it differently. In a future patch we will change it to take, and check, a length parameter. For now we just mark it with a fixme. That this patch has no functional change can be verified as follows: 0. Copy the scripts "comparison-generate" and "function-filter" out of this commit message. 1. Check out the tree before this patch. 2. Run the script ../comparison-generate .... ../before 3. Check out the tree after this patch. 4. Run the script ../comparison-generate .... ../after 5. diff --exclude=\.[soi] -ruN before/ after/ \|less Expect these differences: * stubdom/zlib-x86_64/ztest.s2 The filename of this test file apparently contains the pid. xen/common/version.s2 The xen build timestamp appears in two diff hunks. Verification that this is all that's needed: In a completely built xen.git, find * -name ..d -type f \| xargs grep -l libelf\.h Expect results in: xen/arch/x86: Checked above. tools/libxc: Checked above. tools/xcutils/readnotes: Checked above. tools/xenstore: Checked above. xen/common/libelf: This is the build for the hypervisor; checked in B above. stubdom: We have one stubdom which reads ELFs using our libelf, pvgrub, which is checked above. I have not done this verification for ARM. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> v7: Add uintptr_t cast to ELF_UNSAFE_PTR. Still verifies. Use git foo not git-foo in commit message verification script. v4: Fix elf_load_binary's phdr message to be correct on 32-bit. Fix ELF_OBSOLETE_VOIDP_CAST to work on 32-bit. Indent scripts in commit message. v3.1: Change elf_store_field to verify correctly on 32-bit. comparison-generate copes with Xen 4.1's lack of ./configure. v2: Use Xen style for multi-line comments. Postpone changes to readnotes.c:print_l1_mfn_valid_note. Much improved verification instructions with new script. Fixed commit message subject. -8<- comparison-generate -8<- #!/bin/bash # usage: # cd xen.git # .../comparison-generate OUR-CONFIG BUILD-RUNE-PREFIX ../before\|../after # eg: # .../comparison-generate ~/work/.config 'schroot -pc64 --' ../before set -ex test $# = 3 \|\| need-exactly-three-arguments our_config=$1 build_rune_prefix=$2 result_dir=$3 git clean -x -d -f cp "$our_config" . cat <<END >>.config debug_symbols=n CFLAGS += -save-temps END perl -i~ -pe 's/ -g / -g0 / if m/^CFLAGS/' xen/Rules.mk if [ -f ./configure ]; then $build_rune_prefix ./configure fi $build_rune_prefix make -C xen $build_rune_prefix make -C tools/include $build_rune_prefix make -C stubdom grub $build_rune_prefix make -C tools/libxc $build_rune_prefix make -C tools/xenstore $build_rune_prefix make -C tools/xcutils rm -rf "$result_dir" mkdir "$result_dir" set +x for f in `find xen tools stubdom -name \.[soi]`; do mkdir -p "$result_dir"/`dirname $f` cp $f "$result_dir"/${f} case $f in *.s) ../function-filter <$f >"$result_dir"/${f}2 ;; esac done echo ok. -8<- -8<- function-filter -8<- #!/usr/bin/perl -w # function-filter # script for massaging gcc-generated labels to be consistent use strict; our @lines; my $sedderybody = "sub seddery () {\n"; while (<>) { push @lines, $_; if (m/^(__FUNCTION__\|__func__)\.(\d+)\:/) { $sedderybody .= " s/\\b$1\\.$2\\b/__XSA55MANGLED__$1.$./g;\n"; } } $sedderybody .= "}\n1;\n"; eval $sedderybody or die $@; foreach (@lines) { seddery(); print or die $!; } -8<-
*	libelf: move include of <asm/guest_access.h> to top of file	Ian Jackson	2013-06-14	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	libelf-loader.c #includes <asm/guest_access.h>, when being compiled for Xen. Currently it does this in the middle of the file. Move this #include to the top of the file, before libelf-private.h. This is necessary because in forthcoming patches we will introduce private #defines of memcpy etc. which would interfere with definitions in headers #included from guest_access.h. No semantic or functional change in this patch. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
*	libelf: abolish elf_sval and elf_access_signed	Ian Jackson	2013-06-14	1	-28/+0
\| \| \| \| \| \| \| \| \| \| \|	These are not used anywhere. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
*	libelf: add `struct elf_binary*' parameter to elf_load_image	Ian Jackson	2013-06-14	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The meat of this function is going to need a copy of the elf pointer, in forthcoming patches. No functional change in this patch. This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
*	libelf: abolish libelf-relocate.c	Ian Jackson	2013-06-14	1	-372/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This file is not actually used. It's not built in Xen's instance of libelf; in libxc's it's built but nothing in it is called. Do not compile it in libxc, and delete it. This reduces the amount of work we need to do in forthcoming patches to libelf (particularly since as libelf-relocate.c is not used it is probably full of bugs). This is part of the fix to a security issue, XSA-55. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Chuck Anderson <chuck.anderson@oracle.com>
*	tmem: Don't use map_domain_page for long-life-time pages	Konrad Rzeszutek Wilk	2013-06-13	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using tmem with Xen 4.3 (and debug build) we end up with: (XEN) Xen BUG at domain_page.c:143 (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 .. (XEN) Xen call trace: (XEN) [<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284 (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4 (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0 (XEN) [<ffff82c4c022391b>] syscall_enter+0xeb/0x145 (XEN) A bit of debugging revealed that the map_domain_page and unmap_domain_page are meant for short life-time mappings. And that those mappings are finite. In the 2 VCPU guest we only have 32 entries and once we have exhausted those we trigger the BUG_ON condition. The two functions - tmh_persistent_pool_page_[get,put] are used by the xmem_pool when xmem_pool_[alloc,free] are called. These xmem_pool_* function are wrapped in macro and functions - the entry points are via: tmem_malloc and tmem_page_alloc. In both cases the users are in the hypervisor and they do not seem to suffer from using the hypervisor virtual addresses. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
*	x86: re-enable VCPUOP_register_vcpu_time_memory_area	Jan Beulich	2013-05-27	1	-2/+0
\| \| \| \| \| \| \| \| \| \|	By moving the call to update_vcpu_system_time() out of schedule() into arch-specific context switch code, the original problem of the function accessing the wrong domain's address space goes away (obvious even from patch context, as update_runstate_area() does similar copying). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	hypervisor/xen/tools: Remove the XENMEM_get_oustanding_pages and provide the ↵	Konrad Rzeszutek Wilk	2013-05-14	3	-11/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	data via xc_phys_info During the review of the patches it was noticed that there exists a race wherein the 'free_memory' value consists of information from two hypercalls. That is the XEN_SYSCTL_physinfo and XENMEM_get_outstanding_pages. The free memory the host has available for guest is the difference between the 'free_pages' (from XEN_SYSCTL_physinfo) and 'outstanding_pages'. As they are two hypercalls many things can happen in between the execution of them. This patch resolves this by eliminating the XENMEM_get_outstanding_pages hypercall and providing the free_pages and outstanding_pages information via the xc_phys_info structure. It also removes the XSM hooks and adds locking as needed. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Keir Fraser <keir.xen@gmail.com>
*	also move compat mode VCPUOP_register_vcpu_info into common code	Jan Beulich	2013-05-13	1	-0/+9
\| \| \| \| \| \| \| \| \|	Otherwise, with arch_compat_vcpu_op() calling arch_do_vcpu_op() to handle it, it results in -ENOSYS after 6ff9e4f7 ("xen: move VCPUOP_register_vcpu_info to common code") for 32-bit x86 domains. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	xen/arm: Use hierarchical device tree to retrieve GIC information	Julien Grall	2013-05-13	1	-42/+0
\| \| \| \| \| \| \| \|	- Remove early parsing for GIC addresses - Remove hard coded maintenance IRQ number Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	xen/arm: Add helpers to retrieve an interrupt description from the device tree	Julien Grall	2013-05-13	1	-0/+362
\| \| \| \| \|	Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	xen/arm: Add helpers to retrieve an address from the device tree	Julien Grall	2013-05-13	1	-0/+343
\| \| \| \| \|	Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	xen/arm: Add helpers to use the device tree	Julien Grall	2013-05-13	1	-18/+144
\| \| \| \| \|	Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	xen/arm: Create a hierarchical device tree	Julien Grall	2013-05-13	1	-5/+450
\| \| \| \| \| \| \| \| \|	Add function to parse the device tree and create a hierarchical tree. This code is based on drivers/of/base.c in linux source. Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	xen: move VCPUOP_register_runstate_memory_area to common code	Stefano Stabellini	2013-05-08	1	-0/+28
\| \| \| \| \|	Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	xen: move VCPUOP_register_vcpu_info to common code	Stefano Stabellini	2013-05-08	1	-0/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the implementation of VCPUOP_register_vcpu_info from x86 specific to commmon code. Move vcpu_info_mfn from an arch specific vcpu sub-field to the common vcpu struct. Move the initialization of vcpu_info_mfn to common code. Move unmap_vcpu_info and the call to unmap_vcpu_info at domain destruction time to common code. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	xen/arm: implement smp_call_function	Julien Grall	2013-05-08	2	-0/+113
\| \| \| \| \| \| \| \|	Move smp_call_function and on_selected_cpus to common code. Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	common: remove rcu_lock_target_domain_by_id	Daniel De Graaf	2013-05-07	1	-34/+0
\| \| \| \| \| \| \| \| \| \|	This function (and rcu_lock_remote_target_domain_by_id) has no remaining users, having been replaced with XSM hooks and the other rcu_lock_* functions. Remove it. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (for 4.3 release) Acked-by: Keir Fraser <keir@xen.org>
*	xsm: add hooks for claim	Daniel De Graaf	2013-05-07	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \|	Adds XSM hooks for the recently introduced XENMEM_claim_pages and XENMEM_get_outstanding_pages operations, and adds FLASK access vectors for them. This makes the access control decisions for these operations match those in the rest of the hypervisor. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> (for 4.3 release) Acked-by: Keir Fraser <keir@xen.org>
*	x86: Fix __prepare_to_wait() asm test for stack size	Keir Fraser	2013-05-02	1	-1/+1
\| \| \| \|	Signed-off-by: Keir Fraser <keir@xen.org>
*	x86: make arch_set_info_guest() preemptible	Jan Beulich	2013-05-02	3	-0/+13
\| \| \| \| \| \| \| \| \| \|	.. as the root page table validation (and the dropping of an eventual old one) can require meaningful amounts of time. This is part of CVE-2013-1918 / XSA-45. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
*	x86: make vcpu_reset() preemptible	Jan Beulich	2013-05-02	2	-8/+17
\| \| \| \| \| \| \| \| \| \|	... as dropping the old page tables may take significant amounts of time. This is part of CVE-2013-1918 / XSA-45. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
*	More emacs local variable block fixes.	Ian Campbell	2013-04-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The emacs variable to set the C style from a local variable block is c-file-style, not c-set-style. These were either missed by 82639998a5f2 or have crept back in since. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	xen: introduce vcpu_block	Stefano Stabellini	2013-04-30	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	Rename do_block to vcpu_block. Move the call to local_event_delivery_enable out of vcpu_block, to a new static function called vcpu_block_enable_events. Use vcpu_block_enable_events instead of do_block throughout in schedule.c Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	cpupool: prevent a domain from moving itself	Daniel De Graaf	2013-04-23	1	-6/+2
\| \| \| \| \| \| \| \| \|	In the XEN_SYSCTL_CPUPOOL_OP_MOVEDOMAIN operation, the existing check for domid == 0 should be checking that a domain does not attempt to modify its own cpupool; fix this by using rcu_lock_remote_domain_by_id. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
*	libxc: Add unsafe decompressors	Bastian Blank	2013-04-22	5	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add decompressors based on hypervisor code. This are used in mini-os by pv-grub. This enables pv-grub to boot kernels compressed with e.g. xz, which are becoming more common. Signed-off-by: Bastian Blank <waldi@debian.org> Adjusted to use terminology "unsafe" rather than "trusted" to indicate that the user had better sanitise the data (or not care, as in stub domains) as suggested by Tim Deegan. This was effectively a sed script. Minimise the changes to hypervisor code by moving the "compat layer" into the relevant libxc source files (which include the Xen ones). Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
*	x86/S3: Fix cpu pool scheduling after suspend/resume	Ben Guthro	2013-04-19	1	-11/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This review is another S3 scheduler problem with the system_state variable introduced with the following changeset: http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=269f543ea750ed567d18f2e819e5d5ce58eda5c5 Specifically, the cpu_callback function that takes the CPU down during suspend, and back up during resume. We were seeing situations where, after S3, only CPU0 was in cpupool0. Guest performance suffered greatly, since all vcpus were only on a single pcpu. Guests under high CPU load showed the problem much more quickly than an idle guest. Removing this if condition forces the CPUs to go through the expected online/offline state, and be properly scheduled after S3. This also includes a necessary partial change proposed earlier by Tomasz Wroblewski here: http://lists.xen.org/archives/html/xen-devel/2013-01/msg02206.html It should also resolve the issues discussed in this thread: http://lists.xen.org/archives/html/xen-devel/2012-11/msg01801.html Signed-off-by: Ben Guthro <benjamin.guthro@citrix.com> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
*	x86: fix various issues with handling guest IRQs	Jan Beulich	2013-04-18	2	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	- properly revoke IRQ access in map_domain_pirq() error path - don't permit replacing an in use IRQ - don't accept inputs in the GSI range for MAP_PIRQ_TYPE_MSI - track IRQ access permission in host IRQ terms, not guest IRQ ones (and with that, also disallow Dom0 access to IRQ0) This is CVE-2013-1919 / XSA-46. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
*	xen: allow for explicitly specifying node-affinity	Dario Faggioli	2013-04-17	5	-8/+133
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make it possible to pass the node-affinity of a domain to the hypervisor from the upper layers, instead of always being computed automatically. Note that this also required generalizing the Flask hooks for setting and getting the affinity, so that they now deal with both vcpu and node affinity. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Acked-by: Keir Fraser <keir@xen.org>