xen/xen - xen

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add DOMCTL to limit the number of event channels a domain may use	David Vrabel	2013-10-14	6	-1/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add XEN_DOMCTL_set_max_evtchn which may be used during domain creation to set the maximum event channel port a domain may use. This may be used to limit the amount of Xen resources (global mapping space and xenheap) that a domain may use for event channels. A domain that does not have a limit set may use all the event channels supported by the event channel ABI in use. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Keir Fraser <keir@xen.org>
*	evtchn: add FIFO-based event channel hypercalls and port ops	David Vrabel	2013-10-14	5	-1/+519
\| \| \| \| \| \| \| \| \| \|	Add the implementation for the FIFO-based event channel ABI. The new hypercall sub-ops (EVTCHNOP_init_control, EVTCHNOP_expand_array) and the required evtchn_ops (set_pending, unmask, etc.). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	evtchn: implement EVTCHNOP_set_priority and add the set_priority hook	David Vrabel	2013-10-14	2	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \|	Implement EVTCHNOP_set_priority. A new set_priority hook added to struct evtchn_port_ops will do the ABI specific validation and setup. If an ABI does not provide a set_priority hook (as is the case of the 2-level ABI), the sub-op will return -ENOSYS. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	evtchn: add FIFO-based event channel ABI	David Vrabel	2013-10-14	3	-3/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the event channel hypercall sub-ops and the definitions for the shared data structures for the FIFO-based event channel ABI. The design document for this new ABI is available here: http://xenbits.xen.org/people/dvrabel/event-channels-F.pdf In summary, events are reported using a per-domain shared event array of event words. Each event word has PENDING, LINKED and MASKED bits and a LINK field for pointing to the next event in the event queue. There are 16 event queues (with different priorities) per-VCPU. Key advantages of this new ABI include: - Support for over 100,000 events (2^17). - 16 different event priorities. - Improved fairness in event latency through the use of FIFOs. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	evtchn: allow many more evtchn objects to be allocated per domain	David Vrabel	2013-10-14	3	-29/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Expand the number of event channels that can be supported internally by altering now struct evtchn's are allocated. The objects are indexed using a two level scheme of groups and buckets (instead of only buckets). Each group is a page of bucket pointers. Each bucket is a page-sized array of struct evtchn's. The optimal number of evtchns per bucket is calculated at compile time. If XSM is not enabled, struct evtchn is 16 bytes and each bucket contains 256, requiring only 1 group of 512 pointers for 2^17 (131,072) event channels. With XSM enabled, struct evtchn is 24 bytes, each bucket contains 128 and 2 groups are required. For the common case of a domain with only a few event channels, instead of requiring an additional allocation for the group page, the first bucket is indexed directly. As a consequence of this, struct domain shrinks by at least 232 bytes as 32 bucket pointers are replaced with 1 bucket pointer and (at most) 2 group pointers. [ Based on a patch from Wei Liu with improvements from Malcolm Crossley. ] Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	evtchn: use a per-domain variable for the max number of event channels	David Vrabel	2013-10-14	5	-5/+6
\| \| \| \| \| \| \| \| \|	Instead of the MAX_EVTCHNS(d) macro, use d->max_evtchns instead. This avoids having to repeatedly check the ABI type. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	evtchn: print ABI specific state with the 'e' debug key	David Vrabel	2013-10-14	3	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the output of the 'e' debug key, print some ABI specific state in addition to the (p)ending and (m)asked bits. For the 2-level ABI, print the state of that event's selector bit. e.g., (XEN) port [p/m/s] (XEN) 1 [0/0/1]: s=3 n=0 x=0 d=0 p=74 (XEN) 2 [0/0/1]: s=3 n=0 x=0 d=0 p=75 Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	evtchn: refactor low-level event channel port ops	David Vrabel	2013-10-14	7	-61/+189
\| \| \| \| \| \| \| \| \| \| \| \| \|	Use functions for the low-level event channel port operations (set/clear pending, unmask, is_pending and is_masked). Group these functions into a struct evtchn_port_op so they can be replaced by alternate implementations (for different ABIs) on a per-domain basis. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	debug: remove some event channel info from the 'i' and 'q' debug keys	David Vrabel	2013-10-14	2	-13/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	The 'i' key would always use VCPU0's selector word when printing the event channel state. Remove the incorrect output as a subsequent change will add the (correct) information to the 'e' key instead. When dumping domain information, printing the state of the VIRQ_DEBUG port is redundant -- this information is available via the 'e' key. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86/HVM: cache emulated instruction for retry processing	Jan Beulich	2013-10-14	2	-14/+46
\| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than re-reading the instruction bytes upon retry processing, stash away and re-use what we already read. That way we can be certain that the retry won't do something different from what requested the retry, getting once again closer to real hardware behavior (where what we use retries for is simply a bus operation, not involving redundant decoding of instructions). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86/HVM: properly deal with hvm_copy_*_guest_phys() errors	Jan Beulich	2013-10-14	2	-16/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In memory read/write handling the default case should tell the caller that the operation cannot be handled rather than the operation having succeeded, so that when new HVMCOPY_* states get added not handling them explicitly will not result in errors being ignored. In task switch emulation code stop handling some errors, but not others. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86/HVM: don't ignore hvm_copy_to_guest_phys() errors during I/O intercept	Jan Beulich	2013-10-14	1	-13/+107
\| \| \| \| \| \| \| \| \|	Building upon the extended retry logic we can now also make sure to not ignore errors resulting from writing data back to guest memory. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86/HVM: fix direct PCI port I/O emulation retry and error handling	Jan Beulich	2013-10-14	3	-18/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dpci_ioport_{read,write}() guest memory access failure handling should be modelled after process_portio_intercept()'s (and others): Upon encountering an error on other than the first iteration, the count successfully handled needs to be stored and X86EMUL_OKAY returned, in order for the generic instruction emulator to update register state correctly before reporting failure or retrying (both of which would only happen after re-invoking emulation). Further we leverage (and slightly extend, due to the above mentioned need to return X86EMUL_OKAY) the "large MMIO" retry model. Note that there is still a special case not explicitly taken care of here: While the first retry on the last iteration of a "rep ins" correctly recovers the already read data, an eventual subsequent retry is being handled by the pre-existing mmio-large logic (through hvmemul_do_io() storing the [recovered] data [again], also taking into consideration that the emulator converts a single iteration "ins" to ->read_io() plus ->write()). Also fix an off-by-one in the mmio-large-read logic, and slightly simplify the copying of the data. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86/HVM: properly handle backward string instruction emulation	Jan Beulich	2013-10-14	3	-44/+23
\| \| \| \| \| \| \| \| \| \| \|	Multiplying a signed 32-bit quantity with an unsigned 32-bit quantity produces an unsigned 32-bit result, yet for emulation of backward string instructions we need the result sign extended before getting added to the base address. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	sched: Correct function prototypes	Andrew Cooper	2013-10-14	1	-3/+3
\| \| \| \| \| \| \|	struct vcpu pointers are traditionally v rather than d. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86/MSI: fix locking in pci_restore_msi_state()	Jan Beulich	2013-10-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Right after the loop the lock is being dropped, so all loop exits should happen with the lock still held. Reported-by: Kristoffer Egefelt <kristoffer@itoc.dk> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Kristoffer Egefelt <kristoffer@itoc.dk> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
*	sched: fix race between sched_move_domain() and vcpu_wake()	David Vrabel	2013-10-14	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From: David Vrabel <david.vrabel@citrix.com> sched_move_domain() changes v->processor for all the domain's VCPUs. If another domain, softirq etc. triggers a simultaneous call to vcpu_wake() (e.g., by setting an event channel as pending), then vcpu_wake() may lock one schedule lock and try to unlock another. vcpu_schedule_lock() attempts to handle this but only does so for the window between reading the schedule_lock from the per-CPU data and the spin_lock() call. This does not help with sched_move_domain() changing v->processor between the calls to vcpu_schedule_lock() and vcpu_schedule_unlock(). Fix the race by taking the schedule_lock for v->processor in sched_move_domain(). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Use vcpu_schedule_lock_irq() (which now returns the lock) to properly retry the locking should the to be used lock have changed in the course of acquiring it (issue pointed out by George Dunlap). Add a comment explaining the state after the v->processor adjustment. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	scheduler: adjust internal locking interface	Jan Beulich	2013-10-14	5	-136/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the locking functions return the lock pointers, so they can be passed to the unlocking functions (which in turn can check that the lock is still actually providing the intended protection, i.e. the parameters determining which lock is the right one didn't change). Further use proper spin lock primitives rather than open coded local_irq_...() constructs, so that interrupts can be re-enabled as appropriate while spinning. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: fix bug_line()	Jan Beulich	2013-10-14	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \|	Due to the packing into a bit field together with a relocated field, the computation can overflow when the relocated field ends up getting a negative value stored. Hence it isn't sufficient to correct the value by 1 in this case, but we also need to mask the result to the width of the original bit field. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: check for canonical address before doing page walks	Jan Beulich	2013-10-11	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	... as there doesn't really exists any valid mapping for them. Particularly in the case of do_page_walk() this also avoids returning non-NULL for such invalid input. Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: use {rd,wr}{fs,gs}base when available	Jan Beulich	2013-10-11	7	-29/+79
\| \| \| \| \| \| \| \| \| \| \| \|	... as being intended to be faster than MSR reads/writes. In the case of emulate_privileged_op() also use these in favor of the cached (but possibly stale) addresses from arch.pv_vcpu. This allows entirely removing the code that was the subject of XSA-67. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: add address validity check to guest_map_l1e()	Jan Beulich	2013-10-11	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Just like for guest_get_eff_l1e() this prevents accessing as page tables (and with the wrong memory attribute) internal data inside Xen happening to be mapped with 1Gb pages. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: correct LDT checks	Jan Beulich	2013-10-11	5	-26/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- MMUEXT_SET_LDT should behave as similarly to the LLDT instruction as possible: fail only if the base address is non-canonical - instead LDT descriptor accesses should fault if the descriptor address ends up being non-canonical (by ensuring this we at once avoid reading an entry from the mach-to-phys table and consider it a page table entry) - fault propagation on using LDT selectors must distinguish #PF and #GP (the latter must be raised for a non-canonical descriptor address, which also applies to several other uses of propagate_page_fault(), and hence the problem is being fixed there) - map_ldt_shadow_page() should properly wrap addresses for 32-bit VMs At once remove the odd invokation of map_ldt_shadow_page() from the MMUEXT_SET_LDT handler: There's nothing really telling us that the first LDT page is going to be preferred over others. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: check segment descriptor read result in 64-bit OUTS emulation	Matthew Daley	2013-10-10	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When emulating such an operation from a 64-bit context (CS has long mode set), and the data segment is overridden to FS/GS, the result of reading the overridden segment's descriptor (read_descriptor) is not checked. If it fails, data_base is left uninitialized. This can lead to 8 bytes of Xen's stack being leaked to the guest (implicitly, i.e. via the address given in a #PF). Coverity-ID: 1055116 This is CVE-2013-4368 / XSA-67. Signed-off-by: Matthew Daley <mattjd@gmail.com> Fix formatting. Signed-off-by: Jan Beulich <jbeulich@suse.com>
*	xen/arm: Fixing clear_guest_offset macro	Jaeyong Yoo	2013-10-10	1	-2/+3
\| \| \| \| \| \| \| \|	Fix the the broken macro 'clear_guest_offset' in arm. Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com> Reviewed-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	xen/arm32: Call start_xen only on the boot CPU	Julien Grall	2013-10-10	1	-1/+2
\| \| \| \| \| \| \| \|	The boot CPU can have a CPU ID non-equal to zero. Xen needs to check the logical CPU ID (in r12) to know if the CPU is the boot one. Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	hvm/vidirian: Avoid printing page_to_mfn(NULL) on error paths	Andrew Cooper	2013-10-09	2	-18/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While working in the viridian code, I noticed that 4cb6c4f4941 "x86/hvm: Use get_page_from_gfn() instead of get_gfn()/put_gfn." introduced two error paths where page_to_mfn(NULL) would be formatted and presented as a bad MFN. This provides junk in the warning rather than something useful. These two codepaths are fixed up to match their counterpart in wrmsr_hypervisor_regs() While auditing the other changes from 4cb6c4f4941, I noticed a small optimisation which could be made by changing the order of the validity checks to remove 6 NULL pointer checks. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86/traps: improvements to {rd,wr}msr_hypervisor_regs()	Andrew Cooper	2013-10-09	1	-26/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coverity ID: 1055249 1055250 Coverity was complaining that the switch statments contained dead code in their default statements. While this is quite minor, the code flow in wrmsr_hypervisor_regs() was sufficiently opaque that I felt it approprate to fix. Other improvements include: * not shadowing the function parameter 'idx'. * use of PAGE_{SHIFT,SIZE} instead of opencoded numbers. * a more descriptive error message for attempting to write invalid indicies for hypercall pages. There is no behavioural change as a result. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
*	xen/x86: Remove GB macro in asm-x86/config.h	Julien Grall	2013-10-08	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \|	Commit 983843e "xen: Add macros MB and GB" introduce a generic GB macro. By mistake, the macro in asm-x86/config.h was not removed. This is result to a compilation error when Xen is build for x86. Signed-off-by: Julien Grall <julien.grall@linaro.org> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
*	xen/dts: Support Linux initrd DT bindings	Julien Grall	2013-10-08	1	-0/+25
\| \| \| \| \| \| \| \|	Linux uses the property linux,initrd-start and linux,initrd-end to know where the initrd lives in memory. Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	xen/arm: Add support to load initrd in dom0	Julien Grall	2013-10-08	3	-21/+102
\| \| \| \| \|	Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	xen/dts: Use ROUNDUP macro instead of the internal ALIGN	Julien Grall	2013-10-08	1	-6/+4
\| \| \| \| \|	Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	xen: Add macro ROUNDUP	Julien Grall	2013-10-08	1	-0/+2
\| \| \| \| \| \|	Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Keir Fraser <keir@xen.org> CC: Jan Beulich <jbeulich@suse.com>
*	xen: Add macros MB and GB	Julien Grall	2013-10-08	2	-1/+3
\| \| \| \| \| \| \|	Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Keir Fraser <keir@xen.org> CC: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
*	x86/HPET: basic cleanup	Andrew Cooper	2013-10-08	3	-16/+14
\| \| \| \| \| \| \| \| \|	* Strip trailing whitespace * Remove redundant definitions * Update stale documentation links * Move hpet_address into __initdata Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
*	VT-d: fix suspected data race condition in iommu_set_root_entry()	Andrew Cooper	2013-10-08	1	-16/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coverity ID: 1054967 Coverity spotted that iommu->root_maddr was optionally allocated within the protection of the iommu->lock, but was referenced with the protection of the iommu->register_lock, and freed without any lock. Luckily, the code as-is is not vulnerable to the potential risks identified. However, the alloc_pgtable_maddr() is far more appropriately done in iommu_alloc(), removing a set of spinlock calls, and a possibility for the iommu setup to fail later than iommu_alloc() with an -ENOMEM. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
*	xen: add LZ4 decompression support	Kyungsik Lee	2013-10-07	8	-2/+767
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for Xen are based on LZ4 implementation by Yann Collet. Benchmark Results(PATCH v3) Compiler: Linaro ARM gcc 4.6.2 1. ARMv7, 1.5GHz based board Kernel: linux 3.4 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.7MB 20.1MB/s, 25.2MB/s(UA) LZ4 7.3MB 29.1MB/s, 45.6MB/s(UA) 2. ARMv7, 1.7GHz based board Kernel: linux 3.7 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.0MB 34.1MB/s, 52.2MB/s(UA) LZ4 6.5MB 86.7MB/s - UA: Unaligned memory Access support - Latest patch set for LZO applied This patch set is for adding support for LZ4-compressed Kernel. LZ4 is a very fast lossless compression algorithm and it also features an extremely fast decoder [1]. But we have five of decompressors already and one question which does arise, however, is that of where do we stop adding new ones? This issue had been discussed and came to the conclusion [2]. Russell King said that we should have: - one decompressor which is the fastest - one decompressor for the highest compression ratio - one popular decompressor (eg conventional gzip) If we have a replacement one for one of these, then it should do exactly that: replace it. The benchmark shows that an 8% increase in image size vs a 66% increase in decompression speed compared to LZO(which has been known as the fastest decompressor in the Kernel). Therefore the "fast but may not be small" compression title has clearly been taken by LZ4 [3]. [1] http://code.google.com/p/lz4/ [2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157 [3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347 LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html LZ4 source repository: http://code.google.com/p/lz4/ Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Yann Collet <yann.collet.73@gmail.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: Improve information from domain_crash_synchronous	Andrew Cooper	2013-10-04	5	-28/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As it currently stands, the string "domain_crash_sync called from entry.S" is not helpful at identifying why the domain was crashed, and a debug build of Xen doesn't help the matter This patch improves the information printed, by pointing to where the crash decision was made. Specific improvements include: * Moving the ascii string "domain_crash_sync called from entry.S\n" away from some semi-hot code cache lines. * Moving the printk into C code (especially as this_cpu() is miserable to use in assembly code) * Undo the previous confusing situation of having the domain_crash_synchronous() as a macro in C code, yet a global symbol in assembly code. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86/traps: Record last extable faulting address	Andrew Cooper	2013-10-04	1	-0/+5
\| \| \| \| \| \| \| \|	... so the following patch can identify the location of faults leading to a decision to crash a domain. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86: allow HVM guests to make console_io hypercall	Konrad Rzeszutek Wilk	2013-10-04	1	-0/+2
\| \| \| \| \| \| \| \| \|	The console_io hypercall is provided for PV guests and for HVM guests it is done via the 0xe9 port. However the PV hypercall is more efficient as it takes a string rather than one character per write. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
*	xsm: clean up unneeded current references	Daniel De Graaf	2013-10-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Some XSM hooks in dummy.h used current->domain when this was also passed as a parameter; use the parameter in these cases. There are two hooks where this does not apply and which are not immediately obvious: xsm_set_target's parameters are the device model and HVM domains, and xsm_mem_sharing_op's first parameter is the source of the shared page, not the domain making the hypercall. Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
*	xsm: forbid PV guest console reads	Daniel De Graaf	2013-10-04	1	-3/+3
\| \| \| \| \| \| \| \|	The CONSOLEIO_read operation was incorrectly allowed to PV guests if the hypervisor was compiled in debug mode (with VERBOSE defined). Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
*	x86: make hvm_cpuid() tolerate NULL pointers	Jan Beulich	2013-10-04	3	-19/+29
\| \| \| \| \| \| \| \| \| \| \|	Now that other HVM code started making more extensive use of hvm_cpuid(), let's not force every caller to declare dummy variables for output not cared about. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	Nested VMX: fix IA32_VMX_CR4_FIXED1 msr emulation	Yang Zhang	2013-10-04	3	-4/+53
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, it use hardcode value for IA32_VMX_CR4_FIXED1. This is wrong. We should check guest's cpuid to know which bits are writeable in CR4 by guest and allow the guest to set the corresponding bit only when guest has the feature. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Cleanup. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	VMX: clean up capability checks	Jan Beulich	2013-10-04	2	-19/+35
\| \| \| \| \| \| \| \| \| \| \| \|	VMCS size validation on APs should check against BP's size. No need for a separate cpu_has_vmx_ins_outs_instr_info variable anymore. Use proper symbolics. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	Nested VMX: check VMX capability before read VMX related MSRs	Yang Zhang	2013-10-04	3	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \|	VMX MSRs only available when the CPU support the VMX feature. In addition, VMX_TRUE* MSRs only available when bit 55 of VMX_BASIC MSR is set. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Cleanup. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
*	x86/percpu: Force INVALID_PERCPU_AREA into the non-canonical address region	Andrew Cooper	2013-10-04	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	This causes accidental uses of per_cpu() on a pcpu with an INVALID_PERCPU_AREA to result in a #GF for attempting to access the middle of the non-canonical virtual address region. This is preferable to the current behaviour, where incorrect use of per_cpu() will result in an effective NULL structure dereference which has security implication in the context of PV guests. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	x86/idle: Fix get_cpu_idle_time()'s interaction with offline pcpus	Andrew Cooper	2013-10-04	2	-7/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Checking for "idle_vcpu[cpu] != NULL" is insufficient protection against offline pcpus. From a hypercall, vcpu_runstate_get() will determine "v != current", and try to take the vcpu_schedule_lock(). This will try to look up per_cpu(schedule_data, v->processor) and promptly suffer a NULL structure deference as v->processors' __per_cpu_offset is INVALID_PERCPU_AREA. One example might look like this: ... Xen call trace: [<ffff82c4c0126ddb>] vcpu_runstate_get+0x50/0x113 [<ffff82c4c0126ec6>] get_cpu_idle_time+0x28/0x2e [<ffff82c4c012b5cb>] do_sysctl+0x3db/0xeb8 [<ffff82c4c023280d>] compat_hypercall+0xbd/0x116 Pagetable walk from 0000000000000040: L4[0x000] = 0000000186df8027 0000000000028207 L3[0x000] = 0000000188e36027 00000000000261c9 L2[0x000] = 0000000000000000 ffffffffffffffff **************************************** Panic on CPU 11: ... get_cpu_idle_time() has been updated to correctly deal with offline pcpus itself by returning 0, in the same way as it would if it was missing the idle_vcpu[] pointer. In doing so, XENPF_getidletime needed updating to correctly retain its described behaviour of clearing bits in the cpumap for offline pcpus. As this crash can only be triggered with toolstack hypercalls, it is not a security issue and just a simple bug. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
*	xen/arm: map_domain_page: reuse slots with avail == 0	Stefano Stabellini	2013-10-03	1	-7/+10
\| \| \| \| \| \| \| \|	If a slot has avail == 0 but still points to the right mfn, reuse it. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
*	xen/arm32: don't export v7_init	Julien Grall	2013-10-03	1	-1/+1
\| \| \| \| \|	Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>