| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Checking for "idle_vcpu[cpu] != NULL" is insufficient protection against
offline pcpus. From a hypercall, vcpu_runstate_get() will determine "v !=
current", and try to take the vcpu_schedule_lock(). This will try to look up
per_cpu(schedule_data, v->processor) and promptly suffer a NULL structure
deference as v->processors' __per_cpu_offset is INVALID_PERCPU_AREA.
One example might look like this:
...
Xen call trace:
[<ffff82c4c0126ddb>] vcpu_runstate_get+0x50/0x113
[<ffff82c4c0126ec6>] get_cpu_idle_time+0x28/0x2e
[<ffff82c4c012b5cb>] do_sysctl+0x3db/0xeb8
[<ffff82c4c023280d>] compat_hypercall+0xbd/0x116
Pagetable walk from 0000000000000040:
L4[0x000] = 0000000186df8027 0000000000028207
L3[0x000] = 0000000188e36027 00000000000261c9
L2[0x000] = 0000000000000000 ffffffffffffffff
****************************************
Panic on CPU 11:
...
get_cpu_idle_time() has been updated to correctly deal with offline pcpus
itself by returning 0, in the same way as it would if it was missing the
idle_vcpu[] pointer.
In doing so, XENPF_getidletime needed updating to correctly retain its
described behaviour of clearing bits in the cpumap for offline pcpus.
As this crash can only be triggered with toolstack hypercalls, it is not a
security issue and just a simple bug.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
More specifically:
1. replaces xenctl_cpumap with xenctl_bitmap
2. provides bitmap_to_xenctl_bitmap and the reverse;
3. re-implement cpumask_to_xenctl_bitmap with
bitmap_to_xenctl_bitmap and the reverse;
Other than #3, no functional changes. Interface only slightly
afected.
This is in preparation of introducing NUMA node-affinity maps.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
| |
The emacs variable to set the C style from a local variable block is
c-file-style, not c-set-style.
Signed-off-by: David Vrabel <david.vrabel@citrix.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Include the default XSM hook action as the first argument of the hook
to facilitate quick understanding of how the call site is expected to
be used (dom0-only, arbitrary guest, or device model). This argument
does not solely define how a given hook is interpreted, since any
changes to the hook's default action need to be made identically to
all callers of a hook (if there are multiple callers; most hooks only
have one), and may also require changing the arguments of the hook.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
| |
A number of the platform_hypercall XSM hooks have no parameters or
only pass the operation ID, making them redundant with the
xsm_platform_op hook. Remove these redundant hooks.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
| |
The newly introduced xsm_platform_op hook addresses new sub-ops, while
most ops already have their own XSM hooks.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
- use the variants not validating the VA range when writing back
structures/fields to the same space that they were previously read
from
- when only a single field of a structure actually changed, copy back
just that field where possible
- consolidate copying back results in a few places
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
More substitutions in this patch, not as obvious as the ones in the
previous patch.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Recent Linux tries to make use of this, and has no way of getting at
these bits without Xen assisting it.
There doesn't appear to be a way to obtain the same information from
UEFI.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
This patch implement hypercall through which dom0 send core parking
request, and get core parking result.
Due to the characteristic of continue_hypercall_on_cpu, dom0
seperately send/get core parking request/result.
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases guests should not provide workarounds for errata even when the
physical processor is affected. For example, because of erratum 400 on family
10h processors a Linux guest will read an MSR (resulting in VMEXIT) before
going to idle in order to avoid getting stuck in a non-C0 state. This is not
necessary: HLT and IO instructions are intercepted and therefore there is no
reason for erratum 400 workaround in the guest.
This patch allows us to present a guest with certain errata as fixed,
regardless of the state of actual hardware.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Acked-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
| |
Actions requiring IS_PRIV should also require some XSM access control
in order for XSM to be useful in confining multiple privileged
domains. Add XSM hooks for new hypercalls and sub-commands that are
under IS_PRIV but not currently under any access checks.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the recent hotplug changes to the Xen part of the microcode
loading, this allows the kernel driver to avoid unnecessary calls into
the hypervisor during pCPU hot-enabling: Knowing that the hypervisor
retains the data for already booted CPUs, only data for CPUs with a
different signature needs to be passed down. Since the microcode
loading code can be pretty verbose, avoiding to invoke it can make the
log much easier to look at in case of problems.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
XENPF_get_cpuinfo should init the flags output field rather than only
modify it.
XENPF_cpu_online must check for the input CPU number to be in range.
XENPF_cpu_offline must also do that, and should also reject attempts to
offline CPU 0 (this fails in cpu_down() too, but preventing this here
appears more correct given that the code here calls
continue_hypercall_on_cpu(0, ...), which would be flawed if cpu_down()
would ever allow bringing down CPU 0 (and a distinct error code is
easier to deal with when debugging issues).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
This includes the conversion from for_each_cpu_mask() to for_each-cpu().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generally there was a NR_CPUS-bits wide array in these functions and
another (through a cpumask_t) on their callers' stacks, which may get
a little large for big NR_CPUS. As the functions can fail anyway, do
the allocation in there.
For the x86/MCA case this require a little code restructuring: By using
different CPU mask accessors it was possible to avoid allocating a mask
in the broadcast case. Also, this was the only user that failed to
check the return value of the conversion function (which could have led
to undefined behvior).
Also constify the input parameters of the two functions.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The former is the runtime equivalent of NR_CPUS (and users of NR_CPUS,
where necessary, get adjusted accordingly), while the latter is for the
sole use of determining the allocation size when dynamically allocating
CPU masks (done later in this series).
Adjust accessors to use either of the two to bound their bitmap
operations - which one gets used depends on whether accessing the bits
in the gap between nr_cpu_ids and nr_cpumask_bits is benign but more
efficient.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem is that efi_runtime_call is the name of both a function in
xen/arch/x86/efi/runtime.c and a member of the xsm_operations struct
in xen/include/xsm/xsm.h. This causes the macro "#define
efi_runtime_call(x) efi_compat_runtime_call(x)" on line 15 of
xen/arch/x86/x86_64/platform_hypercall.c to cause the above compile
error.
Renaming the XSM struct member fixes the problem.
Signed-off-by: James Carter <jwcart2@tycho.nsa.gov>
Acked-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to have Dom0 call _PDC with input fully representing Xen's
capabilities, and in order to avoid building knowledge of Xen
implementation details into Dom0, this provides a mechanism by which
the Dom0 kernel can, once it filled the _PDC input buffer according to
its own knowledge, present the buffer to Xen to apply overrides for
the parts of the C-, P-, and T-state management that it controls. This
is particularly to address the dependency of Xen using MWAIT to enter
certain C-states on the availability of the break-on-interrupt
extension (which the Dom0 kernel should have no need to know about).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows Dom0 access to all suitable EFI runtime services. The
actual calls into EFI are done in "physical" mode, as entering virtual
mode has been determined to be incompatible with kexec (EFI's
SetVirtualAddressMap() can be called only once, and hence the
secondary kernel can't establish its mappings). ("Physical" mode here
being quoted because this is a mode with paging enabled [otherwise
64-bit mode wouldn't work] but all mappings being 1:1.)
Open issue (not preventing this from being committed imo):
Page (and perhaps other) faults occuring while calling runtime
functions in the context of a hypercall don't get handled correctly
(they don't even seem to reach do_page_fault()). I'm intending to
investigate this further.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Besides introducing the relevant code paralleling parts of what is
under xen/arch/x86/boot/, this adjusts the build logic so that with a
single compilation two images (gzip-compressed ELF and EFI
application)
can get created. The EFI part of this depends on a new enough compiler
(supposedly gcc 4.4.x and above, but so far only tested to work with
4.5.x) and a properly configured linker (must support the i386pep
emulation). If either functionality is found to not be available, the
EFI part of the build will simply be skipped.
The patch adds all code to allow Xen and the (accordingly enabled)
Dom0 kernel to boot, but doesn't allow Dom0 to make use of EFI
runtime calls (this will be the subject of the next patch).
Parts of the code were lifted from an earlier never published OS
project of ours - whether respective license information needs to be
added to the respective source file is unclear to me (I was told
internally that adding a GPLv2 license header can be done if needed by
the community).
Open issues (not preventing this from being committed imo):
The trampoline allocation and initialization isn't really nice. This
is due to the trampoline needing to be placed at a fixed address, and
hence making the trampoline relocatable would seem desirable here (as
well as for BIOS-based booting, where the trampoline location needed
to be adjusted a number of time already in the past, due to it
colliding with firmware data).
By excluding mem.S, edd.S, and video.S from copied trampoline (i.e.
moving up wakeup.S? and making sure none of the symbols are used from
EFI code), the effective trampoline size could at least be reduced.
Should the mappings of [__XEN_VIRT_START, mbi.mem_upper) and
[_end, __XEN_VIRT_START+BOOTSTRAP_MAP_BASE) be destroyed, despite
non-EFI code also keeping them?
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this it is questionable whether retaining struct domain's
nr_pirqs is actually necessary - the value now only serves for bounds
checking, and this boundary could easily be nr_irqs.
Note that ia64, the build of which is broken currently anyway, is only
being partially fixed up.
v2: adjustments for split setup/teardown of translation data
v3: re-sync with radix tree implementation changes
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
| |
This patch moves some more, mostly data, extern declarations into
header files. I haven't been as strict as I was with functions;
in particular there are a number of declarations of assembler labels
that are only used in one place. I've also left a few compat-mode
tricks, and all the magic in symbols.c
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
|
|
|
|
| |
Move all extern declarations into appropriate header files.
This also fixes up a few places where the caller and the definition
had different signatures.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
|
|
|
|
|
| |
Although the caller should react appropriately to EBUSY, if the error
is due to pending RCU work then we can help things along by executing
rcu_barrier() and then retrying. To this end, this changeset is an
optimisation only.
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
| |
... decreasing cache footprint. As a prerequisite this requires making
cmdline_parse() a little more flexible.
Also remove a few variables altogether, and adjust sections
annotations for several others.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since (non-pvops, 32-bit only up to 2.6.27) Linux would report "BAD"
unconditionally on all SiS chipset versions (it only looks for a PCI
device at 0000:00:00.0 with SiS as the vendor), we must not crash if
the report on a 64-bit hypervisor doesn't match the #define (which is
zero).
While we could honor the quirk indication even on 64-bit, it doesn't
seem worthwhile, as there's no evidence that newer SiS chipsets
(supporting 64-bit CPUs) are actually affected.
This should also address bug 1687 (mis-reported, however, afaict).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
| |
Some of these could be benignly discarded by the compiler but some are
actual bugs.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
|
|
| |
...and fix up the ensuing fall-out of implicit dependencies
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
| |
Also simplify the locking (reverting to use if spin_trylock, as
returning EBUSY/EAGAIN seems unavoidable after all). In particular
this should continue to ensure that stop_machine_run() does not have
cpu_online_map change under its feet.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
| |
Instead we protect against concurrent online/offline requests for a
single CPU asynchronously, using a cpumask for bookkeeping.
This means that cpu_add_remove_lock no longer needs to be acquired via
spin_trylock.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
| |
Since the execution of stop_machine_run() via cpu_down() is now always
deferred to a hypercall continuation context, the above locks are not
held at that time. Hence the trylock is not specifically to avoid
deadlock with stop_machine_run(), but rather a more general paranoia
about deadlocks in general.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently stop_machine_run() will try to bring all CPUs to softirq
context, with some locks held, like xenpf_lock or cpu_add_remove_lock
etc. However, if another CPU is trying to get these locks, it may
cause deadlock.
This patch replace all such spin_lock with spin_trylock. For
xenpf_lock and sysctl_lock, we try to use hypercall_continuation, so
that we will not cause trouble to user space tools. For
cpu_hot_remove_lock, we simply return EBUSY if failure, since it will
only impact small number of user space tools.
In the end, we should try to make the stop_machine_run as spinlock
free.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The basic work flow to handle the memory hotadd is:
Update node information
Map new pages to xen 1:1 mapping
Setup frametable for new memory range
Setup m2p table for new memory range
Put the new pages to domheap
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch add CPU hot-add in system.
a) It mark all CPU as possible when booting, if CONFIG_HOTPLUG_CPU is
set. BTW, this will increase per_cpu area.
b) When a CPU is added through hypercall, the CPU will be marked as
present and offline, and the numa information is setup if numa is
supported. The CPU will be brought to online by dom0 online explicitly.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
|
|
|
|
|
|
|
|
| |
This patch change the XENPF_get_cpuinfo interface to pass only one
pcpu information each hypercall. Also, it replace
xenpf_resource_hotplug with XENPF_cpu_online/offline.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It also make some changes to current cpu online/offline logic:
1) Firstly, cpu online/offline will trigger a vIRQ to dom0 for status
changes notification.
2) It also add an interface to platform operation to online/offline
physical CPU. Currently the cpu online/offline interface is in sysctl,
which can't be triggered in kernel. With this change, it is possible
to trigger cpu online/offline in dom0 through sysfs interface.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
While pretty simplistic, it appears to serve the purpose at the moment
(i.e. it spotted two places where a GNU extension was used withou
proper preprocessor conditionals). The "simplistic" here includes that
the checking gets only done for native builds, and ia64 gets excluded
due to its arch-specific header intentionally (for whatever reason)
checking that anonymous struct/unions can be used.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
| |
Also consolidate all places to get cpu idle time.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Changeset 18552 (19b0a4f91712) move px transfer logic from
platform_hypercall.c to a common file to support both x86 and
ia64. However, it involves 32b guest os to 64b hypervisor (x86)
compatible issue. This patch fix the compatible issue, and make
set_px_pminfo() re-used by ia64 and x86 (32b guest os to 64b
hypervisor, and 64b guest os to 64b hypervisor).
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
|
|
|
|
|
|
|
|
|
| |
This patch move the cpufreq notify code from x86 specfic place to
common place, since it can be used by both x86 and ia64 cpufreq
driver.
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: Yu Ke <ke.yu@intel.com>
|
|
|
|
|
|
|
| |
... as this is rather wasteful when Xen is configured to support many
CPUs but is running on systems having only a few.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
| |
Clean up cpufreq core logic, which now can cope with cpu
online/offline event, and also dynamic platform limitation event
(_PPC).
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- The patch refactors the IO resource checks into the rangeset add/del
code. This produces a much more architecture friendly implementation and
ensures that HVM and paravirtualized guests are checked consistently.
- The patch removes the following hooks in support of the refactoring
of the IO resource checks:
- xsm_irq_permission
- xsm_iomem_permission
- xsm_ioport_permission
- The patch adds the following hooks in support of the refactoring of
the IO resource checks:
- xsm_add_range
- xsm_remove_range
- These IO refactoring changes are transparent to any pre-existing
Flask policies.
- The patch adds also adds hooks for sysctl functionality that was
added since the last major XSM patch. The following hooks were added:
- xsm_set_target
- xsm_debug_keys
- xsm_getcpuinfo
- xsm_availheap
- xsm_firmware_info
- xsm_acpi_sleep
- xsm_change_freq
- xsm_getidletime
- xsm_sendtrigger
- xsm_test_assign_device
- xsm_assign_device
- xsm_deassign_device
- xsm_bind_pt_irq
- xsm_pin_mem_cacheattr
- xsm_ext_vcpucontext
Signed-off-by: George Coker <gscoker@alpha.ncsc.mil>
|
|
|
|
|
|
|
|
| |
Linux 2.6.27 marks the data pointer in its firmware struct 'const',
and hence, to avoid a compiler warning, Xen's microcode update
interface should be properly properly constified too.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
| |
xen changeset 18125(ab1d7db3facb) changed some px init logic, which
has some effect to Px statistic and S3 suspend/resume logic.
This patch change Px control protection corresponding to changeset
18125.
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
|