| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Discovered by Coverity, CID 1055743
Depending on the contents of the mp_irqs/mp_ioapics from the MP table,
find_isa_irq_apic() might return -1, at which point calling
ioapic_read_entry() with it is bad.
In addition to bailing if pin is -1, bail if apic is -1.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes two regressions from c/s 20143:a7de5bd776ca ("x86: Make the
hypercall PHYSDEVOP_alloc_irq_vector hypercall dummy"):
For one, IRQs that had their vector set up by Xen internally without a
handler ever having got set (e.g. via "com<n>=..." without a matching
consumer option like "console=com<n>") would wrongly call
add_pin_to_irq() here, triggering the BUG_ON() in that function.
Second, when assign_irq_vector() fails this addition to irq_2_pin[]
needs to be undone.
In the context of this I'm also surprised that the irq_2_pin[]
manipulations here occur without any lock, i.e. rely on Dom0 to do
some sort of serialization.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are BIOSes that want to map the IO-APIC MMIO region from some
ACPI method(s), and there is at least one BIOS flavor that wants to
use this mapping to clear an RTE's mask bit. While we can't allow the
latter, we can permit reads and simply drop write attempts, leveraging
the already existing infrastructure introduced for dealing with AMD
IOMMUs' representation as PCI devices.
This fixes an interrupt setup problem on a system where _CRS evaluation
involved the above described BIOS/ACPI behavior, and is expected to
also deal with a boot time crash of pv-ops Linux upon encountering the
same kind of system.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It has always been puzzling me why the first IO-APIC gets special cased
in two places, and finally Xen got run on a system where this breaks:
(XEN) ACPI: IOAPIC (id[0x10] address[0xfecff000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 16, version 17, address 0xfecff000, GSI 0-2
(XEN) ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[3])
(XEN) IOAPIC[1]: apic_id 15, version 17, address 0xfec00000, GSI 3-38
(XEN) ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[39])
(XEN) IOAPIC[2]: apic_id 14, version 17, address 0xfec01000, GSI 39-74
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 1 global_irq 4 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 5 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 3 global_irq 6 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 4 global_irq 7 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 6 global_irq 9 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 7 global_irq 10 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 11 low edge)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 12 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 12 global_irq 15 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 13 global_irq 16 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 17 low edge)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 18 dfl dfl)
i.e. all legacy IRQs (apart from the timer one, but the firmware passed
data doesn't look right for that case anyway, as both Xen and native
Linux are falling back to use the virtual wire setup for IRQ0,
apparently rather using pin 2 of the first IO-APIC) are being handled
by the second IO-APIC.
This at once eliminates the possibility of an unmasked RTE getting
written without having got a vector put in place (in
setup_IO_APIC_irqs()).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
| |
Performance is not an issue with printk(), so let the function do
minimally more work and instead save a byte per affected format
specifier.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than evaluating "ioapic_ack_new" on each invocation, and
considering that the two methods really have almost no code in common,
split the handlers.
While at it, also move ioapic_ack_{new,forced} into .init.data
(eliminating the single non-__init reference to the former).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
This improves readability, not the least through doing away with a
couple of ugly casts.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
highest_gsi() returns the last valid GSI, not a count.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Joe Jin <joe.jin@oracle.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the recent changes to clear_IO_APIC_pin() which tries to
clear remoteIRR bit explicitly, some of the users started to see
"Unable to reset IRR for apic .." messages.
Close look shows that these are related to bogus IO-APIC entries
which returns all 1s for their io-apic registers. And the
above mentioned error messages are benign. But kernel should
have ignored such io-apic's in the first place.
Check if register 0, 1, 2 of the listed io-apic are all 1s and
ignore such io-apic.
[original Linux patch:]
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ioapic_ack=new on the command line.
Currently, if EOI broadcast suppression is advertised on the BSP
LAPIC, Xen will discard any user specified option regarding IO-APIC
ack mode.
This patch introduces a check which prevents EOI Broadcast suppression
from forcing the IO-APIC ack mode to old if the user has explicitly
asked for the new ack mode on the command line.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
CONFIG_SMP is always enabled and !CONFIG_SMP is not supported. So
simplify the code a little by removing all #ifdefs.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having the columns aligned makes for much easier reading. Also remove
the commas which only add to visual clutter in combination with
spaces.
Furthermore, printing fewer characters makes it less likely that the
serial buffer will overflow resulting in loss of critical debugging
information.
Changes since v1:
* Format vector as hex rather than dec
* Contract some names
* destination mode uses 'L' or 'P' instead of full words
* trigger mode uses 'L' or 'E' instead of full words
* delivery mode uses short string instead of a number
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extends create_irq() to take a node parameter, allowing the
resulting IRQ to have its destination set to a CPU on that node right
away, which is more natural than having to post-adjust this (and
get e.g. a new IRQ vector assigned despite a fresh one was just
obtained).
All other callers of create_irq() pass NUMA_NO_NODE for the time being.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following the prevention of vector sharing for MSIs, this change
enforces the same within IO-APICs: Pin based interrupts use the IO-APIC
as their identifying device under the AMD IOMMU (and just like for
MSIs, only the identifying device is used to remap interrupts here,
with no regard to an interrupt's destination).
Additionally, LAPIC initiated EOIs (for level triggered interrupts) too
use only the vector for identifying which interrupts to end. While this
generally causes no significant problem (at worst an interrupt would be
re-raised without a new interrupt event actually having occurred), it
still seems better to avoid the situation.
For this second aspect, a distinction is being made between the
traditional and the directed-EOI cases: In the former, vectors should
not be shared throughout all IO-APICs in the system, while in the
latter case only individual IO-APICs need to be contrained (or, if the
firmware indicates so, sub- groups of them having the same GSI appear
at multiple pins).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than going through all IO-APICs and calling io_apic_eoi_vector()
for the vector in question, just use eoi_IO_APIC_irq().
This in turn allows to eliminate quite a bit of other code.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The vector is already being tracked in struct irq_desc's arch.vector
member, so there's no real need for a second place where this to get
stored. The only caveat is that legacy vectors (used for interrupts
handled through the 8259) must be special cased to not prevent non-
legacy vectors from being assigned.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
|
|
| |
This includes delaying the initialization of dynamically created IRQs
until their actual first use and some further elimination of uses of
struct irq_cfg.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
| |
... in favor of using the new, nr_cpumask_bits-based ones.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
While doing so, eliminate the use of struct irq_cfg and convert the
CPU mask accessors to the new style ones as far as possible.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
struct irq_cfg really has become an architecture extension to struct
irq_desc, and hence it should be treated as such (rather than as IRQ
chip specific data, which it was meant to be originally).
For a first step, only convert a subset of the uses; subsequent
patches (partly to be sent later) will aim at fully eliminating the
use of the old structure type.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This eliminates passing and returning by value of this type (making it
unnecessary for the compiler to create temporary variables for doing so
on the stack), thus dramatically reducing the stack frame sizes of a
couple of functions (was in one case over 12k for a 4095-CPU build).
In one case a local variable gets converted to a static one, possible
because the function gets called at most once (during early boot).
Some accessors get eliminated altogether as being unused or as being
equally well open coded at the place of use.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
| |
This includes the removal of a redundant memset() from microcode_amd.c.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
The name "nr_ioapic_registers" is wrong and actively misleading. The
variable holds the number of redirection entries for each apic, which
is two registers fewer than the total number of registers.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The IRQ handling code requires pcidevs_lock to be held only for MSI
interrupts.
As the handling of which was now fully moved into msi.c (i.e. while
applying fine without, the patch needs to be applied after the one
titled "x86: split MSI IRQ chip"), io_apic.c now also doesn't need to
include PCI headers anymore.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the .end() accessor having become optional and noting that
several of the accessors' behavior really depends on the result of
msi_maskable_irq(), the splits the MSI IRQ chip type into two - one
for the maskable ones, and the other for the (MSI only) non-maskable
ones.
At once the implementation of those methods gets moved from io_apic.c
to msi.c.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is again because the descriptor is generally more useful (with
the IRQ number being accessible in it if necessary) and going forward
will hopefully allow to remove all direct accesses to the IRQ
descriptor array, in turn making it possible to make this some other,
more efficient data structure.
This additionally makes the .end() accessor optional, noting that in a
number of cases the functions were empty.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
This is because the descriptor is generally more useful (with the IRQ
number being accessible in it if necessary) and going forward will
hopefully allow to remove all direct accesses to the IRQ descriptor
array, in turn making it possible to make this some other, more
efficient data structure.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
| |
This is particularly relevant as the number of CPUs to be supported
increases (as recently happened for the default thereof).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old io_apic_eoi() function using the EOI register only works for
IO-APICs with a version of 0x20. Older IO-APICs do not have an EOI
register so line level interrupts have to be EOI'd by flipping the
mode to edge and back, which clears the IRR and Delivery Status bits.
This patch replaces the current io_apic_eoi() function with one which
takes into account the version of the IO-APIC and EOI's
appropriately.
v2: make recursive call to __io_apic_eoi() to reduce code size.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
| |
Introduce old_vector to irq_cfg with the same principle as
old_cpu_mask. This removes a brute force loop from
__clear_irq_vector(), and paves the way to correct bitrotten logic
elsewhere in the irq code.
Signed-off-by Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
| |
irq_desc.depth is a write only variable.
LEGACY_IRQ_FROM_VECTOR(vec) is never referenced.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When migrating IO-APIC line level interrupts between PCPUs, the
migration code rewrites the IO-APIC entry to point to the new
CPU/Vector before EOI'ing it.
The EOI process says that EOI'ing the Local APIC will cause a
broadcast with the vector number, which the IO-APIC must listen to to
clear the IRR and Status bits.
In the case of migrating, the IO-APIC has already been
reprogrammed so the EOI broadcast with the old vector fails to match
the new vector, leaving the IO-APIC with an outstanding vector,
preventing any more use of that line interrupt. This causes a lockup
especially when your root device is using PCI INTA (megaraid_sas
driver *ehem*)
However, the problem is mostly hidden because send_cleanup_vector()
causes a cleanup of all moving vectors on the current PCPU in such a
way which does not cause the problem, and if the problem has occured,
the writes it makes to the IO-APIC clears the IRR and Status bits
which unlocks the problem.
This fix is distinctly a temporary hack, waiting on a cleanup of the
irq code. It checks for the edge case where we have moved the irq,
and manually EOI's the old vector with the IO-APIC which correctly
clears the IRR and Status bits. Also, it protects the code which
updates irq_cfg by disabling interrupts.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
| |
such count is useful to assist decision make in cpuidle governor,
while w/o this patch only device interrupts through do_IRQ is
currently counted.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to make sure that cfg->used_vector is only cleared once;
otherwise there may be a race condition that allows the same vector to
be assigned twice, defeating the whole purpose of the map.
This makes two changes:
* __clear_irq_vector() only clears the vector if the irq is not being
moved
* smp_iqr_move_cleanup_interrupt() only clears used_vector if this
is the last place it's being used (move_cleanup_count==0 after
decrement).
Also make use of asserts more consistent, to catch this kind of logic
bug in the future.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was found that in a crash scenario, the remoteIRR bit in an IO-APIC
RTE could be left set, causing problems when bringing up a kdump
kernel. While this generally is most important to be taken care of in
the new kernel (which usually would be a native one), it still seems
desirable to also address this problem in Xen so that (a) the problem
doesn't bite Xen when used as a secondary emergency kernel and (b) an
attempt is being made to save un-fixed secondary kernels from running
into said problem.
Based on a Linux patch from suresh.b.siddha@intel.com.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are used during bootup and (emergency) shutdown only, and their
only purpose is to get the actual IO-APIC's RTE(s) cleared.
Consequently, only the "raw" accessors should be used (and the ones
going through interrupt remapping code can be skipped), with the
exception of determining the delivery mode: This one must always go
through the interrupt remapping path, as in the VT-d case the actual
IO-APIC's RTE will have the delivery mode always set to zero (which
before possibly could have resulted in such an entry getting cleared
in the "raw" pass, though I haven't observed this case in practice).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the respective functions quite a bit more legible.
Since this requires fiddling with __ioapic_{read,write}_entry()
anyway,
make them and their wrappers have their argument types match those of
__io_apic_{read,write}() (int -> unsigned int).
No functional change intended.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
| |
Laying the groundwork for per-device vector maps. This generic
code allows any irq to point to a vector map; all irqs sharing the
same vector map will avoid sharing vectors.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
|
|
|
|
|
|
|
|
| |
Add tracing for various IRQ-related events. Also, move
the exiting TRC_TRACE_IRQ from the "generic" class into the
new TRC_HW_IRQ sub-class.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
|
|
|
|
|
|
|
|
|
|
| |
This patch moves some more, mostly data, extern declarations into
header files. I haven't been as strict as I was with functions;
in particular there are a number of declarations of assembler labels
that are only used in one place. I've also left a few compat-mode
tricks, and all the magic in symbols.c
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gcc 4.6 complains:
io_apic.c: In function 'restore_IO_APIC_setup':
/build/user-xen_4.1.0-3-amd64-zSon7K/xen-4.1.0/debian/build/build-hypervisor_amd64_amd64/xen/include/asm/io_apic.h:150:26:
error: '*((void *)&entry+4)' may be used uninitialized in this
function [-Werror=uninitialized]
io_apic.c:221:32: note: '*((void *)&entry+4)' was declared
here
/build/user-xen_4.1.0-3-amd64-zSon7K/xen-4.1.0/debian/build/build-hypervisor_amd64_amd64/xen/include/asm/io_apic.h:150:26:
error: 'entry' may be used uninitialized in this function
[-Werror=uninitialized]
io_apic.c:221:32: note: 'entry' was declared here
cc1: all warnings being treated as errors
Add functions to read/write an entire IO APIC entry using an explicit
union to allow gcc to spot the initialisation.
Reported as Debian bug #625438, thanks to Matthias Klose.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
| |
Remove unused and pointless bits from IO-APIC handling code. Move
whatever possible into .init.*, and some data items into
.data.read_mostly. Adjust some types.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
| |
Remove unused and pointless bits from mpparse.c (and other files where
they are related to it). Of what remains, move whatever possible into
.init.*, and some data items into .data.read_mostly.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
| |
This is needed to compile xen with clang.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
| |
... decreasing cache footprint. As a prerequisite this requires making
cmdline_parse() a little more flexible.
Also remove a few variables altogether, and adjust sections
annotations for several others.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Keir Fraser <keir@xen.org>
|