aboutsummaryrefslogtreecommitdiffstats
path: root/xen/arch/x86/irq.c
Commit message (Collapse)AuthorAgeFilesLines
...
* eliminate cpumask accessors referencing NR_CPUSJan Beulich2011-10-211-29/+28
| | | | | | | ... in favor of using the new, nr_cpumask_bits-based ones. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: move generic IRQ code out of io_apic.cJan Beulich2011-10-191-0/+122
| | | | | | | | While doing so, eliminate the use of struct irq_cfg and convert the CPU mask accessors to the new style ones as far as possible. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
* fold struct irq_cfg into struct irq_descJan Beulich2011-10-191-21/+13
| | | | | | | | | | | | | struct irq_cfg really has become an architecture extension to struct irq_desc, and hence it should be treated as such (rather than as IRQ chip specific data, which it was meant to be originally). For a first step, only convert a subset of the uses; subsequent patches (partly to be sent later) will aim at fully eliminating the use of the old structure type. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
* use xzalloc in x86 codeJan Beulich2011-10-041-9/+4
| | | | | | | This includes the removal of a redundant memset() from microcode_amd.c. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86,irq: Clean up __clear_irq_vectorKeir Fraser2011-09-301-11/+25
| | | | | | | | | | | | | | | | | | | Fix and clean up the logic to __clear_irq_vector(). We always need to clear the things related to cfg->vector. If the IRQ is currently in motion, then we need to also clear out things related to cfg->old_vector. This patch reorganizes the function to make the parallels between the two clean-ups more obvious. The main functional change here is with cfg->used_vectors; make sure to clear cfg->vector always (even if !cfg->move_in_progress); if cfg->move_in_progress, clear cfg->old_vector as well. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
* x86: IO-APIC code has no dependency on PCIJan Beulich2011-09-221-3/+4
| | | | | | | | | | | | The IRQ handling code requires pcidevs_lock to be held only for MSI interrupts. As the handling of which was now fully moved into msi.c (i.e. while applying fine without, the patch needs to be applied after the one titled "x86: split MSI IRQ chip"), io_apic.c now also doesn't need to include PCI headers anymore. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* x86: split MSI IRQ chipJan Beulich2011-09-181-6/+9
| | | | | | | | | | | | | With the .end() accessor having become optional and noting that several of the accessors' behavior really depends on the result of msi_maskable_irq(), the splits the MSI IRQ chip type into two - one for the maskable ones, and the other for the (MSI only) non-maskable ones. At once the implementation of those methods gets moved from io_apic.c to msi.c. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* pass struct irq_desc * to all other IRQ accessorsJan Beulich2011-09-181-36/+35
| | | | | | | | | | | | | This is again because the descriptor is generally more useful (with the IRQ number being accessible in it if necessary) and going forward will hopefully allow to remove all direct accesses to the IRQ descriptor array, in turn making it possible to make this some other, more efficient data structure. This additionally makes the .end() accessor optional, noting that in a number of cases the functions were empty. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* pass struct irq_desc * to set_affinity() IRQ accessorsJan Beulich2011-09-181-7/+5
| | | | | | | | | | This is because the descriptor is generally more useful (with the IRQ number being accessible in it if necessary) and going forward will hopefully allow to remove all direct accesses to the IRQ descriptor array, in turn making it possible to make this some other, more efficient data structure. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* convert more literal uses of cpumask_t to pointersJan Beulich2011-09-181-6/+6
| | | | | | | This is particularly relevant as the number of CPUs to be supported increases (as recently happened for the default thereof). Signed-off-by: Jan Beulich <jbeulich@suse.com>
* PCI multi-seg: add new physdevop-sJan Beulich2011-09-181-1/+1
| | | | | | | | | | | | | The new PHYSDEVOP_pci_device_add is intended to be extensible, with a first extension (to pass the proximity domain of a device) added right away. A couple of directly related functions at once get adjusted to account for the segment number. Should we deprecate the PHYSDEVOP_manage_pci_* sub-hypercalls? Signed-off-by: Jan Beulich <jbeulich@suse.com>
* Clear IRQ_GUEST in irq_desc->status when setting action to NULL.Igor Mammedov2011-09-181-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Looking more closely at usage of action field with relation to IRQ_GUEST flag. It appears that set IRQ_GUEST implies that action is not NULL. As result it is not safe to set action to NULL and leave IRQ_GUEST set. Hence IRQ_GUEST should be cleared in dynamic_irq_cleanup where action is set to NULL. An addition remove BUGON at __pirq_guest_unbind that appears to be bogus and not needed anymore. Thanks Paolo Bonzini for NACKing previous patch, and pointing at the correct solution. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reinstate the BUG_ON, but after the action==NULL check. Since we then go and start interpreting action as an irq_guest_action_t, the BUG_ON is relevant here. More generally, the brute-force nature of dynamic_irq_cleanup() looks a bit worrying. Possibly there should be more integratioin with pirq_guest_unbind() logic, for cleaning up un-acked EOIs and the like. Signed-off-by: Keir Fraser <keir@xen.org>
* xen: if mapping GSIs we run out of pirq < nr_irqs_gsi, use the othersStefano Stabellini2011-09-131-9/+6
| | | | | | | | | PV on HVM guests can have more GSIs than the host, in that case we could run out of pirq < nr_irqs_gsi. When that happens use pirq >= nr_irqs_gsi rather than returning an error. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Benjamin Schweikert <b.schweikert@googlemail.com>
* IRQ: Introduce old_vector to irq_cfgAndrew Cooper2011-09-051-11/+7
| | | | | | | | | Introduce old_vector to irq_cfg with the same principle as old_cpu_mask. This removes a brute force loop from __clear_irq_vector(), and paves the way to correct bitrotten logic elsewhere in the irq code. Signed-off-by Andrew Cooper <andrew.cooper3@citrix.com>
* IRQ: Fold irq_status into irq_cfgAndrew Cooper2011-09-051-21/+7
| | | | | | | | irq_status is an int for each of nr_irqs which represents a single boolean variable. Fold it into the bitfield in irq_cfg, which saves 768 bytes per CPU with per-cpu IDTs in use. Signed-off-by Andrew Cooper <andrew.cooper3@citrix.com>
* IRQ: Remove bit-rotten codeAndrew Cooper2011-09-051-6/+0
| | | | | | | | irq_desc.depth is a write only variable. LEGACY_IRQ_FROM_VECTOR(vec) is never referenced. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
* xen: Add global irq_vector_map option, set if using AMD global intremap tablesGeorge Dunlap2011-09-051-11/+103
| | | | | | | | | | | | | | | | | | | | | | | | | As mentioned in previous changesets, AMD IOMMU interrupt remapping tables only look at the vector, not the destination id of an interrupt. This means that all IRQs going through the same interrupt remapping table need to *not* share vectors. The irq "vector map" functionality was originally introduced after a patch which disabled global AMD IOMMUs entirely. That patch has since been reverted, meaning that AMD intremap tables can either be per-device or global. This patch therefore introduces a global irq vector map option, and enables it if we're using an AMD IOMMU with a global interrupt remapping table. This patch removes the "irq-perdev-vector-map" boolean command-line optino and replaces it with "irq_vector_map", which can have one of three values: none, global, or per-device. Setting the irq_vector_map to any value will override the default that the AMD code sets. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
* xen: get_free_pirq: make sure that the returned pirq is allocatedStefano Stabellini2011-08-311-0/+6
| | | | Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* xen: fix hvm_domain_use_pirq's behaviorStefano Stabellini2011-08-311-2/+1
| | | | | | | | | | | hvm_domain_use_pirq should return true when the guest is using a certain pirq, no matter if the corresponding event channel is currently enabled or disabled. As an additional complication, qemu is going to request pirqs for passthrough devices even for Xen unaware HVM guests, so we need to wait for an event channel to be connected before considering the pirq of a passthrough device as "in use". Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* IRQ: manually EOI migrating line interruptsAndrew Cooper2011-08-311-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When migrating IO-APIC line level interrupts between PCPUs, the migration code rewrites the IO-APIC entry to point to the new CPU/Vector before EOI'ing it. The EOI process says that EOI'ing the Local APIC will cause a broadcast with the vector number, which the IO-APIC must listen to to clear the IRR and Status bits. In the case of migrating, the IO-APIC has already been reprogrammed so the EOI broadcast with the old vector fails to match the new vector, leaving the IO-APIC with an outstanding vector, preventing any more use of that line interrupt. This causes a lockup especially when your root device is using PCI INTA (megaraid_sas driver *ehem*) However, the problem is mostly hidden because send_cleanup_vector() causes a cleanup of all moving vectors on the current PCPU in such a way which does not cause the problem, and if the problem has occured, the writes it makes to the IO-APIC clears the IRR and Status bits which unlocks the problem. This fix is distinctly a temporary hack, waiting on a cleanup of the irq code. It checks for the edge case where we have moved the irq, and manually EOI's the old vector with the IO-APIC which correctly clears the IRR and Status bits. Also, it protects the code which updates irq_cfg by disabling interrupts. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
* x86: drop unused parameter from msi_compose_msg() and setup_msi_irq()Jan Beulich2011-08-271-1/+1
| | | | | | This particularly eliminates the bogus passing of NULL by hpet.c. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: Fix up irq vector map logicGeorge Dunlap2011-08-221-3/+10
| | | | | | | | | | | | | | | | | | We need to make sure that cfg->used_vector is only cleared once; otherwise there may be a race condition that allows the same vector to be assigned twice, defeating the whole purpose of the map. This makes two changes: * __clear_irq_vector() only clears the vector if the irq is not being moved * smp_iqr_move_cleanup_interrupt() only clears used_vector if this is the last place it's being used (move_cleanup_count==0 after decrement). Also make use of asserts more consistent, to catch this kind of logic bug in the future. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
* x86: IRQ fix incorrect logic in __clear_irq_vectorAndrew Cooper2011-08-131-0/+1
| | | | | | | | | | | | | | In the old code, tmp_mask is the cpu_and of cfg->cpu_mask and cpu_online_map. However, in the usual case of moving an IRQ from one PCPU to another because the scheduler decides its a good idea, cfg->cpu_mask and cfg->old_cpu_mask do not intersect. This causes the old cpu vector_irq table to keep the irq reference when it shouldn't. This leads to a resource leak if a domain is shut down wile an irq has a move pending, which results in Xen's create_irq() eventually failing with -ENOSPC when all vector_irq tables are full of stale references. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
* xen: AMD IOMMU: Automatically enable per-device vector mapsGeorge Dunlap2011-07-261-0/+1
| | | | | | | Automatically enable per-device vector maps when using IOMMU, unless disabled specifically by an IOMMU parameter. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
* xen: Option to allow per-device vector maps for MSI IRQsGeorge Dunlap2011-07-261-0/+6
| | | | | | | | | | | | Add a vector-map to pci_dev, and add an option to point MSI-related IRQs to the vector-map of the device. This prevents irqs from the same device from being assigned the same vector on different pcpus. This is required for systems using an AMD IOMMU, since the intremap tables on AMD only look at vector, and not destination ID. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
* xen: Infrastructure to allow irqs to share vector mapsGeorge Dunlap2011-07-261-0/+16
| | | | | | | | Laying the groundwork for per-device vector maps. This generic code allows any irq to point to a vector map; all irqs sharing the same vector map will avoid sharing vectors. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
* xentrace: reduce size of extradata in trace_irq_mask()Olaf Hering2011-07-161-4/+8
| | | | | | | | | | | | | | | | | | | Reduce size of extra_data in to avoid possible crash in trace_var. (XEN) Assertion 'extra_word <= TRACE_EXTRA_MAX' failed at trace.c:687 (XEN) Xen call trace: (XEN) [<ffff82c480128783>] __trace_var+0x4d/0x3b8 (XEN) [<ffff82c480162172>] trace_irq_mask+0x49/0x4b (XEN) [<ffff82c4801631ae>] __assign_irq_vector+0x241/0x374 (XEN) [<ffff82c48015d813>] set_desc_affinity+0x5d/0xd4 (XEN) [<ffff82c480160708>] set_msi_affinity+0x44/0x1c1 (XEN) [<ffff82c480162938>] move_masked_irq+0x9c/0xcd (XEN) [<ffff82c4801629a7>] move_native_irq+0x3e/0x53 (XEN) [<ffff82c48015d969>] ack_msi_irq+0x2c/0x6e (XEN) [<ffff82c4801622e3>] do_IRQ+0x16f/0x66d Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
* Remove MSI IRQ storms prevention logicShan Haitao2011-07-161-67/+8
| | | | | | | | | | The reason is: 1. The logic has negative impact on 10G NIC performance (assigned to guest) by lowering the interrupt frequency that Xen can handle. 2. Xen already has IRQ rate limit logic, which can also help to prevent IRQ storms. Signed-off-by: Shan Haitao <haitao.shan@intel.com>
* x86: remove the domain parameter from the guest EOI functions.Jan Beulich2011-07-011-4/+4
| | | | Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: make domain_spin_lock_irq_desc() a wrapper of pirq_spin_lock_irq_desc()Jan Beulich2011-07-011-23/+7
| | | | | | ...and drop the now unused struct domain * parameter of the latter. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: adjust pirq_spin_lock_irq_desc()Jan Beulich2011-07-011-11/+12
| | | | | | | Remove unnecessary/bogus assertions and add retry loop matching domain_spin_lock_irq_desc(). Signed-off-by: Jan Beulich <jbeulich@novell.com>
* xentrace: Add tracing for IRQ-related eventsGeorge Dunlap2011-07-011-2/+18
| | | | | | | | Add tracing for various IRQ-related events. Also, move the exiting TRC_TRACE_IRQ from the "generic" class into the new TRC_HW_IRQ sub-class. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
* replace d->nr_pirqs sized arrays with radix treeJan Beulich2011-06-231-105/+199
| | | | | | | | | | | | | | | With this it is questionable whether retaining struct domain's nr_pirqs is actually necessary - the value now only serves for bounds checking, and this boundary could easily be nr_irqs. Note that ia64, the build of which is broken currently anyway, is only being partially fixed up. v2: adjustments for split setup/teardown of translation data v3: re-sync with radix tree implementation changes Signed-off-by: Jan Beulich <jbeulich@novell.com>
* pv-on-hvm: hvm_domain_use_pirq return positive no matter if the evtchn is boundStefano Stabellini2011-06-161-7/+1
| | | | | | | | | | This patch fixes PV on HVM interrupt remapping with recent Linux kernels and upstream qemu. hvm_domain_use_pirq should return positive even if the evtchn is not currently bound. If it doesn't assert_irq ends up injecting legacy interrupts even after the guest disabled the irq. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* xen: remove extern function declarations from C files.Tim Deegan2011-05-261-2/+0
| | | | | | | | Move all extern declarations into appropriate header files. This also fixes up a few places where the caller and the definition had different signatures. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
* x86: Check for valid pirq values in hvm_domain_use_pirqStefano Stabellini2011-05-121-1/+1
| | | | Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* x86: replace nr_irqs sized per-domain arrays with radix treesJan Beulich2011-05-091-15/+100
| | | | | | | | | | | | | | | | | | | | | | It would seem possible to fold the two trees into one (making e.g. the emuirq bits stored in the upper half of the pointer), but I'm not certain that's worth it as it would make deletion of entries more cumbersome. Unless pirq-s and emuirq-s were mutually exclusive... v2: Split setup/teardown into two stages - (de-)allocation (tree node (de-)population) is done with just d->event_lock held (and hence interrupts enabled), while actual insertion/removal of translation data gets done with irq_desc's lock held (and interrupts disabled). Signed-off-by: Jan Beulich <jbeulich@novell.com> Fix up for new radix-tree implementation. In particular, we should never insert NULL into a radix tree, as that means empty slot (which can be reclaimed during a deletion). Make use of radix_tree_int_to_ptr() (and its inverse) to hide some of these details. Signed-off-by: Keir Fraser <keir@xen.org>
* Revert 23295:4891f1f41ba5 and 23296:24346f749826Keir Fraser2011-05-021-256/+102
| | | | | | Fails current lock checking mechanism in spinlock.c in debug=y builds. Signed-off-by: Keir Fraser <keir@xen.org>
* replace d->nr_pirqs sized arrays with radix treeJan Beulich2011-05-011-93/+175
| | | | | | | | | | | | | | | With this it is questionable whether retaining struct domain's nr_pirqs is actually necessary - the value now only serves for bounds checking, and this boundary could easily be nr_irqs. Another thing to consider is whether it's worth storing the pirq number in struct pirq, to avoid passing the number and a pointer to quite a number of functions. Note that ia64, the build of which is broken currently anyway, is only partially fixed up. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: replace nr_irqs sized per-domain arrays with radix treesJan Beulich2011-05-011-16/+88
| | | | | | | | | It would seem possible to fold the two trees into one (making e.g. the emuirq bits stored in the upper half of the pointer), but I'm not certain that's worth it as it would make deletion of entries more cumbersome. Unless pirq-s and emuirq-s were mutually exclusive... Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: split struct domainJan Beulich2011-04-051-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is accomplished by converting a couple of embedded arrays (in one case a structure containing an array) into separately allocated pointers, and (just as for struct arch_vcpu in a prior patch) overlaying some PV-only fields with HVM-only ones. One particularly noteworthy change in the opposite direction is that of PITState - this field so far lived in the HVM-only portion, but is being used by PV guests too, and hence needed to be moved out of struct hvm_domain. The change to XENMEM_set_memory_map (and hence libxl__build_pre() and the movement of the E820 related pieces to struct pv_domain) are subject to a positive response to a query sent to xen-devel regarding the need for this to happen for HVM guests (see http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01848.html). The protection of arch.hvm_domain.irq.dpci accesses by is_hvm_domain() is subject to confirmation that the field is used for HVM guests only (see http://lists.xensource.com/archives/html/xen-devel/2011-03/msg02004.html). In the absence of any reply to these queries, and given the early state of 4.2 development, I think it should be acceptable to take the risk of having to later undo/redo some of this. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* move setup_irq() into .init.textJan Beulich2011-04-021-2/+2
| | | | | | | With no modular drivers, all interrupt setup is supposed to happen during boot. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* move request_irq() into .init.textJan Beulich2011-04-021-1/+1
| | | | | | | With no modular drivers, all interrupt setup is supposed to happen during boot. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: __pirq_guest_eoi() must check it is called for a fullyKeir Fraser2011-03-261-0/+6
| | | | | | guest-bound irq before accessing desc->action. Signed-off-by: Keir Fraser <keir@xen.org>
* move various bits into .init.* sectionsJan Beulich2011-03-091-3/+3
| | | | | | | | | | This also includes the removal of some entirely unused functions. The patch builds upon the makefile adjustments done in the earlier sent patch titled "move more kernel decompression bits to .init.* sections". Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: IO-APIC cleanupJan Beulich2011-03-091-1/+0
| | | | | | | | Remove unused and pointless bits from IO-APIC handling code. Move whatever possible into .init.*, and some data items into .data.read_mostly. Adjust some types. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: Fix pirq teardown on domain destruction.Wei Gang2011-01-261-3/+0
| | | | | | | | | The privilege check in unmap_domain_pirq() fails since the teardown completes in RCU (idle domain) context. We can remove the check since it is covered in physdev_op() already, which is the only potentially unprivileged caller. Signed-off-by: Wei Gang <gang.wei@intel.com>
* Use bool_t for various boolean variablesKeir Fraser2010-12-241-1/+1
| | | | | | | | | | | ... decreasing cache footprint. As a prerequisite this requires making cmdline_parse() a little more flexible. Also remove a few variables altogether, and adjust sections annotations for several others. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xen.org>
* x86: adjust other interrupt related section placementKeir Fraser2010-12-151-4/+2
| | | | | | | ... and remove some variables the value of which is never used altogether. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86/IRQ: pass CPU masks by reference rather than by value in more placesKeir Fraser2010-12-011-12/+11
| | | | | | Additionally simplify operations on them in a few cases. Signed-off-by: Jan Beulich <jbeulich@novell.com>