| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
* Strip trailing whitespace
* Remove redundant definitions
* Update stale documentation links
* Move hpet_address into __initdata
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
|
| |
The irqflags parameter appears to be an unused vestigial parameter right from
the integration of the IOMMU code in 2007. The parameter is 0 at all
callsites and never used.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the need to allocate multiple contiguous IRTEs for multi-vector
MSI, the chance of failure here increases. While on the AMD side
there's no allocation of IRTEs at present at all (and hence no way for
this allocation to fail, which is going to change with a later patch in
this series), VT-d already ignores an eventual error here, which this
patch fixes.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: "Zhang, Xiantao" <xiantao.zhang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 2d8a282 ("x86/HPET: fix FSB interrupt masking") may cause the
HPET event to occur while its interrupt is masked. In that case we need
to "manually" deliver the event.
Unfortunately this requires the locking to be changed: For one, it was
always bogus for handle_hpet_broadcast() to use spin_unlock_irq() - the
function is being called from an interrupt handler, and hence shouldn't
blindly re-enable interrupts (this should be left up to the generic
interrupt handling code). And with the event handler wanting to acquire
the lock for two of its code regions, we must not enter it with the
lock already held. Hence move the locking into
hpet_{attach,detach}_channel(), permitting the lock to be dropped by
set_channel_irq_affinity() (which is a tail call of those functions).
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While being unable to reproduce the "No irq handler for vector ..."
messages observed on other systems, the change done by 5dc3fd2 ('x86:
extend diagnostics for "No irq handler for vector" messages') appears
to point at the lack of masking - at least I can't see what else might
be wrong with the HPET MSI code that could trigger these warnings.
While at it, also adjust the message printed by aforementioned commit
to not pointlessly insert spaces - we don't need aligned tabular output
here.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- force use of physical APIC mode if indicated so (as we don't support
xAPIC cluster mode, the respective flag is taken to force physical
mode too)
- don't use MSI if indicated so (implies no IOMMU)
Both can be overridden on the command line, for the MSI case this at
once adds a new command line option allowing to turn off PCI MSI (IOMMU
and HPET are unaffected by this).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
HPET_TN_FSB is not really suitable for masking interrupts - it merely
switches between the two delivery methods. The right way of masking is
through the HPET_TN_ENABLE bit (which really is an interrupt enable,
not a counter enable or some such). This is even more so with certain
chip sets not even allowing HPET_TN_FSB to be cleared on some of the
channels.
Further, all the setup of the channel should happen before actually
enabling the interrupt, which requires splitting legacy and FSB logic.
Finally this also fixes an S3 resume problem (HPET_TN_FSB did not get
set in hpet_broadcast_resume(), and hpet_msi_unmask() doesn't get
called from the general resume code either afaict).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than spending measurable amounts of time reading back the most
recently written message, cache it in space previously unused, and thus
accelerate the CPU's entering of the intended C-state.
hpet_msi_read() ends up being unused after this change, but rather than
removing the function, it's being marked "unused" in order - that way
it can easily get used again should a new need for it arise.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
This requires some additions to the VT-d side; AMD IOMMUs use the
"normal" MSI message format even when interrupt remapping is enabled,
thus making adjustments here unnecessary.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Xiantao Zhang<xiantao.zhang@intel.com>
|
|
|
|
|
|
|
|
|
| |
The IRQ descriptor lock should be held while adjusting the affinity of
any IRQ; the HPET channel lock isn't sufficient to protect namely
against races with moving the IRQ to a different CPU.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When there are more FSB delivery capable HPET channels than CPU cores
(or threads), we can simply use a dedicated channel per CPU. This
avoids wasting the resources to handle the excess channels (including
the pointless triggering of the respective interrupt on each
wraparound) as well as the ping-pong of the interrupts' affinities
(when getting assigned to different CPUs).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
| |
We shouldn't clear HPET_TN_FSB right after we (indirectly, via
request_irq()) enabled it for the channels we intend to use for
broadcasts.
This fixes a regression introduced by c/s 25103:0b0e42dc4f0a.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
lapic_timer_{on,off} need to get initialized in this case. This in turn
requires getting HPET broadcast setup to be carried out earlier (and
hence preventing double initialization there).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Leaving certain bits set when being started from an environment where
the HPET was already in use can affect functionality. Clear those bits
to be on the safe side.
We should also consider ignoring the HPET altogether if any reserved
bits are found to be set.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... by the call to hpet_disable() added in the immediately preceding
patch.
In order to retain the behavior intended by c/s 23776:0ddb4481f883,
implement one of the alternative options pointed out there: remove CPUs
from the online map in __stop_this_cpu() (and hence doing so in
stop_this_cpu() is no longer needed).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Linux up to now is not smart enough to properly clear the HPET when it
boots, which is particularly a problem when a kdump attempt from
running under Xen is being made. Linux itself added code to work around
this to its shutdown paths quite some time ago, so let's do something
similar in Xen: Save the configuration register settings during boot,
and restore them during shutdown. This should cover the majority of
cases where the secondary kernel might not come up because timer
interrupts don't work.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's no need for a lock here, elimination of which makes the
function a leaf one, thus allowing for better (and smaller) code.
Further, use the variable next_channel according to its name - so far
it represented the most recently used channel rather than the next one
to use.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extends create_irq() to take a node parameter, allowing the
resulting IRQ to have its destination set to a CPU on that node right
away, which is more natural than having to post-adjust this (and
get e.g. a new IRQ vector assigned despite a fresh one was just
obtained).
All other callers of create_irq() pass NUMA_NO_NODE for the time being.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
This includes the conversion from for_each_cpu_mask() to for_each-cpu().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hpet_fsb_cap_lookup(), if it doesn't find any FSB capable timer,
leaves hpet_events allocated, while hpet_events->cpumask may not have
been, As we're pretty generous with these one-time allocations already
(in that hpet_events doesn't get freed when no usable counters were
found, even if in that case only the first array entry [or none at
all] may get used), simply make the cpumask allocation in the legacy
case independent of whether hpet_events was NULL before.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Christoph Egger <Christoph.Egger@amd.com>
Acked-by: Christoph Egger <Christoph.Egger@amd.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
| |
... in favor of using the new, nr_cpumask_bits-based ones.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
struct irq_cfg really has become an architecture extension to struct
irq_desc, and hence it should be treated as such (rather than as IRQ
chip specific data, which it was meant to be originally).
For a first step, only convert a subset of the uses; subsequent
patches (partly to be sent later) will aim at fully eliminating the
use of the old structure type.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
|
| |
This includes the removal of a redundant memset() from microcode_amd.c.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the .end() accessor having become optional and noting that
several of the accessors' behavior really depends on the result of
msi_maskable_irq(), the splits the MSI IRQ chip type into two - one
for the maskable ones, and the other for the (MSI only) non-maskable
ones.
At once the implementation of those methods gets moved from io_apic.c
to msi.c.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is again because the descriptor is generally more useful (with
the IRQ number being accessible in it if necessary) and going forward
will hopefully allow to remove all direct accesses to the IRQ
descriptor array, in turn making it possible to make this some other,
more efficient data structure.
This additionally makes the .end() accessor optional, noting that in a
number of cases the functions were empty.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
This is because the descriptor is generally more useful (with the IRQ
number being accessible in it if necessary) and going forward will
hopefully allow to remove all direct accesses to the IRQ descriptor
array, in turn making it possible to make this some other, more
efficient data structure.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
| |
This is particularly relevant as the number of CPUs to be supported
increases (as recently happened for the default thereof).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When migrating IO-APIC line level interrupts between PCPUs, the
migration code rewrites the IO-APIC entry to point to the new
CPU/Vector before EOI'ing it.
The EOI process says that EOI'ing the Local APIC will cause a
broadcast with the vector number, which the IO-APIC must listen to to
clear the IRR and Status bits.
In the case of migrating, the IO-APIC has already been
reprogrammed so the EOI broadcast with the old vector fails to match
the new vector, leaving the IO-APIC with an outstanding vector,
preventing any more use of that line interrupt. This causes a lockup
especially when your root device is using PCI INTA (megaraid_sas
driver *ehem*)
However, the problem is mostly hidden because send_cleanup_vector()
causes a cleanup of all moving vectors on the current PCPU in such a
way which does not cause the problem, and if the problem has occured,
the writes it makes to the IO-APIC clears the IRR and Status bits
which unlocks the problem.
This fix is distinctly a temporary hack, waiting on a cleanup of the
irq code. It checks for the edge case where we have moved the irq,
and manually EOI's the old vector with the IO-APIC which correctly
clears the IRR and Status bits. Also, it protects the code which
updates irq_cfg by disabling interrupts.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
| |
This particularly eliminates the bogus passing of NULL by hpet.c.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since RTC/CMOS accesses aren't atomic, there are possible races
between code paths setting the index register and subsequently
reading/writing the data register. This is supposed to be dealt with
by acquiring rtc_lock, but two places up to now lacked respective
synchronization: Accesses to the EFI time functions and
smpboot_{setup,restore}_warm_reset_vector().
This in turn requires no longer directly passing through guest writes
to the index register, but instead using a machanism similar to that
for PCI config space method 1 accesses.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the (now getting removed) comment in struct
hpet_event_channel, this was to prevent accessing a CPU's
timer_deadline after it got cleared from cpumask. This can be done
without a lock altogether - hpet_broadcast_exit() can simply clear
the bit, and handle_hpet_broadcast() can read timer_deadline before
looking at the mask a second time (the cpumask bit was already
found set by the surrounding loop).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Gang Wei <gang.wei@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
Clearly for the adjusted BUG_ON()s to not yield false positives
num_hpets_used (rather than num_chs_used, as done mistakenly in
23042:599ceb5b0a9b) must be incremented before setting up an IRQ (and
decremented back when the setup failed). To avoid further confusion,
just eliminate the local variable altogether.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
| |
Clearly for the adjusted BUG_ON()s to not yield false positives
num_chs_used must be incremented before setting up an IRQ (and
decremented back when the setup failed).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'unsigned int' is better suited as an array index on x86-64.
'u32' produces better code than 'unsigned long' on x86-64, so use the
former for storing 32-bit values read from the hardware.
this_cpu() uses an implicit smp_processor_id(), and hence using
per_cpu() when the result of smp_processor_id() is already available
is more efficient.
Fold one case of cpu_isset()+cpu_clear() into cpu_test_and_clear().
Drop the unused return value of evt_do_broadcast().
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Wei Gang <gang.wei@intel.com>
|
|
|
|
|
|
|
|
| |
Typically there are far less than 32 counters available, and hence
there's no use in wasting the memory on (almost) every system.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Wei Gang <gang.wei@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
- separate init and resume code paths (so that the [larger] init parts
can go init .init.* sections)
- drop the separate legacy_hpet_event object, as we can easily re-use
the first slot of hpet_events[] for that purpose (the whole array is
otherwise unused when the legacy code is being used)
- use section placement attributes where reasonable
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Wei Gang <gang.wei@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
At least the legacy path can enter its interrupt handler callout while
initialization is still in progress - that handler checks whether
->event_handler is non-NULL, and hence all other initialization must
happen before setting this field.
Do the same to the MSI initialization just in case (and to keep the
code in sync).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jan Beulich found that for S3 resume on platforms without ARAT feature
but with MSI capable HPET, request_irq() will be called in
hpet_setup_msi_irq() for irq already setup(no release_irq() called
during S3 suspend), so that always falling back to using
legacy_hpet_event.
Fix it by conditional calling request_irq() for 4.1. Planned to split
the S3 resume path from booting path post 4.1, as Jan suggested.
Signed-off-by: Wei Gang <gang.wei@intel.com>
Acked-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
| |
The check it adds is already present in the function's sole caller.
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This follows Linux commit 39fe05e58c5e448601ce46e6b03900d5bf31c4b0,
noticing that all this setup is pointless when ARAT support is there,
and knowing that on SLED11's native kernel it has actually caused S3
resume issues.
A question would be whether HPET legacy interrupts should be forced
off in this case (rather than leaving whatever came from firmware).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
| |
... decreasing cache footprint. As a prerequisite this requires making
cmdline_parse() a little more flexible.
Also remove a few variables altogether, and adjust sections
annotations for several others.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
| |
Additionally simplify operations on them in a few cases.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
| |
This includes replacing the bogus definition of cpumask_test_cpu()
(introduced by c/s 20073) with a Linux compatible one and replacing
the bad uses with cpu_isset().
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
| |
Signed-off-by: Wei Gang <gang.wei@intel.com>
|