| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are BIOSes that want to map the IO-APIC MMIO region from some
ACPI method(s), and there is at least one BIOS flavor that wants to
use this mapping to clear an RTE's mask bit. While we can't allow the
latter, we can permit reads and simply drop write attempts, leveraging
the already existing infrastructure introduced for dealing with AMD
IOMMUs' representation as PCI devices.
This fixes an interrupt setup problem on a system where _CRS evaluation
involved the above described BIOS/ACPI behavior, and is expected to
also deal with a boot time crash of pv-ops Linux upon encountering the
same kind of system.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the need to allocate multiple contiguous IRTEs for multi-vector
MSI, the chance of failure here increases. While on the AMD side
there's no allocation of IRTEs at present at all (and hence no way for
this allocation to fail, which is going to change with a later patch in
this series), VT-d already ignores an eventual error here, which this
patch fixes.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: "Zhang, Xiantao" <xiantao.zhang@intel.com>
|
|
|
|
|
|
|
| |
... to use a (struct pci_dev *, devfn) pair.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: "Zhang, Xiantao" <xiantao.zhang@intel.com>
|
|
|
|
|
|
|
| |
... to use a (struct pci_dev *, devfn) pair.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: "Zhang, Xiantao" <xiantao.zhang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the arch-independent do_domctl function now RCU locks the domain
specified by op->domain, pass the struct domain to the arch-specific
domctl function and remove the duplicate per-subfunction locking.
This also removes two get_domain/put_domain call pairs (in
XEN_DOMCTL_assign_device and XEN_DOMCTL_deassign_device), replacing
them with RCU locking.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since these interrupts get setup before APs get brought online, their
affinities naturally could only ever point to CPU 0 alone so far.
Adjust this to include potentially multiple CPUs in the target mask
(when running in one of the cluster modes), and take into account NUMA
information (to handle the interrupts on a CPU on the node where the
respective IOMMU is).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The iommu is enabled by default when xen is booting and later disabled
in iommu_setup() when no iommu is present.
But under some circumstances iommu code can be called before
iommu_setup() is processed. If there is no iommu available xen crashes.
This can happen for example when panic(...) is called as introduced
with the patch "x86-64: detect processors subject to AMD erratum #121
and refuse to boot" since xen 4.1.3, resulting in
find_iommu_for_device() to be called in the context of
disable_IO_APIC() / __stop_this_cpu().
This patch fixes this by keeping the iommu disabled until iommu_setup()
is entered.
Originally-by: Ronny Hegewald <ronny.hegewald@online.de>
In order for iommu_enable to be off initially, iommu_supports_eim()
must not depend on it anymore, nor must acpi_parse_dmar(). The former
in turn requires that iommu_intremap gets uncoupled from iommu_enabled
(in particular, failure during IOMMU setup should no longer result in
iommu_intremap getting cleared by generic code; IOMMU specific code
can still do so provided in can live with the consequences).
This could have the nice side effect of allowing to use "iommu=off"
even when x2APIC was pre-enabled by the BIOS (in which case interrupt
remapping is a requirement, but DMA translation [obviously] isn't), but
that doesn't currently work (and hence x2apic_bsp_setup() forces the
IOMMU on rather than just interrupt remapping).
For consistency with VT-d, make the AMD IOMMU code also skip all ACPI
table parsing when neither iommu_enable nor iommu_intremap are set.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Acked-by: "Huang2, Wei" <Wei.Huang2@amd.com>
|
|
|
|
|
|
|
|
|
|
| |
This requires some additions to the VT-d side; AMD IOMMUs use the
"normal" MSI message format even when interrupt remapping is enabled,
thus making adjustments here unnecessary.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Xiantao Zhang<xiantao.zhang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note: these changes don't make any difference on x86.
Replace XEN_GUEST_HANDLE with XEN_GUEST_HANDLE_PARAM when it is used as
an hypercall argument.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
|
|
|
|
|
|
|
|
|
|
| |
- names of functions used independent of vendor should not have vendor
specific names
- vendor specific declarations should not live inn generic headers
- other vendor specific items should not be used in generic code
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were a number of cases where an "iommu" retrieved got passed to
another function before being NULL-checked. While this by itself was
not a problem as the called function did the checks, it is confusing to
the reader and redundant in several cases (particularly with NULL-
checking the return value of iommu_ir_ctrl()). Drop the redundant
checks (also ones where the sole caller of a function did the checking
already), and at once make the three similar functions proper inline
instead of extern ones (they were prototyped in the wrong header file
anyway, so would have needed touching sooner or later).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New key handler 'o' to dump the IOMMU p2m table for each domain.
Skips dumping table for domain 0.
Intel and AMD specific iommu_ops handler for dumping p2m table.
Incorporated feedback from Jan Beulich and Wei Wang.
Fixed indent printing with %*s.
Removed superflous superpage and other attribute prints.
Make next_level use consistent for AMD IOMMU dumps. Warn if found
inconsistent.
AMD IOMMU does not skip levels. Handle 2mb and 1gb IOMMU page size for
AMD.
Signed-off-by: Santosh Jodh <santosh.jodh@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
| |
It retains IA64-specific bits in code imported from elsewhere (e.g.
ACPI, EFI) as well as in the public headers.
It also doesn't touch the tools, mini-os, and unmodified_drivers
sub-trees.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
iotlb flush
Add cpu flag that will be checked by the iommu low level code
to skip iotlb flushes. iommu_iotlb_flush shall be called explicitly.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
| |
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
| |
This addresses all remaining build problems introduced over the last
several months.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Since extended config space accesses may not be possible when
scan_pci_devices() runs (due to MMCFG resources not being reserved in
the E820 table, which the specification allows to be the case),
functionality enabling of which requires such must be re-attempted
when it is known whether MMCFG is safe to use.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: "Kay, Allen M" <allen.m.kay@intel.com>
|
|
|
|
|
|
|
| |
Again, a couple of directly related functions at once get adjusted to
account for the segment number.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
| |
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
|
|
|
| |
In line with 22924:86000076dcee, paging_mode_hap(d) shouldn't be
used in HAP internals that are called during HAP setup.
Signed-off-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
| |
This makes the check more precise, and brings VTd in line with AMD code.
Signed-off-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
23733:fbf3768e5934 causes xen-unstable not to boot on several of the
xen.org AMD test systems. We get an endless series of these:
(XEN) AMD-Vi: IO_PAGE_FAULT: domain = 0, device id = 0x00a0, fault
address = 0xfdf8f10144
I have constructed the attached patch which reverts c/s 23733
(adjusted for conflicts due to subsequent patches). With this
reversion Xen once more boots on these machines.
23733 has been in the tree for some time now, causing this breakage,
and has already been fingered by the automatic bisector and discussed
on xen-devel as the cause of boot failures. I think it is now time to
revert it pending a correct fix to the original problem.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
|
|
|
|
|
|
|
|
|
|
| |
...use per-device table instead.
This should work with per-cpu IDTs. We are safe to remove global
table since SATA device id issue doee not appear in recent
production BIOS.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this it is questionable whether retaining struct domain's
nr_pirqs is actually necessary - the value now only serves for bounds
checking, and this boundary could easily be nr_irqs.
Note that ia64, the build of which is broken currently anyway, is only
being partially fixed up.
v2: adjustments for split setup/teardown of translation data
v3: re-sync with radix tree implementation changes
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
| |
Print out debug messages in vtd_page_fault() handler only when
iommu=debug is set xen boot parameter.
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The kdump kernel has problems booting with interrupt/dma
remapping enabled, so we need a new iommu_ops called
crash_shutdown which is basically suspend but doesn't
need to bother saving state.
Make sure that crash_shutdown is called on the kexec
path.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
|
|
|
|
| |
Fails current lock checking mechanism in spinlock.c in debug=y builds.
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this it is questionable whether retaining struct domain's
nr_pirqs is actually necessary - the value now only serves for bounds
checking, and this boundary could easily be nr_irqs.
Another thing to consider is whether it's worth storing the pirq
number in struct pirq, to avoid passing the number and a pointer to
quite a number of functions.
Note that ia64, the build of which is broken currently anyway, is only
partially fixed up.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
| |
Also introduce a new parameter (iommu=sharept) to enable this feature.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Acked-by: Tim Deegan <Tim.Deegan@citrix.com>
Committed-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is accomplished by converting a couple of embedded arrays (in one
case a structure containing an array) into separately allocated
pointers, and (just as for struct arch_vcpu in a prior patch)
overlaying some PV-only fields with HVM-only ones.
One particularly noteworthy change in the opposite direction is that
of PITState - this field so far lived in the HVM-only portion, but is
being used by PV guests too, and hence needed to be moved out of
struct hvm_domain.
The change to XENMEM_set_memory_map (and hence libxl__build_pre() and
the movement of the E820 related pieces to struct pv_domain) are
subject to a positive response to a query sent to xen-devel regarding
the need for this to happen for HVM guests (see
http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01848.html).
The protection of arch.hvm_domain.irq.dpci accesses by is_hvm_domain()
is subject to confirmation that the field is used for HVM guests only
(see
http://lists.xensource.com/archives/html/xen-devel/2011-03/msg02004.html).
In the absence of any reply to these queries, and given the early
state of 4.2 development, I think it should be acceptable to take the
risk of having to later undo/redo some of this.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
| |
Check flags field in ACPI DMAR structure before enabling interrupt
remapping or x2apic. This allows platform vendors to disable
interrupt remapping or x2apic features if on board BIOS does not
support them.
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes following changes: 1) Moves EPT/VT-d sharing
initialization back to when it is actually needed to make sure
vmx_ept_vpid_cap has been initialized. 2) added page order parameter
to iommu_pte_flush() to tell VT-d what size of page to flush. 3)
added hap_2mb flag to ease performance studies between base 4KB EPT
size and when 2MB and 1GB page size support are enabled.
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
|
|
|
|
|
|
|
|
|
| |
Basic idea is to leverage 2MB and 1GB page size support in EPT by having
VT-d using the same page tables as EPT. When EPT page table changes, flush
VT-d IOTLB cache.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Make it aligned with Linux kernel.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Removed unnecessary bits from the original patch, and removed
intremap_enabled() with its only caller gone.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
| |
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Tested-by: Conny Seidel <conny.seidel@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These structures are used by Xen, and hence guests must not be able
to fiddle with them.
qemu-dm currently plays with the MSI-X table, requiring Dom0 to
still have write access. This is broken (explicitly allowing the guest
write access to the mask bit) and should be fixed in qemu-dm, at which
time Dom0 won't need any special casing anymore.
The changes are made under the assumption that p2m_mmio_direct will
only ever be used for order 0 pages.
An open question is whether dealing with pv guests (including the
IOMMU-less case) is necessary, as handling mappings a domain may
already have in place at the time the first interrupt gets set up
would require scanning all of the guest's L1 page table pages.
Currently a hole still remains allowing PV guests to map these ranges
before actually setting up any MSI-X vector for a device.
An alternative would be to determine and insert the address ranges
earlier into mmio_ro_ranges, but that would require a hook in the
PCI config space writes, which is particularly problematic in case
MMCONFIG accesses are being used.
A second alternative would be to require Dom0 to report all devices
(or at least all MSI-X capable ones) regardless of whether they would
be used by that domain, and do so after resources got determined/
assigned for them (i.e. a second notification later than the one
currently happening from the PCI bus scan would be needed).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Jiang, Yunhong <yunhong.jiang@intel.com>
|
|
|
|
|
|
|
|
| |
x2apic depends on interrupt remapping, so it should disable interrupt
remapping behind x2apic disabling. And also this patch wraps
__enable_x2apic to get rid of duplicated code.
Signed-off-by: Weidong Han <weidong.han@intel.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
| |
Allow devices to always be passed through to PV domains, just as they
can be to HVM domains.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch masks PIC and IOAPIC RTE's before x2APIC enabling, unmask
and restore them after x2APIC enabling. It also really enables
interrupt remapping before x2APIC enabling instead of just checking
interrupt remapping setting. This patch also handles all x2APIC
configuration including BIOS settings and command line
settings. Especially, it handles that BIOS hands over in x2APIC mode
(when there is apic id > 255). It checks if x2APIC is already enabled
by BIOS. If already enabled, it will disable interrupt remapping and
queued invalidation first, then enable them again.
Signed-off-by: Weidong Han <weidong.han@intel.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a new sub-option "verbose" to "iommu=", and hide most
(debugging) messages when that option is not specified. Particularly
messages printed after time management was initialized can, on
sufficiently large systems and with a graphical console, lead to
time management issues (therefore a call to process_pending_softirqs()
also gets added in case the new sub-option is being used).
While touching that code, also convert all improper uses of gdprintk()
to dprintk(), and convert all boolean iommu config variables to bool_t
residing in the .data.read_mostly section.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add this option to workaround BIOS bugs. Currently it ignores DRHD
if "all" devices under its scope are not pci discoverable. This
workarounds a BIOS bug in some platforms to make VT-d work. But note
that this option doesn't guarantee security, because it might ignore
DRHD.
So there are 3 options which handle BIOS bugs differently:
iommu=1 (default): If detect non-existent device under a DRHD's
scope, or find incorrect RMRR setting (base_address > end_address),
disable VT-d completely in Xen with warning messages. This guarantees
security when VT-d enabled, or just disable VT-d to let Xen work
without VT-d.
iommu=force: it enforces to enable VT-d in Xen. If VT-d cannot be
enabled, it will crashes Xen. This is mainly for users who must need
VT-d.
iommu=workaround_bogus_bios: it workarounds some BIOS bugs to make
VT-d still work. This might be insecure because there might be a
device not protected by any DRHD if the device is re-enabled by
malicious s/w. This is for users who want to use VT-d regardless of
security.
Signed-off-by: Weidong Han <weidong.han@intel.com>
|
|
|
|
| |
Signed-off-by: Weidong Han <weidong.han@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, xen uses shared iommu domain-id across all the VT-d units
in the platform. The number of iommu domain-ids (NR_DID, e.g. 256)
supported by each VT-d unit is reported in Capability register. The
limitation of current implementation is it only can support at most
NR_DID domains with VT-d in the entire platform, even though the
platform can support N * NR_DID (where N is the number of VT-d
units). Imagine a platform with several SR_IOV NICs, and each NIC
supports 128 VFs. It possibly beyond the NR_DID.
This patch implements iommu domain-id management per iommu (VT-d
unit), hence solves above limitation. It removes the global domain-id
bitmap, instead use domain-id bitmap in struct iommu, and also involve
an array to map guest domain-id and iommu domain-id, which is used to
iommu domain-id when flush context cache or IOTLB. When a device is
assigned to a guest, choose an available iommu domain-id from the
device's iommu, and map guest domain id to the domain-id mapping
array. When a device is deassigned from a guest, clear the domain-id
bit in domain-id bitmap and clear the corresponding entry in domain-id
map array if there is no other devices under the same iommu owned by
the guest.
Signed-off-by: Weidong Han <weidong.han@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Clear Interrupt Remapping(IR) unit's CFI (Compatibility Format
Interrupt) to enhance security;
2) Move the iommu_setup() ahead and put it before we begin to use
IOAPIC so we can make sure after we enable Interrupt Remapping, the
later IOAPIC (and MSI) initialization would setup IOAPIC RTEs (and
MSI) with remappable format;
3) Enable x2APIC only when all VT-d engines support IR with EIM
(Extended Interrupt Mode). EIM enables external devices to deliver
interrupts to logical processor with >8-bit APIC ID.
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
from vector-based to IRQ-based.
In per-cpu vector environment, vector space changes to
multi-demension resource, so vector number is not appropriate
to index irq_desc which stands for unique interrupt source. As
Linux does, irq number is chosen to index irq_desc. This patch
changes vector-based interrupt infrastructure to irq-based one.
Mostly, it follows upstream linux's changes, and some parts are
adapted for Xen.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
|
|
|
|
| |
Signed-off-by: Wei Wang <wei.wang2@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add 2 generic functions into the vendor neutral iommu interface, The
reason is that from changeset 19732, there is only one global flag
"iommu_enabled" that controls iommu enablement for both vtd and amd
systems, so we need different code paths for vtd and amd iommu systems
if this flag has been turned on. Also, the early checking of
"iommu_enabled" in iommu_setup() is removed to prevent iommu
functionalities from been disabled on amd systems.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
|