| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than re-reading the instruction bytes upon retry processing,
stash away and re-use what we already read. That way we can be certain
that the retry won't do something different from what requested the
retry, getting once again closer to real hardware behavior (where what
we use retries for is simply a bus operation, not involving redundant
decoding of instructions).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In memory read/write handling the default case should tell the caller
that the operation cannot be handled rather than the operation having
succeeded, so that when new HVMCOPY_* states get added not handling
them explicitly will not result in errors being ignored.
In task switch emulation code stop handling some errors, but not
others.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dpci_ioport_{read,write}() guest memory access failure handling should
be modelled after process_portio_intercept()'s (and others): Upon
encountering an error on other than the first iteration, the count
successfully handled needs to be stored and X86EMUL_OKAY returned, in
order for the generic instruction emulator to update register state
correctly before reporting failure or retrying (both of which would
only happen after re-invoking emulation).
Further we leverage (and slightly extend, due to the above mentioned
need to return X86EMUL_OKAY) the "large MMIO" retry model.
Note that there is still a special case not explicitly taken care of
here: While the first retry on the last iteration of a "rep ins"
correctly recovers the already read data, an eventual subsequent retry
is being handled by the pre-existing mmio-large logic (through
hvmemul_do_io() storing the [recovered] data [again], also taking into
consideration that the emulator converts a single iteration "ins" to
->read_io() plus ->write()).
Also fix an off-by-one in the mmio-large-read logic, and slightly
simplify the copying of the data.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
We shouldn't do any acceleration for
- "rep movs" when either side is passed through MMIO or when both sides
are handled by qemu
- "rep ins" and "rep outs" when the memory operand is any kind of MMIO
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
| |
Just like real hardware we ought to split such accesses transparently
to the caller. With little extra effort we can at once even handle page
crossing accesses correctly.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
Define new struct hvm_trap to represent information of trap, and
renames hvm_inject_exception to hvm_inject_trap, then define a couple
of wrappers around that function for existing callers.
Signed-off-by: Keir Fraser <keir@xen.org>
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
| |
The eventual hvm_copy or IO emulations will re-check the p2m and DTRT.
Signed-off-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
| |
Signed-off-by: Tim Deegan <tim@xen.org>
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tracepoint for emulated MMIO and I/O port reads was always before
the emulated read or write was done. This means that for reads the
register value in the trace record was always 0.
So for reads, move the tracepoint until the register value is
available.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having an enum for this won't work if we want to add any orthogonal
options to it -- the existing code is only correct (after the removal of
p2m_guest in the previous patch) because there are no tests anywhere for
'== p2m_alloc', only for '!= p2m_query' and '== p2m_unshare'.
Replace it with a set of flags.
Signed-off-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was neither consistently used by callers nor correctly handled by the
lookup code. Instead, treat any lookup that might allocate or unshare
memory as a 'guest' lookup for the purposes of:
- detecting the highest pod gfn populated; and
- crashing the guest on access to a broken page
which were the only things this was used for.
Signed-off-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The hvm io emulation code holds the p2m lock for the duration of the
emulation, which may include sending an event to qemu. On a separate path,
map_domain_pirq grabs the event channel and p2m locks in opposite order.
Fix this by ensuring liveness of the ram_gfn used by io emulation, with a
page ref.
Reported-by: "Hao, Xudong" <xudong.hao@intel.com>
Signed-off-by: "Hao, Xudong" <xudong.hao@intel.com>
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When calling get_gfn multiple times on different gfn's in the same function, we
can easily deadlock if p2m lookups are locked. Thus, refactor these calls to
enforce simple deadlock-avoidance rules:
- Lowest-numbered domain first
- Lowest-numbered gfn first
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavila.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we use wait queues to handle potential ring congestion cases,
code paths that try to generate a mem event while holding a gfn lock
would go to sleep in non-preemptible mode.
Most such code paths can be fixed by simply postponing event generation until
locks are released.
Signed-off-by: Adin Scannell <adin@scannell.ca>
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
| |
Limit such queries only to p2m_query types. This is more compatible
with the name and intended semantics: perform only a lookup, and explicitly
in an unlocked way.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the existing movq emulation to also support its SSE2 and AVX
variants, the latter implying the addition of VEX decoding. Fold the
read and write cases (as most of the logic is identical), and add
movntq and variants (as they're very similar).
Extend the testing code to also exercise these instructions.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Callers of lookups into the p2m code are now variants of get_gfn. All
callers need to call put_gfn. The code behind it is a no-op at the
moment, but will change to proper locking in a later patch.
This patch does not change functionality. Only naming, and adds
put_gfn's.
set_p2m_entry retains its name because it is always called with
p2m_lock held.
This patch is humongous, unfortunately, given the dozens of call sites
involved.
After this patch, anyone using old style gfn_to_mfn will not succeed
in compiling their code. This is on purpose: adapt to the new API.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
| |
Move HVM io fields into a structure.
On MMIO instruction failure print out some more bytes.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Committed-by: Keir Fraser <keir@xen.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As part of the nested HVM patch series, many p2m functions were changed
to take pointers to p2m tables rather than to domains. This patch
reverses that for almost all of them, which:
- gets rid of a lot of "p2m_get_hostp2m(d)" in code which really
shouldn't have to know anything about how gfns become mfns.
- ties sharing and paging interfaces to a domain, which is
what they actually act on, rather than a particular p2m table.
In developing this patch it became clear that memory-sharing and nested
HVM are unlikely to work well together. I haven't tried to fix that
here beyond adding some assertions around suspect paths (as this patch
is big enough with just the interface changes)
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gfn_to_mfn_unshare() had its own function despite all other lookup types
being handled in one place. Merge it into _gfn_to_mfn_type(), so that it
gets the benefit of broken-page protection, for example, and tidy its
interfaces up to fit.
The unsharing code still has a lot of bugs, e.g.
- failure to alloc for unshare on a foreign lookup still BUG()s,
- at least one race condition in unshare-and-retry
- p2m_* lookup types should probably be flags, not enum
but it's cleaner and will make later p2m cleanups easier.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Newer SVM implementations (Bulldozer) copy up to 15 bytes from the
instruction stream into the VMCB when a #PF or #NPF exception is
intercepted. This patch makes use of this information if available.
This saves us from a) traversing the guest's page tables, b) mapping
the guest's memory and c) copy the instructions from there into the
hypervisor's address space.
This speeds up #NPF intercepts quite a lot and avoids cache and TLB
trashing.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir@xen.org>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
| |
Change p2m infrastructure to operate on per-p2m instead of per-domain.
This allows us to use multiple p2m tables per-domain.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Acked-by: Tim Deegan <Tim.Deegan@citrix.com>
|
|
|
|
| |
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
currently we go through the emulator every time a HVM guest does an
I/O port access (in/out). This is unnecessary most of the times, as
both VMX and SVM provide all the necessary information already in the
VMCS/VMCB. String instructions are not covered by this shortcut, but
they are quite rare and we would need to access the guest memory
anyway. This patch decodes the information from VMCB/VMCS and calls a
simple handle_mmio wrapper. In handle_mmio() itself the emulation part
will simply be skipped, this approach avoids code duplication. Since
the vendor specific part is quite trivial, I implemented both the VMX
and SVM part, please check the VMX part for sanity.
I boot-tested both versions and ran some simple benchmarks. A micro
benchmark (hammering an I/O port in a tight loop) shows a significant
performance improvement (down to 66% of the time needed to handle the
intercept on an AMD K8, measured in the guest with TSC). Even with
reading a 1GB file from an emulated IDE harddisk (Dom0 cached) I could
get a 4-5% improvement. Some guest code (e.g. the TCP stack in some
Windows version) exercises the PM-Timer I/O port (0x1F48) very often
(multiple 10,000 times per second), these workloads also benefit with
up to 5% improvement from this patch.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
|
|
|
|
|
|
|
|
|
| |
If pages cannot be unshared immediately (due to lack of free memory required to
create private copies) the VCPU under emulation is paused, and the emulator
returns X86EMUL_RETRY, which will get resolved after some memory is freed back
to Xen (possibly through host paging).
Signed-off-by: Grzegorz Milos <Grzegorz.Milos@citrix.com>
|
|
|
|
|
|
|
|
|
|
| |
A new HVMCOPY return value, HVMCOPY_gfn_paged_out is defined to indicate that
a gfn was paged out. This value and PFEC_page_paged, as appropriate, are
caught and passed up as X86EMUL_RETRY to the emulator. This will cause the
emulator to keep retrying the operation until is succeeds (once the page has
been paged in).
Signed-off-by: Patrick Colp <Patrick.Colp@citrix.com>
|
|
|
|
| |
Signed-off-by: Patrick Colp <Patrick.Colp@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reduce size of Xen-qemu shared ioreq structure to 32 bytes. This
has two advantages:
1. We can support up to 128 VCPUs with a single shared page
2. If/when we want to go beyond 128 VCPUs, a whole number of ioreq_t
structures will pack into a single shared page, so a multi-page
array will have no ioreq_t straddling a page boundary
Also, while modifying qemu, replace a 32-entry vcpu-indexed array
with a dynamically-allocated array.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
| |
Make various data items const or __read_mostly where
possible/reasonable.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
|
|
|
|
|
|
| |
reset to HVMIO_none, as no IO is in flight.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
| |
read/write cycle to qemu is dropped due to guest suspend.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After some discussion, here's a second version of the patch I posted a
couple of weeks back to map grant references into HVM guests. As
before, this is done by modifying the P2M map, but this time there's
no new hypercall to do it. Instead, the existing GNTTABOP_map is
overloaded to perform a P2M mapping if called from a shadow mode
translate guest. This matches the IA64 API.
Signed-off-by: Steven Smith <steven.smith@citrix.com>
Acked-by: Tim Deegan <tim.deegan@citrix.com>
CC: Bhaskar Jayaraman <Bhaskar.Jayaraman@lsi.com>
|
|
|
|
|
| |
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
|
|
|
|
|
|
| |
Original patch by Christoph Egger <christoph.egger@amd.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
| |
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
| |
Check for self-corrupting copies, and report hvm_copy errors to the
console log.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
| |
PAGE_SIZE - (x & ~PAGE_MASK) is not equivalent to -x & ~PAGE_MASK
Also the early goto could be removed.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
| |
instructions to 4096 iterations.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
| |
address translations on multi-iteration string instructions.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
| |
pointer to emulator data buffer, and an arbitrary byte count (up to
the size of a page of memory).
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
|
|
| |
Also clean up cmpxchg() callback handling so we can get rid of teh
specific cmpxchg8b handler.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
| |
which may require multiple round trips to the device model.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
| |
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|
|
|
|
|
|
| |
exceptions, which will allow emulation stubs to be built dynamically
in a future patch.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
|