aboutsummaryrefslogtreecommitdiffstats
path: root/tools/libxl
Commit message (Collapse)AuthorAgeFilesLines
* Add vendor_device parameter for HVM guestsPaul Durrant2013-08-054-0/+36
| | | | | | | | | | | | The parameter determines which, if any, xen-pvdevice is specified on the QEMU command line. The default value is 'none' which means no argument will be passed. A value of 'xenserver' specifies a xen-pvdevice with device-id 0xc000 (the initial value in the xenserver namespace - see docs/misc/pci-device-reservations.txt). Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> [ ijc -- s/BUILD_INFO/BUILDINFO for consistency in LIBXL_HAVE define ]
* libxl: Fix function libxl__domain_resume_device_modelrwxybh2013-08-021-0/+1
| | | | | | | Add a break line in function libxl__domain_resume_device_model Signed-off-by: Bingheng Yan <rwxybh@126.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: Add vif.default.backend to xl.confGeorge Dunlap2013-07-223-0/+11
| | | | | | | | | This will allow a user to default to a network driver domain system-wide. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xl: Enable by default claim mode.Konrad Rzeszutek Wilk2013-07-221-1/+1
| | | | | | | | | | During the Xen 4.3 release we discussed that this feature could be turned on by default - as it benefits all of the guests - not just tmem related. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: Allow network driver domains when run_hotplug_scritps is setGeorge Dunlap2013-07-171-7/+0
| | | | | | | | | | | | | | As of commit 05bfd984dfe7014f1f5ea1133608b9bab589c120, hotplug scripts are not run if backend_domid != LIBXL_TOOSTACK_DOMID; so there is no reason to restrict this for network driver domains any more. This is a candidate for backporting to 4.3. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> CC: Ian Campbell <ian.campbell@citrix.com> CC: Ian Jackson <ian.jackson@citrix.com> CC: Jan Beulich <jbeulich@suse.com>
* xl: support for leaving domain paused after saveIan Murray2013-07-172-7/+16
| | | | | | | | | | | | | | New feature to allow xl save to leave a domain paused after its memory has been saved. This is to allow disk snapshots of domU to be taken that exactly correspond to the memory state at save time. Once the snapshot(s) have been taken or whatever, the domain can be unpaused in the usual manner. Usage: xl save -p <domid> <filespec> Signed-off-by: Ian Murray <murrayie@yahoo.co.uk> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xl: Add 'xen_version' to `xl info`Andrew Cooper2013-07-121-0/+2
| | | | | | | | | | | | Getting the full Xen version in an easily scriptable way is awkward, especially if trying to piece together from xen_{major,minor,extra}. This reflects $(XEN_FULLVERSION) in the build system (but under a more sensible name, as $(XEN_VERSION) is just the major number). Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Matt Wilson <msw@amazon.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
* libxl: do not call exit() in libxl_device_vtpm_listMarek Marczykowski2013-07-041-5/+6
| | | | | | | | Signal error with NULL return value, do not terminate the whole process. Signed-off-by: Marek Marczykowski <marmarek@invisiblethingslab.com> Reviewed-by: Jim Fehlig <jfehlig@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: suppress device assignment to HVM guest when there is no IOMMUIan Jackson2013-07-011-0/+12
| | | | | | | | | | | | | | | | | | | | | | | This in effect copies similar logic from xend: While there's no way to check whether a device is assigned to a particular guest, XEN_DOMCTL_test_assign_device at least allows checking whether an IOMMU is there and whether a device has been assign to _some_ guest. For the time being, this should be enough to cover for the missing error checking/recovery in other parts of libxl's device assignment paths. There remains a (functionality-, but not security-related) race in that the iommu should be set up earlier, but this is too risky a change for this stage of the 4.3 release. This is a security issue, XSA-61. Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
* libxl: Use QMP cpu-add to hotplug CPU with qemu-xen.Anthony PERARD2013-06-261-6/+46
| | | | | | Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: Add "cpu-add" QMP command.Anthony PERARD2013-06-262-0/+23
| | | | | | | Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> [ ijc -- rename index parameter to avoid Wshadow due to index(3) in strings.h ]
* libxl: Fix assignment of devid value returned from libxl__device_nextidJim Fehlig2013-06-261-4/+4
| | | | | | | | | | | | | | | | | | | Commit 5420f265 has some misplaced parenthesis that caused devid to be assigned 1 or 0 based on checking return value of libxl__device_nextid < 0, e.g. devid = libxl__device_nextid(...) < 0 This works when only one instance of a given device type exists, but subsequent devices of the same type will also have a devid = 1 if libxl__device_nextid succeeds. Fix by checking the value assigned to devid, e.g. (devid = libxl__device_nextid(...)) < 0 Signed-off-by: Jim Fehlig <jfehlig@suse.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: Restrict permissions on PV console device xenstore nodesIan Jackson2013-06-255-36/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Matthew Daley has observed that the PV console protocol places sensitive host state into a guest writeable xenstore locations, this includes: - The pty used to communicate between the console backend daemon and its client, allowing the guest administrator to read and write arbitrary host files. - The output file, allowing the guest administrator to write arbitrary host files or to target arbitrary qemu chardevs which include sockets, udp, ptr, pipes etc (see -chardev in qemu(1) for a more complete list). - The maximum buffer size, allowing the guest administrator to consume more resources than the host administrator has configured. - The backend to use (qemu vs xenconsoled), potentially allowing the guest administrator to confuse host software. So we arrange to make the sensitive keys in the xenstore frontend directory read only for the guest. This is safe since the xenstore permissions model, unlike POSIX directory permissions, does not allow the guest to remove and recreate a node if it has write access to the containing directory. There are a few associated wrinkles: - The primary PV console is "special". It's xenstore node is not under the usual /devices/ subtree and it does not use the customary xenstore state machine protocol. Unfortunately its directory is used for other things, including the vnc-port node, which we do not want the guest to be able to write to. Rather than trying to track down all the possible secondary uses of this directory just make it r/o to the guest. All newly created subdirectories inherit these permissions and so are now safe by default. - The other serial consoles do use the customary xenstore state machine and therefore need write access to at least the "protocol" and "state" nodes, however they may also want to use arbitrary "feature-foo" nodes (although I'm not aware of any) and therefore we cannot simply lock down the entire frontend directory. Instead we add support to libxl__device_generic_add for frontend keys which are explicitly read only and use that to lock down the sensitive keys. - Minios' console frontend wants to write the "type" node, which it has no business doing since this is a host/toolstack level decision. This fails now that the node has become read only to the PV guest. Since the toolstack already writes this node just remove the attempt to set it. This is a security issue, XSA-57. Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
* libxl,hvmloader: Don't relocate memory for MMIO holeGeorge Dunlap2013-06-241-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment, qemu-xen can't handle memory being relocated by hvmloader. This may happen if a device with a large enough memory region is passed through to the guest. At the moment, if this happens, then at some point in the future qemu will crash and the domain will hang. (qemu-traditional is fine.) It's too late in the release to do a proper fix, so we try to do damage control. hvmloader already has mechanisms to relocate memory to 64-bit space if it can't make a big enough MMIO hole. By default this is 2GiB; if we just refuse to make the hole bigger if it will overlap with guest memory, then the relocation will happen by default. v5: - Update comment to not refer to "this series". v4: - Wrap long line in libxl_dm.c - Fix comment v3: - Fix polarity of comparison - Move diagnostic messages to another patch - Tested with xen platform pci device hacked to have different BAR sizes {256MiB, 1GiB} x {qemu-xen, qemu-traditional} x various memory configurations - Add comment explaining why we default to "allow" - Remove cast to bool v2: - style fixes - fix and expand comment on the MMIO hole loop - use "%d" rather than "%s" -> (...)?"1":"0" - use bool instead of uint8_t - Move 64-bit bar relocate detection to another patch - Add more diagnostic messages Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> CC: Ian Campbell <ian.campbell@citrix.com> CC: Stefano Stabellini <stefano.stabellini@citrix.com> CC: Hanweidong <hanweidong@huawei.com> CC: Keir Fraser <keir@xen.org> CC: Keir Fraser <keir@xen.org>
* libxl: add LIBXL_HAVE_<foo> for outstanding_pages and outstanding_memkbDario Faggioli2013-06-121-0/+18
| | | | | | | | | | | | Commits d0782481 ("xl: export 'outstanding_pages' value from xcinfo") and bec8f17e ("xen: Remove the XENMEM_get_oustanding_pages and provide the data via xc_phys_info") added these two fields in libxl_physinfo and in libxl_dominfo, respectively, but did not include the needed LIBXL_HAVE_<foo> runes. Adding them. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
* tools/libxl: fix array subscript has type 'char'Christoph Egger2013-05-311-1/+1
| | | | Signed-off-by: Christoph Egger <chegger@amazon.de>
* libxl: Remove qxl support for the 4.3 releaseGeorge Dunlap2013-05-304-32/+0
| | | | | | | | | | | | | | | | | | | | | | | | | The qxl drivers for Windows and Linux end up calling instructions that cannot be used for MMIO at the moment. Just for the 4.3 release, remove qxl support. This patch should be reverted as soon as the 4.4 development window opens. The issue in question: (XEN) emulate.c:88:d18 bad mmio size 16 (XEN) io.c:201:d18 MMIO emulation failed @ 0033:7fd2de390430: f3 0f 6f 19 41 83 e8 403 The instruction in question is "movdqu (%rcx),%xmm3". Xen knows how to emulate it, but unfortunately %xmm3 is 16 bytes long, and the interface between Xen and qemu at the moment would appear to only allow MMIO accesses of 8 bytes. It's too late in the release cycle to find a fix or a workaround. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: Fix qemu-xen command line for vcpus numbers.Anthony PERARD2013-05-301-2/+2
| | | | | | | | On the qemu-xen command line, the number of vcpus initially online and the number of maximum available vcpus are inverted. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: use Linux-compatible names for sse4 cpuid featuresGeorge Dunlap2013-05-301-0/+3
| | | | | | | | | | | Linux uses sse4_1 and sse4_2, but at the moment libxl uses '.' instead of '_'. This makes it confusing for people looking in Linux's /proc/cpuinfo to disable features. Add the Linux feature names, keeping the old ones for compatability. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.camppbell@citrix.com>
* xl: Return an error if an empty file is passed to cd-insertGeorge Dunlap2013-05-301-5/+24
| | | | | | | | | | | | | | Two changes: * Stat the file before calling libxl_cdrom_insert() * Return an error if anything fails (including libxl_cdrom_insert) This is in part to work around the fact that the RAW disk type is used for things that aren't actually files; so we can't call stat in libxl_device.c:libxl__device_disk_set_backend() because it may be going over a remote protocol. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xl, e820_host, PV passthrough: Fix guests crashing when memory == maxmemKonrad Rzeszutek Wilk2013-05-301-1/+1
| | | | | | | | | | | | | | | | The code had an obvious bug where it would assume that the balloon amount would always be _something_ and add an E820_RAM entry at the end of the E820 array. The added E820_RAM would contain the balloon amount plus the delta of memory that had to be subtracted b/c of the various E820 entries. That assumption is certainly true when maxmem != mem, but if guest config has maxmem = memory that is incorrect (as balloon value is zero). The end result is that the E820 that is constructed is missing a swath of "delta" memory and in most cases ends up with only one E820_RAM entry that is of 512MB size on many Intel systems. Reported-by: Christian Holpert <christian@holpert.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: Make 'xl vcpu-set' work properly on overcommited hosts with an override.Konrad Rzeszutek Wilk2013-05-142-7/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The libxl_cpu_bitmap_alloc(..) function, if provided with a zero value for max CPUs will call xc_get_max_cpus() which will retrieve the number of physical CPUs the host has. This is usually OK if the guest's maxvcpus <= host pcpus. But if the value is different, then the bitmap for VCPUs is limited by the number of CPUs the host has. This is incorrect as what we want is to hotplug in the guest the amount of CPUs that the user specified on the command line and not be limited by the amount of physical CPUs. This means that a guest config like this: vcpus=8 maxvcpus=32 and on a 4 PCPU machine doing xl vcpu-set <guest name> 16 won't work. This is b/c the the size of the bitmap is one byte so it can only hold up to 8 VCPUs. Hence anything above that is going to be ignored. Note that this patch also fixes the bitmap setting - as it would set all of the bits allowed. Meaning if the user had a 4PCPU host we would still allow the user to set 8VCPUs. This second iteration of the patch fixes this. Note that all of the libxl_cpu_bitmap_[test|set] silently ignore any test or sets above its size: if (bit >= bitmap->size * 8) return 0; so we were never notified off this bug. This patch warns the user if they are trying to do this. If the user really wants to do this they have to provide the --ignore-host parameter to bypass this check. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
* libxl: claim: Print the values in 'xl info' unconditionallyKonrad Rzeszutek Wilk2013-05-141-6/+1
| | | | | | | | | | During the review of "libxl: Change claim_mode from bool to int." Ian Campbell suggested that the xl info should print the claim information irregardless of the global claim_mode value. Suggested-by: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
* libxl: Change claim_mode from bool to int.Konrad Rzeszutek Wilk2013-05-143-7/+7
| | | | | | | | | | | | | | | | | During the review it was noticed that it would be better if internally the claim_mode was held as an 'int' instead of a 'bool'. The reason is that during the startup of xl, one has call the libxl_defbool_setdefault. otherwise any usage of claim_mode would result in assert break. The assert is due to the fact that using defbool without any set values (either true of false) will cause it hit an assertion. If we use an 'int' we don't have to worry about it and by default the value of zero will suffice for checks whether the claim is enabled or disabled. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
* hypervisor/xen/tools: Remove the XENMEM_get_oustanding_pages and provide the ↵Konrad Rzeszutek Wilk2013-05-144-28/+5
| | | | | | | | | | | | | | | | | | | | | | | | data via xc_phys_info During the review of the patches it was noticed that there exists a race wherein the 'free_memory' value consists of information from two hypercalls. That is the XEN_SYSCTL_physinfo and XENMEM_get_outstanding_pages. The free memory the host has available for guest is the difference between the 'free_pages' (from XEN_SYSCTL_physinfo) and 'outstanding_pages'. As they are two hypercalls many things can happen in between the execution of them. This patch resolves this by eliminating the XENMEM_get_outstanding_pages hypercall and providing the free_pages and outstanding_pages information via the xc_phys_info structure. It also removes the XSM hooks and adds locking as needed. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Keir Fraser <keir.xen@gmail.com>
* docs: Change cd-insert docs to match behaviorGeorge Dunlap2013-05-101-1/+1
| | | | | | | xl cd-insert takes a plain file. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: don't write physical-device node for driver domain disksRoger Pau Monne2013-05-081-1/+2
| | | | | | | | | This will be handled by the driver domain itself, since the toolstack does not have access to the physical device because it is in a different domain. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
* libxl: don't execute hotplug scripts if device is on a driver domainRoger Pau Monne2013-05-081-0/+7
| | | | | | | | Prevent hotplug script execution from libxl if device is on a different domain. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
* libxl: correctly parse storage devices on driver domainsRoger Pau Monne2013-05-081-0/+6
| | | | | | | | | | | | | | Don't try to check physical devices if they belong to a domain different than the one where the toolstack is running. This prevents the following error when trying to use storage driver domains: libxl: debug: libxl_create.c:1246:do_domain_create: ao 0x1819240: create: how=(nil) callback=(nil) poller=0x1818fa0 libxl: debug: libxl_device.c:235:libxl__device_disk_set_backend: Disk vdev=xvda spec.backend=phy libxl: debug: libxl_device.c:175:disk_try_backend: Disk vdev=xvda, backend phy unsuitable as phys path not a block device libxl: error: libxl_device.c:278:libxl__device_disk_set_backend: no suitable backend for disk xvda Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
* tools: Bump some library sonamesIan Jackson2013-05-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | libxc (libxenctrl, libxenguest): New claim_enabled field in struct xc_dom_image; New nr_outstanding_pages field in struct xc_dominfo; New fields in struct xc_hvm_build_args (xenguest.h). libxl: new fields in dominfo domain_build_info device_vfb device_vkb device_disk etc. etc. etc. libxlu #includes libxl headers so needs to inherit its new soname Use Xen version for new sonames since we don't in fact guarantee ABI (as opposed to API) stability across releases. xenstore (libxenstore): New flag XS_UNWATCH_FILTER, so bump minor version only. This was the result of reviewing the output from: git-checkout staging cd tools git-diff RELEASE-4.2.2 `find -name \*.h` Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: Don't use tapdisk for cd-romsGeorge Dunlap2013-05-021-0/+6
| | | | | | | | | | | | | | blktap does not support the insert / eject commands, and so is not suitable for cd-roms. This fixes the bug where libxl uses tapdisk as a cdrom back-end, causing subsequent eject / insert commands to fail. Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> CC: Fabio Fantoni <fabio.fantoni@heliman.it> CC: Stefano Stabellini <stefano.stabellini@citrix.com> CC: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
* libxl: unconst the event argument to the event_occurs hook.Ian Campbell2013-05-012-3/+17
| | | | | | | | | | | | | | | The event is supposed to become owned, and therefore freed, by the application and the const prevents this. Unfortunately there is no way to remove the const without breaking existing callers. The best we can do is use the LIBXL_API_VERSION provisions to remove the const for callers who wish only to support the 4.3 API and newer. Callers who wish to support 4.2 will need to live with casting away the const. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Jim Fehlig <jfehlig@suse.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
* libxl: adjust point of backend name resolutionEric Shelton2013-05-011-4/+7
| | | | | | | | | | | | | | | | | | | | | Resolution of a backend name to a domid needs to happen a little earlier in some cases. For example, if a domU is specified as a backend for a disk and, as previously written, libxl__device_disk_setdefault() calls libxl__resolve_domid() last, then disk->backend_domid still equals LIBXL_TOOLSTACK_DOMID when libxl__device_disk_set_backend() is called. This results in libxl__device_disk_set_backend() making an incorrect attempt to validate the target by calling stat() on a file on dom0, resulting in ERROR_INVAL (see libxl_device.c lines 239-248), which prevents creation of the frontend domain. Likewise, libxl__device_nic_setdefault() previously made use of nic->backend_domid before it was set. Signed-off-by: Eric Shelton <eshelton@pobox.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
* libxl: fix spelling of "backend-id" for vtpmMarek Marczykowski2013-04-301-1/+1
| | | | | Signed-off-by: Marek Marczykowski <marmarek@invisiblethingslab.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: stat the path for all non-qdisk backends (including unknown)Ian Campbell2013-04-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | The commit a8a1f236a296 "libxl: Only call stat() when adding a disk if we expect a device to exist." changed things to only stat the file when the phy backend was explicitly requested. This broke the case where we are probing and would normally be able to decide on the phy option. Since the intention of that commit was to allow for backends with no explicit file in dom0 (i.e. network remote backend such as ceph) the lowest impact fix appears to be to make that explicit. It turns out that tap disk can also potentially handle such paths. The only backend which requires a local file/device is PHY but we need to handle UNKNOWN too in order for subsequent probing to work. Note that it is not possible to autoprobe the backend if the path is not a local object, so we don't need to worry about autoprobing ceph etc. This should probably be revisited to rationalize the probing. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
* libxl: write IO ABI for disk frontendsWei Liu2013-04-261-0/+23
| | | | | | | | | | | | | | | This is a patch to forward-port a Xend behaviour. Xend writes IO ABI used for all frontends. Blkfront before 2.6.26 relies on this behaviour otherwise guest cannot boot when running in 32-on-64 mode. Blkfront after 2.6.26 writes that node itself, in which case it's just an overwrite to an existing node which should be OK. In fact Xend writes the ABI for all frontends including console and vif. But nowadays only old disk frontends rely on that behaviour so that we only write the ABI for disk frontends in libxl, minimizing the impact. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: Only call stat() when adding a disk if we expect a device to exist.David Scott2013-04-241-1/+3
| | | | | | | | | | | | | | | | | | | | We consider calling stat() a helpful error check in the following circumstances only: 1. the disk backend type must be PHYsical 2. the disk backend domain must be the same as the running libxl code (ie LIBXL_TOOLSTACK_DOMID) 3. there must not be a hotplug script because this would imply that the device won't be created until after the hotplug script has run. With this fix, it is possible to use qemu's built-in block drivers such as ceph/rbd, with a xl config disk spec like this: disk=[ 'backendtype=qdisk,format=raw,vdev=hda,access=rw,target=rbd:rbd/ubuntu1204.img' ] Signed-off-by: David Scott <dave.scott@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
* x86: fix various issues with handling guest IRQsJan Beulich2013-04-181-5/+7
| | | | | | | | | | | | | - properly revoke IRQ access in map_domain_pirq() error path - don't permit replacing an in use IRQ - don't accept inputs in the GSI range for MAP_PIRQ_TYPE_MSI - track IRQ access permission in host IRQ terms, not guest IRQ ones (and with that, also disallow Dom0 access to IRQ0) This is CVE-2013-1919 / XSA-46. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* libxl: properly initialize device structuresDaniel De Graaf2013-04-171-3/+5
| | | | | | | | | This avoids returning unallocated memory in the libxl_device_vtpm structure in libxl_device_vtpm_list, and uses libxl_device_nic_init instead of memset when initializing libxl_device_nics. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: postpone backend name resolutionDaniel De Graaf2013-04-179-330/+347
| | | | | | | | | | | | | | | | | | | | This adds a backend_domname field in libxl devices that contain a backend_domid field, allowing either a domid or a domain name to be specified in the configuration structures. The domain name is resolved into a domain ID in the _setdefault function when adding the device. This change allows the backend of the block devices to be specified (which previously required passing the libxl_ctx down into the block device parser), and will simplify specification of backend domains in other users of libxl. The check on run_hotplug_scripts in parse_config_data is removed because it is a duplicate of the one in libxl__device_nic_setdefault, and is removed here because it no longer has the resolved domain ID to check. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> [ ijc -- reran flex ]
* xl: add node-affinity to the output of `xl list`Dario Faggioli2013-04-172-63/+105
| | | | | | | | | | | | | | | | Node-affinity is now something that is under (some) control of the user, so show it upon request as part of the output of `xl list' by the `-n' option. Re the patch, the print_bitmap() related hunk is _mostly_ code motion, although there is a very minor change in the code, basically to allow using the function for printing both cpu and node bitmaps (as, in case all bits are sets, it used to print "any cpu", which doesn't fit the nodemap case). Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: automatic placement deals with node-affinityDario Faggioli2013-04-172-17/+33
| | | | | | | | | | | | | | | | Which basically means the following two things: 1) during domain creation, it is the node-affinity of the domain --rather than the vcpu-affinities of its VCPUs-- that is affected by automatic placement; 2) during automatic placement, when counting how many VCPUs are already "bound" to a placement candidate (as part of the process of choosing the best candidate), both vcpu-affinity and node-affinity are considered. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
* libxl: optimize the calculation of how many VCPUs can run on a candidateDario Faggioli2013-04-171-22/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For choosing the best NUMA placement candidate, we need to figure out how many VCPUs are runnable on each of them. That requires going through all the VCPUs of all the domains and check their affinities. With this change, instead of doing the above for each candidate, we do it once for all, populating an array while counting. This way, when we later are evaluating candidates, all we need is summing up the right elements of the array itself. This reduces the complexity of the overall algorithm, as it moves a potentially expensive operation (for_each_vcpu_of_each_domain {}) outside from the core placement loop, so that it is performed only once instead of (potentially) tens or hundreds of times. More specifically, we go from a worst case computation time complaxity of: O(2^n_nodes) * O(n_domains*n_domain_vcpus) To, with this change: O(n_domains*n_domains_vcpus) + O(2^n_nodes) = O(2^n_nodes) (with n_nodes<=16, otherwise the algorithm suggests partitioning with cpupools and does not even start.) Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: allow for explicitly specifying node-affinityDario Faggioli2013-04-175-0/+39
| | | | | | | | | | By introducing a nodemap in libxl_domain_build_info and providing the get/set methods to deal with it. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* xl: Fix 'free_memory' to include outstanding_claims value.Konrad Rzeszutek Wilk2013-04-162-30/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updating to make it clear that free_memory reported by 'xl info' is influenced by the outstanding claim value. That is the free memory that will be available to the host once all outstanding claims have been completed. This modifies the behavior that the patch titled "xl: 'xl info' print outstanding claims if enabled (claim_mode=1 in xl.conf)" had - which reported the outstanding claims and nothing else. The free_pages as reported by the hypervisor is the currently available count of pages on the heap. The outstanding pages is the total amount of pages reserved for guests (so not taken from the heap yet). As guests are being populated the memory from the heap shrinks and the outstanding count of pages decreases. The total memory used for guests increases. As the available count of pages on the heap and outstanding claims are intertwined, report the amount of free memory available to be a combination of that. That is free heap memory minus the outstanding pages. We also make some odd choices in reporting. By default we will only display 'outstanding_claims' if the claim_mode is enabled in the global configuration file. However, if there are outstanding claims, we will ignore the claim_mode and report these values. Suggested-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xl: 'xl claims' print outstanding per domain claimsKonrad Rzeszutek Wilk2013-04-164-6/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is similar to "xl: 'xl info' print outstanding claims if enabled (claim_mode=1 in xl.conf)" which exposes the global claim value. This patch provides the value of the currently outstanding pages claimed for each domains. This is per domain value which is added to the global claim value which influences the hypervisors' MM system. When a claim call is done, a reservation for a specific amount of pages is set (and this patch lists said number) and also a global value is incremented. This global value is then reduced as the domain's memory is populated and eventually reaches zero. The toolstack (libxc) also sets the domain's claim to zero when the population of memory has completed as an extra step. Any call to destroy the domain will also set the domain's claim to zero. If the reservation cannot be meet the guest creation fails immediately instead of taking seconds or minutes (depending on the size of the guest) while the toolstack populates memory. See patch: "xl: Implement XENMEM_claim_pages support via 'claim_mode' global config" for details on how it is implemented. The value fluctuates quite often so the value is stale once it is provided to the user-space. However it is useful for diagnostic purposes. It is printed irregardless of global "claim_mode" option in xl.conf(5). That is b/c the user might have enabled, launched a guest, and then disabled the option - and we should still report the correct outstanding claim value. The 'man xl' shows the details of this argument. The output is close to what 'xl list' looks like: Name ID Mem VCPUs State Time(s) Claimed Domain-0 0 2047 4 r----- 19.7 0 OL5 2 2048 1 --p--- 0.0 847 OL6 3 1024 4 r----- 5.9 0 Windows_XP 4 2047 1 --p--- 0.0 1989 [In which it can be seen that the OL5 guest still has 847MB of claimed memory (out of the total 2048MB where 1191MB has been allocated to the guest).] Please note that the 'Mem' column has the cumulative value of outstanding claims and the total amount of memory that has been allocated to the guest. [v1: claims, not claim-list] [v2: Add outstanding and current memkb in the output list] [v3: Clairy docs and relax some checks] [v4: Removed comments about guest config memory being the same as 'Mem'] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xl: export 'outstanding_pages' value from xcinfoKonrad Rzeszutek Wilk2013-04-162-0/+2
| | | | | | | | | | | | | | | | | | This patch provides the value of the currently outstanding pages claimed for a specific domain. This is a value that influences the global outstanding claims value (See patch: "xl: 'xl info' print outstanding claims if enabled") returned via xc_domain_get_outstanding_pages hypercall. This domain value decrements as the memory is populated for the guest and eventually reaches zero. With this patch it is possible to utilize this field. Acked-by: Ian Campbell <ian.campbell@citrix.com> [v2: s/unclaimed/outstanding/ per Tim's suggestion] [v3: Don't use SXP printout file per Ian's suggestion] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
* xl: 'xl info' print outstanding claims if enabled (claim_mode=1 in xl.conf)Konrad Rzeszutek Wilk2013-04-163-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides the value of the currently outstanding pages claimed for all domains. This is a total global value that influences the hypervisors' MM system. When a claim call is done, a reservation for a specific amount of pages is set and also a global value is incremented. This global value is then reduced as the domain's memory is populated and eventually reaches zero. The toolstack (libxc) also sets the domain's claim to zero when the population of memory has completed as an extra step. Any call to destroy the domain will also set the domain's claim to zero. If the reservation cannot be meet the guest creation fails immediately instead of taking seconds or minutes (depending on the size of the guest) while the toolstack populates memory. See patch: "xl: Implement XENMEM_claim_pages support via 'claim_mode' global config" for details on how it is implemented. The value fluctuates quite often so the value is stale once it is provided to the user-space. However it is useful for diagnostic purposes. It is only printed when the global "claim_mode" option in xl.conf(5) is set to enabled (1). The 'man xl' shows the details of this item. [v1: s/unclaimed/outstanding/] [v2: Made libxl_get_claiminfo return just MemKB suggested by Ian Campbell] [v3: Made libxl_get_claininfo return MemMB to conform to the other values printed] [v4: Improvements suggested by Ian Jackson, also added docs to xl.pod.1] [v5: Clarify how claims are cancelled, split >72 characters - Ian Jackson] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
* xl: Implement XENMEM_claim_pages support via 'claim_mode' global configKonrad Rzeszutek Wilk2013-04-167-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The XENMEM_claim_pages hypercall operates per domain and it should be used system wide. As such this patch introduces a global configuration option 'claim_mode' that by default is disabled. If this option is enabled then when a guest is created there will be an guarantee that there is memory available for the guest. This is an particularly acute problem on hosts with memory over-provisioned guests that use tmem and have self-balloon enabled (which is the default option for them). The self-balloon mechanism can deflate/inflate the balloon quickly and the amount of free memory (which 'xl info' can show) is stale the moment it is printed. When claim is enabled a reservation for the amount of memory ('memory' in guest config) is set, which is then reduced as the domain's memory is populated and eventually reaches zero. If the reservation cannot be meet the guest creation fails immediately instead of taking seconds/minutes (depending on the size of the guest) while the guest is populated. Note that to enable tmem type guests, one needs to provide 'tmem' on the Xen hypervisor argument and as well on the Linux kernel command line. There are two boolean options: (0) No claim is made. Memory population during guest creation will be attempted as normal and may fail due to memory exhaustion. (1) Normal memory and freeable pool of ephemeral pages (tmem) is used when calculating whether there is enough memory free to launch a guest. This guarantees immediate feedback whether the guest can be launched due to memory exhaustion (which can take a long time to find out if launching massively huge guests) and in parallel. [v1: Removed own claim_mode type, using just bool, improved docs, all per Ian's suggestion] [v2: Updated the comments] [v3: Rebase on top 733b9c524dbc2bec318bfc3588ed1652455d30ec (xl: add vif.default.script)] [v4: Fixed up comments] [v5: s/global_claim_mode/claim_mode/] [v6: Ian Jackson's feedback: use libxl_defbool, better comments, etc] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
* libxl: beautify json with YAJL2M A Young2013-04-121-1/+5
| | | | | | | | xl list -l should produce readable output when built with yajl2 so it is compatible with the xendomains script. Signed-off-by: Michael Young <m.a.young@durham.ac.uk> Acked-by: Ian Campbell <ian.campbell@citrix.com>