aboutsummaryrefslogtreecommitdiffstats
path: root/tools/libxl/libxl_save_callout.c
Commit message (Collapse)AuthorAgeFilesLines
* tools/migrate: Fix regression when migrating from older version of XenAndrew Cooper2013-10-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010 "libxc: provide notification of final checkpoint to restore end" broke migration from any version of Xen using tools from prior to that commit Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer tools xc_domain_restore() to start reading the qemu save record, as ctx->last_checkpoint is 0. The failure looks like: xc: error: Max batch size exceeded (1970103633). Giving up. where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu" With this fix in place, the behaviour for normal migrations is reverted to how it was before the regression; the migration is considered non-checkpointed right from the start. A XC_SAVE_ID_LAST_CHECKPOINT chunk seen in the migration stream is a nop. For checkpointed migrations the behaviour is unchanged. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> (Remus bits)
* fix wrong path while calling pygrub and libxl-save-helperBamvor Jian Zhang2013-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | in current xen x86_64, the default libexec directory is /usr/lib/xen/bin, while the private binder is /usr/lib64/xen/bin. but some commands(pygrub, libxl-save-helper) located in private binder directory is called from libexec directory which lead to the following error: 1, for pygrub bootloader: libxl: debug: libxl_bootloader.c:429:bootloader_disk_attached_cb: /usr/lib/xen/bin/pygrub doesn't exist, falling back to config path 2, for libxl-save-helper: libxl: cannot execute /usr/lib/xen/bin/libxl-save-helper: No such file or directory libxl: error: libxl_utils.c:363:libxl_read_exactly: file/stream truncated reading ipc msg header from domain 3 save/restore helper stdout pipe libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: domain 3 save/restore helper [10222] exited with error status 255 there are two ways to fix above error. the first one is make such command store in the /usr/lib/xen/bin and /usr/lib64/xen/bin(symbol link to previous), e.g. qemu-dm. The second way is using private binder dir instead of libexec dir. e.g. xenconsole. For these cases, the latter one is suitable. Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: wait for qemu to acknowledge logdirty commandIan Jackson2012-06-281-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | The current migration code in libxl instructs qemu to start or stop logdirty, but it does not wait for an acknowledgement from qemu before continuing. This might lead to memory corruption (!) Fix this by waiting for qemu to acknowledge the command. Unfortunately the necessary ao arrangements for waiting for this command are unique because qemu has a special protocol for this particular operation. Also, this change means that the switch_qemu_logdirty callback implementation in libxl can no longer synchronously produce its return value, as it now needs to wait for xenstore. So we tell the marshalling code generator that it is a message which does not need a reply. This turns the callback function called by the marshaller into one which returns void; the callback function arranges to later explicitly sends the reply to the helper, when the xs watch triggers and the appropriate value is read from xenstore. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: domain save/restore: run in a separate processIan Jackson2012-06-281-11/+350
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | libxenctrl expects to be able to simply run the save or restore operation synchronously. This won't work well in a process which is trying to handle multiple domains. The options are: - Block such a whole process (eg, the whole of libvirt) while migration completes (or until it fails). - Create a thread to run xc_domain_save and xc_domain_restore on. This is quite unpalatable. Multithreaded programming is error prone enough without generating threads in libraries, particularly if the thread does some very complex operation. - Fork and run the operation in the child without execing. This is no good because we would need to negotiate with the caller about fds we would inherit (and we might be a very large process). - Fork and exec a helper. Of these options the latter is the most palatable. Consequently: * A new helper program libxl-save-helper (which does both save and restore). It will be installed in /usr/lib/xen/bin. It does not link against libxl, only libxc, and its error handling does not need to be very advanced. It does contain a plumbing through of the logging interface into the callback stream. * A small ad-hoc protocol between the helper and libxl which allows log messages and the libxc callbacks to be passed up and down. Protocol doc comment is in libxl_save_helper.c. * To avoid a lot of tedium the marshalling boilerplate (stubs for the helper and the callback decoder for libxl) is generated with a small perl script. * Implement new functionality to spawn the helper, monitor its output, provide responses, and check on its exit status. * The functions libxl__xc_domain_restore_done and libxl__xc_domain_save_done now turn out to want be called in the same place. So make their state argument a void* so that the two functions are type compatible. The domain save path still writes the qemu savefile synchronously. This will need to be fixed in a subsequent patch. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: domain save: API changes for asynchronyIan Jackson2012-06-281-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the internal and external APIs for domain save (suspend) to be capable of asynchronous operation. The implementation remains synchronous. The interfaces surrounding device model saving are still synchronous. Public API changes: * libxl_domain_save takes an ao_how. * libxl_domain_remus_start takes an ao_how. If the libxl_domain_remus_info is NULL, we abort rather than returning an error. * The `suspend_callback' function passed to libxl_domain_save is never called by the existing implementation in libxl. Abolish it. * libxl_domain_save takes its flags parameter as an argument. Thus libxl_domain_suspend_info is abolished. * XL_SUSPEND_* flags renamed to LIBXL_SAVE_*. * Callers in xl updated. Internal code restructuring: * libxl__domain_suspend_state member types and names rationalised. * libxl__domain_suspend renamed from libxl__domain_suspend_common. (_common here actually meant "internal function"). * libxl__domain_suspend takes a libxl__domain_suspend_state, which where the parameters to the operation are filled in by the caller. * xc_domain_save is now called via libxl__xc_domain_save which can itself become asynchronous. * Consequently, libxl__domain_suspend is split into two functions at the callback boundary; the second half is libxl__xc_domain_save_done. * libxl__domain_save_device_model is now called by the actual implementation rather than by the public wrapper. It is already in its proper place in the domain save execution sequence. So officially make it part of that execution sequence, renaming it to domain_save_device_model. * Effectively, rewrite the public wrapper functions libxl_domain_suspend and libxl_domain_remus_start. * Remove a needless #include <xenctrl.h> * libxl__domain_suspend aborts on unexpected domain types rather than mysteriously returning EINVAL. * struct save_callbacks moved from the stack to the dss. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
* libxl: domain restore: reshuffle, preparing for aoIan Jackson2012-06-281-0/+37
We are going to arrange that libxl, instead of calling xc_domain_restore, calls a stub function which forks and execs a helper program, so that restore can be asynchronous rather than blocking the whole toolstack. This stub function will be called libxl__xc_domain_restore. However, its prospective call site is unsuitable for a function which needs to make a callback, and is buried in two nested single-call-site functions which are logically part of the domain creation procedure. So we first abolish those single-call-site functions, integrate their contents into domain creation in their proper temporal order, and break out libxl__xc_domain_restore ready for its reimplementation. No functional change - just the following reorganisation: * Abolish libxl__domain_restore_common, as it had only one caller. Move its contents into (what was) domain_restore. * There is a new stage function domcreate_rebuild_done containing what used to be the bulk of domcreate_bootloader_done, since domcreate_bootloader_done now simply starts the restore (or does the rebuild) and arranges to call the next stage. * Move the contents of domain_restore into its correct place in the domain creation sequence. We put it inside domcreate_bootloader_done, which now either calls libxl__xc_domain_restore which will call the new function domcreate_rebuild_done, or calls domcreate_rebuild_done directly. * Various general-purpose local variables (`i' etc.) and convenience alias variables need to be shuffled about accordingly. * Consequently libxl__toolstack_restore needs to gain external linkage as it is now in a different file to its user. * Move the xc_domain_save callbacks struct from the stack into libxl__domain_create_state. In general the moved code remains almost identical. Two returns in what used to be libxl__domain_restore_common have been changed to set the return value and "goto out", and the call sites for the abolished and new functions have been adjusted. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>