|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
"libxc: provide notification of final checkpoint to restore end"
broke migration from any version of Xen using tools from prior to that commit
Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
tools xc_domain_restore() to start reading the qemu save record, as
ctx->last_checkpoint is 0.
The failure looks like:
xc: error: Max batch size exceeded (1970103633). Giving up.
where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
With this fix in place, the behaviour for normal migrations is reverted to how
it was before the regression; the migration is considered non-checkpointed
right from the start. A XC_SAVE_ID_LAST_CHECKPOINT chunk seen in the
migration stream is a nop. For checkpointed migrations the behaviour is
unchanged.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> (Remus bits)
|
|
libxenctrl expects to be able to simply run the save or restore
operation synchronously. This won't work well in a process which is
trying to handle multiple domains.
The options are:
- Block such a whole process (eg, the whole of libvirt) while
migration completes (or until it fails).
- Create a thread to run xc_domain_save and xc_domain_restore on.
This is quite unpalatable. Multithreaded programming is error
prone enough without generating threads in libraries, particularly
if the thread does some very complex operation.
- Fork and run the operation in the child without execing. This is
no good because we would need to negotiate with the caller about
fds we would inherit (and we might be a very large process).
- Fork and exec a helper.
Of these options the latter is the most palatable.
Consequently:
* A new helper program libxl-save-helper (which does both save and
restore). It will be installed in /usr/lib/xen/bin. It does not
link against libxl, only libxc, and its error handling does not
need to be very advanced. It does contain a plumbing through of
the logging interface into the callback stream.
* A small ad-hoc protocol between the helper and libxl which allows
log messages and the libxc callbacks to be passed up and down.
Protocol doc comment is in libxl_save_helper.c.
* To avoid a lot of tedium the marshalling boilerplate (stubs for the
helper and the callback decoder for libxl) is generated with a
small perl script.
* Implement new functionality to spawn the helper, monitor its
output, provide responses, and check on its exit status.
* The functions libxl__xc_domain_restore_done and
libxl__xc_domain_save_done now turn out to want be called in the
same place. So make their state argument a void* so that the two
functions are type compatible.
The domain save path still writes the qemu savefile synchronously.
This will need to be fixed in a subsequent patch.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
|