-
Notifications
You must be signed in to change notification settings - Fork 292
CP-308455 VM.sysprep wait for shutdown #6587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP-308455 VM.sysprep wait for shutdown #6587
Conversation
The timeout (in seconds) we wait for the current domain to shut down before we return. This is supplied by the user; for the XE CLI we can provide a default. Signed-off-by: Christian Lindig <[email protected]>
An easy way to wait for the shutdown of a domain is to watch its xenstore tree to disappear. Wait for this after triggering sysprep via the gueat agent. Signed-off-by: Christian Lindig <[email protected]>
Explain the timeout parameter. Signed-off-by: Christian Lindig <[email protected]>
ocaml/xapi/vm_sysprep.ml
Outdated
) ; | ||
"running" | ||
with Watch.Timeout _ -> xs.Xs.read (control // "action") | ||
debug "%s: sysprep is runnung; waiting for shutdown" __FUNCTION__ ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could also wait for the key "/control/sysprep/action" to disappear first: I think then we know sysprep stopped running, so we can eject the CD.
The VM will then reboot soon after that. That key will also disappear on a reboot.
We could have different timeouts for the two: sysprep itself can take a long time to run (so we could have a timeout of a few hours perhaps, it depends how much work unattend.xml has given it).
OTOH once sysprep finished we expect the reboot to happen quickly, so we can have a shorter timeout on the order of minutes.
This would be useful for debugging how much progress the guest makes (we could set the progress field of the task at various points, though for now debug messages would probably be sufficient).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For simplicity one timeout is probably sufficient: we let sysprep run for as long as it wishes (we don't have a way to cancel it anyway), i.e. the first Watch.wait_for..key_to_disappear wouldn't have a timeout.
The 2nd timeout would be as it is here; a timeout for the reboot.
Then the 'wait for sysprep to finish running' can be nicely added as a separate commit on top of this one, something like:
Watch.(wait_for ~xs (key_to_disappear (control // "action")))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to run anything without a timeout because we want to fail this API call to signal that something is wrong and it is more difficult to cancel an API call from the outside and make sure we are not leaking any resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice if we could wait for either task cancelation or the xenstore key to disappear, but I don't think we currently have a mechanism in XAPI that could easily support that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have support to wait for all/any xenstore events. I don't see the value of waiting for "running" to disappear if we ultimately want to wait for the domain to disappear which implies that "running" disappears as well.
Monitor the VM's reaction more carefully: wait for sysprep to terminate. Signed-off-by: Christian Lindig <[email protected]>
__FUNCTION__ ; | ||
Watch.(wait_for ~xs ~timeout (key_to_disappear (control // "action"))) ; | ||
debug "%s sysprep is finished" __FUNCTION__ ; | ||
Watch.(wait_for ~xs ~timeout (key_to_disappear domain)) ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that the xapi sees the last change of disappear directly without seeing the changes above? In other words, can the xenstore ensure the individual consecutive changes will never be combined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rule out sysprep terminating and the domain disappearing together. But that is an argument based on how the Windows domain works. not how xapi and xenstore work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the domain disappears then all keys underneath it disappear too, so the 'action' will always disappear on shutdown/reboot.
Change in semantics: the API call now waits for the VM to shut down (by observing the xenstore tree being removed). This is guarded by a user-provided timeout.