-
Notifications
You must be signed in to change notification settings - Fork 292
Sync master to feature/configure-ssh-phase3 #6598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
changlei-li
merged 148 commits into
xapi-project:feature/configure-ssh-phase3
from
LunfanZhang:private/luzhan/sync-master-to-feature
Jul 21, 2025
Merged
Sync master to feature/configure-ssh-phase3 #6598
changlei-li
merged 148 commits into
xapi-project:feature/configure-ssh-phase3
from
LunfanZhang:private/luzhan/sync-master-to-feature
Jul 21, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Simplifies the logic of `exec_with_context` by letting the caller decide when the task is destroyed from the database. Adds helper function in `context.ml` to destroy and trace the destroy op correctly. Signed-off-by: Gabriel Buica <[email protected]>
Signed-off-by: Gabriel Buica <[email protected]>
Dynamic.t is not used currently, just drop it. Signed-off-by: Andrii Sultanov <[email protected]>
There's already a 'localhost' in scope. Signed-off-by: Andrii Sultanov <[email protected]>
No functional change, this just removes several indentation levels from the 500+ lines of the function, making it easier to refactor in the future. This also clears up the logic of the function, now that two arms of if-else and try-with clauses are not 500+ lines apart, and avoids splitting some expressions and strings over several lines given that they reach the line character limit more often inside of several levels of indentation. Signed-off-by: Andrii Sultanov <[email protected]>
Best reviewed by commit, and also best viewed locally with a git client that distinguishes between moved and new lines, with a config like: ``` [diff] colormoved = "default" colormovedws = "allow-indentation-change" ``` No functional changes intended. ---- This is split out from a larger series, but I figured these cleanups are easier to review on their own and it would be better to get them in early to avoid annoying rebases of a thousand lines of code ....
events_from_xenopsd thread is critical as it sync up VM status in case of bootstorm, this thread is flood as lots of events comes from xenopsd waiting for process. During processing of the events, VM/VDI/VBD update_allowed_operations will be called to refresh the allowed operations. However, for each ops (start/suspend,etc) for the same object(VM), the object info is always the same no matter what the ops is. Thus, it is not necessary to query the object information over and over again. Disclosure is used to resovle the issue. Query once and the disclosure will just remember the query result. The performance test for starting 500 VM on 4 hosts improve around 10% performance for both XS8 and XS9 This commit just introduce disclosure and caller call the disclosure instead of the original function Signed-off-by: Lin Liu <[email protected]>
Define db ops into variables and keep them inside returned function for code review Signed-off-by: Lin Liu <[email protected]>
Move ops unrelated db operation outside of returned function Signed-off-by: Lin Liu <[email protected]>
This allows changing the type definition in Updates without modifying the types here as well in the future. Signed-off-by: Andrii Sultanov <[email protected]>
The only usage of it ignores values, so drop them altogether. This allows changing the type of the values in the future without modifying inject_barrier and its users in any way. Signed-off-by: Andrii Sultanov <[email protected]>
Drop the first member of the (Vm.t * Vm.state) tuple as it's never used. This removes the need for several 'snd info', 'Option.iter (fun (_, state) ->) info' constructs. Replace all the following constructs: ``` if different (fun x -> x.field) && predicate then Option.iter (fun state -> ...) info ``` With this: ``` different (fun x -> x.field) ((&&) predicate) (fun field -> ...); ``` It 1) removes the additional level of indentation inside Option.iter (fun state), hides the duplication of this construct 2) makes it obvious where the whole 'state' is accessed and not only the field that was checked to be different Signed-off-by: Andrii Sultanov <[email protected]>
…oject#6542) Simplifies the logic of `exec_with_context` by letting the caller decide when the task is destroyed from the database. Adds helper function in `context.ml` to destroy and trace the destroy op correctly.
Instruments `process`/`Consumers` spans of message-switch service in xenopsd. Signed-off-by: Gabriel Buica <[email protected]>
Intruments functions in `xapi_xenops.ml` that carry the traceparent/tracecontext through dbg. `Debug_info.with_dbg` now accepts setting up attributes for spans. Also, instruments `Events_from_xenopsd` to capture the event spans: `subscribe`/`settle`. It's nto straight forward to link them on the same trace. For now the only way they are connnected is having the same `message.id` attribute. Signed-off-by: Gabriel Buica <[email protected]>
Intruments the functions in `xapi_xenops.ml` that carry the traceparent/tracecontext through `context`. Signed-off-by: Gabriel Buica <[email protected]>
This function is called recursively and the context span is only closed on failure. This makes it hard to read the trace. Therefore, I am resetting the context tracing each recursion step and now each recursion step will have its own trace. Signed-off-by: Gabriel Buica <[email protected]>
Signed-off-by: Gabriel Buica <[email protected]>
…#6541) events_from_xenopsd thread is critical as it sync up VM status in case of bootstorm, this thread is flood as lots of events comes from xenopsd waiting for process. During processing of the events, VM/VDI/VBD update_allowed_operations will be called to refresh the allowed operations. However, for each ops (start/suspend,etc) for the same object(VM), the object info is always the same no matter what the ops is. Thus, it is not necessary to query the object information over and over again. Disclosure is used to resovle the issue. Query once and the disclosure will just remember the query result. The performance test for starting 500 VM on 4 hosts improve around 10% performance for both XS8 and XS9
1. WLB request will raise `wlb_authentication_failed` when catching `Http_client.Http_error`. But actually only error code 401 and 403 should raise this type exception. For other error code, raise `wlb_connection_reset`. Also print the detail error code and message. 2. `message_forwarding` raises same error for `Http_request_rejected` and `Connection_reset` so we don't know which exception actually be raised. Print detailed logs for these 2 exceptions. Signed-off-by: Bengang Yuan <[email protected]>
…xenops.ml` (xapi-project#6527) As promised, more tracing instrumentation around message switch in xenopsd. There is also some instrumentation of `Events_from_xenops` around waiting/waking up for events. Currently these are not straight forward linkable as they have different traces, e.g. one is on the vm op and the other in the xapi event watch thread. It is now possible to search for for "`subscribe`/`settle` <queue_name>" spans that have the same attribute value for "messaging.message.id". This should help debugging.  
"data/updated" is not read or used anywhere in xenopsd or xapi: * xapi_guest_agent's last_updated field is just Unix.gettimeofday (). * xapi_xenops removes "data/updated" from the guest agent state altogether before checking if it's changed: ``` let ignored_keys = ["data/meminfo_free"; "data/updated"; "data/update_cnt"] ``` So there is no need to watch this path at all. This greatly reduces unnecessary traffic between xapi and xenopsd, since any VM with a guest agent would write to data/updated once every 60 seconds, which would generate a Dynamic.Vm event, making xapi call xenopsd's VM.stat to rescan the domain's xenstore tree and perform several hypercalls. Almost always, this would be completely unnecessary as nothing else about the VM would change, but a lot of work would be done anyhow. Signed-off-by: Andrii Sultanov <[email protected]>
1. WLB request will raise `wlb_authentication_failed` when catching `Http_client.Http_error`. But actually only error code 401 and 403 should raise this type exception. For other error code, raise `wlb_connection_reset`. Also print the detail error code and message. 2. `message_forwarding` raises same error for `Http_request_rejected` and `Connection_reset` so we don't know which exception actually be raised. Print detailed logs for these 2 exceptions.
Drop the first member of the (X.t * X.state) tuple coming from X.stat immediately as it's never used. This removes the need for several 'snd info', 'Option.iter (fun (_, state) ->) info' constructs. Signed-off-by: Andrii Sultanov <[email protected]>
Rename the "u" function used across storage and observer_helpers to make its meaning more obvious and use __FUNCTION__ instead of hardcoding the function name. Signed-off-by: Steven Woods <[email protected]>
Some refactoring, but the most important commit is this one-line change that greatly improves things on its own: --- xenopsd: Remove `data/updated` from the list of watched paths "data/updated" is not read or used anywhere in xenopsd or xapi: * xapi_guest_agent's last_updated field is just Unix.gettimeofday (). * xapi_xenops removes "data/updated" from the guest agent state altogether before checking if it's changed: ``` let ignored_keys = ["data/meminfo_free"; "data/updated"; "data/update_cnt"] ``` So there is no need to watch this path at all. This greatly reduces unnecessary traffic between xapi and xenopsd, since any VM with a guest agent would write to data/updated once every 60 seconds, which would generate a Dynamic.Vm event, making xapi call xenopsd's VM.stat to rescan the domain's xenstore tree and perform several hypercalls. Almost always, this would be completely unnecessary as nothing else about the VM would change, but a lot of work would be done anyhow.
Rename the "u" function used across storage and observer_helpers to make its meaning more obvious and use __FUNCTION__ instead of hardcoding the function name.
The VM id part of Vbd.id is unnecessary in the attached_vdis key as the DB is already indexed by the VM id. This also prevents problems when the VM is renamed. Signed-off-by: Steven Woods <[email protected]>
The `ref` parameter within the `location` attribute of the console can refer to either the VM's ref or the console's own ref. Currently, the console's location uses the VM's ref by default. This causes an issue: when executing xs console, the requested location contains the VM's ref. If the ref points to the VM, xapi will attempt to use the RFB console (which is graphical) by default, rather than the VT100 console (which is text-based). As a result, the xs console command fails to open the console and hangs. **Solution:** Update the default ref in the console's `location` to the console's own ref. With this change, whether accessing the RFB or the VT100 console, the ref in the `location` will always point to the respective console itself. Signed-off-by: Bengang Yuan <[email protected]>
Add details on specifying image format for VDI and VM migration. In particular This revision explains how to choose the destination image format during VDI creation and migration, including VM migration scenarios. Also fixes minor typos in the document. Signed-off-by: Guillaume <[email protected]>
The default for the Xen PCI MMIO BAR is UnCachable. Setting this to WriteBack in the MTRR shows a massive performance improvement on AMD, at least for Linux guests. On Intel this is already set to WriteBack by Xen via another mechanism. To be effective this also requires the corresponding [Xen commit](https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=22650d6054625be10172fe0c78b9cadd1a39bd63), old versions will just ignore this xenstore key. The optimization is not enabled by default in Xen due to the wide range of guests it supports, but XAPI supports a much narrower set of guest OSes. Setting the cache attribute to WB is done by setting UC=0.
The first argument of this API error is used for error codes as defined by v6d. These product-specific error codes can be matched on by clients who know about them (e.g. XenCenter for XenServer). The new, second argument allows v6d to return an error message in English that xe can print directly, while xe remains product agnostic and does not need to know v6d's error definitions. This replaces some awkward API-message handling code in cli_operations. Signed-off-by: Rob Hoes <[email protected]>
The first argument of this API error is used for error codes as defined by v6d. These product-specific error codes can be matched on by clients who know about them (e.g. XenCenter for XenServer). The new, second argument allows v6d to return an error message in English that xe can print directly, while xe remains product agnostic and does not need to know v6d's error definitions. This replaces some awkward API-message handling code in cli_operations.
other_operation_in_progress has separate fields for the main class of the object ("Cluster" or "Cluster_host") and the operation that's blocking progress, do not concatenate these into one string. Also report the task ref that's blocking progress. Signed-off-by: Andrii Sultanov <[email protected]>
other_operation_in_progress has separate fields for the main class of the object ("VM") and the operation that's blocking progress, do not concatenate these into one string. Also report the task ref that's blocking progress. Signed-off-by: Andrii Sultanov <[email protected]>
qcow-stream uses Lwt, which is not thread-safe, so we want to avoid using it in the xapi process. Create a CLI wrapper for calls to qcow-stream. Signed-off-by: Andrii Sultanov <[email protected]>
This patch allows to pass "qcow2" as a supported format when calling VDI export and import. Qcow_tool_wrapper is added as a helper that calls the Python script for export (conversion from raw to qcow2 stream) and the qcow-tool CLI tool (ocaml-qcow library) for import (conversion from qcow2 stream to raw). Signed-off-by: Guillaume <[email protected]> Signed-off-by: Andrii Sultanov <[email protected]>
Also add an mli file for qcow_tool_wrapper Signed-off-by: Andrii Sultanov <[email protected]>
Final touches for qcow2 VDI import/export - add it as one of the supported formats and plumb through the calls to the python script for export and `qcow-stream` library for import via `qcow_tool_wrapper` and `qcow-stream-tool`. `qcow-stream-tool` is a thin wrapper executable over `qcow-stream`, so that xapi doesn't have to use Lwt code in its process (it's not thread-safe, see xapi-project#6573 for the discussion on this) --- This was tested manually, confirmed to have the same behaviour as the previous version (xapi-project#6573) --- This PR requires the following additions to the specfile: ```diff @@ -166,6 +172,7 @@ Requires: hwdata Requires: /usr/sbin/ssmtp Requires: stunnel >= 5.55 Requires: vhd-tool +Requires: qcow-stream-tool Requires: libffi Requires: busybox Requires: iproute @@ -397,9 +404,15 @@ This packages contains plugins registering to the RRD daemon and exposing variou %package -n vhd-tool Summary: Command-line tools for manipulating and streaming .vhd format files +%package -n qcow-stream-tool +Summary: Minimal CLI wrapper for qcow-stream + %description -n vhd-tool Simple command-line tools for manipulating and streaming .vhd format file. +%description -n qcow-stream-tool +Minimal CLI wrapper for qcow-stream + %package -n xcp-networkd Summary: Simple host network management service for the xapi toolstack Requires: ethtool @@ -1339,6 +1353,9 @@ done /opt/xensource/libexec/get_nbd_extents.py /opt/xensource/libexec/python_nbd_client.py +%files -n qcow-stream-tool +%{_bindir}/qcow-stream-tool + %files -n xcp-networkd %{_sbindir}/xcp-networkd %{_bindir}/networkd_db ```
The API supports a driver deselect call for which Xapi calls the underlying driver-tool. The mock driver is missing the corresponding -d option, which this patch adds. This fixes an internal error that Xapi raises otherwise. An alternative would be to ignore the error from the driver-tool. Signed-off-by: Christian Lindig <[email protected]>
…n_in_progress Fix incorrect operations being reported - e.g. when other_operation_in_progress is reported only when there are non-parallelisable operations currently happening, we can't just report the first current op, we need to report the first *non-parallelisable* op. This required some refactoring, moving to options instead of bools, etc. Additionally report the blocking operation's reference when raising the error. Signed-off-by: Andrii Sultanov <[email protected]>
…_in_progress Signed-off-by: Andrii Sultanov <[email protected]>
…other_operation_in_progress Instead of logging all the blocking operations with 'debug', just report them with the error. Signed-off-by: Andrii Sultanov <[email protected]>
…progress Signed-off-by: Andrii Sultanov <[email protected]>
…rogress error Signed-off-by: Andrii Sultanov <[email protected]>
…rror Signed-off-by: Andrii Sultanov <[email protected]>
…ation_in_progress error Signed-off-by: Andrii Sultanov <[email protected]>
…gress error Signed-off-by: Andrii Sultanov <[email protected]>
…on_in_progress error Signed-off-by: Andrii Sultanov <[email protected]>
Not all of other_operation_in_progress error contain two objects in the list ([cls; objref]), some contain three or four, but the match would miss them and they would be put into a non-interruptible Thread.sleep instead of the interruptible Delay.sleep that provides a mechanism for the other task to wake us up on its way out. Signed-off-by: Andrii Sultanov <[email protected]>
…progress Most of the users report these when appropriate, so document them here. Signed-off-by: Andrii Sultanov <[email protected]>
quality-gate: Reduce expected List.hd count test_clustering: test for the blocking operation to be reported Signed-off-by: Andrii Sultanov <[email protected]>
…s delays (xapi-project#6588) `other_operation_in_progress` is reported inconsistently, with some users only specifying `[class; class_ref]`, and others also `op_type`. Make it consistent, changing all users to report `[class; class_ref; op_type; op_ref]`. This makes it much easier to navigate the logs, since the operation blocking progress is clearly referenced. Compare before, where it's unclear which operation is precluding progress: ``` Caught exception while marking SR for VDI.clone in message forwarder: OTHER_OPERATION_IN_PROGRESS: [ SR; OpaqueRef:d019f26a-b2e8-529b-bb66-fd4f008d4f82; VDI.clone ] ``` And after, with the ID of the operation present: ``` Caught exception while marking SR for VDI.clone in message forwarder: OTHER_OPERATION_IN_PROGRESS: [ SR; OpaqueRef:b0f54a40-485f-f48f-fdc7-6db28a64d3fd; scan; OpaqueRef:832fb22d-2ce5-0d26-8a00-a4165e78b34d ] ``` In addition, fix incorrect operation reporting, so that when progress is blocked only when particular types of operations are present in `current_operations`, these specific operations are reported, and not just any current operation. Also fix handling of `other_operation_in_progress` delays - some of the users would miss the `match` arm and use the non-interruptible `Thread.sleep` instead of the interruptible `Delay.sleep` that provides a mechanism for the blocking task to wake us up on its way out.
The API supports a driver deselect call for which Xapi calls the underlying driver-tool. The mock driver is missing the corresponding -d option, which this patch adds. This fixes an internal error that Xapi raises otherwise. An alternative would be to ignore the error from the driver-tool.
The previous `xe` help is as below: ``` Usage: xe <cmd> [-s server] [-p port] ([-u username] [-pw password] or [-pwf <password file>]) [--traceparent traceparent] <other arguments> A full list of commands can be obtained by running xe help -s <server> -p <port> ``` The previous `xe` help output lacked debug-related options and did not provide detailed parameter description. The new `xe` help output is as follows: ``` Usage: xe <command> [ -s <server> ] XenServer host [ -p <port> ] XenServer port number [ -u <username> -pw <password> | -pwf <password file> ] User authentication (password or file) [ --nossl ] Disable SSL/TLS [ --debug ] Enable debug output [ --debug-on-fail ] Enable debug output only on failure [ --traceparent <value> ] Distributed tracing context [ <other arguments> ... ] Command-specific options A full list of commands can be obtained by running xe help -s <server> -p <port> ```
The original confliction: <<<<<<< HEAD
; ( "vm-sysprep-enabled"
, Arg.Set vm_sysprep_enabled
, (fun () -> string_of_bool !vm_sysprep_enabled)
, "Enable VM.sysprep API"
)
; ( "vm-sysprep-wait"
, Arg.Set_float vm_sysprep_wait
, (fun () -> string_of_float !vm_sysprep_wait)
, "Time in seconds to wait for VM to recognise inserted CD"
=======
; ( "ssh-auto-mode"
, Arg.Bool (fun b -> ssh_auto_mode_default := b)
, (fun () -> string_of_bool !ssh_auto_mode_default)
, "Defaults to true; overridden to false via \
/etc/xapi.conf.d/ssh-auto-mode.conf(e.g., in XenServer 8)"
>>>>>>> feature/configure-ssh-phase3 |
gangj
approved these changes
Jul 21, 2025
changlei-li
approved these changes
Jul 21, 2025
c368857
into
xapi-project:feature/configure-ssh-phase3
16 checks passed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merge master to feature branch and resolve the following conflicts: