Skip to content

Conversation

gangj
Copy link
Contributor

@gangj gangj commented Aug 4, 2025

Xenctrl.Runstateinfo.V2.domain_get is added in xen patch for querying runnable time of vCPUs of a domain.

caml_release_runtime_system();
retval = xc_get_runstate_info(xch, c_domid, &info);
caml_acquire_runtime_system();
retval = xc_get_runstate_info(xch, Int_val(domid), &info);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the release no longer required? And why are we using not xc_get_runstate_info2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the xen patch, Xenctrl.domain_get_runstate_info is kept to make the old clients working.
Just revert the commit as xenopsd is not using the new fields: runnable, running, nonaffine from Xenctrl.domain_get_runstate_info2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But caml_release_runtime_system still needs to stay

Copy link
Contributor

@edwintorok edwintorok Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see what is going on now, some commits from here got reverted, which makes this PR awkard to review.
In effect this branch won't touch the xenctrlext_stubs.c at all: master...gangj:xen-api:private/gangj/CP-308857, that is why it looks like we're removing the acquire_runtime_system, that got added in this branch: a8782c0

We could reintroduce that, but that can be done independently of this feature branch in another PR (I have a very old draft PR to fix up runtime lock usage).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. instead of reviewing this PR as is, we should review the feature branch's diff against master (see link above).

@gangj gangj marked this pull request as draft August 4, 2025 08:47
@psafont
Copy link
Member

psafont commented Aug 4, 2025

This needs to get https://github.com/xapi-project/xenctrl patched as well

@gangj
Copy link
Contributor Author

gangj commented Aug 4, 2025

Will update the xenctrl mock the same way as the new xen patch first.

@gangj
Copy link
Contributor Author

gangj commented Aug 4, 2025

xapi-project/xenctrl#14

CAMLparam2(xch_val, domid);
#if defined(XENCTRL_HAS_GET_RUNSTATE_INFO)
CAMLlocal1(result);
xc_runstate_info_t info = { 0 };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with keeping this initialization of info to { 0 }? otherwise it's going to be indeterminate and might not be initialized in the xenctrl call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a completely new API got introduced instead, so we leave the old API as it was (which will look like we are changing it if we are looking at the diff vs current branch, but if we look at the diff against master we see that the file is not changed at all, see #6612 (comment)).

This has confused me as well.

@gangj gangj force-pushed the private/gangj/CP-308857 branch from 37b6723 to 6f2a73c Compare August 5, 2025 07:32
@gangj gangj force-pushed the private/gangj/CP-308857 branch from 6f2a73c to 1c8afb9 Compare August 5, 2025 10:25
@gangj gangj marked this pull request as ready for review August 5, 2025 11:13
@mg12
Copy link
Member

mg12 commented Aug 5, 2025

the comparison against master master...gangj:xen-api:private/gangj/CP-308857 looks good to me. Though it's probably good to squash the resulting branch feature/vcpu-runnable to remove the unnecessary commits (like the one that introduces and reverts the same changes).

@gangj gangj force-pushed the private/gangj/CP-308857 branch from 1c8afb9 to 32c1fd5 Compare August 6, 2025 06:54
@gangj
Copy link
Contributor Author

gangj commented Aug 6, 2025

Squashed feature branch: master...gangj:xen-api:private/gangj/CP-308857, as we are to change the history of the feature branch, here is the PR for it: #6615, this PR can be closed if #6615 is OK.

gangj and others added 4 commits August 6, 2025 18:19
"vcpu", "VCPU" unified to "vCPU".

Signed-off-by: Gang Ji <[email protected]>
Adding a new metric 'runnable_any' as % of time that at least one vCPU
of the domain is in the runnable state. It is the sum of the following 3
metrics:
- runstate_full_contention
- runstate_concurrency_hazard
- runstate_partial_contention

Naming it 'runnable_any' instead of 'runnable' is to resolve one problem
with rrd2csv: if we name it 'runnable', rrd2csv will select both
"runnable" and "runnable_vcpus" when the 'runnable' is used:

> rrd2csv AVERAGE:vm:<vm-uuid>:runnable
> timestamp, AVERAGE:vm:<vm-uuid>:runnable, AVERAGE:vm:<vm-uuid>:runnable_vcpus

This is because "runnable" is a prefix of "runnable_vcpus".

Naming it 'runnable_any', with rrd2csv:
* can select only 'runnable_any' if we use:
  rrd2csv AVERAGE:vm:<vm-uuid>:runnable_any
* can select only 'runnable_vcpus' if we use:
  rrd2csv AVERAGE:vm:<vm-uuid>:runnable_vcpus
* can select both 'runnable_any' and 'runnable_vcpus' if we use:
  rrd2csv AVERAGE:vm:<vm-uuid>:runnable

Naming it 'runnable_any' also makes it clearer as the metric is % of
time that at least one vCPU of the domain is in the runnable state.

Add max to "runnable_any" metric to follow the fix here:

Signed-off-by: Bengang Yuan <[email protected]>
[Rebase with metric renaming and some fixes]
Signed-off-by: Gang Ji <[email protected]>
Adding a new CPU ready RRD metric: "runnable_vcpus" per domain as time %
of vCPUs of the domain that are in the runnable state.

Signed-off-by: Gang Ji <[email protected]>
Query runstate with the new API: Xenctrl.Runstateinfo.V2.domain_get

Signed-off-by: Marcus Granado <[email protected]>
@gangj gangj force-pushed the private/gangj/CP-308857 branch from ec0b510 to d355725 Compare August 6, 2025 10:30
github-merge-queue bot pushed a commit that referenced this pull request Aug 7, 2025
Squashed the history in PR
#6612 together with the
history in feature/vcpu-runnable.
@gangj gangj closed this Aug 7, 2025
@gangj gangj reopened this Aug 7, 2025
@gangj
Copy link
Contributor Author

gangj commented Aug 7, 2025

Drop it, the work is done by #6615

@gangj gangj closed this Aug 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants