Skip to content

Conversation

@alexander-demicev
Copy link
Contributor

What this PR does / why we need it:

This PR adds notes about KCP implementation and managed field refactoring. @sbueringer @fabriziopandini could you please verify that I didn't miss anything and got it all right, also I wasn't sure if we should move these notes into the proposal or keep them separated.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 20, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/needs-area PR is missing an area label label Nov 20, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign neolit123 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 20, 2025
@alexander-demicev
Copy link
Contributor Author

/area documentation

@k8s-ci-robot k8s-ci-robot added area/documentation Issues or PRs related to documentation and removed do-not-merge/needs-area PR is missing an area label labels Nov 20, 2025
Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR.
i've added some comments on grammar. note that this is not a implementation correctness pass.

Q: i know that the CAPI book has a lot of details, but this seems like it's going trough the controller source code and covering everything. is the average reader that interested in documentation verbosity ALA "i what to know exactly how this works"? for comparison, the k8s documentation doesn't do that.


- In-place updates respect the existing control plane update strategy:
- KCP controller uses `rollingUpdate` strategy with `maxSurge` (0 or 1)
- When `maxSurge` is 0, no new machines are created during rollout - only in-place updates or scale down
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best to clarify the part - only in-place updates or scale down a little, as is the sentence is a bit hard to put into context. e.g. it is only performed on...

- KCP controller uses `rollingUpdate` strategy with `maxSurge` (0 or 1)
- When `maxSurge` is 0, no new machines are created during rollout - only in-place updates or scale down
- When `maxSurge` is 1:
- Controller first scales up by creating one new machine to maximize fault tolerance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Controller first scales up by creating one new machine to maximize fault tolerance
- The controller first scales up by creating one new machine to maximize fault tolerance

definite article for the action taker

- When `maxSurge` is 0, no new machines are created during rollout - only in-place updates or scale down
- When `maxSurge` is 1:
- Controller first scales up by creating one new machine to maximize fault tolerance
- Once at `maxReplicas` (desiredReplicas + 1), evaluates whether to in-place update or scale down old machines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Once at `maxReplicas` (desiredReplicas + 1), evaluates whether to in-place update or scale down old machines
- Once `maxReplicas` (desiredReplicas + 1) is reached, it evaluates whether to in-place update or scale down old machines

- When `maxSurge` is 1:
- Controller first scales up by creating one new machine to maximize fault tolerance
- Once at `maxReplicas` (desiredReplicas + 1), evaluates whether to in-place update or scale down old machines
- For each old machine needing rollout: if eligible for in-place update, performs in-place; otherwise scales down
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best to break this down into multiple sentences

e.g. for each machine it evaluates if it is eligible for in-place update. If so, it.... , otherwise ....

- Controller first scales up by creating one new machine to maximize fault tolerance
- Once at `maxReplicas` (desiredReplicas + 1), evaluates whether to in-place update or scale down old machines
- For each old machine needing rollout: if eligible for in-place update, performs in-place; otherwise scales down
- This pattern repeats until all machines are up-to-date, then scales back to desired replica count
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- This pattern repeats until all machines are up-to-date, then scales back to desired replica count
- This pattern repeats until all machines are up-to-date, it then scales back to the desired replica count

- Result: Final desired managedField structure is established

3. **Ready for operations**:
- Continuous `syncMachines` calls update labels/annotations without affecting spec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Continuous `syncMachines` calls update labels/annotations without affecting spec
- Continuous `syncMachines` calls update labels/annotations without affecting the spec of a Machine

(i'm assuming it's a machine spec)

When triggering in-place updates:

1. Apply BootstrapConfig/InfraMachine with:
- Updated spec (owned by spec manager)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Updated spec (owned by spec manager)
- Updated spec (owned by the spec manager)

1. Apply BootstrapConfig/InfraMachine with:
- Updated spec (owned by spec manager)
- `update-in-progress` annotation (owned by spec manager)
- For InfraMachine: `cloned-from` annotations (owned by spec manager)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- For InfraMachine: `cloned-from` annotations (owned by spec manager)
- For InfraMachine: `cloned-from` annotations (owned by the spec manager)

- `update-in-progress` annotation (owned by spec manager)
- For InfraMachine: `cloned-from` annotations (owned by spec manager)

2. Result after in-place update trigger:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Result after in-place update trigger:
2. Result after the in-place update trigger:

Comment on lines +256 to +258
- labels + annotations => metadata manager
- spec => spec manager
- in-progress / cloned-from annotations => spec manager
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best to replace the + and => with actual words (verbs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/documentation Issues or PRs related to documentation cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants