Skip to content

Commit e99d697

Browse files
committed
add upgrade downgrade version skep
1 parent 28dc1f8 commit e99d697

File tree

1 file changed

+33
-28
lines changed
  • keps/sig-network/1880-services-ip-ranges

1 file changed

+33
-28
lines changed

keps/sig-network/1880-services-ip-ranges/README.md

Lines changed: 33 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -282,7 +282,7 @@ type IPRangeStatus struct {
282282

283283
##### Controller
284284

285-
The IPRange object will be reconciled by a controller running in the kube-controller-manager that will be update the Status accordenly.
285+
The IPRange object will be reconciled by a controller running in the kube-controller-manager that will be update the Status accordingly.
286286
The controller with perform the following operations:
287287
- cross validation, per example, avoiding overlapping IP Ranges
288288
- resizing subnets
@@ -453,18 +453,19 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
453453

454454
- Feature implemented behind a feature flag
455455
- Initial unit, integration and e2e tests completed and enabled
456+
- Only basic functionality implemented
456457

457458
#### Beta
458459

459-
- Gather feedback from developers and surveys
460-
- Complete features for IPRange resizing and changing primary IP family
460+
- API stability, no disruptive changes on types and behaviors
461+
- Gather feedback from developers and users
462+
- Complete all the advanced features: resizing, changing primary IP family, ...
461463
- Additional tests are in Testgrid and linked in KEP
462464
- Allowing time for feedback
463465

464466
#### GA
465467

466-
- 3 examples of real-world usage
467-
- 3 installs
468+
- 2 examples of real-world usage
468469
- More rigorous forms of testing—e.g., downgrade tests and scalability tests
469470
- Allowing time for feedback
470471

@@ -479,34 +480,38 @@ in back-to-back releases.
479480

480481
#### Deprecation
481482

482-
### Upgrade / Downgrade Strategy
483+
### Upgrade / Downgrade / Version Skew Strategy
483484

484-
<!--
485-
If applicable, how will the component be upgraded and downgraded? Make sure
486-
this is in the test plan.
485+
Currently, the Service CIDRs are configured independently in each kube-apiserver using flags.
486+
During the bootstrap process, the apiserver uses the first IP of each range to create the special "kubernetes.default" service.
487+
It also starts a reconcile loop, that synchronize the state of the bitmap with the assigned IPs to the Services.
487488

488-
Consider the following in developing an upgrade/downgrade strategy for this
489-
enhancement:
490-
- What changes (in invocations, configurations, API use, etc.) is an existing
491-
cluster required to make on upgrade, in order to maintain previous behavior?
492-
- What changes (in invocations, configurations, API use, etc.) is an existing
493-
cluster required to make on upgrade, in order to make use of the enhancement?
494-
-->
489+
It is a known limitation that each kube-apiserver can boot with different ranges and create conflicts.
495490

496-
### Version Skew Strategy
491+
In order to be completely backwards compatible, the bootstrap process will remain the same, the difference is
492+
that instead of creating a bitmap based on the flags, it will create a well-known IPRange object with name "default"
493+
and configured with the CIDRs passed as flags.
497494

498-
<!--
499-
If applicable, how will the component handle version skew with other
500-
components? What are the guarantees? Make sure this is in the test plan.
495+
```
496+
<<[UNRESOLVED conflict ]>>
497+
Currently, there is no conflict resolution for incompatible configurations on kube-apiservers, should we
498+
maintain current behavior and let the apiservers "keep fighting" to set their own parameters, or just
499+
fail to start?
500+
<<[/UNRESOLVED]>>
501+
```
501502

502-
Consider the following in developing a version skew strategy for this
503-
enhancement:
504-
- Does this enhancement involve coordinating behavior in the control plane and
505-
in the kubelet? How does an n-2 kubelet without this feature available behave
506-
when this feature is used?
507-
- Will any other components on the node change? For example, changes to CSI,
508-
CRI or CNI may require updating that component before the kubelet.
509-
-->
503+
The source of truth are the IPs assigned to the Services, both the old and new methods have reconcile loops that
504+
rebuild the state of the allocators based on the assigned IPs to the Services, this allows to support
505+
upgrades and skewed clusters.
506+
507+
508+
Since the new allocation model will remove some of the limitations of the current model, skewed versions and downgrades
509+
can only work if the configurations are fully compatible, per example, current CIDRs are limited to a /112 max for IPv6,
510+
if an user configures a /64 to their IPv6 subnets in the new model, and IPs are assigned out of the first /112 block,
511+
the old allocator based in bitmap will not be able to use those IPs creating an inconsistency in the cluster.
512+
513+
It is recommended that those Services are recreated to get IP addresses inside the configured ranges, for consistency,
514+
but there should not be any functional problem in the cluster.
510515

511516
## Production Readiness Review Questionnaire
512517

0 commit comments

Comments
 (0)