Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions content/en/docs/concepts/scheduling-eviction/api-eviction.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,7 @@ If the API server allows the eviction, the Pod is deleted as follows:
resource is marked for termination and starts to gracefully shut down the
local Pod.
1. While the kubelet is shutting the Pod down, the control plane removes the Pod
from {{<glossary_tooltip term_id="endpoint" text="Endpoint">}} and
{{<glossary_tooltip term_id="endpoint-slice" text="EndpointSlice">}}
from {{<glossary_tooltip term_id="endpoint-slice" text="EndpointSlice">}}
objects. As a result, controllers no longer consider the Pod as a valid object.
1. After the grace period for the Pod expires, the kubelet forcefully terminates
the local Pod.
Expand Down
151 changes: 56 additions & 95 deletions content/en/docs/concepts/services-networking/endpoint-slices.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,7 @@ description: >-
{{< feature-state for_k8s_version="v1.21" state="stable" >}}

Kubernetes' _EndpointSlice_ API provides a way to track network endpoints
within a Kubernetes cluster. EndpointSlices offer a more scalable and extensible
alternative to [Endpoints](/docs/concepts/services-networking/service/#endpoints).
within a Kubernetes cluster.

<!-- body -->

Expand All @@ -31,8 +30,8 @@ endpoints. The control plane automatically creates EndpointSlices
for any Kubernetes Service that has a {{< glossary_tooltip text="selector"
term_id="selector" >}} specified. These EndpointSlices include
references to all the Pods that match the Service selector. EndpointSlices group
network endpoints together by unique combinations of protocol, port number, and
Service name.
network endpoints together by unique combinations of IP family, protocol,
port number, and Service name.
The name of a EndpointSlice object must be a valid
[DNS subdomain name](/docs/concepts/overview/working-with-objects/names#dns-subdomain-names).

Expand Down Expand Up @@ -67,17 +66,16 @@ more than 100 endpoints each. You can configure this with the
{{< glossary_tooltip text="kube-controller-manager" term_id="kube-controller-manager" >}}
flag, up to a maximum of 1000.

EndpointSlices can act as the source of truth for
EndpointSlices act as the source of truth for
{{< glossary_tooltip term_id="kube-proxy" text="kube-proxy" >}} when it comes to
how to route internal traffic.

### Address types

EndpointSlices support three address types:
EndpointSlices support two address types:

* IPv4
* IPv6
* FQDN (Fully Qualified Domain Name)

Each `EndpointSlice` object represents a specific IP address type. If you have
a Service that is available via IPv4 and IPv6, there will be at least two
Expand All @@ -86,42 +84,37 @@ a Service that is available via IPv4 and IPv6, there will be at least two
### Conditions

The EndpointSlice API stores conditions about endpoints that may be useful for consumers.
The three conditions are `ready`, `serving`, and `terminating`.

#### Ready

`ready` is a condition that maps to a Pod's `Ready` condition. A running Pod with the `Ready`
condition set to `True` should have this EndpointSlice condition also set to `true`. For
compatibility reasons, `ready` is NEVER `true` when a Pod is terminating. Consumers should refer
to the `serving` condition to inspect the readiness of terminating Pods. The only exception to
this rule is for Services with `spec.publishNotReadyAddresses` set to `true`. Endpoints for these
Services will always have the `ready` condition set to `true`.
The three conditions are `serving`, `terminating`, and `ready`.

#### Serving

{{< feature-state for_k8s_version="v1.26" state="stable" >}}

The `serving` condition is almost identical to the `ready` condition. The difference is that
consumers of the EndpointSlice API should check the `serving` condition if they care about pod readiness while
the pod is also terminating.
The `serving` condition indicates that the endpoint is currently serving responses, and
so it should be used as a target for Service traffic. For endpoints backed by a Pod, this
maps to the Pod's `Ready` condition.

{{< note >}}
#### Terminating

Although `serving` is almost identical to `ready`, it was added to prevent breaking the existing meaning
of `ready`. It may be unexpected for existing clients if `ready` could be `true` for terminating
endpoints, since historically terminating endpoints were never included in the Endpoints or
EndpointSlice API to begin with. For this reason, `ready` is _always_ `false` for terminating
endpoints, and a new condition `serving` was added in v1.20 so that clients can track readiness
for terminating pods independent of the existing semantics for `ready`.
{{< feature-state for_k8s_version="v1.26" state="stable" >}}

{{< /note >}}
The `terminating` condition indicates that the endpoint is
terminating. For endpoints backed by a Pod, this condition is set when
the Pod is first deleted (that is, when it receives a deletion
timestamp, but most likely before the Pod's containers exit).

#### Terminating
Service proxies will normally ignore endpoints that are `terminating`,
but they may route traffic to endpoints that are both `serving` and
`terminating` if all available endpoints are `terminating`. (This
helps to ensure that no Service traffic is lost during rolling updates
of the underlying Pods.)

{{< feature-state for_k8s_version="v1.22" state="beta" >}}
#### Ready

`Terminating` is a condition that indicates whether an endpoint is terminating.
For pods, this is any pod that has a deletion timestamp set.
The `ready` condition is essentially a shortcut for checking
"`serving` and not `terminating`" (though it will also always be
`true` for Services with `spec.publishNotReadyAddresses` set to
`true`).

### Topology information {#topology}

Expand All @@ -133,18 +126,6 @@ per endpoint fields on EndpointSlices:
* `nodeName` - The name of the Node this endpoint is on.
* `zone` - The zone this endpoint is in.

{{< note >}}
In the v1 API, the per endpoint `topology` was effectively removed in favor of
the dedicated fields `nodeName` and `zone`.

Setting arbitrary topology fields on the `endpoint` field of an `EndpointSlice`
resource has been deprecated and is not supported in the v1 API.
Instead, the v1 API supports setting individual `nodeName` and `zone` fields.
These fields are automatically translated between API versions. For example, the
value of the `"topology.kubernetes.io/zone"` key in the `topology` field in
the v1beta1 API is accessible as the `zone` field in the v1 API.
{{< /note >}}

### Management

Most often, the control plane (specifically, the endpoint slice
Expand All @@ -169,34 +150,12 @@ slice object tracks endpoints for. This ownership is indicated by an owner
reference on each EndpointSlice as well as a `kubernetes.io/service-name`
label that enables simple lookups of all EndpointSlices belonging to a Service.

### EndpointSlice mirroring

In some cases, applications create custom Endpoints resources. To ensure that
these applications do not need to concurrently write to both Endpoints and
EndpointSlice resources, the cluster's control plane mirrors most Endpoints
resources to corresponding EndpointSlices.

The control plane mirrors Endpoints resources unless:

* the Endpoints resource has a `endpointslice.kubernetes.io/skip-mirror` label
set to `true`.
* the Endpoints resource has a `control-plane.alpha.kubernetes.io/leader`
annotation.
* the corresponding Service resource does not exist.
* the corresponding Service resource has a non-nil selector.

Individual Endpoints resources may translate into multiple EndpointSlices. This
will occur if an Endpoints resource has multiple subsets or includes endpoints
with multiple IP families (IPv4 and IPv6). A maximum of 1000 addresses per
subset will be mirrored to EndpointSlices.

### Distribution of EndpointSlices

Each EndpointSlice has a set of ports that applies to all endpoints within the
resource. When named ports are used for a Service, Pods may end up with
different target port numbers for the same named port, requiring different
EndpointSlices. This is similar to the logic behind how subsets are grouped
with Endpoints.
EndpointSlices.

The control plane tries to fill EndpointSlices as full as possible, but does not
actively rebalance them. The logic is fairly straightforward:
Expand Down Expand Up @@ -244,34 +203,36 @@ You can find a reference implementation for how to perform this endpoint aggrega
and deduplication as part of the `EndpointSliceCache` code within `kube-proxy`.
{{< /note >}}

## Comparison with Endpoints {#motivation}

The original Endpoints API provided a simple and straightforward way of
tracking network endpoints in Kubernetes. As Kubernetes clusters
and {{< glossary_tooltip text="Services" term_id="service" >}} grew to handle
more traffic and to send more traffic to more backend Pods, the
limitations of that original API became more visible.
Most notably, those included challenges with scaling to larger numbers of
network endpoints.

Since all network endpoints for a Service were stored in a single Endpoints
object, those Endpoints objects could get quite large. For Services that stayed
stable (the same set of endpoints over a long period of time) the impact was
less noticeable; even then, some use cases of Kubernetes weren't well served.

When a Service had a lot of backend endpoints and the workload was either
scaling frequently, or rolling out new changes frequently, each update to
the single Endpoints object for that Service meant a lot of traffic between
Kubernetes cluster components (within the control plane, and also between
nodes and the API server). This extra traffic also had a cost in terms of
CPU use.

With EndpointSlices, adding or removing a single Pod triggers the same _number_
of updates to clients that are watching for changes, but the size of those
update message is much smaller at large scale.

EndpointSlices also enabled innovation around new features such dual-stack
networking and topology-aware routing.
### EndpointSlice mirroring

{{< feature-state for_k8s_version="v1.33" state="deprecated" >}}

The EndpointSlice API is a replacement for the older Endpoints API. To
preserve compatibility with older controllers and user workloads that
expect {{<glossary_tooltip term_id="kube-proxy" text="kube-proxy">}}
to route traffic based on Endpoints resources, the cluster's control
plane mirrors most user-created Endpoints resources to corresponding
EndpointSlices.

(However, this feature, like the rest of the Endpoints API, is
deprecated. Users who manually specify endpoints for selectorless
Services should do so by creating EndpointSlice resources directly,
rather than by creating Endpoints resources and allowing them to be
mirrored.)

The control plane mirrors Endpoints resources unless:

* the Endpoints resource has a `endpointslice.kubernetes.io/skip-mirror` label
set to `true`.
* the Endpoints resource has a `control-plane.alpha.kubernetes.io/leader`
annotation.
* the corresponding Service resource does not exist.
* the corresponding Service resource has a non-nil selector.

Individual Endpoints resources may translate into multiple EndpointSlices. This
will occur if an Endpoints resource has multiple subsets or includes endpoints
with multiple IP families (IPv4 and IPv6). A maximum of 1000 addresses per
subset will be mirrored to EndpointSlices.

## {{% heading "whatsnext" %}}

Expand Down
2 changes: 1 addition & 1 deletion content/en/docs/concepts/services-networking/gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ reference for a full definition of this API kind.

The HTTPRoute kind specifies routing behavior of HTTP requests from a Gateway listener to backend network
endpoints. For a Service backend, an implementation may represent the backend network endpoint as a Service
IP or the backing Endpoints of the Service. An HTTPRoute represents configuration that is applied to the
IP or the backing EndpointSlices of the Service. An HTTPRoute represents configuration that is applied to the
underlying Gateway implementation. For example, defining a new HTTPRoute may result in configuring additional
traffic routes in a cloud load balancer or in-cluster proxy server.

Expand Down
22 changes: 15 additions & 7 deletions content/en/docs/concepts/services-networking/service.md
Original file line number Diff line number Diff line change
Expand Up @@ -213,8 +213,8 @@ spec:
targetPort: 9376
```

Because this Service has no selector, the corresponding EndpointSlice (and
legacy Endpoints) objects are not created automatically. You can map the Service
Because this Service has no selector, the corresponding EndpointSlice
objects are not created automatically. You can map the Service
to the network address and port where it's running, by adding an EndpointSlice
object manually. For example:

Expand Down Expand Up @@ -307,14 +307,22 @@ until an extra endpoint needs to be added.
See [EndpointSlices](/docs/concepts/services-networking/endpoint-slices/) for more
information about this API.

### Endpoints
### Endpoints (deprecated) {#endpoints}

In the Kubernetes API, an
{{< feature-state for_k8s_version="v1.33" state="deprecated" >}}

The EndpointSlice API is the evolution of the older
[Endpoints](/docs/reference/kubernetes-api/service-resources/endpoints-v1/)
(the resource kind is plural) defines a list of network endpoints, typically
referenced by a Service to define which Pods the traffic can be sent to.
API. The deprecated Endpoints API has several problems relative to
EndpointSlice:

- It does not support dual-stack clusters.
- It does not contain information needed to support newer features, such as
[trafficDistribution](/docs/concepts/services-networking/service/#traffic-distribution).
- It will truncate the list of endpoints if it is too long to fit in a single object.

The EndpointSlice API is the recommended replacement for Endpoints.
Because of this, it is recommended that all clients use the
EndpointSlice API rather than Endpoints.

#### Over-capacity endpoints

Expand Down
8 changes: 4 additions & 4 deletions content/en/docs/concepts/workloads/pods/pod-lifecycle.md
Original file line number Diff line number Diff line change
Expand Up @@ -483,8 +483,8 @@ containers:

`readinessProbe`
: Indicates whether the container is ready to respond to requests.
If the readiness probe fails, the endpoints controller removes the Pod's IP
address from the endpoints of all Services that match the Pod. The default
If the readiness probe fails, the EndpointSlice controller removes the Pod's IP
address from the EndpointSlices of all Services that match the Pod. The default
state of readiness before the initial delay is `Failure`. If a container does
not provide a readiness probe, the default state is `Success`.

Expand Down Expand Up @@ -608,7 +608,7 @@ Pod termination flow, illustrated with an example:
to synchronize (or switch to using sidecar containers).

1. At the same time as the kubelet is starting graceful shutdown of the Pod, the control plane
evaluates whether to remove that shutting-down Pod from EndpointSlice (and Endpoints) objects,
evaluates whether to remove that shutting-down Pod from EndpointSlice objects,
where those objects represent a {{< glossary_tooltip term_id="service" text="Service" >}}
with a configured {{< glossary_tooltip text="selector" term_id="selector" >}}.
{{< glossary_tooltip text="ReplicaSets" term_id="replica-set" >}} and other workload resources
Expand All @@ -621,7 +621,7 @@ Pod termination flow, illustrated with an example:

Any endpoints that represent the terminating Pods are not immediately removed from
EndpointSlices, and a status indicating [terminating state](/docs/concepts/services-networking/endpoint-slices/#conditions)
is exposed from the EndpointSlice API (and the legacy Endpoints API).
is exposed from the EndpointSlice API.
Terminating endpoints always have their `ready` status as `false` (for backward compatibility
with versions before 1.26), so load balancers will not use it for regular traffic.

Expand Down
12 changes: 6 additions & 6 deletions content/en/docs/reference/access-authn-authz/rbac.md
Original file line number Diff line number Diff line change
Expand Up @@ -706,9 +706,9 @@ When used in a <b>RoleBinding</b>, it gives full control over every resource in
If used in a <b>RoleBinding</b>, allows read/write access to most resources in a namespace,
including the ability to create roles and role bindings within the namespace.
This role does not allow write access to resource quota or to the namespace itself.
This role also does not allow write access to EndpointSlices (or Endpoints) in clusters created
This role also does not allow write access to EndpointSlices in clusters created
using Kubernetes v1.22+. More information is available in the
["Write Access for EndpointSlices and Endpoints" section](#write-access-for-endpoints).</td>
["Write Access for EndpointSlices" section](#write-access-for-endpoints).</td>
</tr>
<tr>
<td><b>edit</b></td>
Expand All @@ -718,9 +718,9 @@ using Kubernetes v1.22+. More information is available in the
This role does not allow viewing or modifying roles or role bindings.
However, this role allows accessing Secrets and running Pods as any ServiceAccount in
the namespace, so it can be used to gain the API access levels of any ServiceAccount in
the namespace. This role also does not allow write access to EndpointSlices (or Endpoints) in
the namespace. This role also does not allow write access to EndpointSlices in
clusters created using Kubernetes v1.22+. More information is available in the
["Write Access for EndpointSlices and Endpoints" section](#write-access-for-endpoints).</td>
["Write Access for EndpointSlices" section](#write-access-for-endpoints).</td>
</tr>
<tr>
<td><b>view</b></td>
Expand Down Expand Up @@ -1213,10 +1213,10 @@ In order from most secure to least secure, the approaches are:
--group=system:serviceaccounts
```

## Write access for EndpointSlices and Endpoints {#write-access-for-endpoints}
## Write access for EndpointSlices {#write-access-for-endpoints}

Kubernetes clusters created before Kubernetes v1.22 include write access to
EndpointSlices (and Endpoints) in the aggregated "edit" and "admin" roles.
EndpointSlices (and the now-deprecated Endpoints API) in the aggregated "edit" and "admin" roles.
As a mitigation for [CVE-2021-25740](https://github.com/kubernetes/kubernetes/issues/103675),
this access is not part of the aggregated roles in clusters that you create using
Kubernetes v1.22 or later.
Expand Down
9 changes: 3 additions & 6 deletions content/en/docs/reference/glossary/endpoint-slice.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,13 @@ id: endpoint-slice
date: 2018-04-12
full_link: /docs/concepts/services-networking/endpoint-slices/
short_description: >
A way to group network endpoints together with Kubernetes resources.
EndpointSlices track the IP addresses of Pods with matching Service selectors.

aka:
tags:
- networking
---
A way to group network endpoints together with Kubernetes resources.
EndpointSlices track the IP addresses of Pods with matching {{< glossary_tooltip text="selectors" term_id="selector" >}}.

<!--more-->

A scalable and extensible way to group network endpoints together. These can be
used by {{< glossary_tooltip text="kube-proxy" term_id="kube-proxy" >}} to
establish network routes on each {{< glossary_tooltip text="node" term_id="node" >}}.
EndpointSlices can be configured manually for {{< glossary_tooltip text="Services" term_id="service" >}} without selectors specified.
13 changes: 9 additions & 4 deletions content/en/docs/reference/glossary/endpoint.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,19 @@ id: endpoints
date: 2020-04-23
full_link:
short_description: >
Endpoints track the IP addresses of Pods with matching Service selectors.
An endpoint of a Service is one of the Pods (or external servers) that implements the Service.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

despite is deprecated the API exist, so I think the old definition still applies

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, it was dicussed here #49831 (comment) , it seems is the preference to follow this path


aka:
tags:
- networking
---
Endpoints track the IP addresses of Pods with matching {{< glossary_tooltip text="selectors" term_id="selector" >}}.
An endpoint of a {{< glossary_tooltip text="Service" term_id="service" >}} is one of the {{< glossary_tooltip text="Pods" term_id="pod" >}} (or external servers) that implements the Service.

<!--more-->
Endpoints can be configured manually for {{< glossary_tooltip text="Services" term_id="service" >}} without selectors specified.
The {{< glossary_tooltip text="EndpointSlice" term_id="endpoint-slice" >}} resource provides a scalable and extensible alternative to Endpoints.
For Services with {{< glossary_tooltip text="selectors" term_id="selector" >}},
the EndpointSlice controller will automatically create one or more {{<
glossary_tooltip text="EndpointSlices" term_id="endpoint-slice" >}} giving the
IP addresses of the selected endpoint Pods.

EndpointSlices can also be created manually to indicate the endpoints of
Services that have no selector specified.
Loading