add networking test config files #536

oliviassss · 2025-07-23T23:23:01Z

Issue #, if available:

Description of changes:
Onboard networking components scale test

coredns
- leverage the dnsperfgo in upstream perf-test/clusterloader, and collect dns request latency metrics.
- currently use the default settings - the tester create 5 dns client pods and 1 extra DNS client pod for every 100 nodes in the cluster; we can fine tune the config later
kubeproxy
- leverage the upstream clusterloader measurement to collect and KP perf metrics for networking programming latency
- currently test with 5k endpoints, this can be configurable if we want to test larger scale later.

Test:

Tested via internal pipeline for 1k node. link; test results
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

tests/tekton-resources/tasks/generators/clusterloader/load-networking.yaml

tests/assets/eks-networking/config-eks-networking.yaml

hakuna-matatah · 2025-07-24T14:27:31Z

tests/assets/eks-networking/config-eks-networking.yaml

+      action: start
+{{if $ENABLE_NETWORK_POLICY_ENFORCEMENT_LATENCY_TEST}}
+- module:
+    path: modules/network-policy/net-policy-enforcement-latency.yaml


where are these modules ?

these NP modules are all from upstream: https://github.com/kubernetes/perf-tests/tree/master/clusterloader2/testing/load/modules/network-policy

I see that you are copying this config in task here - https://github.com/awslabs/kubernetes-iteration-toolkit/pull/536/files#diff-fc65d141840f1a98569538f9a751871bc06280084dec75e8e1777f47f429814dR175 to cl2 load test dir.

Could you add the comment explicitly here saying you are copying this file to relative path under cl2 load test dir, because by default this won't work if your config file is sitting in some other directory.

added a comment.
The test needs to access modules under the cl2 folder, so copying the config over cl2 is cleaner.

tests/assets/eks-networking/config-eks-networking.yaml

hakuna-matatah · 2025-07-25T17:15:20Z

tests/assets/eks-networking/test-svc.yaml

+  name: test-svc-deployment
+  namespace: test-svc
+spec:
+  replicas: 5000


do you want to make this configurable ?

We don't have to block on this for this PR if you want, but you may want to keep this configurable to test at different scales.

Yeah I would prefer to take this as a TODO. I'm thinking of 2 options:

Migrate to use daemonset from deployment, enabling us to test endpoints across all nodes (where number of endpoints = node count)

Implement a configurable setup with m services, each containing k endpoints, where both parameters are configurable
I'd like to evaluate these 2 options further with an actual cluster that we will use for all the testing, and decide a better option (less time costly, but stress test kp)

hakuna-matatah · 2025-07-25T17:17:43Z

tests/tekton-resources/tasks/generators/clusterloader/load-networking.yaml

+      EOF
+      cat $(workspaces.source.path)/perf-tests/clusterloader2/pkg/prometheus/manifests/exporters/kube-state-metrics/deployment.yaml
+
+      # # TODO: Remove this once we fix https://github.com/kubernetes/kubernetes/issues/126578 or find a better way to work around it.


not sure i get this ?

I'm not sure if we still need this, looks like the issue with endpoint controller has been fixed upstream. Removing coredns service monitor will cause Prometheus stop scrapping coredns metrics. so I commented out

tests/tekton-resources/tasks/generators/clusterloader/load-networking.yaml

hakuna-matatah · 2025-07-25T17:19:58Z

tests/tekton-resources/tasks/generators/clusterloader/load-networking.yaml

+
+      # create the service backed by 5k pods to test kubeproxy network programming performance
+      # we can tune the scale of pods later
+      kubectl apply -f $(workspaces.source.path)/perf-tests/clusterloader2/testing/load/test-svc.yaml


why are you creating the workload even before test is kicked off ?

It's better to create the service with endpoints before the clusterloader binary runs since clusterloader only collects kp metrics.

The svc creation itself will trigger kp to sync the network programming rules, and generate the latency metrics (the metrics measures the time gap b/w endpoints creation timestamp and the time when kp done it's job, so it's better to create before cl collects the metrics).

add networking test config files

ddae21b

hakuna-matatah reviewed Jul 24, 2025

View reviewed changes

tests/tekton-resources/tasks/generators/clusterloader/load-networking.yaml Outdated Show resolved Hide resolved

hakuna-matatah reviewed Jul 24, 2025

View reviewed changes

tests/tekton-resources/tasks/generators/clusterloader/load-networking.yaml Outdated Show resolved Hide resolved

hakuna-matatah reviewed Jul 24, 2025

View reviewed changes

oliviassss force-pushed the sonyingy-dev branch from a26f7b4 to 1b45271 Compare July 24, 2025 18:10

update load and config

7e9e5a3

oliviassss force-pushed the sonyingy-dev branch from 1b45271 to 7e9e5a3 Compare July 24, 2025 18:45

add a comment

6b7594f

hakuna-matatah reviewed Jul 25, 2025

View reviewed changes

tests/tekton-resources/tasks/generators/clusterloader/load-networking.yaml Outdated Show resolved Hide resolved

hakuna-matatah reviewed Jul 25, 2025

View reviewed changes

tests/tekton-resources/tasks/generators/clusterloader/load-networking.yaml Outdated Show resolved Hide resolved

hakuna-matatah reviewed Jul 25, 2025

View reviewed changes

remove unneeded comments

0062767

hakuna-matatah approved these changes Jul 28, 2025

View reviewed changes

mengqiy merged commit 5cf3f10 into awslabs:main Jul 28, 2025

add networking test config files #536

add networking test config files #536

Uh oh!

Conversation

oliviassss commented Jul 23, 2025

Test:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hakuna-matatah Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oliviassss Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hakuna-matatah Jul 25, 2025 •

edited

Loading

oliviassss Jul 28, 2025 •

edited

Loading