Skip to content

Conversation

Gregory-Pereira
Copy link
Contributor

Copy link

linux-foundation-easycla bot commented Aug 14, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot
Copy link
Contributor

Welcome @Gregory-Pereira!

It looks like this is your first PR to kubernetes-sigs/gateway-api-inference-extension 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/gateway-api-inference-extension has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Aug 14, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @Gregory-Pereira. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 14, 2025
Copy link

netlify bot commented Aug 14, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 602f45f
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/68c2df646247e800081e3757
😎 Deploy Preview https://deploy-preview-1381--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Aug 14, 2025
@Gregory-Pereira
Copy link
Contributor Author

Gregory-Pereira commented Aug 14, 2025

cc @SergeyKanzhelev @ahg-g

@kfswain
Copy link
Collaborator

kfswain commented Aug 18, 2025

Meta-question here; does a destination rule have to be provided at the time of inferencePool creation? I don't fully understand the usecase here, and so, to me, it looks like we are just extracting all the DestinationRule fields into the helm chart. That may set a pattern we don't want, where the helm chart is essentially just a wrapper over the direct K8s manifest.

Said another way; what are we simplifying by exposing the entire CRD in the helm chart?

@Gregory-Pereira
Copy link
Contributor Author

Gregory-Pereira commented Aug 18, 2025

Meta-question here; does a destination rule have to be provided at the time of inferencePool creation?

The key reason why I wanted this change, is because to properly route traffic with Istio we need a DestinationRule. The .host value for that DestinationRule needs to be the EPP service address, and so the DestinationRule is dependent on the name of the EPP service and thus the release name used when installing the GIE charts.

What we do currently (which is definitely not ideal) is create the destination rule in the infra charts, (mapping gateway traffic to a service address that does not exist yet). Then we create the release of the model-service the GIE charts which will create the other pieces of the stack, and we apply a bogus label to the HTTPRoute and or Gateway to force reconciliation.

Said another way; what are we simplifying by exposing the entire CRD in the helm chart?

In my eyes we are simplifying the user experience. AFAICT, this chart does not support extraDeploy or any method to apply or track miscellaneous but related manifests. The alternatives I see here are creating an individual DestinationRule out of band after chart application, which relies on scripting rather than coordinating helmfile releases, or guessing the service address ahead of time in infra and applying it there.

Definitely open to alternatives if you had anything better in mind.

Comment on lines +4 to +3
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to use Istio specific APIs (DestinationRule) or standard GW APIs can be used instead?
(for example: BackendTrafficPolicy and BackendTLSPolicy
https://gateway-api.sigs.k8s.io/api-types/backendtrafficpolicy/
https://gateway-api.sigs.k8s.io/api-types/backendtlspolicy/?h=backendtlspolicy)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely open to it if you want to contribute an alternative, this is just beyond the scope of this initial implementation for integration with llm-d

@liu-cong
Copy link
Contributor

liu-cong commented Aug 18, 2025

Great discussions between @Gregory-Pereira and @kfswain .

The part that it creates a DestinationRule pointing to the EPP service makes sense to me (unless there are better ways to do it), similar to how we create the GCPBackendPolicy and HealthCheckPolicy for gke. To me these are necessary resources to make things work, while they are not "mandatory" info that users need to know unless they want to customize.

But if that's our goal (hide unnecessary complexity for most initial use cases), then we should probably provide a [optional] default config that works out of the box, instead of asking the user to provide the entire DestinationRule config. The current gke example provides default config that works out of the box. Then the question becomes what's the minimal DestinationRule config required to get things to work? Is it just the host part?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 21, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Gregory-Pereira
Once this PR has been reviewed and has the lgtm label, please assign ahg-g for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 28, 2025
@Gregory-Pereira Gregory-Pereira force-pushed the istio-provider-support branch 2 times, most recently from ca7ea49 to f6f6e5c Compare August 28, 2025 00:14
Copy link
Contributor

@liu-cong liu-cong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

From what I can tell from our Istio doc, this is needed for Istio routing to work. I tink this is a nice to have, and we can simplify the Istio installation doc once this is in.

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 28, 2025
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@danehans
Copy link
Contributor

danehans commented Sep 2, 2025

This PR needs a link to an issue or a more detailed description of what the problem is and how this PR fixes it.

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 2, 2025
@danehans
Copy link
Contributor

danehans commented Sep 2, 2025

xref: #1475 that refactors the setup guide and may affect this PR.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 6, 2025
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 11, 2025
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants