Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add helm template #416

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

add helm template #416

wants to merge 2 commits into from

Conversation

Kuromesi
Copy link
Contributor

Resolve #381, deploy by helm.

A generated file is shown in config/manifests/gateway-api-inference-extension/generated.yaml.

To avoid conflicts with other releases, I extend the names of the resources with helm release name, which is shown in config/manifests/gateway-api-inference-extension/templates/_helpers.tpl.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 27, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Kuromesi
Once this PR has been reviewed and has the lgtm label, please assign jeffwan for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

Hi @Kuromesi. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 27, 2025
Copy link

netlify bot commented Feb 27, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit c91ec3c
🔍 Latest deploy log https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67bfd39b64b78d0008de3d29
😎 Deploy Preview https://deploy-preview-416--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

netlify bot commented Feb 27, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 2366460
🔍 Latest deploy log https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67c172f89a1ea4000862e593
😎 Deploy Preview https://deploy-preview-416--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@robscott
Copy link
Member

Thanks @Kuromesi! I'll try to take a look at this today

/assign

@ahg-g
Copy link
Contributor

ahg-g commented Feb 27, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 27, 2025
@Kuromesi
Copy link
Contributor Author

Thanks @Kuromesi! I'll try to take a look at this today

/assign

Thanks, and I got some questions which not quite certain of:

  1. I made some effort on extending the names of the resources to avoid conflicting and made it possible to deploy under a single namespace, I'm not quite sure if that needed and the naming is appropriate.
  2. Do we need to support to configure entire setup parameters of the ext_proc in helm values?

@robscott
Copy link
Member

robscott commented Feb 28, 2025

Hey @Kuromesi, thanks for the work on this!

I think it would be helpful to think about how we expect users to use this project.

Initial Setup

Day to Day

  • Deploy InferencePool(s), each of which will be bundled with an Endpoint Picker
  • Configure InferenceModel(s) that will be served by an InferencePool
  • Configure HTTPRoute(s) to point to InferencePool(s)

While your PR seems to do a great job at capturing the config required for our current quickstart guide, it's not where we want to be long term. In the next ~month, I'm hopeful that we'll have built in support for this pattern from kgateway, Istio, and GKE Gateway implementations. That will mean that instead of manually patching Envoy Gateway like our current quickstart guide (and this Helm chart) do, users will be able to just use these APIs directly.

With that background, I think the original issue was specifically asking for a chart that "simplifies creating an InferencePool with an associated EPP deployment".

I think the ideal for this would be a chart that took parameters for InferencePool name, and then had defaults for all the rest, including the EPP configuration (Deployment, Service, HPA, RBAC). It looks like you have a lot of this in the chart already, but ideally the chart could be restructured to be focused exclusively on InferencePool and deploying a corresponding extension.

In the future we could expand this chart to include InferenceModels pointing at the InferencePool.

I'd recommend leaving all CRD, Gateway, and HTTPRoute configuration out of this chart. Hopefully that approach makes sense. I'm also happy to chat about this in the #gateway-api-inference-extension channel on Kubernetes Slack if that would be easier.

@Kuromesi
Copy link
Contributor Author

Hey @Kuromesi, thanks for the work on this!

I think it would be helpful to think about how we expect users to use this project.

Initial Setup

Day to Day

  • Deploy InferencePool(s), each of which will be bundled with an Endpoint Picker
  • Configure InferenceModel(s) that will be served by an InferencePool
  • Configure HTTPRoute(s) to point to InferencePool(s)

While your PR seems to do a great job at capturing the config required for our current quickstart guide, it's not where we want to be long term. In the next ~month, I'm hopeful that we'll have built in support for this pattern from kgateway, Istio, and GKE Gateway implementations. That will mean that instead of manually patching Envoy Gateway like our current quickstart guide (and this Helm chart) do, users will be able to just use these APIs directly.

With that background, I think the original issue was specifically asking for a chart that "simplifies creating an InferencePool with an associated EPP deployment".

I think the ideal for this would be a chart that took parameters for InferencePool name, and then had defaults for all the rest, including the EPP configuration (Deployment, Service, HPA, RBAC). It looks like you have a lot of this in the chart already, but ideally the chart could be restructured to be focused exclusively on InferencePool and deploying a corresponding extension.

In the future we could expand this chart to include InferenceModels pointing at the InferencePool.

I'd recommend leaving all CRD, Gateway, and HTTPRoute configuration out of this chart. Hopefully that approach makes sense. I'm also happy to chat about this in the #gateway-api-inference-extension channel on Kubernetes Slack if that would be easier.

Got it, thanks!

Signed-off-by: Kuromesi <[email protected]>
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add helm chart to simplify creating an InferencePool + EPP deployment
4 participants