add helm template #416

Kuromesi · 2025-02-27T02:53:12Z

Resolve #381, deploy by helm.

A generated file is shown in config/manifests/gateway-api-inference-extension/generated.yaml.

To avoid conflicts with other releases, I extend the names of the resources with helm release name, which is shown in config/manifests/gateway-api-inference-extension/templates/_helpers.tpl.

k8s-ci-robot · 2025-02-27T02:53:19Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Kuromesi
Once this PR has been reviewed and has the lgtm label, please assign jeffwan for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2025-02-27T02:53:22Z

Hi @Kuromesi. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: Kuromesi <[email protected]>

netlify · 2025-02-27T02:54:01Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`c91ec3c`
🔍 Latest deploy log	https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67bfd39b64b78d0008de3d29
😎 Deploy Preview	https://deploy-preview-416--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

netlify · 2025-02-27T02:55:51Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`2366460`
🔍 Latest deploy log	https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67c172f89a1ea4000862e593
😎 Deploy Preview	https://deploy-preview-416--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

robscott · 2025-02-27T20:55:16Z

Thanks @Kuromesi! I'll try to take a look at this today

/assign

ahg-g · 2025-02-27T21:22:04Z

/ok-to-test

Kuromesi · 2025-02-28T00:46:18Z

Thanks @Kuromesi! I'll try to take a look at this today

/assign

Thanks, and I got some questions which not quite certain of:

I made some effort on extending the names of the resources to avoid conflicting and made it possible to deploy under a single namespace, I'm not quite sure if that needed and the naming is appropriate.
Do we need to support to configure entire setup parameters of the ext_proc in helm values?

robscott · 2025-02-28T01:15:41Z

Hey @Kuromesi, thanks for the work on this!

I think it would be helpful to think about how we expect users to use this project.

Initial Setup

Install APIs (CRDs)
Install or enable a Gateway controller that supports this API
Set up an initial Gateway
Maybe set up body-to-header translator to enable routing based on model param in body (Add code for Envoy extension that support body-to-header translation #355)

Day to Day

Deploy InferencePool(s), each of which will be bundled with an Endpoint Picker
Configure InferenceModel(s) that will be served by an InferencePool
Configure HTTPRoute(s) to point to InferencePool(s)

While your PR seems to do a great job at capturing the config required for our current quickstart guide, it's not where we want to be long term. In the next ~month, I'm hopeful that we'll have built in support for this pattern from kgateway, Istio, and GKE Gateway implementations. That will mean that instead of manually patching Envoy Gateway like our current quickstart guide (and this Helm chart) do, users will be able to just use these APIs directly.

With that background, I think the original issue was specifically asking for a chart that "simplifies creating an InferencePool with an associated EPP deployment".

I think the ideal for this would be a chart that took parameters for InferencePool name, and then had defaults for all the rest, including the EPP configuration (Deployment, Service, HPA, RBAC). It looks like you have a lot of this in the chart already, but ideally the chart could be restructured to be focused exclusively on InferencePool and deploying a corresponding extension.

In the future we could expand this chart to include InferenceModels pointing at the InferencePool.

I'd recommend leaving all CRD, Gateway, and HTTPRoute configuration out of this chart. Hopefully that approach makes sense. I'm also happy to chat about this in the #gateway-api-inference-extension channel on Kubernetes Slack if that would be easier.

Kuromesi · 2025-02-28T01:58:34Z

Hey @Kuromesi, thanks for the work on this!

I think it would be helpful to think about how we expect users to use this project.

Initial Setup

Install APIs (CRDs)

Install or enable a Gateway controller that supports this API

Set up an initial Gateway

Maybe set up body-to-header translator to enable routing based on model param in body ([WIP] Add code for Envoy extension that support body-to-header translation #355)

Day to Day

Deploy InferencePool(s), each of which will be bundled with an Endpoint Picker

Configure InferenceModel(s) that will be served by an InferencePool

Configure HTTPRoute(s) to point to InferencePool(s)

While your PR seems to do a great job at capturing the config required for our current quickstart guide, it's not where we want to be long term. In the next ~month, I'm hopeful that we'll have built in support for this pattern from kgateway, Istio, and GKE Gateway implementations. That will mean that instead of manually patching Envoy Gateway like our current quickstart guide (and this Helm chart) do, users will be able to just use these APIs directly.

With that background, I think the original issue was specifically asking for a chart that "simplifies creating an InferencePool with an associated EPP deployment".

I think the ideal for this would be a chart that took parameters for InferencePool name, and then had defaults for all the rest, including the EPP configuration (Deployment, Service, HPA, RBAC). It looks like you have a lot of this in the chart already, but ideally the chart could be restructured to be focused exclusively on InferencePool and deploying a corresponding extension.

In the future we could expand this chart to include InferenceModels pointing at the InferencePool.

I'd recommend leaving all CRD, Gateway, and HTTPRoute configuration out of this chart. Hopefully that approach makes sense. I'm also happy to chat about this in the #gateway-api-inference-extension channel on Kubernetes Slack if that would be easier.

Got it, thanks!

Signed-off-by: Kuromesi <[email protected]>

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 27, 2025

k8s-ci-robot requested review from danehans and kfswain February 27, 2025 02:53

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 27, 2025

initialize helm template

4931640

Signed-off-by: Kuromesi <[email protected]>

Kuromesi force-pushed the helm branch from c91ec3c to 4931640 Compare February 27, 2025 02:54

k8s-ci-robot assigned robscott Feb 27, 2025

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 27, 2025

tidy template

2366460

Signed-off-by: Kuromesi <[email protected]>

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add helm template #416

add helm template #416

Kuromesi commented Feb 27, 2025

k8s-ci-robot commented Feb 27, 2025

k8s-ci-robot commented Feb 27, 2025

netlify bot commented Feb 27, 2025 •

edited

Loading

netlify bot commented Feb 27, 2025 •

edited

Loading

robscott commented Feb 27, 2025

ahg-g commented Feb 27, 2025

Kuromesi commented Feb 28, 2025

robscott commented Feb 28, 2025 •

edited

Loading

Kuromesi commented Feb 28, 2025

add helm template #416

Are you sure you want to change the base?

add helm template #416

Conversation

Kuromesi commented Feb 27, 2025

k8s-ci-robot commented Feb 27, 2025

k8s-ci-robot commented Feb 27, 2025

netlify bot commented Feb 27, 2025 • edited Loading

✅ Deploy Preview for gateway-api-inference-extension ready!

netlify bot commented Feb 27, 2025 • edited Loading

✅ Deploy Preview for gateway-api-inference-extension ready!

robscott commented Feb 27, 2025

ahg-g commented Feb 27, 2025

Kuromesi commented Feb 28, 2025

robscott commented Feb 28, 2025 • edited Loading

Kuromesi commented Feb 28, 2025

netlify bot commented Feb 27, 2025 •

edited

Loading

netlify bot commented Feb 27, 2025 •

edited

Loading

robscott commented Feb 28, 2025 •

edited

Loading