Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InferenceModel status conditions update #380

Open
ahg-g opened this issue Feb 20, 2025 · 1 comment
Open

InferenceModel status conditions update #380

ahg-g opened this issue Feb 20, 2025 · 1 comment

Comments

@ahg-g
Copy link
Contributor

ahg-g commented Feb 20, 2025

InferenceModel defines a Accepted condition type with three possible reasons:

  • Pending, which is the default when the object is created
  • ModelNameInUse, which is set if the ModelName is used by another InferenceModel
  • Accepted, which is set when the model conforms to the state of the InferencePool it references.

The question here is if we want the epp to populate this condition, or some other component like the gateway controller.

As of right now, there is no reason for the gateway controller to be aware of the InferenceModel API, it only cares about the InferencePool to establish the connection between the proxy and the epp.

One problem with having the epp update the condition is that it may run in HA active-active mode, which complicates synchronizing status updates. It also adds the requirement on other epp implementations to do so.

This issue is to open and track the discussion on this topic, and bring awareness that the conditions are currently not being updated by any component.

@ahg-g
Copy link
Contributor Author

ahg-g commented Feb 20, 2025

One interesting direction here is to have a third component (inference-gateway-controller), that does status updates for all inferencePool/Model objects in the cluster, including other potential status tracking like reporting the number of ready endpoints on the inferencePool. There are however status updates on the InferencePool that should be owned by the gateway controller, and so this is also not a clear path since ideally the status of an object should be owned by a single controller.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant