InferenceService Status restructuring
See original GitHub issueCurrently, ModelMesh has its own set of status fields that it populates in the InferenceService status since we allow unknown fields there. Similarly, KServe has its own status fields that are predominantly based on knative fields.
The goal here is to converge on a common set of status fields that would work with the various different deployment modes.
As an overview, here are the current status fields:
InferenceServiceStatus
InferenceServiceStatus
βββββ duckv1.Status `json:",inline"`
β β // The 'Generation' of the Service that was last processed by the controller.
β ββββ ObservedGeneration int64 `json:"observedGeneration,omitempty"`
β β
β β // Conditions the latest available observations of a resource's current state.
β | // Condition defines a readiness condition for a Knative resource.
β ββββ Conditions Conditions `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type"`
β |
β | // Additional Status fields for the Resource to save some additional State as well as convey more information to the user.
β ββββ Annotations map[string]string `json:"annotations,omitempty"`
β
βββββ Address *duckv1.Addressable `json:"address,omitempty"`
βββββ URL *apis.URL `json:"url,omitempty"`
βββββ Components map[ComponentType]ComponentStatusSpec `json:"components,omitempty"`
ComponentTypes
are predictor
, explainer
, transformer
. Which map to a ComponentStatusSpec.
An example of an actual status can be found here: https://pastebin.com/wkCrZyxk
ModelMesh Predictor Status
PredictorStatus
βββββ Available bool `json:"available"`
β
β // One of 'UpToDate', 'InProgress', 'BlockedByFailedLoad', or 'InvalidSpec'
βββββ TransitionStatus TransitionStatus `json:"transitionStatus"
β
β // High level state string: Pending, Standby, Loading, Loaded, FailedToLoad
βββββ ActiveModelState ModelState `json:"activeModelState"
βββββ TargetModelState ModelState `json:"targetModelState"`
β
β // Details of last failure, when load of target model is failed or blocked
βββββ LastFailureInfo *FailureInfo `json:"lastFailureInfo,omitempty"`
β
β // Addressable endpoint for the deployed trained model. This will be "static" and will not change when the model is mutated
βββββ HTTPEndpoint string `json:"httpEndpoint"`
βββββ GrpcEndpoint string `json:"grpcEndpoint"`
β
β // How many copies of this predictor's models failed to load recently
βββββ FailedCopies int `json:"failedCopies"`
An example of an actual predictor status can be found here: https://pastebin.com/wBM3WgFW
Action item: Decide how we can restructure the fields in a way that make sense for all the deployment modes. Do we still rely on the Knative types?
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (8 by maintainers)
Top GitHub Comments
For both KServe and ModelMesh to update and understand the status/state of an inference service, I would suggest to continue to use the Knative types. That probably means that InferenceServiceStatus needs to cover ModelMesh status requirements, and ModelMesh needs to include Knative packages and change code accordingly.
FYI @yuzisun , @njhill , @chinhuang007