-
Notifications
You must be signed in to change notification settings - Fork 53
Description
Question
What is/will be the preferred approach to consuming models within Kubernetes by applications and admission controllers/operators? Some overarching questions to consider are:
- How does the model get to the Kubernetes cluster?
- Where is the model stored within the Kubernetes cluster?
- How is the model consumed within Kubernetes?
For example, some questions to ask as they relate to each of those questions are:
- Will the models all be downloaded and stored in some shared persistent volume?
- Downloaded on-demand somewhere specific for each application on the node it's running?
- Streamed?
- Accessed over the network via API calls?
- Consumed as OCI container images?
- Combination of different ways?
- Some other cloud native format/solution?
Options
Some potential options come to mind right now (others may follow):
Option 1
Introduce a Storage Access Service as a new component in the model-transparency ecosystem. This service would:
- Run as a standalone Deployment in the cluster.
- Dynamically introspect and access CSI-managed volumes containing models (via Kubernetes APIs and CSI driver capabilities).
- Expose a simple API (e.g., REST or gRPC) for admission controllers to fetch model data synchronously.
This enhancement would enable admission controllers to validate or process models in real-time without hardcoding storage backend logic or assuming a single volume, improving scalability and maintainability.
Option 2
Using OCI standard for packaging and deployment of ML models within cloud native environments like Kubernetes. See #434 for details.