This repository contains a reference architecture and test cases for Foundation Model inference with KServe on Amazon EKS, integrating Karpenter as cluster autoscaler.
KServe offers a standard Kubernetes-based Model Inference Platform for scalable use-cases. Complementing it, Karpenter provides swift, simplified compute provisioning, optimally leveraging cloud resources. This synergy offers a unique opportunity to exploit Spot instances, enhancing cost efficiency. This reference architecture illustrates the mechanics of these technologies and demonstrates their combined power in enabling efficient serverless ML deployments.
This section guide you through how to deploy an EKS cluster and Kubernetes custom resources required.
This repository is built on top of Karpenter Blueprints. Please refer to the repository for the infrastructure set up or run Make
to
Once you have deploy
This library is licensed under the MIT-0 License. See the LICENSE file.