Foundation Model Inference Architectures with KServe on EKS

This repository contains a reference architecture and test cases for Foundation Model inference with KServe on Amazon EKS, integrating Karpenter as cluster autoscaler.

KServe offers a standard Kubernetes-based Model Inference Platform for scalable use-cases. Complementing it, Karpenter provides swift, simplified compute provisioning, optimally leveraging cloud resources. This synergy offers a unique opportunity to exploit Spot instances, enhancing cost efficiency. This reference architecture illustrates the mechanics of these technologies and demonstrates their combined power in enabling efficient serverless ML deployments.

Deployment

This section guide you through how to deploy an EKS cluster and Kubernetes custom resources required. This repository is built on top of Karpenter Blueprints. Please refer to the repository for the infrastructure set up or run Make to

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
terraform		terraform
test_cases/vllm-benchmark		test_cases/vllm-benchmark
validation		validation
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

terraform

terraform

test_cases/vllm-benchmark

test_cases/vllm-benchmark

validation

validation

.gitignore

.gitignore

CODE_OF_CONDUCT.md

CODE_OF_CONDUCT.md

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

Makefile

Makefile

README.md

README.md

Repository files navigation

Foundation Model Inference Architectures with KServe on EKS

Deployment

Infrastructure validation

Test cases

License

About

Releases

Packages

Contributors 2

Languages

License

aws-samples/awsome-llmops

Folders and files

Latest commit

History

Repository files navigation

Foundation Model Inference Architectures with KServe on EKS

Deployment

Infrastructure validation

Test cases

License

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages