GitHub - aws-samples/optimize-foundation-models-deployment-on-amazon-sagemaker: In this workshop, we demonstrate how to choose the right container and right instance types, optimize container parameters, and set up the right autoscaling policies and how to use APIs to get recommendations with Amazon SageMaker

AIM351 - Optimize foundation model deployment on Amazon SageMaker

Workshop Studio link: https://catalog.workshops.aws/optimize-foundation-model-deployment-on-amazon-sagemaker

Deploy large foundation models on Amazon SageMaker

Hosting foundation models(FMs) can be challenging. Larger models are often more accurate because they include billions of parameters, but their size can also result in slower inference latency or decreased throughput. Hosting an FM can require more accelerator memory and optimized kernels to achieve the best performance.

In this workshop, we demonstrate how to use SageMaker Deep Learning Containers(DLCs) and various strategies to optimize FM inference to optimize cost and performance.

The goal of this workshop is to give you hands-on experience with deploying foundation models using Amazon SageMaker

What's included in the workshop

This workshop provides a hands on experience deploying foundation models using Amazon SageMaker. This workshop tackles the following topics -

Lab 1: Hosting large models on Amazon SageMaker with Large Model Inference(LMI) Deep Learning Container(DLC) and TensorRT-LLM.
Lab 2: Deploy Llama2 13b SmoothQuant Model with high performance on SageMaker using Sagemaker LMI.
Lab 3: Multi-LoRA adapter inference on Amazon SageMaker.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
lab1		lab1
lab2		lab2
lab3		lab3
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lab1

lab1

lab2

lab2

lab3

lab3

.gitignore

.gitignore

CODE_OF_CONDUCT.md

CODE_OF_CONDUCT.md

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

README.md

README.md

Repository files navigation

AIM351 - Optimize foundation model deployment on Amazon SageMaker

Deploy large foundation models on Amazon SageMaker

What's included in the workshop

Security

License

About

Releases

Packages

Contributors 5

Languages

License

aws-samples/optimize-foundation-models-deployment-on-amazon-sagemaker

Folders and files

Latest commit

History

Repository files navigation

AIM351 - Optimize foundation model deployment on Amazon SageMaker

Deploy large foundation models on Amazon SageMaker

What's included in the workshop

Security

License

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages