Skip to content

In this workshop, we demonstrate how to choose the right container and right instance types, optimize container parameters, and set up the right autoscaling policies and how to use APIs to get recommendations with Amazon SageMaker

License

Notifications You must be signed in to change notification settings

aws-samples/optimize-foundation-models-deployment-on-amazon-sagemaker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

AIM351 - Optimize foundation model deployment on Amazon SageMaker

Workshop Studio link: https://catalog.workshops.aws/optimize-foundation-model-deployment-on-amazon-sagemaker

Deploy large foundation models on Amazon SageMaker

Hosting foundation models(FMs) can be challenging. Larger models are often more accurate because they include billions of parameters, but their size can also result in slower inference latency or decreased throughput. Hosting an FM can require more accelerator memory and optimized kernels to achieve the best performance.

In this workshop, we demonstrate how to use SageMaker Deep Learning Containers(DLCs) and various strategies to optimize FM inference to optimize cost and performance.

The goal of this workshop is to give you hands-on experience with deploying foundation models using Amazon SageMaker

What's included in the workshop

This workshop provides a hands on experience deploying foundation models using Amazon SageMaker. This workshop tackles the following topics -

  • Lab 1: Hosting large models on Amazon SageMaker with Large Model Inference(LMI) Deep Learning Container(DLC) and TensorRT-LLM.
  • Lab 2: Deploy Llama2 13b SmoothQuant Model with high performance on SageMaker using Sagemaker LMI.
  • Lab 3: Multi-LoRA adapter inference on Amazon SageMaker.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

In this workshop, we demonstrate how to choose the right container and right instance types, optimize container parameters, and set up the right autoscaling policies and how to use APIs to get recommendations with Amazon SageMaker

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published