This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
-
Updated
Feb 13, 2024 - Jupyter Notebook
This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
Boosting DL Service Throughput 1.5-4x by Ensemble Pipeline Serving with Concurrent CUDA Streams for PyTorch/LibTorch Frontend and TensorRT/CVCUDA, etc., Backends
Building Real-Time Inference Pipelines with Ray Serve
A drop-in replacement of fastapi to enable scalable and fault tolerant deployments with ray serve
A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App
This MLOps repository contains python modules intended for distributed model training, tuning, and serving using PyTorch and Ray, a distributed computing framework.
contains the basic structure that a model serving application should have. This implementation is based on the Ray Serve framework.
Add a description, image, and links to the ray-serve topic page so that developers can more easily learn about it.
To associate your repository with the ray-serve topic, visit your repo's landing page and select "manage topics."