Carbon Limiting Auto Tuning for Kubernetes
-
Updated
May 18, 2024 - Python
Carbon Limiting Auto Tuning for Kubernetes
A Framework For Intelligence Farming
🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
Modal LLM LLama.cpp based model deployment as part of series of Model as a Service (MaaS)
On-device LLM inference powered by x-bit quantization
A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap
LLMs as Copilots for Theorem Proving in Lean
A high-performance inference system for large language models, designed for production environments.
Minimalist web-searching app with an AI assistant that runs directly from your browser. Uses Web-LLM, Ratchet-ML, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
A TK based graphical user interface for gpt4all. It uses the python bindings. Run LLMs in a very slimmer environment and leave maximum resources for inference
Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
The open-source serverless GPU container runtime.
Local webui for Large Language Models. Supports the GGUF format. Inference LLMs with support for STT/TTS and function calling.
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
∇ Valyu LLManager simplifies and scales LLM application deployment, reducing infrastructure complexity and costs.
The Arcee client for executing domain-adpated language model routines
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Add a description, image, and links to the llm-inference topic page so that developers can more easily learn about it.
To associate your repository with the llm-inference topic, visit your repo's landing page and select "manage topics."