Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.
-
Updated
May 13, 2024 - C++
Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
the AI-native open-source embedding database
Chat with your legal assistant.
User-friendly WebUI for LLMs (Formerly Ollama WebUI)
Self-hosted AI coding assistant
Work with LLMs on a local environment using containers
🤖 Collect practical AI repos, tools, websites, papers and tutorials on AI. 实用的AI百宝箱 💎
Structured Text Generation
TypeScript SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
Add a description, image, and links to the llms topic page so that developers can more easily learn about it.
To associate your repository with the llms topic, visit your repo's landing page and select "manage topics."