SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
-
Updated
May 29, 2024 - Python
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Neural Network Compression Framework for enhanced OpenVINO™ inference
Code for the paper "FOCIL: Finetune-and-Freeze for Online Class-Incremental Learning by Training Randomly Pruned Sparse Experts"
Characterization study repository for model compression method: pruning
Architecture for pruning methods analysis using pytorch prune module
《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
Chess engine
This is the official implementation of "LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models", and it is also an efficient LLM compression tool with various advanced compression methods, supporting multiple inference backends.
Code for CPAL-2024 paper "Continual Learning with Dynamic Sparse Training: Exploring Algorithms for Effective Model Updates"
Sparsity-aware deep learning inference runtime for CPUs
[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
Config driven, easy backup cli for restic.
AIMET GitHub pages documentation
Add a description, image, and links to the pruning topic page so that developers can more easily learn about it.
To associate your repository with the pruning topic, visit your repo's landing page and select "manage topics."