×
Post-training quantization (PTQ) for large language models (LLMs) significantly accelerates model inference and relieves memory constraints, without incurring ...
Aug 16, 2024 · Post-training quantization (PTQ) for large lan- guage models (LLMs) significantly acceler- ates model inference and relieves memory con-.
Jul 16, 2024 · A simple yet effective post-training weight quantization method for LLMs that reconstructs the outputs of an intermediate Transformer block by leveraging low- ...
Abstract: Post-training quantization (PTQ) for large language models (LLMs) significantly accelerates model inference and relieves memory constraints, without ...
Nov 18, 2022 · We propose SmoothQuant, a training-free, accuracy-preserving, and general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, 8-bit ...
Missing: LRQuant: | Show results with:LRQuant:
LRQuant: Learnable and Robust Post-Training Quantization for Large Language Models. Jiaqi Zhao | Miao Zhang | Chao Zeng | Ming Wang | Xuebo Liu | Liqiang ...
LRQuant: Learnable and Robust Post-Training Quantization for Large Language Models Jiaqi Zhao, Miao Zhang, Chao Zeng, Ming Wang, Xuebo Liu, Liqiang Nie ...
People also ask
The RPTQ approach involves rearranging the channels in the activations and then quantizing them in clusters, thereby reducing the impact of the range difference ...
Missing: LRQuant: Robust
This repo collects papers, docs, codes about model quantization for anyone who wants to do research on it. We are continuously improving the project.
Missing: LRQuant: | Show results with:LRQuant: