Publications

Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

  • Use activation and weight low-bit quantization to improve throughput.