1

Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
  • Use activation and weight low-bit quantization to improve throughput.
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving