Kan Zhu 朱侃

Kan Zhu 朱侃

PhD Student (Computer Science)

University of Washington

Biography

I am Kan Zhu, a second year PhD student at University of Washington’s Paul G. Allen School of Computer Science and Engineering, co-advised by Baris Kasikci and Arvind Krishnamurthy.

I develop systems and methodologies for optimizing Large Language Model (LLM) inference. The widespread adoption of LLMs presents unique challenges for on-device inference and cost-effective large-scale serving due to their substantial computational demands. To address these issues, I am interested in designing innovative hardware, algorithms, and frameworks tailored for both edge devices and data center environments.

Download my CV

Interests
  • Machine Learning Systems
  • Computer Architecture
Education
  • Ph.D. in Computer Science and Engineering, 2023 - Present

    University of Washington

  • B.S. Computer Engineering, 2021 - 2023

    University of Michigan

  • B.S. Electrical and Computer Engineering, 2019 - 2021 (transfer to UM)

    Shanghai Jiao Tong University

Publications

(2024). NanoFlow: Towards Optimal Large Language Model Serving Throughput. Arxiv 2024.

PDF Code

(2024). Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference. ICML 2024.

PDF Code

(2024). Atom: Low-bit Quantization for Efficient and Accurate LLM Serving. MLSys 2024.

PDF Code

(2024). From Optimal to Practical: Efficient Micro-op Cache Replacement Policies for Data Center Applications. HPCA 2025.

(2024). Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models.. ICLR Workshop 2024.

PDF Code

Awards

OSDI 2024 Travel Grant
Allen School Computer Science & Engineering Research Fellowship
ACM Student Research Competition 1st Place Award
Dean’s Honor List
SJTU Undergraduate Excellence Scholarship