skip to content
Xiaozhe Yao

Xiaozhe Yao is a second-year doctoral student at Systems Group, Department of Computer Science, ETH Zurich advised by Prof. Dr. Ana Klimović. Working on a wide spectrum of machine learning and systems, his research direction is to build systems that support large-scale machine learning and democratize machine learning.

Prior to ETH Zurich, Xiaozhe Yao gained his Master’s degree at the University of Zurich in Data Science, advised by Prof. Dr. Michael Böhlen and Qing Chen. Before that, he completed his Bachelor’s study at Shenzhen University in Computer Science, advised by Prof. Dr. Shiqi Yu. He interned at Shenzhen Institute of Advanced Technology in 2016 as a data scientist. Between 2021 and 2022, he worked on the project AID as an Innovator Fellow at the Library Lab, ETH Zurich. The objective of AID is to support the application of machine learning algorithms by imitating a real library. It provides a unified programming interface to access and manage machine learning models, a digital library for searching, filtering and inspecting machine learning models.

Selected Publications

In chronological order:

(* == equal contribution)

  1. Yao, Xiaozhe, and Ana Klimovic. “DeltaZip: Multi-Tenant Language Model Serving via Delta Compression.” arXiv preprint arXiv:2312.05215 (2023). Code.
  2. Luis Oala, Manil Maskey, Lilith Bat-Leah, Alicia Parrish, Nezihe Merve Gürel, Tzu-Sheng Kuo, Yang Liu, Rotem Dror, Danilo Brajovic, Xiaozhe Yao, et al. “DMLR: Data-centric Machine Learning Research — Past, Present and Future”. The Journal of Data-centric Machine Learning Research (DMLR), 2023.
  3. Jiang, Youhe*, Ran Yan*, Xiaozhe Yao*, Beidi Chen, and Binhang Yuan. “HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment” ICML 24’, 2023.
  4. Berivan Isik*, Hermann Kumbong*, Wanyi Ning*, Xiaozhe Yao*, Sanmi Koyejo, Ce Zhang, “GPT-Zip: Deep Compression of Finetuned Large Language Models”. ICML Workshop on Efficient Systems for Foundation Models.
  5. Mazumder, Mark, Colby Banbury, Xiaozhe Yao, Bojan Karlaš, William Gaviria Rojas, Sudnya Diamos, Greg Diamos, et al. “DataPerf: Benchmarks for Data-Centric AI Development”. ICML Workshop (Benchmarking Data for Data-Centric AI). NeurIPS 2023 Datasets and Benchmarks. Code.
  6. Cedric Renggli, Xiaozhe Yao, Luka Kolar, Luka Rimanic, Ana Klimovic, Ce Zhang. “SHiFT: An Efficient, Flexible Search Engine for Transfer Learning”. (VLDB 2023). Code.
  7. Yao, Xiaozhe. “MLPM: Machine Learning Package Manager” Workshop on MLOps, MLSys, 2020. Code.


  1. Master Thesis: Implementation of Learned Cardinality Estimation in Database Contexts, supervised by Prof. Dr. Michael H. Bohlen, Prof. Dr. Anton Dignös, Qing Chen.
  2. Bachelor Thesis: Face Detection with Multi-Block Local Binary Pattern in OpenCV, supervised by Prof. Dr. Shiqi Yu.

Technical Reports

  1. Yao, Xiaozhe, Neeraj Kumar and Nivedita Nivedita. Slides/Implementing learned indexes on 1 and 2 dimensional data (Master Project 2021).
  2. Yao, Xiaozhe. Implementing Deconvolution to Visualize and Understand Convolutional Neural Networks, supervised by Prof. Dr. Michael Böhlen and Qing Chen (2020).

Work Experiences

  • Research Consultant at the Together Computing. Dec 2022 - Jun 2023.
  • Innovation Fellow at the Library Lab, ETH Zürich. June 2021 - Feb 2022.
  • Visiting Student at the Computer Vision Lab, Shenzhen University.
  • Engineer and Associate Founder at,LTD. May 2018
  • Data Scientist (Intern) at the Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences. March 2016 - March 2017.


  • Master of Data Science (with a minor in Informatics), summa cum laude, at the Institut für Informatik, Universität Zürich. Sept 2019 to April 2022
  • Bachelor in Computer Science at the College of Computer Science and Software Engineering, Shenzhen University, with an honour in high-performance computing. Sept 13 to June 2017.


I served as the teaching assistant at both Shenzhen University and Universität Zürich, for the following courses:

  • Cloud Computing Architecture. ETH Zürich. Spring 2024.
  • Information Systems for Engineers. ETH Zürich. Fall 2023.
  • Informatics II: Data Structures and Algorithms. Universität Zürich. Spring 2022. Cheatsheet.
  • Foundations of Data Science. Universität Zürich. Fall 2021.
  • Informatics II: Data Structures and Algorithms. Universität Zürich. Spring 2021. Cheatsheet.
  • Informatics I: Introduction to Programming. Universität Zürich. Fall 2020.
  • Professional English for Computer Science. Shenzhen University. Spring 2019.
  • Web Programming (Java Web). Shenzhen University. 2016.
  • Web Programming (Android). Shenzhen University. 2015.
  • Data Structures and Algorithms. Shenzhen University. 2014.

I am fortunate to work with students at all levels on their projects and thesis:

  • Sparse Mixture of Experts Serving via Delta Compression. Boyko Borisov at ETH Zurich.
  • Analytical Model for LLM Inference. Zhiyuan Huang at ETH Zurich.
  • LongerLora: Scaling Up Low-Rank Adaptors For Longer Context. Xiaoyuan Jin at ETH Zurich.


If you are interested in the slides of the following talks, please contact me.

  • MLSys Singapore: DeltaZip - Multi-tenant LLMs Serving. YouTube. April, 2024
  • AIDS Seminar: DeltaZip - Multi-tenant LLMs Serving at Aalto University, hosted by Prof. Dr. Bo Zhao. 26, Feb, 2024
  • Industry Retreat: DeltaZip - Multi-tenant LLMs Serving at ETH Zurich, hosted by Systems Group. 16, Jan, 2024.
  • Techno Hour: Learned Cardinality Estimation in Database Systems at SAP, hosted by Thomas Zurek. 5, April 2022.
  • Tech Talk: AID - Towards Findable, Accessible and Usable AI at ETH Library, hosted by Koesling Sven. 15, Feb 2022.

Community Service

Open Source Software


During my spare time, I like to casually travel around and take some photos. Find my photos at Unsplash.