BIO

I am a third-year PhD student in the School of Computer Science at Carnegie Mellon University(CMU), specializing in machine learning and computer software engineering. My research focuses on Large Language Model post-training, make the LLMs better (more align with domain specific tasks), faster (more efficient in training and inference), and cheaper (training with less GPU hours and GPU memory utilization).

At CMU, I am advised by Prof. Heather Miller. Previously, I earned my master in Computer Science from New York University advised by Prof. Anna Choromanska and Prof. Parijat Dube. I received my B.S. in Computer Science and Engineering from The Chinese University of Hong Kong(CUHK) where I worked with Prof. David Zhang, Dapeng and Prof. Rui Huang. Before starting my PhD, My research mainly focuses on distributed machine learning system.

(This personal website is updated as of March 2025.)

News

5-2025: I will intern at AWS-AI-Labs@Amazon this summer working on LLM post-training enabled speculative decoding via latent space reasoning.
2-2025: Open-source SMT. We implemented SMT in two frameworks: DeepSpeed and Hugging Face Trainer.
2-2025: SMT: Fine-Tuning Large Language Models with Sparse Matrices, has been accepted by ICLR 2025.
5-2024: Adjacent Leader Decentralized Stochastic Gradient Descent has been accepted by ECAI 2024.
4-2024: Multi-View Radar Autoencoder for Self-Supervised Automotive Radar Representation Learning has been accepted by IEEE Intelligent Vehicles Symposium (IV) 2024.
I started my Ph.D. journey at CMU.
2- 2023: Open-source a general codebase to implement any (de)centralized, (a)synchronous distributed SGD algorithms when models fit into a single machine. The paper, which proposes a novel distributed SGD algorithm.

Selected Publications

Haoze He, Juncheng Billy Li, Xuan Jiang, Heather Miller, “Sparse Matrix in Large Language Model Fine-Tuning”, International Conference on Learning Representations (ICLR), Accepted, Jan. 2025. [code]
Haoze He, Jing Wang, Anna Choromanska, “Adjacent Leader Decentralized Stochastic Gradient Descent”, European Conference on Artificial Intelligence (ECAI), Accepted, June 2024. [code]

My full publication list can be found on my Google Scholar profile.

Academic Blog

Peter Zhong, Haoze He, Omar Khattab, Christopher Potts, Matei Zaharia, Heather Miller, “A Guide to Large Language Model Abstractions”, Jan. 2024.

Education

Ph.D. in Machine Learning and Software Engineering at Carnegie Mellon University, 2023-present
- GPA: 4.16/4.0, Rank: top1%
M.S. in Computer Engineering at New York University, 2021-2023
- GPA: 3.93/4.0, Rank: top1%
B.S. in Computer Science and Engineering at The Chinese University of Hong Kong, 2016-2020

Work Experience

Applied Research Scientist Intern, Amazon AWS AI Labs, Summer 2025

Teaching Assistant, Carnegie Mellon University, LTI at SCS, Large Language Model Systems (11-868), Spring 2025

Research Assistant, Carnegie Mellon University, S3D at SCS, 2023 ~ present

Research Assistant, New York University, Engineering School, 2022 ~ 2023

Awards

Presidential Fellowship, Carnegie Mellon University, Nov. 2024

Service

Reviewer, International Conference on Learning Representations (ICLR) — 2025, 2026
Reviewer, International Joint Conference on Neural Networks (IJCNN) — 2025
Reviewer, The Association for the Advancement of Artificial Intelligence (AAAI) — 2025
Reviewer, International Conference on Acoustics, Speech, and Signal Processing (ICASSP) — 2022 – 2025
Reviewer, International Conference on Computer Vision (ICCV) Workshops — 2023

Open-Sources for the Community

Build an open-source website for NYU EECS/DS community and help 150+ NYU students each semester. This website summary the open-source courses in NYU EECS/DS, provide links and repositories for each course, list the workload, and provide course experiences for reference. Anyone from the NYU community is welcome to fork and contribute!

Haoze He