Posts by Collection

portfolio

publications

Paper Title Number 2

Published in arXiv, submitted to Conference on Machine Learning and Systems (MLSys), 2010

This paper is about the number 2. The number 3 is left for future work.

Recommended citation: ``` @misc{https://doi.org/10.48550/arxiv.2211.00839, doi = {10.48550/ARXIV.2211.00839}, url = {https://arxiv.org/abs/2211.00839}, author = {He, Haoze and Dube, Parijat}, keywords = {Machine Learning (cs.LG), Distributed, Parallel, and Cluster Computing (cs.DC), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {RCD-SGD: Resource-Constrained Distributed SGD in Heterogeneous Environment via Submodular Partitioning}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Zero v1.0 Universal} } ``` http://academicpages.github.io/files/RCD-SGD-2022ICASSP.pdf

Adjacent Leader Decentralized Stochastic Gradient Descent

Published in ICML, to be submitted, 2022

Elastic Averaging SGD (ASGD) and Leader Gradient Descent (LSGD) can accelerate the convergence of centralized distributed SGD and lead to faster training versus both wall-clock time and the number of epochs. However, both of these algorithms can not be applied to the state-of-the-art decentralized distributed SGD frameworks which can alleviate the congestion communication traffic issue by abandoning the centralized parameter server. In this paper, we propose the decentralized Adjacent Leader Decentralized Gradient Descent(AL-DSGD), which can accelerate the convergence of decentralized SOTA framework. The main idea of AL-DSGD is to assign specific weights to different neighbor learners according to their performance when averaging and apply a corrective force dictated by the currently best-performing neighbor when training. The convergence analysis is applied to demonstrate the faster convergence. Experiments on a suite of datasets and deep learning neural networks validate the theoretical analyses and demonstrate that AL-DSGD speeds up the training and fastens the convergence. Finally, we developed a general and concise distributed training pytorch framework which can implement any distributed machine learning systems easily (any synchronous/ asynchronous, centralized/decentralized distributed SGD system).

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.