IEEE International Symposium on Local and Metropolitan Area Networks
11–12 July 2022 // Virtual Conference

Invited Talks

Toward Practical Federated Learning

Mosharaf Chowdhury, Morris Wellman Assistant Professor of CSE at the University of Michigan

Abstract: Although theoretical federated learning research is growing exponentially, we are far from putting those theories into practice. In this talk, I will share our ventures into building practical systems for two extremities of federated learning. Sol is a cross-silo federated learning and analytics system that tackles network latency and bandwidth challenges faced by distributed computation between far-apart data sites. Oort, in contrast, is a cross-device federated learning system that enables training and testing on representative data distributions despite unpredictable device availability. Both deal with systems and network characteristics in the wild that are hard to account for in analytical models. I’ll then share the challenges in systematically evaluating federated learning systems that have led to a disconnect between theoretical conclusions and performance in the wild. I’ll conclude this talk by introducing FedScale, which is an extensible framework for evaluation and benchmarking in realistic settings to democratize practical federated learning for researchers and practitioners alike. All these systems are open-source and available at https://github.com/symbioticlab.

Biography: Mosharaf Chowdhury is a Morris Wellman assistant professor of CSE at the University of Michigan, Ann Arbor, where he leads the SymbioticLab. His work improves application performance and system efficiency of machine learning and big data workloads. He is also building software solutions to allow users to monitor and optimize the impact of machine learning systems on energy consumption and user privacy. His group developed Infiniswap, the first scalable software solution for memory disaggregation; Salus, the first software-only GPU sharing system for deep learning; Sol, the fastest multi-cloud data processing engine; and FedScale, the largest federated learning benchmark with accompanying runtime. In the past, Mosharaf did seminal works on coflows and virtual network embedding, and he was a co-creator of Apache Spark. He has received many individual awards and fellowships, thanks to his stellar students and collaborators. His works have received seven paper awards from top venues, including NSDI, OSDI, and ATC, and over 22,000 citations. He received his Ph.D. from UC Berkeley in 2015.

Untitled (TBC)

Junchen Jiang, Assistant Professor of Computer Science at The University of Chicago 

Optimizing Contributions to Distributed, Networked Learning

Carlee Joe-Wong, Assistant Professor of Electrical and Computer Engineering at Carnegie Mellon University

Abstract: The rapid expansion of Internet-connected, compute-equipped “things” has greatly expanded the amount of data that can be collected about many types of systems, from smart cities to mobile applications to personal health. Making use of this data, however, requires effectively leveraging computing resources to run data analysis algorithms (e.g., machine learning inference or training). Unfortunately, the “things” at which all of this data is collected are often resource-constrained, e.g., with limited power budgets, unreliable network connectivity, and/or limited computing capabilities. Distributed learning algorithms such as federated learning aim to address these challenges, but they are generally not optimized to run on networks of devices with limited, heterogeneous, and unreliable computing and communication resources. In this talk, I will present new variants on federated learning algorithms that provide theoretical convergence guarantees and good empirical performance in the presence of such resource limitations. By carefully designing algorithms for each stage in the distributed machine learning pipeline (data collection, data analysis, and communication across devices), we can realize significant improvement in the accuracy of our trained models.

Biography: Carlee Joe-Wong is the Robert E. Doherty Associate Professor of Electrical and Computer Engineering at Carnegie Mellon University. She received her A.B. degree (magna cum laude) in Mathematics, and M.A. and Ph.D. degrees in Applied and Computational Mathematics, from Princeton University in 2011, 2013, and 2016, respectively. Dr. Joe-Wong’s research is in optimizing networked systems, particularly on applying machine learning and pricing to resource allocation in data and computing networks. From 2013 to 2014, she was the Director of Advanced Research at DataMi, a startup she co-founded from her Ph.D. research on mobile data pricing. Her research has received several awards, including the NSF CAREER Award in 2018.

The Hyper-Converged Programmable Gateway in Alibaba Edge Cloud

Hongqiang Liu, Director of R&D at Alibaba Group

Abstract: Edge cloud provides significant performance and cost advantages for emerging applications such as cloud gaming, video conferencing and AR/VR, etc. However, different from central clouds, edge cloud also faces tremendous challenges due to the limited resources, demands on high performance, and hardware heterogeneity. Alibaba solves these problems by introducing a hyper-converged gateway platform ”SNA” that provides the cloud network stack and network functions within the network rather than the hosts. SNA is a heterogeneous computing platform that merges network switching, network virtualization, and various network functions on top of programmable network ASICs, FPGAs, and CPUs. It has been deployed to support some multi-million-user products in Alibaba’s edge cloud. The key technical enabler of the rapid and safe deployment of the hyper-converged gateways running in SNA is our programmable network development platform “TaiX” which provides novel and practical programming abstractions, compilers, debuggers, testers, orchestrators, and operation tools.

Biography: Hongqiang “Harry” Liu is a Director of Network Research and Edge Network Infrastructure Engineering in Alibaba Cloud and Alibaba DAMO Academy. He received his Ph.D. degree from the Department of Computer Science at Yale University in 2014. His research focuses on data center networks, network transports, and programmable networks. He has published more than 20 papers in top-tier academic conferences, such as ACM SIGCOMM, ACM SOSP, and USENIX NSDI. He also serves on the technical program committees of SIGCOMM and NSDI. He is the recipient of the prestigious ACM SIGCOMM Doctoral Dissertation Award – Honorable Mention in 2015.

Untitled (TBC)

Andra Lutu, Senior Researcher of Telefónica Research

Untitled (TBC)

Tong Yang, Associate Professor of Peking University