Design of a Computer System for Scalable Deep Learning: How to Make it Usable

Presenter: Volodymyr Kindratenko, National Center for Supercomputing Applications, University of Illinois

Tuesday, April 19, 2022

Slides: https://uofi.app.box.com/s/yj0b5gvygfzg7s9o8pdhvvucud3emgf7

Video: https://uofi.app.box.com/s/zx1rdsvxutwl7mvuu96htgg70ngozcu3

Abstract

This presentation will focus on the design of a purpose-built computer system for running deep learning frameworks at scale. The system consists of 16 IBM P9 AC922 servers with NVIDIA V100 GPUs and DDN storage and is available to the UIUC research community. A key characteristic of this system is a custom management software stack to enable an efficient use of the system by a diverse community of users.

Biography

Volodymyr Kindratenko is an Assistant Director at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign where he serves as the Director for the Center for Artificial Intelligence Innovation (CAII). He holds an Adjunct Associate Professor appointment in the Departments of Electrical and Computer Engineering (ECE) and a Research Associate Professor appointment in the Department of Computer Science (CS). Prior to becoming the Director of CAII, he was leading NCSA’s Innovative Systems Laboratorya center-wide research effort to investigate and evaluate emerging compute technologies for high-performance computing applications. Dr. Kindratenko received D.Sc. degree from the University of Antwerp, Belgium, in 1997. His research interests include high-performance computing, special-purpose computing architectures, cloud computing, and machine learning systems and applications. He serves as a department editor of IEEE Computing in Science and Engineering magazine and an associate editor of the International Journal of Reconfigurable Computing. Dr. Kindratenkos work has been funded by NSF, NASA, ONR, DOE, and industry. He has published over 70 papers in refereed scientific journals and conference proceedings and holds five US patents. He is a Senior Member of IEEE and ACM.