IMPORTANT DATES
- Submission: August 20th, 2022
- Notification: September 6th, 2022
- Final Pre-Workshop papers: October 1st, 2022
ABOUT LCPC 2022
WEDNESDAY, OCTOBER 12, 2022 – Illini Center, Discovery Partners Institute, Chicago, Illinois (Tentative Program)
Time | Event |
2:00 – 3:00 pm | David Padua Title: The Evolution of Parallel Computing since LCPC’88 Abstract: An overview of the evolution of hardware, programming notations, and compilers for parallel computing during the last 35 years and the impact of the 1988 state of the art on parallel computing today. |
3:00 – 3:30 pm | Coffee Break |
3:30 – 4:00 pm | Saeed Maleki Title: GPU Collectives with MSCCL: Man vs. Dragons Abstract: Collective communication primitives on GPUs are the primary bottleneck on large neural network models. Although there have been decades of research on optimizing computation kernels, there has been very little done for collective communication kernels on GPUs. There are many challenges in area including unique GPU interconnection topologies, high P2P transfer latency, wide range of use cases for neural networks, and software complexities. In this talk, I will present program synthesis as a primary solution for communication algorithms for these topologies and show how a bespoke algorithm can significantly improve the overall performance of a model. Lastly, I will present a high-level DSL along with a compiler for mapping from an abstract synthesized algorithm to a low-level CUDA code for collective communications. |
4:00 – 5:00 pm | Albert Cohen Title: Retire Linear Algebra Libraries Abstract: Despite decades of investment in software infrastructure, scientific computing, signal processing and machine learning and systems remain stuck in a rut. Some numerical computations are more equal than others: BLAS and the core operations for neural networks achieve near-peak performance, while marginally different variants do not get this chance. As a result, performance is only achieved at the expense of dramatic loss of programmability. Compilers are obviously the cure. But what compilers? How should these be built, deployed, retargeted, autotuned? Sure, the BLAS API is not the ultimate interface to compose and reuse high-performance operations, but then, what would be a better one? And why did we not build and agree on one yet? We’ll review these questions and some of the proposed solutions in this talk. In particular, we will advocate for a new tile-level programming interface sitting in-between the top-level computational operations and generators of target- and problem-specific code. We will also advocate for a structured approach to the construction of domain-specific code generators for tensor compilers, with the stated goal of improving the productivity of both compiler engineers and end-users. |
THURSDAY, OCTOBER 13, 2022 – Illini Center, Discovery Partners Institute, Chicago, Illinois (Tentative Program)
Time | Event |
9:00 – 10:00 am | Saman Amarasinghe Title: Compiler 2.0 Abstract: When I was a graduate student a long time ago, I used to have intense conversations and learned a lot from my peers in other areas of computer science as the program structure, systems, and algorithms used in my compiler were very similar to and inspired by many of the work done by my peers. For example, a Natural Language Recognition System that was developed by my peers, with a single sequential program with multiple passes connected through IRs that systematically transformed an audio stream into text, was structurally similar to the SUIF compiler I was developing. In the intervening 30 years, the information revolution brought us unprecedented advances in algorithms (e.g., machine learning and solvers), systems (e.g., multicores and cloud computing), and program structure (e.g., serverless and low-code frameworks). Thus, a modern NLP system such as Apple’s Siri or Amazon’s Alexa, a thin client on an edge device interfacing to a massively-parallel, cloud-based, centrally-trained Deep Neural Network, has little resemblance to its predecessors. However, the SUIF compiler is still eerily similar to a state-of-the-art modern compiler such as LLVM or MLIR. What happened with compiler construction technology? At worst, as a community, we have been Luddites to the information revolution even though our technology has been critical to it. At best, we have been unable to transfer our research innovations (e.g., polyhedral method or program synthesis) into production compilers. In this talk I hope to inspire the compiler community to radically rethink how to build next generation compilers by giving a few possible examples of using 21st century program structures, algorithms and systems in constructing a compiler. |
10:00 – 10:30 am | Coffee Break |
10:30 – 11:00 am | Tohma Kawasumi, Tsumura Yuta, Hiroki Mikami, Tomoya Yoshikawa, Takero Hosomi, Shingo Oidate, Keiji Kimura, Hironori Kasahara Title: Parallelizing Factory Automation Ladder Programs by OSCAR Automatic Parallelizing Compiler |
11:00 – 11:30 am | Thomas Rolinger, Christopher Krieger, Alan Sussman Title: Compiler Optimization for Irregular Memory Access Patterns in PGAS Programs |
11:30 – 12:00 pm | Florian Mayer, Julian Brandner, Michael Philippsen Title: Employing Polyhedral Methods to Reduce Data Movement in FPGA Stencil Codes |
12:00 – 1:30 pm | Lunch |
1:30 – 2:30 pm | Fredrik Kjolstad Title: Portable Compilation of Sparse Computation Abstract: Hardware is becoming ever more complicated and the architects are developing a fleet of new types of accelerators. I will talk about compiling collection-oriented programs to heterogeneous hardware. I will discuss properties that make certain programming abstractions amenable to portable compilation, give some examples, and describe a programming system design. I will then describe how to compile one such programming model, operations on sparse and dense arrays/tensors, to the major types of hardware: CPUs, fixed-function accelerators, GPUs, distributed machines, and streaming dataflow accelerators. Finally, I will briefly discuss how verification may make it easier to program heterogeneous machines. |
2:30 – 3:00 pm | Coffee Break |
3:00 – 3:30 pm | John Jolly, Priya Goyal, Vishal Kumar, Hans Johansen, Mary Hall Title: Tensor Iterators for Flexible High-Performance Tensor Computation |
3:30 – 4:00 pm | Parinaz Barakhshan, Rudolf Eigenmann Title: Learning from Automatically versus Manually Parallelized NAS Benchmarks |
4:00 – 5:00 pm | Ponnuswamy Sadayappan Title: Towards compiler-driven algorithm-architecture co-design for energy-efficient ML accelerators Abstract: The improvement of energy-efficiency of ML accelerators is of fundamental importance. The energy expended in accessing data from DRAM/SRAM/Registers is orders of magnitude higher than that expended in actually performing the arithmetic operations on data. The total energy expended in executing an ML operator depends both on the choice of accelerator design parameters (such as the capacities of register banks and scratchpad buffers) as well as the “dataflow” – the schedule of data movement and operation execution. The design space of architectural parameters and dataflow is extremely large. This talk will discuss how analytical modeling can be used to co-design accelerator parameters and dataflow to optimize energy. |
FRIDAY, OCTOBER 14, 2022 – Illini Center, Discovery Partners Institute, Chicago, Illinois (Tentative Program)
Time | Event |
9:00 – 9:30 am | Xiaoming Li Title: How Can Compilers Help The Additive Manufacturing of Electronics? |
9:30 – 10:00 am | Henry Dietz Title: Wordless Integer and Floating-Point Computing |
10:00 – 10:30 am | Coffee Break |
10:30 – 11:00 am | Jose Moreira, Kit Barton, Peter Bergner, Nemanja Ivanovic, Puneeth Bhat, Satish Sadasivam Title: Exploiting the new Power ISA matrix math instructions through compiler built-ins |
11:00 – 11:30 am | Juan Benavides, John Baugh, Ganesh Gopalakrishnan Title: An HPC Practitioners Workbench for Formal Refinement Checking |
11:30 am – 12:00 pm | Wenwen Wang Title: MPIRace: A Static Data Race Detector for MPI Programs |