• Submission: August 20th, 2022
  • Notification: September 6th, 2022
  • Final Pre-Workshop papers: October 1st, 2022


WEDNESDAY, OCTOBER 12, 2022 – Illini Center, Discovery Partners Institute, Chicago, Illinois (Tentative Program)

2:00 – 3:00 pmDavid Padua
Title: The Evolution of Parallel Computing since LCPC’88
Abstract: An overview of the evolution of hardware, programming
notations, and compilers for parallel computing during the last 35 years
and the impact of the 1988 state of the art on parallel computing today. 
3:00 – 3:30 pmCoffee Break
3:30 – 4:00 pmSaeed Maleki
Title: GPU Collectives with MSCCL: Man vs. Dragons
Abstract: Collective communication primitives on GPUs are the
primary bottleneck on large neural network models. Although there
have been decades of research on optimizing computation kernels,
there has been very little done for collective communication kernels
on GPUs. There are many challenges in area including unique GPU
interconnection topologies, high P2P transfer latency, wide range of
use cases for neural networks, and software complexities. In this talk,
I will present program synthesis as a primary solution for communication
algorithms for these topologies and show how a bespoke algorithm
can significantly improve the overall performance of a model. Lastly,
I will present a high-level DSL along with a compiler for mapping
from an abstract synthesized algorithm to a low-level CUDA code
for collective communications.
4:00 – 5:00 pmAlbert Cohen
Title: Retire Linear Algebra Libraries
Abstract: Despite decades of investment in software infrastructure,
scientific computing, signal processing and machine learning and
systems remain stuck in a rut. Some numerical computations are
more equal than others: BLAS and the core operations for neural
networks achieve near-peak performance, while marginally different
variants do not get this chance. As a result, performance is only
achieved at the expense of dramatic loss of programmability. Compilers
are obviously the cure. But what compilers? How should these be
built, deployed, retargeted, autotuned? Sure, the BLAS API is not
the ultimate interface to compose and reuse high-performance
operations, but then, what would be a better one? And why did
we not build and agree on one yet? We’ll review these questions
and some of the proposed solutions in this talk. In particular, we will
advocate for a new tile-level programming interface sitting
in-between the top-level computational operations and
generators of target- and problem-specific code. We will also
advocate for a structured approach to the construction of
domain-specific code generators for tensor compilers, with the
stated goal of improving the productivity of both compiler
engineers and end-users.

THURSDAY, OCTOBER 13, 2022 – Illini Center, Discovery Partners Institute, Chicago, Illinois (Tentative Program)

9:00 – 10:00 amSaman Amarasinghe
Title: Compiler 2.0
Abstract: When I was a graduate student a long time ago, I used to
have intense conversations and learned a lot from my peers in other
areas of computer science as the program structure, systems,
and algorithms used in my compiler were very similar to and inspired
by many of the work done by my peers. For example, a Natural
Language Recognition System that was developed by my peers,
with a single sequential program with multiple passes connected
through IRs that systematically transformed an audio stream into text,
was structurally similar to the SUIF compiler I was developing. In the
intervening 30 years, the information revolution brought us
unprecedented advances in algorithms (e.g., machine learning and
solvers), systems (e.g., multicores and cloud computing), and program
structure (e.g., serverless and low-code frameworks). Thus, a
modern NLP system such as Apple’s Siri or Amazon’s Alexa, a thin
client on an edge device interfacing to a massively-parallel,
cloud-based, centrally-trained Deep Neural Network, has little
resemblance to its predecessors. However, the SUIF compiler is
still eerily similar to a state-of-the-art modern compiler such as
LLVM or MLIR.  What happened with compiler construction technology? 
At worst, as a community, we have been Luddites to the information
revolution even though our technology has been critical to it. 
At best,  we have been unable to transfer our research innovations
(e.g., polyhedral method or program synthesis) into production
compilers. In this talk I hope to inspire the compiler community
to radically rethink how to build next generation compilers by giving
a few possible examples of using 21st century program structures,
algorithms and systems in constructing a compiler.
10:00 – 10:30 amCoffee Break
10:30 – 11:00 amTohma Kawasumi, Tsumura Yuta,  Hiroki Mikami, Tomoya Yoshikawa,
Takero Hosomi, Shingo Oidate, Keiji Kimura, Hironori Kasahara
Title: Parallelizing Factory Automation Ladder Programs by OSCAR
Automatic Parallelizing Compiler
11:00 – 11:30 amThomas Rolinger, Christopher Krieger, Alan Sussman
Title: Compiler Optimization for Irregular Memory Access Patterns
in PGAS Programs
11:30 – 12:00 pmFlorian Mayer, Julian Brandner, Michael Philippsen
Title: Employing Polyhedral Methods to Reduce Data Movement
in FPGA Stencil Codes
12:00 – 1:30 pmLunch
1:30 – 2:30 pmFredrik Kjolstad
Title: Portable Compilation of Sparse Computation
Abstract: Hardware is becoming ever more complicated and the
architects are developing a fleet of new types of accelerators. I will
talk about compiling collection-oriented programs to heterogeneous
hardware. I will discuss properties that make certain programming
abstractions amenable to portable compilation, give some examples,
and describe a programming system design. I will then describe how
to compile one such programming model, operations on sparse and
dense arrays/tensors, to the major types of hardware: CPUs,
fixed-function accelerators, GPUs, distributed machines, and streaming
dataflow accelerators. Finally, I will briefly discuss how verification
may make it easier to program heterogeneous machines.
2:30 – 3:00 pmCoffee Break
3:00 – 3:30 pmJohn Jolly, Priya Goyal, Vishal Kumar, Hans Johansen, Mary Hall 
Title: Tensor Iterators for Flexible High-Performance
Tensor Computation
3:30 – 4:00 pmParinaz Barakhshan, Rudolf Eigenmann
Title: Learning from Automatically versus Manually Parallelized
NAS Benchmarks
4:00 – 5:00 pmPonnuswamy Sadayappan
Title: Towards compiler-driven algorithm-architecture co-design for
energy-efficient ML accelerators
Abstract: The improvement of energy-efficiency of ML accelerators is
of fundamental importance. The energy expended in accessing data
from DRAM/SRAM/Registers is orders of magnitude higher than that
expended in actually performing the arithmetic operations on data. The
total energy expended in executing an ML operator depends both on
the choice of accelerator design parameters (such as the capacities of
register banks and scratchpad buffers) as well as the “dataflow” – the
schedule of data movement and operation execution. The design
space of architectural parameters and dataflow is extremely large.
This talk will discuss how analytical modeling can be used to co-design
accelerator parameters and dataflow to optimize energy.

FRIDAY, OCTOBER 14, 2022 – Illini Center, Discovery Partners Institute, Chicago, Illinois (Tentative Program)

9:00 – 9:30 amXiaoming Li
Title: How Can Compilers Help The
Additive Manufacturing of Electronics?
9:30 – 10:00 amHenry Dietz
Title: Wordless Integer and Floating-Point Computing
10:00 – 10:30 amCoffee Break
10:30 – 11:00 amJose Moreira, Kit Barton, Peter Bergner, Nemanja Ivanovic,
Puneeth Bhat, Satish Sadasivam
Title: Exploiting the new Power ISA matrix math instructions
through compiler built-ins
11:00 – 11:30 amJuan Benavides, John Baugh, Ganesh Gopalakrishnan
Title: An HPC Practitioners Workbench for Formal
Refinement Checking
11:30 am – 12:00 pmWenwen Wang
Title: MPIRace: A Static Data Race Detector for
MPI Programs