Project Overview

An ‘ALLVM system’ is one in which all software components — except a small set needed for bootstrapping — are represented in a virtual instruction set instead of native machine code.  The goal of the approach is to enable sophisticated compiler analyses and transformations to be applied across arbitrary software boundaries — not just caller-callee boundaries analyzed using traditional interprocedural techniques, but also several others: between applications and third-party libraries; applications and the underlying operating system; and between communicating processes in a distributed system.

Many software components already ship as virtual instruction sets (loosely defined as “not a native hardware instruction set”), including software in managed languages like Java, C# and Scala; scripting languages like Python and Javascript; and GPGPU code in languages like CUDA and OpenCL. The major change ALLVM enables is for statically compiled languages like C, C++, Fortran, OCaml, Swift, etc.  For software written in these languages, we represent and ship code using the LLVM Virtual Instruction Set (see http://llvm.org), previously developed in our research group and now widely used in production systems, including MacOS, iOS, and FreeBSD.  LLVM already provides some of the capabilities required for an ALLVM system, including the ability to ship software in LLVM bitcode form and the ability to perform install-time and just-in-time compilation.

The key difference between LLVM and ALLVM is that LLVM enables individual software components to be analyzed and optimized throughout their lifetime (“lifelong compilation”) whereas ALLVM enables all the software on a system to be analyzed and optimized together, throughout the lifetime of the software  (“system-wide, lifelong compilation”). Several research projects within the ALLVM umbrella are exploring the performance, reliability, security and software engineering benefits of the ALLVM approach.

This project is funded by the Office of Naval Research, the National Science Foundation, the Semiconductor Research Corporation, DARPA and the Department of Defense.