Wednesday, February 26.
9:00 a.m. – 11:00 a.m.
CSL 301
Microsoft Workshop – Dr. William Lewis, Microsoft Translator
Automatic Speech Transcription and Translation in the Classroom and Lecture Setting: The Technologies, How They’re Being Used, and Where We’re Going
We have witnessed significant progress in Automated Speech Recognition (ASR) and Machine Translation (MT) in recent years, so much so that Speech Translation, itself a combination of these underlying technologies, is becoming a viable technology in its own right. Although not perfect, many have called what they’ve seen of the current technology the “Universal Translator” or the “mini-UN on a phone”. But we’re not done and there are many problems to solve. For example, for Speech Translation to work well, it is not sufficient to stitch together the two underlying technologies of ASR and MT and call it done. People are amazingly disfluent, which can have profound negative impacts on transcripts and translations. We need to make the output of ASR more “fluent”; this has the effect of improving the quality of downstream translations.
Further, since “fluent” output is much more readable and “caption-like” than disfluent, it is also more easily consumable by same-language users. This opens doors to broader accessibility scenarios. Speech Translation is currently being used in a variety of scenarios, no more so than in education. It sees its greatest uptake in settings where one or more speakers needs to communicate with a multilingual population. Perfect examples are the classroom, but we also see its use in parent-teacher conferences. The underlying technologies can be enhanced further by giving users some control over customizing the underlying models, e.g., to domain-specific vocabulary or speaker accents, significantly improving user experiences. In this talk we will demonstrate the technology in action as part of the presentation.
Wednesday, February 26.
9:00 a.m. – 11:00 a.m.
CSL 141
Intel Workshop – Dr. Ellick Chan
Using Intel oneAPI to program heterogeneous CPU, GPU, and/or FPGA parallel systems
oneAPI is a single, unified programming model that aims to simplify development across multiple architectures – such as CPUs, GPUs, FPGAs and other accelerators. This workshop helps you get started with learning how to program with the Data Parallel C++ (DPC++) language and oneAPI cross-architecture libraries.
Wednesday, February 26.
11:00 a.m. – 1:00 p.m.
CSL 301
Vail Workshop – Joe Smetana, Vijay K. Gurbani, Jordan Hosier, Yu Zhou
Communication: The Human Connection
The human voice is capable of conveying nuances and meaning that can’t just be expressed through clicks and chat messages. For this reason, voice interactions have always had a special power to shape our perceptions and experiences. At Vail, we believe in the unique power of voice interactions to create more expressive and efficient interpersonal interactions. From basic network services, to state-of-the art IP communications, to cutting edge real-time analytics, to innovative fraud detection models, Vail technology makes millions of voice interactions better every day.
In this workshop we provide an introduction to Vail Systems and highlight some of the work we are doing at Vail that occurs at the intersection of communications and affective computing:
- Conversational AI agent as an aid for large-scale IoT devices embedded in buildings.
- Identifying and resolving mis-transcriptions that arise from phonetic ambiguity and degraded acoustic signals.
- Sentiment Analysis of Acoustic Features with Neural Networks.
Wednesday, February 26.
11:00 a.m. – 1:00 p.m.
CSL 141
Nvidia Workshop Talk 1- Harun Bayrakhtar, Senior Manager, CUDA Mathematical Software Libraries
What’s New in the CUDA Math Libraries
Today’s fastest compute platforms are designed from the ground up to leverage the immense compute power of NVIDIA GPUs. As these platforms increase in scale and add specialized hardware, the CUDA Math Libraries are keeping up by constantly expanding, providing industry leading performance and coverage of common compute workflows across AI, ML, and HPC. Major initiatives to support common workflows are: multi-GPU scalability, reduced and mixed precision computing, and libraries that allow kernel fusion and customizations. In this talk, we review the latest developments in the CUDA Math Libraries including Tensor Core acceleration of HPC solvers without loss of accuracy, support for multiple GPUs in FFTs, BLAS and LAPACK routines, and the addition of new libraries with device function support and tensor linear algebra functionality.
Nvidia Workshop Talk 2- Kuhu Shukla, Senior Distributed System Engineer
Accelerating Apache Spark ETL Workflows with Nvidia GPUs
Apache Spark is a unified analytics engine for big data processing with built-in modules for streaming, SQL, machine learning, and graph processing. It’s used extensively in ETL and machine learning workloads across the big data community. GPUs are a quintessential choice for running machine learning and AI workloads. As part of our ongoing effort at NVIDIA, we present a glimpse into what goes into combining these two worlds to accelerate production scale Apache Spark ETL workloads with NVIDIA GPUs. In this talk, we’ll explore some of the assumptions and challenges that need to be considered when developing software around GPUs and dive into how GPU accelerated Spark SQL queries work.
Nvidia Workshop Talk 3- Azzam Haidar, Senior CUDA Mathematical Libraries Engineer
Mixed Precision Numerical Techniques Accelerated with Tensor Cores and its Impact on Today’s Scientific Computing and Implications for Tomorrow’s Hardware Design
Double-precision-floating-point has been the-de-facto standard for doing scientific simulation for several decades. Problem complexity and the sheer magnitude of data coming from various instruments and sensors motivate researchers to mix and match various approaches to optimize compute resources, including different levels of floating-point precision. In recent years, the big bang for machine learning has focused significant attention on half-precision.
We explored the possibility of using FP16/FP32-Tensor-Cores on NVIDIA-Volta-GPUs to accelerate one of the most common linear algebra routines without loss of accuracy. We achieved a 4x performance increase and 5x better energy efficiency versus the standard FP64 implementation while providing a solution with FP64 accuracy.
We studied a plasma fusion application that simulates the instabilities that occur inside a plasma inside the International-Thermonuclear-Experimental-Reactor (ITER). We show that using our mixed precision solver that harnesses the FP16/FP32-Tensor-cores in Volta GPUs, it is possible to simulate the instability between plasma beams 3.5x faster.