## Fall 2020 iDS^{2} Seminar

NOTE: all seminars for Fall 2020 are held remotely through Zoom. Recordings of seminars are available below. Seminars run on Friday at 2:00pm unless otherwise stated.

September 11, 2020

**Katherine Tsai** (PhD Student, UIUC)

**A Nonconvex Framework for Structured Dynamic Covariance Recovery**

## Abstract

Flexible, yet interpretable, models for the second-order temporal structure are needed in scientific analyses of high-dimensional data. We develop a structured time-indexed covariance model for dynamic time-series data by factorizing covariances into sparse spatial and temporally smooth components. Traditionally, time-indexed covariance models without structure require a large sample size to be estimable. While the covariance factorization results in both domain interpretability and ease of estimation from the statistical perspective, the resulting optimization problem used to estimate the model components are nonconvex. We design a two-stage optimization scheme with a carefully tailored spectral initialization, combined with iteratively refined alternating projected gradient descent. We prove a linear convergence rate with a nontrivial statistical error for the proposed descent scheme and establish sample complexity guarantees for the estimator. As a motivating example, we consider the neuroscience application of estimation of dynamic brain connectivity. Empirical results using simulated and real brain imaging data illustrate that our approach outperforms existing baselines.

September 04, 2020

**Necmiye Ozay** (Associate Professor, University of Michigan)

**A fresh look at a classical system identification method**

## Abstract

System identification has a long history with several well-established methods, in particular for learning linear dynamical systems from input/output data. While the asymptotic properties of these methods are well understood as the number of data points goes to infinity or the noise level tends to zero, how well their estimates in finite data regime evolve is relatively less studied. This talk will mainly focus on our analysis of the robustness of the classical Ho-Kalman algorithm and how it translates to non-asymptotic estimation error bounds as a function of number of data samples. If time permits, I will also mention another problem we study at the intersection of learning, control, and optimization: learning constraints from demonstrations as an alternative to inverse optimal control. Our experiments with several robotics problems show (local) optimality can be a very strong prior in learning from demonstrations. I will conclude the talk with some open problems and directions for future research.

## Summer 2020 iDS^{2} Seminar

NOTE: all seminars for Fall 2020 are held remotely through Zoom. Recordings of seminars are available below. Seminars run on Friday at 2:00pm unless otherwise stated.

July 24, 2020

**Yuanzhi Li** (Assistant Professor, CMU)

**Backward feature correction: How can deep learning perform deep learning**

## Abstract

How does a 110-layer ResNet learn a high-complexity classifier using relatively few training examples and short training time? We present a theory towards explaining this deep learning process in terms of hierarchical learning. We refer to hierarchical learning as the learner learns to represent a complicated target function by decomposing it into a sequence of simpler functions, to reduce sample and time complexity. This work formally analyzes how multi-layer neural networks can perform such hierarchical learning efficiently and automatically simply by applying stochastic gradient descent (SGD) to the training objective.

Moreover, we present, to the best of our knowledge, the first theory result indicating how very deep neural networks can be sample and time efficient on certain hierarchical learning tasks, even when no known non-hierarchical algorithms (such as kernel method, linear regression over feature mappings, tensor decomposition, sparse coding, and their simple combinations) are efficient. We establish a new principle called “backward feature correction” to show how the features in the lower-level layers in the network can also be improved via training higher-level layers, which we believe is the key to understand the deep learning process in multi-layer neural networks.

May 01, 2020

**Cong Xie **(PhD student, UIUC)

**Byzantine Tolerance for Distributed SGD**

April 17, 2020

**Ziwei Ji **(PhD student, UIUC)

**Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks**

April 10, 2020

**Philip Amortila **(PhD student, UIUC)

**A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms**

April 03, 2020

**Nan Jiang **(Assistant Professor, UIUC)

**Minimax Methods for Off-policy Evaluation and Optimization**

March 27, 2020

**Shiyu Liang **(PhD student, UIUC)

**The Loss Landscape of Neural Networks**