### Plenary Speakers

**Michael I. Jordan**

“On Gradient-Based Optimization: Accelerated, Stochastic and Nonconvex”

“On Gradient-Based Optimization: Accelerated, Stochastic and Nonconvex”

Monday, June 4^{th}, 3:45-4:45pm

VEC 201, Auditorium

**Abstract:**

Optimization methods play a key enabling role in statistical inference, both frequentist and Bayesian. Moreover, as statistics begins to more fully embrace computation, what is often meant by “computation” is in fact “optimization”. I will discuss some recent progress in high-dimensional, large-scale optimization, where new theory and algorithms have provided non-asymptotic rates, sharp dimension dependence, elegant ties to geometry and practical relevance. In particular, I discuss several recent results: (1) a new framework for understanding Nesterov acceleration, obtained by taking a continuous-time, Lagrangian/Hamiltonian/symplectic perspective, (2) a discussion of how to escape saddle points efficiently in nonconvex optimization, and (3) the acceleration of Langevin diffusion.

**David Madigan**

“**Honest learning for the healthcare system: large-scale evidence from real-world data**“

Tuesday, June 5th, 10:30am-11:30am

VEC 401, Multiple Purpose Room

**Abstract:
**In practice, our learning healthcare system relies primarilyon observational studies generating one effect estimate at a time using customized study designs with unknown operating characteristics and publishing – or not – one estimate at a time. When we investigate the distribution of estimates that this process has produced, we see clear evidence of its shortcomings, including an apparent over-abundance of estimates where the confidence interval does not include one (i.e. statistically significant effects) and indicators of publication bias. In essence, published observational research represents unabashed data fishing. We propose a standardized process for performing observational research that can be evaluated, calibrated and applied at scale to generate a more reliable and complete evidence base than previously possible, fostering a truly learning healthcare system. We demonstrate this new paradigm by generating evidence about all pairwise comparisons of treatments for depression for a relevant set of health outcomes using four large US insurance claims databases. In total, we estimate 17,718 hazard ratios, each using a comparative effectiveness study design and propensity score stratification on par with current state-ofthe-art, albeit one-off, observational studies. Moreover, the process enables us to employ negative and positive controls to evaluate and calibrate estimates ensuring, for example, that the 95% confidence interval includes the true effect size approximately 95% of time. The result set consistently reflects current established knowledge where known, and its distribution shows no evidence of the faults of the current process. Doctors, regulators, and other medical decision makers can potentially improve patient-care by making wellinformed decisions based on this evidence, and every treatment a patient receives becomes the basis for further evidence. Joint work with Martijn J. Schuemie, Patrick B. Ryan, George Hripcsak, and Marc A. Suchard.

**Liza Levina**

**“Matrix Completion in Network Analysis”**

Wednesday, June 6th, 10:30am-11:30am

VEC 401, Multiple Purpose Room

** Abstract:**Matrix completion is an active area of research in itself, and a natural tool to apply to network data, since many real networks are observed incompletely and/or with noise. However, developing matrix completion algorithms for networks requires taking into account the network structure. This talk will discuss three examples of matrix completion used for network tasks. First, we discuss the use of matrix completion for cross-validation or non-parametric bootstrap on network data, a long-standing problem in network analysis. Two other examples focus on reconstructing incompletely observed networks, with structured missingness resulting from network sampling mechanisms. One scenario we consider is egocentric sampling, where a set of nodes is selected first and then their connections to the entire network are observed. Another scenario focuses on data from surveys, where people are asked to name a given number of friends. We show that matrix completion can generally be very helpful in solving network problems, as long as the network structure is taken into account.

### Banquet Speaker

**Cathy O’Neil**

**Biography:** Cathy O’Neil is the author of the New York Times bestselling Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, which was also a semifinalist for the National Book Award. She earned a Ph.D. in math from Harvard, was a postdoctoral fellow in the MIT math department, and a professor at Barnard College where she published a number of research papers in arithmetic algebraic geometry. She then switched over to the private sector, working as a quantitative analyst for the hedge fund D.E. Shaw in the middle of the credit crisis, and then for RiskMetrics, a risk software company that assesses risk for the holdings of hedge funds and banks. She left finance in 2011 and started working as a data scientist in the New York start-up scene, building models that predicted people’s purchases and clicks. Cathy wrote Doing Data Science in 2013 and launched the Lede Program in Data Journalism at Columbia in 2014. She is a columnist for Bloomberg View.