Beidi Chen is a Ph.D. candidate in the Department of Computer Science at Rice University. She is advised by Dr. Anshumali Shrivastava. Previously, Beidi received a BS in EECS from UC Berkeley in 2011. Her research interests are in Large-scale machine learning and scalable algorithms to social science applications. Specifically, she designs and applies randomized algorithms for large-scale estimations. Her work has won Best Paper awards at LISA and IISA. She received the Ken Kennedy Institute for Information Technology Fellowship. She also has held internships in the Microsoft Research Redmond, NVIDIA Research and Amazon AI.
The exponential growth of data poses many challenges for scaling learning algorithms which involve quadratic computation w.r.t. the size of data. To address these challenges, I focus on randomized hashing algorithms and shed new light on Locality Sensitive Hashing (LSH). LSH is a hugely popular algorithm for sub-linear approximate nearest neighbor search (NNS). However, it turns out that fundamentally LSH is a constant time (amortized) adaptive sampler of which efficient NNS is one of the many applications. LSH offers a unique capability to do smart sampling and statistical estimations at extremely low cost. Our observation bridges data structures (probabilistic hash tables) with efficient unbiased statistical estimations. I also demonstrate how this sampling technique breaks the computational barriers for problems in (1) social impact assessments, e.g. estimating the number of casualties in Syrian Conflict, (2) deep learning, e.g. GPU scalability of huge models, and (3) optimization, e.g. faster convergence of stochastic gradient descent (SGD). Besides, I provide some fundamental insights and development in the classical hashing algorithms in the literature.