“Novel Statistical Frameworks for Analysis of Structured Sequential Data”
Dr. Abhra Sarkar, Duke University
Date: Thursday, February 04, 2016
Time: 3:30 PM – 4:30 PM
Location: Engineering Hall Room 106B1
Sponsor: Department of Statistics
We are developing a broad array of novel statistical frameworks for analyzing complex sequential data sets. Our research is primarily motivated by a collaboration with neuroscientists trying to understand the neurological, genetic and evolutionary basis of human communication using bird and mouse models. The data sets comprise structured sequences of syllables or `songs’ produced by animals from different genotypes under different experimental conditions. The primary goal is then to elucidate the roles of different genotypes and experimental conditions on animal vocalization behaviors and capabilities. We have developed novel statistical methods based on first order Markovian dynamics that help answer these important scientific queries. First order dynamics is, however, insufficiently flexible to learn complex serial dependency structures and systematic patterns in the vocalizations, an important secondary goal in these studies. To this end, we have developed a sophisticated nonparametric Bayesian approach to higher order Markov chains building on probabilistic tensor factorization techniques. Our proposed method is of very broad utility, with applications not limited to analysis of animal vocalizations, and provides new insights into the serial dependency structures of many previously analyzed sequential data sets arising from diverse application areas. Our method has appealing theoretical properties and practical advantages, and achieves substantial gains in performance compared to previously existing methods. Our research also paves the way to advanced automated methods for more sophisticated dynamical systems, including higher order hidden Markov models that can accommodate more general data types.