Session 15: Advances in Bayesian methods for high-dimensional data – Conference on Statistical Learning and Data Science / Nonparametric Statistics

Session title: Advances in Bayesian methods for high-dimensional data
Organizer: Howard Bondell (U. of Melbourne)
Chair: Xuan Bi (Yale)
Time: June 4^th, 1:45pm – 3:15pm
Location: VEC 1202/1203

Speech 1: The Graphical Horseshoe Estimator for Inverse Covariance Matrices
Speaker: Anindya Bhadra (Purdue)
Abstract:We develop a new estimator of the inverse covariance matrix for high-dimensional multivariate normal data using the horseshoe prior. The proposed graphical horseshoe estimator has attractive properties compared to other popular estimators, such as the graphical lasso and the graphical smoothly clipped absolute deviation. The most prominent benefit is that when the true inverse covariance matrix is sparse, the graphical horseshoe provides estimates with small information divergence from the sampling model. The posterior mean under the graphical horseshoe prior can also be almost unbiased under certain conditions. In addition to these theoretical results, we also provide a full Gibbs sampler for implementing our estimator. MATLAB code is available for download from github at http://github.com/liyf1988/GHS. The graphical horseshoe estimator compares favorably to existing techniques in simulations and in a human gene network data analysis. This is joint work with Yunfan Li and Bruce Craig at Purdue.

Speech 2: Scalable MCMC for Bayes shrinkage priors
Speaker: Anirban Bhattacharya (Texas A & M)
Abstract: Gaussian scale mixture priors are common in high-dimensional Bayesian analysis. While optimization algorithms for the extremely popular Lasso and elastic net scale to dimension in the hundreds of thousands, Bayesian computation by Markov chain Monte Carlo (MCMC) is limited to problems an order of magnitude smaller. This is due to high computational cost per step and growth of the variance of time-averages as a function of dimension. We propose an MCMC algorithm for computation in these models that combines block updating and approximations of the Markov kernel to combat both of these factors. Our algorithm gives orders of magnitude speedup over the best existing alternatives in high-dimensional applications. We give theoretical guarantees for the accuracy of the approximation. Scalability of the algorithm is illustrated in an application to a genome wide association study with $N=2,267$ observations and $p=98,385$ predictors. The empirical results show that the new algorithm yields estimates with lower mean squared error, intervals with better coverage, and elucidates features of the posterior often missed by previous algorithms, including bimodality of marginals indicating uncertainty about which covariates belong in the model. This latter feature is an important motivation for a Bayesian approach to testing and selection in high dimensions.(joint work with James Johndrow & Paulo Orenstein)

Speech 3: Clustering on the Sphere: State-of-the-art and a Poisson Kernel-Based Model
Speaker: Marianthi Markatou (U. at Buffalo)
Abstract: Many applications of interest involve data that can be analyzed as unit vectors on a d-dimensional sphere. Specific examples include text mining, biology, astronomy and medicine. We present a clustering method based on mixtures of Poisson-kernel based densities on the high-dimensional sphere. We study connections of the Poisson kernel-based densities with other distributions appropriate for the analysis of directional data, prove identifiability of mixtures of the Poisson kernel-based densities model, convergence of the associated EM-type algorithm, and study its operational characteristics. We further propose an empirical densities distance plot for estimating the number of clusters in a Poisson kernel-based densities model. Finally, we propose a method to simulated data from Poisson kernel-based densities and exemplify our methods via application on real data sets and simulation experiments. Our experimental results show that the newly introduced model exhibits higher macro-precision and macro-recall than competing methods based on von Mises Fisher and Watson distributions. This is joint work with Mojgan Golzy, Ph.D.