Session 43: Statistics in neuroscience and microbiome research at the Flatiron Institute – Conference on Statistical Learning and Data Science / Nonparametric Statistics

Session title: Statistics in neuroscience and microbiome research at the Flatiron Institute
Organizer: Christian L. Müller (Flatiron Institute, Simons Foundation)
Chair: Christian L. Müller (Flatiron Institute, Simons Foundation)
Time: June 6^th, 1:15pm – 2:45pm
Location: VEC 1303

Speech 1: Neural representation learning as kernel alignment
Speaker: Cengiz Pehlevan (Simons Foundation)
Abstract: What are the brain’s learning cost functions? I show that some kernel alignment cost functions can be minimized by biologically plausible neural learning algorithms. Starting from such cost functions, I derive neural networks for various biologically motivated unsupervised learning tasks, such as soft-clustering and manifold disentangling. I discuss applications of these ideas to circuits of the brain.

Speech 2: Robust regression with compositional covariates
Speaker: Aditya Mishra (Flatiron Institute)
Abstract: With the large-scale efforts in 16S ribosomal RNA sequencing in microbiome study related to human gut or marine ecosystem, we have relative abundance/compositional data of the group of microbial taxa at different taxonomic levels. A problem of interest is to model phenotype/response using these compositional covariates. Often we have observed that there is presence of either outlier or leveraged observation in data. Hence, we propose a robust regression model with compositional covariates. Subcompositional coherence of the model estimates are satisfied via linear constraint to the linear logcontrast model. In order for model to be robust, we add a mean shift parameter to each n instance of the data. The estimation is performed via penalized regression approach with regularization enforcing sparsity in mean shift parameter. We have investigated the model with both convex (l1 and adaptive l1) and non-convex (l0) penalty on mean shift. For the purpose of initialization in latter case and weight constriction in adaptive l1 penalty case, we propose an algorithm extending the idea of S-estimation in case of linear model with compositional covariates. Our approach has only one tuning parameter which is selected via a modified BIC selection criterion. Our theoretical analysis focus on non-asymptotic prediction error bound revealing interesting finitesample behaviors of the estimators. We have demonstrated the efficacy of the approach using various simulation studies and an application relating body mass index to human gut microbiome data.

Speech 3: Online deconvolution and demixing of calcium imaging data in real time
Speaker: Eftychios Pnevmatikakis (Simons Foundation)
Abstract: Optical imaging methods using calcium indicators enable monitoring the activity of large neuronal populations in vivo. Imaging experiments typically generate a large amount of data that needs to be processed to extract the activity of the imaged neuronal sources. While deriving such processing algorithms is an active area of research, most existing methods require the processing of large amounts of data rendering them vulnerable to the volume of the recorded data, and preventing real-time experimental interrogation. Here we introduce CaImAn, an open source suite of tools for the online analysis of calcium imaging data, including i) motion artifact correction, ii) neuronal source extraction, and iii) activity denoising and deconvolution. Our approach combines and extends previous work on online dictionary learning and calcium imaging data analysis, to deliver an automated pipeline that can discover and track the activity of hundreds of cells in real time. We benchmark the performance of our algorithm on manually annotated data, and show that it outperforms popular offline approaches.