Session 39: New directions in functional data analysis. – Conference on Statistical Learning and Data Science / Nonparametric Statistics

Session title: New directions in functional data analysis.
Organizer: Tailen Hsing ( UMich)
Chair: Vincent Joseph Dorie (Columbia)
Time: June 6^th, 8:30am – 10:00am
Location: VEC 1402

Speech 1: Nonparametric covariance estimation for mixed longitudinal studies
Speaker: Kehui Chen (U of Pitt)
Abstract: Motivated by applications of mixed longitudinal studies, where a group of subjects entering the study at different ages (cross-sectional) are followed for successive years (longitudinal), we consider nonparametric covariance estimation with samples of noisy and partially observed functional trajectories. In this talk, we will introduce a novel sequential aggregation scheme, which works for both dense regular and sparse irregular observations. We will present numerical experiment results and applications a midlife women’s working memory study. We will also discuss the details of identifiability and estimation consistency.

Speech 2: Functional Data Analysis with Highly Irregular Designs with Applications to Head Circumference Growth
Speaker: Matthew Reimherr (Penn State)
Abstract: Functional Data Analysis often falls into one of two branches, either sparse or dense, depending on the sampling frequency of the underlying curves. However, methods for sparse FDA often still rely on having a growing number of observations per subject as the sample size grows. Practically, this means that for very large sample sizes with infrequently or irregularly sampled curves, common methods may still suffer a non-negligible bias. This becomes especially true for nonlinear models, which are often defined based on complete curves. In this talk I will discuss how this issue can be fixed to obtain valid statistical inference regardless of the sampling frequency of the curves. This work is motivated by a study by Dr. Carrie Daymont from Hershey medical school that examines pathologies related to head circumference growth in children. In her study, tens of thousands of children are sampled, but with widely varying frequency.

Speech 3: Supervised Learning on the Path Space and its Applications
Speaker: Hao Ni (UCL)
Abstract: Regression analysis aims to use observational data from multiple observations to develop a functional relationship relating explanatory variables to response variables, which is important for much of modern statistics, and econometrics, and also the field of machine learning. In this talk, we consider the special case where the explanatory variable is a data stream. We provide an approach based on identifying carefully chosen features of the stream which allows linear regression to be used to characterise the functional relationship between explanatory variables and the conditional distribution of the response; the methods used to develop and justify this approach, such as the signature of a stream and the shuffle product of tensors, are standard tools in the theory of rough paths and provide a unified and non-parametric approach with potential significant dimension reduction. To further improve the efficiency of the signature method, we can combine the non-linear regression method (e.g. neural network) with the signature feature set. Numerical examples are provided to show the superior performance of the proposed method. Lastly I will show that the signature based method have achieved the state-of-the-art results in online handwritten text recognition and action recognition.