Algebraic and Differential Topology in Data Analysis (ADTDA)
The course will cover some recent applications of topology and differential geometry in data analysis. Tools of differential and algebraic topology are starting to impact the area of data sciences, where the mathematical apparatus thus far was dominated by the ideas from statistical learning, computational linear algebra and high-dimensional normed space theories. While the research community around ADT topics in data analysis is lively and fast-growing, the area is somewhat sparsely represented in campus syllabi.
1. Tools from algebraic topology Homotopy equivalence, Simplicial homology, Nerve lemma, Dowker’s theorem
Applications: Cech complexes and their topology in robotics and neurophysiology. Netflix problem complexes.
2. Topological Approximations Vietoris-Rips and Čech complexes. Topology reconstruction from random samples: Niyogi-Smale-Weinberger. Topology reconstruction from dense samples: Hausmann-Latscher. Sketches: Merge trees; Reeb graphs.
3. Topological Inference Persistent Homology: Algorithms; Stability. Biparametric persistence
Applications: Image patches spaces. Textures and characterization of materials
4. Euler calculus Integration with respect to Euler characteristics. Topological Signal Processing. Valuations; Hadwiger’s Theorem. Average persistence for Gaussian fields.
Applications: Topological Sensor Networks. Shape reconstruction through Euler transform.
5. Aggregation Spaces with averaging. Arrow theorem and Topological Social Choice. Aggregation in CAT(0) spaces
Applications: Consensus in phylogenetic analysis. Political polarization
6. Clustering Basic clustering tools. Kleinberg’s Impossibility theorem. Carlsson-Memoli functorial approach to clustering.
7. Tools from differential topology Useful topological spaces: manifolds, subanalytic sets, simplicial complexes. Transversality, Sard’s theorem. Whitney’s embedding theorem.
Applications: Dimensionality reduction; Embeddings
The course will rely mainly on the recent papers, and a few textbooks, like
To receive credit the students will be expected
- to take (a fraction of) class notes (and produce a LaTeX source), (sign up here) and
- either present a paper (from a list), or to run a computational project (from a set of provided topics).
The class will be conducted remotely: the lectures will be held via zoom (one time registration required) at the planned times, – 9:30 on Tuesdays and Thursdays. The first lecture is on March 24.
While most of materials and links will be posted here, we will be also using moodle for course-specific announcements. If you auditing the course, let me know so I can add you to the list of users.