Make sure that you did all of your course setup, at the top of the HowTo page.
This week:
Questions? post a comment!…
]]>Make sure that you did all of your course setup, at the top of the HowTo page.
This week:
Questions? post a comment!
]]>We will rely on the standard facts about generic smooth mappings into two-dimensional manifolds: for such mappings, the set of critical points is a smooth curve \(\Sigma\) in \(M\), which is immersed outside of a finite number of pleats: near generic point of \(\Sigma\), there are local coordinates on \(M\) in which the mapping is locally given by
\[
y_1=x_1, y_2=q(x_2,\ldots,x_m)
\]
(folds), and near isolated points of the curve of critical points, in some coordinates the mapping is given by
\[
y_1=x_1, y_2=x_2^3+x_1x_2+q(x_3,\ldots,x_m),
\]
(pleats).…
We will rely on the standard facts about generic smooth mappings into two-dimensional manifolds: for such mappings, the set of critical points is a smooth curve \(\Sigma\) in \(M\), which is immersed outside of a finite number of pleats: near generic point of \(\Sigma\), there are local coordinates on \(M\) in which the mapping is locally given by
\[
y_1=x_1, y_2=q(x_2,\ldots,x_m)
\]
(folds), and near isolated points of the curve of critical points, in some coordinates the mapping is given by
\[
y_1=x_1, y_2=x_2^3+x_1x_2+q(x_3,\ldots,x_m),
\]
(pleats). Here \(q\) is a quadratic form of the remaining variables (absent in the pleat case, if \(m=2\).
The image of the critical curve \(\Sigma\) is a curve in the \((f,g)\)-plane called contour; generically, \(h\) is an immersion of \(\Sigma\) outside of the pleats; at the pleat points, the contour has cusps.
(The contour on the left is a – rotated and somewhat distorted – rendition of Chicago Millenium Park’s Bean sculpture, with the cusps made visible.)
As the coordinates on the \(f,g\) plane are given, more special points emerge: namely those where either of the functions has a critical point. We will be referring to these special points as V- (vertical) or H- (horizontal) points.
These H- and V-points sit on the critical curve \(\Sigma\). Generically, the functions \(f,g\) are Morse, and the contour has non-vanishing curvature at them. (This follows from the fact that at a critical point of, say \(f\) the restriction of Hessian matrix of \(f\) to the hyperplane \(\{dg=0\}\), is nondegenerate.)
The V- and H-points partition the critical curve into segments which map to the segments of the contour curve where its slope is \(\neq 0,\infty\). We will refer to the segments where the slope is negative as Pareto points (for obvious reasons).
We augment the Pareto segments by attaching at their boundaries the rays (referred to as extension rays) pointing up (at V-points) or right (at H-points). We will call the V- or H-points where this attachment results in \(C^1\) curve, the pseudosmooth points; those points where the curve loses smoothness, the pseudocusps.
We will refer to the resulting union of Pareto segments and extension rays as the life contour for the filtration defined by the pair \(f,g\).
At the points of Pareto segments one can coorient the contour, declaring positive the side where both\(f\) and\(g\) can increase. This allows one to define the index of a point \(x\) on the smooth part of Pareto segment: choose a function of \(f,g\) vanishing on contour near the point, and increasing in the positive direction; this function has a (degenerate) critical point at \(x\); its index is the index of the point \(p\).
The index at extension rays is defined just as the index of the critical point of \(f\) or \(g\), where the ray was attached to a Pareto segment.
It is easy to see that the index is constant along the smooth parts of the Pareto segments, and jumps at cusps and pseudocusps in a predictable way: the higher with respect to \(f,g\) point has higher index.
We will be referring to the (Pareto, i.e. such that the branches of the contour involved have negative slopes) pseudocusps, cusps and the double points of the same index) as obstacles.
Consider an increasing curve \(\gamma:\Real\to\Real^2\) in the \(f,g\)-plane (this means that both functions strictly increase along the curve). To fix the gauge, we will assume that the curve is parameterized by \(f+g\).
An increasing curve \(\gamma\) defines a usual \(I\)-indexed filtration of \(M\), and correspondingly, persistent homologies and persistent diagrams \(\phd_k(\gamma), k=0,\ldots,m\), which we interpret as a collections of distinguishable points in planes \(\{b\lt d\}\).
(Remark that the approach to multiparametric persistence through restriction to increasing straight lines has been considered by several authors. Here we allow arbitrary increasing curves.)
The main (somewhat tautological) result of this note is
Theorem: For the path-connected collection of increasing curves in the plane avoiding obstacles, there is a section of the space of persistent diagrams: that is for any two such curves \(\gamma,\gamma’\), there is an identification
\[
I(\gamma’,\gamma):\phd_*(\gamma)\to\phd_*(\gamma’),
\]
of the bars in the persistent diagrams corresponding to each of the curves, and these identifications are consistent:
\[
I(\gamma”,\gamma’)I(\gamma’,\gamma)=I(\gamma”,\gamma).
\]
The figure below shows six homotopy non-equivalent increasing paths avoiding the obstacles, – five green and the sixth, blue.
The bars shown on the lower left (recall that the curves are parameterized by \(f+g\) correspond to the filtration that the blue curve defines: the births and deaths happen where the curve intersects the life contour.
One can readily see what happens when one moves from one homotopy class to another, across a (pseudo)cusp with the indices of branches \((k,k+1)\): a bar in \(\phd_k\) appears or disappears.
As the curve crosses a self-intersection point of the life contour, where the branches of equal indices cross, one has a less prominent event, where the charges in the persistence diagram \(\phd_k\) bounce of a common vertical or horizontal line (we can call the corresponding bars interacting).
This result prompts question about the structure of the space of increasing paths in the plane avoiding a finite collection of obstacles.
Theorem: the connected components of space of increasing curves in the plane avoiding a finite set of obstacles \(o_1,\ldots,o_k\in\Real^2\) are contractible, and their number is equal to the number of chains of obstacles, i.e. subsets \(o_{i_1}\prec\ldots o_{i_l}\), where \(o_m\prec o_n\) is the vector ordering of the points.
Adjacency of these cells is in itself an interesting invariant of the overall structure. Namely, for any collection of obstacles, consider the set of increasing paths passing through those obstacles. Each of these closed strata is manifestly contractible, as is (which is easy to prove the same way the Theorem above is proven) each connected component of the set of the increasing paths passing only through given set of obstacles.
Then the resulting stratification of the space of increasing paths is dual to a \(\CAT(0)\) cube complex: the vertices of the complex correspond to the connected components of teh set of increasing paths avoiding obstacles, and the sets \(h_k\) of increasing paths passing through obstacle \(o_k\) form the hyperplanes (for the nomenclature, see Sageev).
We will be referring to this cube complex as the characteristic complex of biparametric persistence structure defined by \(f,g\).
I would stipulate that the characteristic complex is the key descriptor of biparametric persistence in smooth category.
This note is but an introduction to the overall problem of understanding the nature of the characteristic complex for biparametric persistences. Here are a few research directions we hope to purse:
This note is an outline of my talk at the conference on Geometric Data Analysis at U Chicago in May 2019; written up while at IMI at Kyushu University.
My understanding is that Peter Bubenik and Mike Catanzaro were also considering biparametric persistence for smooth maps.
]]>Euler characteristics’ additivity (apply with care! if used with vanilla singular homologies, work with compact sets only; if going beyond that, use some o-minimal structure) makes possible a version of integration, where the values are added weighted not by Lebesgue measure (as in the common integrals), but with Euler characteristic:
\[
\int h d\chi=\sum_{r} r\chi(\{h=r\}).
\]
For this definition to make sense, we need, of course, the function to have a finite range (in some commutative ring, like \(\Int\)), and all the preimages reside in the o-minimal structure we work with.…
]]>Euler characteristics’ additivity (apply with care! if used with vanilla singular homologies, work with compact sets only; if going beyond that, use some o-minimal structure) makes possible a version of integration, where the values are added weighted not by Lebesgue measure (as in the common integrals), but with Euler characteristic:
\[
\int h d\chi=\sum_{r} r\chi(\{h=r\}).
\]
For this definition to make sense, we need, of course, the function to have a finite range (in some commutative ring, like \(\Int\)), and all the preimages reside in the o-minimal structure we work with. We will be referring to such functions as constructible.
Then, thus defined integral is additive and Fubini theorem works.
One has an immediate application to data fusion: if one has a very dense field of sensors, each measuring the number of objects they can sense. Assuming that the footprint of each object has the same (nonzero) measure, one can just integrate the sensor readings and divide by the measure, – whether the measure is Lebesgue or with respect to Euler characteristic.
Of course, if the measure vanishes, the counts become indefinite…
One can use this approach even for moving objects, with appropriate modifications.
This view can be generalized to a version of integral calculus wrt \(\chi\). For a constructible kernel \(K:X\times Y\to\Int\), one can define the integral operator \(\CF_Y\to\CF_X\) by
\[
(Kh)(x)=\int K(x,y)h(y)d\chi(y).
\]
For example, when \(X=Y=\Real^d\), and \(K=g(x-y)\), the transform defines convolution \(h\star g\) (wrt \(\chi\)).
Classically, the convolution of indicator functions of convex compact sets gives the Minkovski sum:
\[
\one_A\star\one_B=\one_{A+B}.
\]
Inegral calculus allows one to rigorously deconvolve Minkovsky sum:
\[
\one_A\star\one_{-A^o}=\delta_0;
\]
here \(A^o\) is the interior of \(A=\overline{A^o}\).
A general class of problems admitting inversion was introduced by Schapira: often one can find a pair of kernels \(R,S\) such that
\[
\chi(R^{-1}(x)\cap S^{-1}(x))=\mu, \chi(R^{-1}(x)\cap S^{-1}(x’))=\lambda
\]
for any \(x\neq x’\in X\). In this case, if \(g(y)=\int S(y,x)h(x)d\chi(x)\), then
\[
\int R(x,y)g(y)d\chi(x)=(\mu-\lambda)h(x)+\lambda\int h d\chi.
\]
Examples include projective duality and integrals over circles.
Important aspect of data analysis is aggregation: collapsing a flock of points into a single point (finding “averaging”, or “median” or other representative point).
It is a nontrivial task: in some situations, can be done in principled way: say, in case of \(\mathtt{CAT}(0)\) spaces.
But in many situations this task is seen as difficult. If one requests that the aggregation operation
\[
f:X\times\cdots\times X\to X
\]
respects a) unanimity (\(f\circ\Delta_X=\mathtt{id}_X\)) and b) anonymity (\(f(x_{1},\ldots,x_{k})=f(x_{\sigma_1},\ldots,x_{\sigma_k})\) for any permutation \(\sigma\)), then a theorem by B. Eckmann says that any reasonable topological space admitting aggregation over any \(k\geq 2\) points is contractible.
One important applications of this impossibility to aggregate is in social choice theory: one cannot create a democratic procedure if the space of preferences is topologically complicated…
]]>If one works with smooth manifolds and mapping, it makes sense to focus on generic situation, – functions that form an open dense subset in the space of all functions. This often leads to significant simplification of the description of the functions.
For example, For a smooth mapping \(F:M\to\Real^d\) of a compact manifold into Euclidean spaces, the singular set – consists of critical points, where the rank of the Jacobi operator is less than maximal.…
]]>If one works with smooth manifolds and mapping, it makes sense to focus on generic situation, – functions that form an open dense subset in the space of all functions. This often leads to significant simplification of the description of the functions.
For example, For a smooth mapping \(F:M\to\Real^d\) of a compact manifold into Euclidean spaces, the singular set – consists of critical points, where the rank of the Jacobi operator is less than maximal. Generically, this is a set of dimension \((d-1)\).
In the case of \(\Real\)-valued function, generic functions are Morse, i.e. have non-degenerate (thus isolated) critical points, with different critical values. For the maps into the plane, Whitney theory implies that generically such mappings have a smooth curve of critical (“fold”) points, which contains isolated pleat points.
Reeb graphs: given a function (from a manifold, or more generally, a path connected topological space), identify path connected components of the level sets.
Mapper is a software data analysis tool that reconstructs Reeb graph.
More generally, Reeb spaces can be formed using maps into plane or other, higher dimensional spaces, and identifying path-connected components: resulting CW complexes are increasingly harder to construct and interpret.
Merge trees: keep track not of level, but of sublevel sets. Essentially, this is the 0-dimensional persistence homology.
For univariate functions, merge tree completely identifies the underlying function, up to a reparametrization of the underlying space (real line). Is there an analogue of this result in higher dimensional setting? Indeed, there is.
For Morse functions, there is a natural way to decompose the underlying manifold, and a corresponding way to construct persistent homologies from the Morse-Smale complex. This was done by Barannikov and used to derive spectral asymptotics for Witten Laplacian.
One can also see the importance of (at least, 0-dimensional) persistent homologies in the classical result on gradient dynamics perturbed by white noise: the escape rates from wells are governed by the bar lengths corresponding to those wells.
(See here.)
]]>Stability theorem establishes that the persistent diagrams vary little (in the so-called bottleneck metric) when the functions is perturbed little in \(L_\infty\) norm.
While originally a tool to describe topological spaces, persistent diagrams is now used also as a tool to characterize functions. This is especially pronounced in material sciences. With this change of focus, short bars become a signal, a descriptor of the function, not nuisance to get rid of.
Natural question is to ask what to expect from a “generic” function, for example: how the short bars accumulate towards the diagonal.…
]]>Stability theorem establishes that the persistent diagrams vary little (in the so-called bottleneck metric) when the functions is perturbed little in \(L_\infty\) norm.
While originally a tool to describe topological spaces, persistent diagrams is now used also as a tool to characterize functions. This is especially pronounced in material sciences. With this change of focus, short bars become a signal, a descriptor of the function, not nuisance to get rid of.
Natural question is to ask what to expect from a “generic” function, for example: how the short bars accumulate towards the diagonal.
A definition of PH dimension is one measure of this accumulation.
Namely, let \(\mu_k(f)\) be the (counting) measure associated with the \(k\)-th persistence diagram of the function.
We take the persistence-dimension of the function \(f\) to be
\[
\ph_k\dim(f):=inf\{p:\int (d-b)^p_+d\mu_k(f)<\infty\}.
\]
Example: For the univariate Lipschitz function \(x^{q+1}\cos(x^{-q}\), one has \(\ph_k\dim(f)=q/(q+1)\)
We prove that the general estimates (in high generality) hold for all Lipschitz or Hölder functions on a compact finite-dimensional polyhedron \(X\): \(\ph_*\dim(f)<\dim X/\alpha\), where \(\alpha\) is the Hölder parameter (\(=1\) for Lipschitz functions).
Moreover, it turns out that for generic Lipschitz or Hölder functions (outside of a meager set), this estimate is precise: a generic function has the highest possible persistence dimension.
Persistence diagrams are rather incomplete descriptor for a univariate function. If one restrict attention to descriptors that are reparametrization invariant, the merge tree associated to the function is: Harris walk is the inverse transformation.
Also, persistence diagrams for univariate functions can be constructed fast, essentially using an online algorithm with a pair of stacks recording local minima and maxima.
The structure of jitter for one class of random functions is especially easy to investigate: Brownian trajectories.
Let \(f(t)\) be the Brownian motion with constant drift. Then \(\mu_0(f)\) is a random point process, and of \(\ph_0\dim\) at most 2.
We give a quite detailed description of this process, basing on this preprint.
]]>While NSW provides theoretical pipe to recovering an embedded manifold topology with large enough sample, its reliance on Čech complexes is (one of) the computational obstacles.
Vietoris-Rips complexes are much easier to compute, and they make sense in arbitrary metric spaces.
The caveats exist: unlike Čech complexes, they do not necessarily reflect faithfully the topology of the space covered by the metric balls: spurious homologies can emerge.
However, if the sample is dense enough, approximation guarantees with Vietoris-Rips complexes do exist: Hausmann, Latschev.…
]]>While NSW provides theoretical pipe to recovering an embedded manifold topology with large enough sample, its reliance on Čech complexes is (one of) the computational obstacles.
Vietoris-Rips complexes are much easier to compute, and they make sense in arbitrary metric spaces.
The caveats exist: unlike Čech complexes, they do not necessarily reflect faithfully the topology of the space covered by the metric balls: spurious homologies can emerge.
However, if the sample is dense enough, approximation guarantees with Vietoris-Rips complexes do exist: Hausmann, Latschev.
The biggest problem with the “sample from the manifold” model is the noise: the data points that are deviating from the underlying space, and present spurious topological patterns at all scales.
An approach to address the spurious topological features without assuming a particular scale was proposed, under the name of “persistent homology”.
(Basic facts rely on Edelsbrunner-Harer)
Definitions: filtration (an exhaustive family of subsets) of a topological space allows one to arrange the topological features at all scales, ant to track those that are stable, – persistent.
As filtration grows, so does the linear spaces of cycles and boundaries. So one can ask when a particular homology class emerged, and when it dies, – i.e. becomes bounded by a chain. This fits naturally into a neat linear algebra formalism, known as “quiver representation theory”, and leads to a notational device for the persistent classes, known as “persistence diagrams”.
In simplicial complexes, there are distinguished bases in the spaces of chains, and one can track them using an efficient algorithm (cubic in the number of simplices).
Stability theorem establishes that the persistent diagrams vary little (in the so-called bottleneck metric) when the functions is perturbed little in \(L_\infty\) norm.
:
We recall what a topological space is, the notions of homomorphism, homotopy and homotopy equivalence.
Simplicial complexes: abstract; their geometric realizations. Poset of a simplical complex; simplicial complex associated to a poset. Barycentric subdivisions.
Simplicial complexes can be embedded into Euclidean spaces of any dimension exceeding twice the dimension of the complex.
Connecting combinatorial and geometric worlds: if \(f:|K|\to|L|\) is a continuous mapping, we say that \(\phi:K\to L\) is simplical approximation of \(f\) if for any point in \(|K|\), its image is in the simplex whose interior contains \(f(x)\).…
]]>We recall what a topological space is, the notions of homomorphism, homotopy and homotopy equivalence.
Simplicial complexes: abstract; their geometric realizations. Poset of a simplical complex; simplicial complex associated to a poset. Barycentric subdivisions.
Simplicial complexes can be embedded into Euclidean spaces of any dimension exceeding twice the dimension of the complex.
Connecting combinatorial and geometric worlds: if \(f:|K|\to|L|\) is a continuous mapping, we say that \(\phi:K\to L\) is simplical approximation of \(f\) if for any point in \(|K|\), its image is in the simplex whose interior contains \(f(x)\). Simplical approximations are homotopy equivalent to the maps they approximate, and, luckily, they do exist – if one subdivides the domain finely enough (say, by barycentric subdivisions).
Manifolds: charts and atlases. Embeddings (to be discussed later).
We will be using several times the Nerve lemma; perhaps best understood it in the context of diagrams of spaces and their homotopy limits.
Homologies of a relation: given a relation (say, on a pair of finite space), one can construct a pair of simplicial complexes. Dowker’s theorem ensures they match. One can use this to reason about both of the complexes: one example deals with the representation of the terrain via the neuronal activity traces. Other examples include describing the structure of social preferences for a collection of artifacts (Netflix matrix).
Given a sample from a manifold, how we can reconstruct it (or recognize its topology)?
Naive example: a sample is either \((n+1)\)-dimensional unit sphere, or a manifold close to its equator. Can one distinguish these two cases? One can easily see that one needs an exponential in the dimension sample size (essentially, due to measure concentration).
If one is willing to accept this, an algorithm to reconstruct the homotopy type of was offered by Niyogi-Smale-Weinberger: sample a lot of points from a Riemannian manifold \(M\subset\Real^N\) and take some offset, – i.e. union of balls around the sample:
\[
X_\epsilon=\{x\in\Real^N:\mathtt{dist}(x,X)\leq \epsilon\}.
\]
Then one has:
The proof of the first argument is an elementary geometric computation; the second one is essentially a reformulation of the well-known bounds on covering problem.
Refresher on topology: plenty of good books to brush it up:
Projects:
In 2018, the chances for a woman in the US state of Georgia to die from causes related to her pregnancy are 46 per 100,000.…
]]>In 2018, the chances for a woman in the US state of Georgia to die from causes related to her pregnancy are 46 per 100,000.
]]>This content is password protected. To view it please enter your password below:
]]>
This content is password protected. To view it please enter your password below:
]]>