# CSE 5559 Computational Topology: Theory, algorithms, and applications CSE 5559 Computational Topology: Theory, algorithms, and applications to data analysis Chapter 6: More on Persistent homology Instructor: Yusu Wang Section 1: Persistence modules, extensions Persistence Modules Let be a partially ordered set where

or A persistence module over E.g, in persistence induced by the sublevelset filtration of , and is the homomorphism induced by inclusion Persistence induced by simplicial maps

Recall simplicial map Given a tower of simplicial complexes connected by simplicial maps: each is a simplicial map, and Apply homology functor to the above sequence We obtain a persistence module over

Note: The algorithm we introduced in previous lectures do not apply (that algorithm is only for filtrations connected by inclusions). Extended Persistence Given a function together with an index set with Persistence induced by sublevel-set filtration

Persistence pairings of the form correspond to essential homology in Extended persistence sequence Example Zigzag Persistence Connection between vector spaces are no longer

monotone 1 2 persistence (a canonical decomposition into indecomposable interval modules) can still be defined Zigzag Persistence Connection between vector spaces are no longer

monotone 1 2 persistence (a canonical decomposition into indecomposable interval modules) can still be defined Example: levelset zigzag persistence

Given , and For simplicity, set , and A simple example Section 2: Interleaving distance Interleaving Distance More general way to measure distance between two arbitrary persistence modules

Interleaving distance First introduced in [Chazal, Cohen-Steiner, Gliss, Guibas, Oudot, 2009] [Lesnick PhD Thesis] [Chazal, de Silva, Gliss and Oudot, 2016] (available on arXiv)

Two persistence modules (indexed by ) Goal: define a distance between them Intuition : ( ) ( ) : ( ) ( )

Intuition Isomorphic persistence modules : ( ) ( ) : ( ) ( ) Intuition

Isomorphic persistence modules : ( ) ( ) : ( ) ( ) Vertical maps also have to commute with horizontal maps (in all possible combinations)

Intuition Isomorphic persistence modules ( ) ( ) : :

( ) ( ) Vertical maps also have to commute with horizontal maps (in all possible combinations) E.g, Intuition Isomorphic persistence modules , ( ) ( ) :

: , ( ) ( ) Vertical maps also have to commute with horizontal maps (in all possible combinations)

e.g, -Interleaving and are -interleaving if there exists maps : and for any s.t. these maps commute with horizontal maps and ( ) ( + ) ( +2 )

: + + ( ) ( + ) ( +2 ) -Interleaving and are -interleaving if there exists maps

: and for any s.t. these maps commute with horizontal maps and ( ) ( + ) ( +2 ) : +

+ ( ) ( + ) ( +2 ) To verify commutativity of maps, only need to check four configurations (see board) Remark In practice, often it is much easier to check for interleaving at space level Example: Cech and Vietoris-Rips complexes

: + + 2 G: + + 2 Interleaving Distance { and are -interleaved} General Stability Theorem [Chazal, Cohen-Steiner, Gliss, Guibas, Oudot 2009]

Given two (q-tame) persistence modules and , let and be their corresponding persistence. We then have: ) A More General Result { and are -interleaved} Isometry Theorem [Lesnick 2015], [Chazal, de Silva, Gliss and Oudot, 2016]

Given two (q-tame) persistence modules and , let and be their corresponding persistence diagrams. We then have: ) Example Interleaving between Cech and Rips filtrations Further remarks Multidimensional persistence

Generating cycles for homology and persistence homology Further remarks Multidimensional persistence Generating cycles for homology and persistence homology Further remarks

Multidimensional persistence Generating cycles for homology and persistence homology Tracking persistence through time Section 3: Applications of Persistence 1: Homoloyg inference for PCD Recall the example

Imagine we want to characterize the following curve Recall the example Imagine we want to characterize the following curve Death time Birth time Distance Function a compact subset of Distance function

is a 1-Lipschitz function -offset of Given any point More Formally Given a compact , define distance function

where Consider the sub-level set filtration induced by Use its persistence diagram as a summary to characterize More Formally Given a compact , define distance function

Consider the sub-level set filtration induced by Use its persistence diagram as a summary to characterize Input:

where A set of points sampled on/around Question: How to approximate the persistence module induced by or simply just ? Persistent Homology Inference from PCD Input:

Question: A set of points sampled on/around How to approximate the persistence module induced by or simply just ? Sampling condition for Different models (allowing various types of noise) Use the simple Hausdorff model:

is -sampling of if Union of Balls = P ( , ) Union of Balls approximates offset Union of Balls

approximates offset Union of Balls approximates offset As we increase i.e, a filtration

Union of Balls approximates offset As we increase i.e, a filtration Union of Balls

approximates offset As we increase i.e, a filtration Consider the persistence module induced by Its persistence diagram intuitively approximates Image courtesy of T. K. Dey Union of Balls

approximates offset As we increase i.e, a filtration Consider the persistence module induced by Its persistence diagram intuitively approximates

Union of Balls ech complex Intuitively, approximates offset The ech complex is the Nerve of By Nerve Lemma, is homotopy equivalent to Persistent Homology Inference from PCD

Input: A set of points sampled on/around Question: How to approximate the persistence module induced by or simply just ? Target filtration: Intermediate filtration:

ech filtration: By Nerve Lemma Approximate Guarantee Interleaving between s and s [Interleaving Lemma] If (i.e, is an -sample of ), then for any

+ +

+ + Approximate Guarantee Interleaving between s and s

[Interleaving Claim] If (i.e, is an -sample of ), then for any [Approximation Theorem] If (i.e, is an -sample of ), the persistence modules induced by and by are -interleaved. It follows then , where and are the persistence diagrams induced by and by the ech filtration , respectively. Remarks Data sparsification [Sheehy 2012], [Cavanna, Jahanseir, Sheehy 2015], [Dey, Shi, W, 2016]

Homology Inference Estimate the homology of a hidden space from its point samples when is a manifold or is a compact set in Sampling conditions

local feature size / reach weak feature size Distance Function a compact subset of Distance function is a 1-Lipschitz function -offset of

Given any point Medial Axis The medial axis of is the closure of the set of points such that means that there is a medial ball touching at more than point and whose interior is empty of points from . Courtesy of [Dey, 2006]

Local Feature Size The local feature size at a point is the distance of to the medial axis of This concept is adaptive Large in a place without features

Intuitively: That is, We should sample more densely if local feature size is small. The reach Courtesy of [Dey, 2006] Gradient of Distance Function

Distance function not differentiable on the medial axis Still can define a generalized concept of gradient For

[Lieutier, 2004] Let and be the center and radius of the smallest enclosing ball of point(s) in The generalized gradient of distance function Flow lines induced by the generalized gradient Examples: Critical Points A critical point of the distance function is a point whose generalized gradient vanishes

A critical point is either in or in its medial axis Weak Feature Size Given a compact , let denote the set of critical points of the distance function that are not in Given a compact the weak feature size is

Equivalently, is the infimum of the positive critical value of Typical Sampling Conditions Hausdorff distance between two sets and No noise version:

infimum value such that and A set of points is an -sample of if and With noise version: A set of points is an -sample of if Why Distance Field? Theorem [Offset Homotopy] [Grove93] If are such that there is no critical value of in the

closed interval then deformation retracts onto . In particular, . Remarks: For the case of compact set , note that it is possible that , for sufficiently small may not be homotopy equivalent to . Intuitively, by above theorem, we can approximate for any small positive from a thickened version

(offset) of . The sampling condition makes sure that the discrete sample is sufficient to recover the offset homology. Smooth Manifold Case Let be a smooth manifold embedded in Theorem [Niyogi, Smale, Weinberger 2005] Let be such that . If , there is a deformation retraction from to Corollary A Under the conditions above, we have

How about using Rips complex instead of ech complex? Recall that inducing Idea [Chazal and Oudot 2008]: Forming interleaving sequence of homomorphism to connect them with the homology of the input manifold and its offsets

Convert to Cech Complexes Lemma A [Chazal and Oudot, 2008]: The following diagram commutes: ( ) h ( )

( ) h ( ) Corollary B Let be s.t. . If , where the second isomorphism is induced by inclusion. From Rips Complex Lemma B: Given a sequence of homomorphisms between

finite dimensional vector spaces, if then Rips and ech complexes: Applying Lemma B The Case of Compact Set Given a compact set Local feature size could be 0

Also, conditions for previous result on homology inference strong even for manifold The Case of Compact Set Given a compact set Local feature size could be 0 Also, conditions for previous result on homology inference strong even for manifold The Case of Compact In contrast to Corollary A, now we have the

following (using Lemma B). Lemma C [Chazal and Oudot 2008]: Let be a finite set such that for some . Then for all such that and for all , we have , where is the homomorphism between homology groups induced by the canonical inclusion . Cech Complex Version In contrast to Corollary A, now we have the following (using Lemma B).

Lemma C [Chazal and Oudot 2008]: Let be a finite set such that for some . Then for all and for all , we have , where is the homomorphism between homology groups induced by the canonical inclusion . Rips Complex One more level of interleaving. Use the following extension of Lemma B: Given a sequence of homomorphisms between finite dimensional vector spaces, if then

Theorem [Homology Inference] [Chazal and Oudot 2008]: Let be a finite set such that for some . Then for all all , we have where is the homomorphism between homology groups induced by canonical inclusion . Summary of Homology Inference homotopy equivalent to Critical points of distance field

approximates (may be interleaving) E.g, [Niyogi, Smale, Weinberger, 2006], [Chazal and Oudot 2008] homotopy equivalent to interleaves at homology level

Nerve Lemma [Chazal and Oudot 2008] and interleave Derive homology inference from the interleaving sequence of homomorphisms Section 4: Applications of Persistence 2: Persistence summary + ML Persistence-based Framework Persistence-based feature vectorization +

machine learning Persistence images, kernels for persistence Input data: Further data Input data: diagrams, etc A collection of point sets, graphs, 2D/3D images, arbitrary compacts, etc A collection of

persistence-based feature representations (e.g, vectors) Space of persistence features analysis: Feed to data analysis pipelines, say SVM, CNN, etc Vectorization of persistence diagrams

Persistence landscapes Persistence images Persistence scale space kernel Persistence weighted Gaussian kernel Sliced Wasserstein distance Section 5: Applications of Persistence 3: Metric graph comparison Metric Graphs

A (finite) metric graph A length (metric) space , where is the underlying space of a -dimensional simplicial complex Given a topological graph Each edge is associated with a length Metric Graphs A (finite) metric graph

A length (metric) space , where is the underlying space of a -dimensional simplicial complex Given a topological graph Each edge is associated with a length induced shortest path metric Metric Graphs

A (finite) metric graph A length (metric) space , where is the underlying space of a -dimensional simplicial complex Given a topological graph Each edge is associated with a length induced shortest path metric metric graph

Metric graph or Introduction Metric graphs naturally occur in many applications Hidden space: graph-like structures Simple, non-linear structure behind data Introduction Metric graphs naturally occur in many

applications Hidden space: graph-like structures Simple, non-linear structure behind data http://www2.iap.fr/users/sousbie/web/html/indexd41d.html Introduction Metric graphs naturally occur in many applications

Hidden space: graph-like structures Simple, non-linear structure behind data An important problem in applications of metric graphs Comparing metric graphs Graph Matching Exact graph matching

Graph isomorphism testing Inexact Graph Matching Distance between two input (weighted) graphs Graph edit distance Integer quadratic programing (IQP) based Spectral graph theory based Lapacian family signatures to measure compatibility of

Optimization problems usually NPcorrespondences of graphformulated nodes/edges [Hu, Rustamov, hard, and various heuristic approaches developed in practice to improve efficiency Guibas, CVPR2013] Persistence-distortion based [Dey, Shi, W. 2015]

A new persistence-distortion distance Continuous distance measure Align the underlying space of input graphs Stable w.r.t. metric distortion in input graphs

Compare two input metric graphs and Draw upon a topological point of view Robust to noise in practice Polynomial-time algorithm to compute it Distance function to a basepoint For any consider defined as

, for any Consider the 0th persistence diagram Induced by super-level set filtration of Persistence-distortion Distance Given two input metric graphs and Persistence-distortion between and

Here, is the bottleneck distance between two persistence diagrams and Intuitively, each graph is mapped to a set of points in the space of persistence diagrams and , respectively This mapping is isometry-invariant The persistence distortion distance

is the Hausdorff distance between the two image sets Remark: One can replace 0th persistence diagrams with 0th zigzag-persistence diagrams. All subsequent results hold. Stability Theorem (Stability) By triangle inequality, this implies that given two

metric graphs and and their perturbations and , Proof sketch: where that achieves Show for and Use recent result on comparing Reeb graphs of [Bauer, Ge, W, SoCG 2014] Discrete Case

Given input metric graphs and , where where Discrete persistence-distortion between them: Theorem (Discrete-Computation)

Given and Let The discrete persistence-distortion distance can be computed in time. Proof sketch: : compute two sets of persistence diagrams and : compute each bottleneck distance

Modify the algorithm of [Efrat, Katz and Itai, 2001] compute Hausdorff distance between and Observation: where is the largest length of any edge in . For graphs with unit edge length, provides a 2-approximation of , that is: In general, how to use to provide an approximation of with multiplicative error?

Theorem (Continuous-Computation) Given and Let The persistence-distortion distance can be computed in time. If input graphs are trees, Then the persistence-distortion distance can be computed in time. Preliminary Exps (1)

Preliminary Exp (2) pairwise persistence-distortion distance between models