Score: 0

Too Many or Too Few? Sampling Bounds for Topological Descriptors

Published: November 15, 2025 | arXiv ID: 2511.12059v1

By: Brittany Terese Fasy , Maksym Makarchuk , Samuel Micka and more

Potential Business Impact:

Maps shapes perfectly using fewer directions.

Business Areas:
Big Data Data and Analytics

Topological descriptors, such as the Euler characteristic function and the persistence diagram, have grown increasingly popular for representing complex data. Recent work showed that a carefully chosen set of these descriptors encodes all of the geometric and topological information about a shape in R^d. In practice, epsilon nets are often used to find samples in one of two extremes. On one hand, making strong geometric assumptions about the shape allows us to choose epsilon small enough (corresponding to a high enough density sample) in order to guarantee a faithful representation, resulting in oversampling. On the other hand, if we choose a larger epsilon in order to allow faster computations, this leads to an incomplete description of the shape and a discretized transform that lacks theoretical guarantees. In this work, we investigate how many directions are really needed to represent geometric simplicial complexes, exploring both synthetic and real-world datasets. We provide constructive proofs that help establish size bounds and an experimental investigation giving insights into the consequences of over- and undersampling.

Page Count
19 pages

Category
Computer Science:
Computational Geometry