The General Expiration Streaming Model: Diameter, $k$-Center, Counting, Sampling, and Friends
By: Lotte Blank , Sergio Cabello , MohammadTaghi Hajiaghayi and more
Potential Business Impact:
Tracks fast-changing data without storing everything.
An important thread in the study of data-stream algorithms focuses on settings where stream items are active only for a limited time. We introduce a new expiration model, where each item arrives with its own expiration time. The special case where items expire in the order that they arrive, which we call consistent expirations, contains the classical sliding-window model of Datar, Gionis, Indyk, and Motwani [SICOMP 2002] and its timestamp-based variant of Braverman and Ostrovsky [FOCS 2007]. Our first set of results presents algorithms (in the expiration streaming model) for several fundamental problems, including approximate counting, uniform sampling, and weighted sampling by efficiently tracking active items without explicitly storing them all. Naturally, these algorithms have many immediate applications to other problems. Our second and main set of results designs algorithms (in the expiration streaming model) for the diameter and $k$-center problems, where items are points in a metric space. Our results significantly extend those known for the special case of sliding-window streams by Cohen-Addad, Schwiegelshohn, and Sohler [ICALP 2016], including also a strictly better approximation factor for the diameter in the important special case of high-dimensional Euclidean space. We develop new decomposition and coordination techniques along with a geometric dominance framework, to filter out redundant points based on both temporal and spatial proximity.
Similar Papers
Dynamic Diameter in High-Dimensions against Adaptive Adversary and Beyond
Data Structures and Algorithms
Keeps data points organized even when they change.
General Coverage Models: Structure, Monotonicity, and Shotgun Sequencing
Information Theory
Finds how many tries to see all DNA pieces.
Unbiased Insights: Optimal Streaming Algorithms for $\ell_p$ Sampling, the Forget Model, and Beyond
Data Structures and Algorithms
Finds patterns in huge data streams using less space.