Score: 0

O-EENC-SD: Efficient Online End-to-End Neural Clustering for Speaker Diarization

Published: December 17, 2025 | arXiv ID: 2512.15229v1

By: Elio Gruttadauria , Mathieu Fontaine , Jonathan Le Roux and more

We introduce O-EENC-SD: an end-to-end online speaker diarization system based on EEND-EDA, featuring a novel RNN-based stitching mechanism for online prediction. In particular, we develop a novel centroid refinement decoder whose usefulness is assessed through a rigorous ablation study. Our system provides key advantages over existing methods: a hyperparameter-free solution compared to unsupervised clustering approaches, and a more efficient alternative to current online end-to-end methods, which are computationally costly. We demonstrate that O-EENC-SD is competitive with the state of the art in the two-speaker conversational telephone speech domain, as tested on the CallHome dataset. Our results show that O-EENC-SD provides a great trade-off between DER and complexity, even when working on independent chunks with no overlap, making the system extremely efficient.

Category
Computer Science:
Machine Learning (CS)