Beyond I-Con: Exploring New Dimension of Distance Measures in Representation Learning
By: Jasmine Shone , Shaden Alshammari , Mark Hamilton and more
Potential Business Impact:
Finds better ways for computers to learn.
The Information Contrastive (I-Con) framework revealed that over 23 representation learning methods implicitly minimize KL divergence between data and learned distributions that encode similarities between data points. However, a KL-based loss may be misaligned with the true objective, and properties of KL divergence such as asymmetry and unboundedness may create optimization challenges. We present Beyond I-Con, a framework that enables systematic discovery of novel loss functions by exploring alternative statistical divergences and similarity kernels. Key findings: (1) on unsupervised clustering of DINO-ViT embeddings, we achieve state-of-the-art results by modifying the PMI algorithm to use total variation (TV) distance; (2) on supervised contrastive learning, we outperform the standard approach by using TV and a distance-based similarity kernel instead of KL and an angular kernel; (3) on dimensionality reduction, we achieve superior qualitative results and better performance on downstream tasks than SNE by replacing KL with a bounded f-divergence. Our results highlight the importance of considering divergence and similarity kernel choices in representation learning optimization.
Similar Papers
I-Con: A Unifying Framework for Representation Learning
Machine Learning (CS)
Unifies many computer learning methods for better results.
Let's Measure Information Step-by-Step: LLM-Based Evaluation Beyond Vibes
Machine Learning (CS)
Makes AI judge itself fairly, even without answers.
A novel k-means clustering approach using two distance measures for Gaussian data
Machine Learning (CS)
Finds hidden patterns in messy information better.