Score: 1

Stochastic Momentum Methods for Non-smooth Non-Convex Finite-Sum Coupled Compositional Optimization

Published: June 3, 2025 | arXiv ID: 2506.02504v1

By: Xingyu Chen , Bokun Wang , Ming Yang and more

Potential Business Impact:

Makes computer learning faster and better.

Business Areas:

Fast-Moving Consumer Goods Consumer Goods, Real Estate

Finite-sum Coupled Compositional Optimization (FCCO), characterized by its coupled compositional objective structure, emerges as an important optimization paradigm for addressing a wide range of machine learning problems. In this paper, we focus on a challenging class of non-convex non-smooth FCCO, where the outer functions are non-smooth weakly convex or convex and the inner functions are smooth or weakly convex. Existing state-of-the-art result face two key limitations: (1) a high iteration complexity of $O(1/\epsilon^6)$ under the assumption that the stochastic inner functions are Lipschitz continuous in expectation; (2) reliance on vanilla SGD-type updates, which are not suitable for deep learning applications. Our main contributions are two fold: (i) We propose stochastic momentum methods tailored for non-smooth FCCO that come with provable convergence guarantees; (ii) We establish a new state-of-the-art iteration complexity of $O(1/\epsilon^5)$. Moreover, we apply our algorithms to multiple inequality constrained non-convex optimization problems involving smooth or weakly convex functional inequality constraints. By optimizing a smoothed hinge penalty based formulation, we achieve a new state-of-the-art complexity of $O(1/\epsilon^5)$ for finding an (nearly) $\epsilon$-level KKT solution. Experiments on three tasks demonstrate the effectiveness of the proposed algorithms.

Single-loop Algorithms for Stochastic Non-convex Optimization with Weakly-Convex Constraints

Machine Learning (CS)

Makes AI learn better with fewer steps.

21 Apr 2025 1

87%

Stochastic Difference-of-Convex Optimization with Momentum

Machine Learning (CS)

Makes computer learning work with smaller groups.

20 Oct 2025 0

87%

Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex Optimization

Machine Learning (CS)

Makes computers learn faster with less data.

7 Aug 2025 3

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

33 pages

Stochastic Momentum Methods for Non-smooth Non-Convex Finite-Sum Coupled Compositional Optimization

Makes computer learning faster and better.

Technical Abstract

Single-loop Algorithms for Stochastic Non-convex Optimization with Weakly-Convex Constraints

Stochastic Difference-of-Convex Optimization with Momentum

Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex Optimization