Score: 0

Robust reduced rank regression under heavy-tailed noise and missing data via non-convex penalization

Published: December 30, 2025 | arXiv ID: 2512.24450v1

By: The Tien Mai

Reduced rank regression (RRR) is a fundamental tool for modeling multiple responses through low-dimensional latent structures, offering both interpretability and strong predictive performance in high-dimensional settings. Classical RRR methods, however, typically rely on squared loss and Gaussian noise assumptions, rendering them sensitive to heavy-tailed errors, outliers, and data contamination. Moreover, the presence of missing data--common in modern applications--further complicates reliable low-rank estimation. In this paper, we propose a robust reduced rank regression framework that simultaneously addresses heavy-tailed noise, outliers, and missing data. Our approach combines a robust Huber loss with nonconvex spectral regularization, specifically the minimax concave penalty (MCP) and smoothly clipped absolute deviation (SCAD). Unlike convex nuclear-norm regularization, the proposed nonconvex penalties alleviate excessive shrinkage and enable more accurate recovery of the underlying low-rank structure. The method also accommodates missing data in the response matrix without requiring imputation. We develop an efficient proximal gradient algorithm based on alternating updates and tailored spectral thresholding. Extensive simulation studies demonstrate that the proposed methods substantially outperform nuclear-norm-based and non-robust alternatives under heavy-tailed noise and contamination. An application to cancer cell line data set further illustrates the practical advantages of the proposed robust RRR framework. Our method is implemented in the R package rrpackrobust available at https://github.com/tienmt/rrpackrobust.

Category
Statistics:
Methodology