Robust Estimation for Dependent Binary Network Data
By: Tianyu Liu, Somabha Mukherjee, Abhik Ghosh
Potential Business Impact:
Finds hidden connections in messy data.
We consider the problem of learning the interaction strength between the nodes of a network based on dependent binary observations residing on these nodes, generated from a Markov Random Field (MRF). Since these observations can possibly be corrupted/noisy in larger networks in practice, it is important to robustly estimate the parameters of the underlying true MRF to account for such inherent contamination in observed data. However, it is well-known that classical likelihood and pseudolikelihood based approaches are highly sensitive to even a small amount of data contamination. So, in this paper, we propose a density power divergence (DPD) based robust generalization of the computationally efficient maximum pseudolikelihood (MPL) estimator of the interaction strength parameter, and derive its rate of consistency under the pure model. Moreover, we show that the gross error sensitivities of the proposed DPD based estimators are significantly smaller than that of the MPL estimator, thereby theoretically justifying the greater (local) robustness of the former under contaminated settings. We also demonstrate the superior (finite sample) performance of the DPD-based variants over the traditional MPL estimator in a number of synthetically generated contaminated network datasets. Finally, we apply our proposed DPD based estimators to learn the network interaction strength in several real datasets from diverse domains of social science, neurobiology and genomics.
Similar Papers
Asymptotic breakdown point analysis of the minimum density power divergence estimator under independent non-homogeneous setups
Statistics Theory
Finds bad data that messes up computer guesses.
Asymptotic breakdown point analysis of the minimum density power divergence estimator under independent non-homogeneous setups
Statistics Theory
Finds bad data that messes up computer predictions.
Robust Analysis for Resilient AI System
Applications
Finds hidden problems in factory machines.