Quantum-Inspired Optimization Process for Data Imputation
By: Nishikanta Mohanty , Bikash K. Behera , Badshah Mukherjee and more
Potential Business Impact:
Fixes missing health data for better health predictions.
Data imputation is a critical step in data pre-processing, particularly for datasets with missing or unreliable values. This study introduces a novel quantum-inspired imputation framework evaluated on the UCI Diabetes dataset, which contains biologically implausible missing values across several clinical features. The method integrates Principal Component Analysis (PCA) with quantum-assisted rotations, optimized through gradient-free classical optimizers -COBYLA, Simulated Annealing, and Differential Evolution to reconstruct missing values while preserving statistical fidelity. Reconstructed values are constrained within +/-2 standard deviations of original feature distributions, avoiding unrealistic clustering around central tendencies. This approach achieves a substantial and statistically significant improvement, including an average reduction of over 85% in Wasserstein distance and Kolmogorov-Smirnov test p-values between 0.18 and 0.22, compared to p-values > 0.99 in classical methods such as Mean, KNN, and MICE. The method also eliminates zero-value artifacts and enhances the realism and variability of imputed data. By combining quantum-inspired transformations with a scalable classical framework, this methodology provides a robust solution for imputation tasks in domains such as healthcare and AI pipelines, where data quality and integrity are crucial.
Similar Papers
A PCA-based Data Prediction Method
Machine Learning (CS)
Fills in missing numbers in data sets.
Private Data Imputation
Cryptography and Security
Keeps private data safe while fixing missing parts.
Robust Simulation-Based Inference under Missing Data via Neural Processes
Machine Learning (CS)
Fixes broken data for smarter computer guesses.