AutoML-Med: A Framework for Automated Machine Learning in Medical Tabular Data
By: Riccardo Francia , Maurizio Leone , Giorgio Leonardi and more
Potential Business Impact:
Helps doctors find sick people from messy data.
Medical datasets are typically affected by issues such as missing values, class imbalance, a heterogeneous feature types, and a high number of features versus a relatively small number of samples, preventing machine learning models from obtaining proper results in classification and regression tasks. This paper introduces AutoML-Med, an Automated Machine Learning tool specifically designed to address these challenges, minimizing user intervention and identifying the optimal combination of preprocessing techniques and predictive models. AutoML-Med's architecture incorporates Latin Hypercube Sampling (LHS) for exploring preprocessing methods, trains models using selected metrics, and utilizes Partial Rank Correlation Coefficient (PRCC) for fine-tuned optimization of the most influential preprocessing steps. Experimental results demonstrate AutoML-Med's effectiveness in two different clinical settings, achieving higher balanced accuracy and sensitivity, which are crucial for identifying at-risk patients, compared to other state-of-the-art tools. AutoML-Med's ability to improve prediction results, especially in medical datasets with sparse data and class imbalance, highlights its potential to streamline Machine Learning applications in healthcare.
Similar Papers
AutoMedic: An Automated Evaluation Framework for Clinical Conversational Agents with Medical Dataset Grounding
Computation and Language
Tests AI doctors in realistic patient talks.
AR-Med: Automated Relevance Enhancement in Medical Search via LLM-Driven Information Augmentation
Computation and Language
Finds the right health answers online, safely.
AutoMedEval: Harnessing Language Models for Automatic Medical Capability Evaluation
Computation and Language
Checks doctor AI answers for medical accuracy.