Enhancing Password Security Through a High-Accuracy Scoring Framework Using Random Forests
By: Muhammed El Mustaqeem Mazelan, Noor Hazlina Abdul, Nouar AlDahoul
Potential Business Impact:
Makes passwords much harder for hackers to guess.
Password security plays a crucial role in cybersecurity, yet traditional password strength meters, which rely on static rules like character-type requirements, often fail. Such methods are easily bypassed by common password patterns (e.g., 'P@ssw0rd1!'), giving users a false sense of security. To address this, we implement and evaluate a password strength scoring system by comparing four machine learning models: Random Forest (RF), Support Vector Machine (SVM), a Convolutional Neural Network (CNN), and Logistic Regression with a dataset of over 660,000 real-world passwords. Our primary contribution is a novel hybrid feature engineering approach that captures nuanced vulnerabilities missed by standard metrics. We introduce features like leetspeak-normalized Shannon entropy to assess true randomness, pattern detection for keyboard walks and sequences, and character-level TF-IDF n-grams to identify frequently reused substrings from breached password datasets. our RF model achieved superior performance, achieving 99.12% accuracy on a held-out test set. Crucially, the interpretability of the Random Forest model allows for feature importance analysis, providing a clear pathway to developing security tools that offer specific, actionable feedback to users. This study bridges the gap between predictive accuracy and practical usability, resulting in a high-performance scoring system that not only reduces password-based vulnerabilities but also empowers users to make more informed security decisions.
Similar Papers
An explainable Recursive Feature Elimination to detect Advanced Persistent Threats using Random Forest classifier
Cryptography and Security
Finds hidden computer attacks with clear reasons.
Security Bug Report Prediction Within and Across Projects: A Comparative Study of BERT and Random Forest
Cryptography and Security
Finds security problems in computer code faster.
Machine Learning-Based AES Key Recovery via Side-Channel Analysis on the ASCAD Dataset
Cryptography and Security
Finds secret codes by listening to computer signals.