Score: 0

SAGE: A Context-Aware Approach for Mining Privacy Requirements Relevant Reviews from Mental Health Apps

Published: July 11, 2025 | arXiv ID: 2507.09051v2

By: Aakash Sorathiya, Gouri Ginde

Potential Business Impact:

Finds app reviews about privacy worries.

Business Areas:

Semantic Search Internet Services

Mental health (MH) apps often require sensitive user data to customize services for mental wellness needs. However, such data collection practices in some MH apps raise significant privacy concerns for users. These concerns are often mentioned in app reviews, but other feedback categories, such as reliability and usability, tend to take precedence. This poses a significant challenge in automatically identifying privacy requirements-relevant reviews (privacy reviews) that can be utilized to extract privacy requirements and address users' privacy concerns. Thus, this study introduces SAGE, a context-aware approach to automatically mining privacy reviews from MH apps using Natural Language Inference (NLI) with MH domain-specific privacy hypotheses (provides domain-specific context awareness) and a GPT model (eliminates the need for fine-tuning). The quantitative evaluation of SAGE on a dataset of 204K app reviews achieved an F1 score of 0.85 without any fine-tuning, outperforming the fine-tuned baseline classifiers BERT and T5. Furthermore, SAGE extracted 748 privacy reviews previously overlooked by keyword-based methods, demonstrating its effectiveness through qualitative evaluation. These reviews can later be refined into actionable privacy requirement artifacts.