Score: 0

SafeGPT: Preventing Data Leakage and Unethical Outputs in Enterprise LLM Use

Published: January 10, 2026 | arXiv ID: 2601.06366v1

By: Pratyush Desai , Luoxi Tang , Yuqiao Meng and more

Potential Business Impact:

Keeps company secrets safe from smart computers.

Business Areas:

Cloud Security Information Technology, Privacy and Security

Large Language Models (LLMs) are transforming enterprise workflows but introduce security and ethics challenges when employees inadvertently share confidential data or generate policy-violating content. This paper proposes SafeGPT, a two-sided guardrail system preventing sensitive data leakage and unethical outputs. SafeGPT integrates input-side detection/redaction, output-side moderation/reframing, and human-in-the-loop feedback. Experiments demonstrate SafeGPT effectively reduces data leakage risk and biased outputs while maintaining satisfaction.