Score: 1

CTIGuardian: A Few-Shot Framework for Mitigating Privacy Leakage in Fine-Tuned LLMs

Published: December 15, 2025 | arXiv ID: 2512.12914v1

By: Shashie Dilhara Batan Arachchige , Benjamin Zi Hao Zhao , Hassan Jameel Asghar and more

Potential Business Impact:

Protects secret data in smart computer programs.

Business Areas:

Cloud Security Information Technology, Privacy and Security

Large Language Models (LLMs) are often fine-tuned to adapt their general-purpose knowledge to specific tasks and domains such as cyber threat intelligence (CTI). Fine-tuning is mostly done through proprietary datasets that may contain sensitive information. Owners expect their fine-tuned model to not inadvertently leak this information to potentially adversarial end users. Using CTI as a use case, we demonstrate that data-extraction attacks can recover sensitive information from fine-tuned models on CTI reports, underscoring the need for mitigation. Retraining the full model to eliminate this leakage is computationally expensive and impractical. We propose an alternative approach, which we call privacy alignment, inspired by safety alignment in LLMs. Just like safety alignment teaches the model to abide by safety constraints through a few examples, we enforce privacy alignment through few-shot supervision, integrating a privacy classifier and a privacy redactor, both handled by the same underlying LLM. We evaluate our system, called CTIGuardian, using GPT-4o mini and Mistral-7B Instruct models, benchmarking against Presidio, a named entity recognition (NER) baseline. Results show that CTIGuardian provides a better privacy-utility trade-off than NER based models. While we demonstrate its effectiveness on a CTI use case, the framework is generic enough to be applicable to other sensitive domains.

Enterprise AI Must Enforce Participant-Aware Access Control

Cryptography and Security

Stops AI from sharing secret company secrets.

18 Sep 2025 1

89%

Guarding Your Conversations: Privacy Gatekeepers for Secure Interactions with Cloud-Based AI Models

Cryptography and Security

Keeps your private chat info safe from AI.

22 Aug 2025 0

88%

Evaluating the Robustness of Large Language Model Safety Guardrails Against Adversarial Attacks

Cryptography and Security

Makes AI safer from bad instructions.

27 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇦🇺 Australia

Repos / Data Links

github.com github.com github.com

Page Count

13 pages

CTIGuardian: A Few-Shot Framework for Mitigating Privacy Leakage in Fine-Tuned LLMs

Protects secret data in smart computer programs.

Technical Abstract

Enterprise AI Must Enforce Participant-Aware Access Control

Guarding Your Conversations: Privacy Gatekeepers for Secure Interactions with Cloud-Based AI Models

Evaluating the Robustness of Large Language Model Safety Guardrails Against Adversarial Attacks