Score: 1

ROBAD: Robust Adversary-aware Local-Global Attended Bad Actor Detection Sequential Model

Published: July 20, 2025 | arXiv ID: 2507.15067v1

By: Bing He, Mustaque Ahamad, Srijan Kumar

Potential Business Impact:

Finds fake users even when they try to hide.

Detecting bad actors is critical to ensure the safety and integrity of internet platforms. Several deep learning-based models have been developed to identify such users. These models should not only accurately detect bad actors, but also be robust against adversarial attacks that aim to evade detection. However, past deep learning-based detection models do not meet the robustness requirement because they are sensitive to even minor changes in the input sequence. To address this issue, we focus on (1) improving the model understanding capability and (2) enhancing the model knowledge such that the model can recognize potential input modifications when making predictions. To achieve these goals, we create a novel transformer-based classification model, called ROBAD (RObust adversary-aware local-global attended Bad Actor Detection model), which uses the sequence of user posts to generate user embedding to detect bad actors. Particularly, ROBAD first leverages the transformer encoder block to encode each post bidirectionally, thus building a post embedding to capture the local information at the post level. Next, it adopts the transformer decoder block to model the sequential pattern in the post embeddings by using the attention mechanism, which generates the sequence embedding to obtain the global information at the sequence level. Finally, to enrich the knowledge of the model, embeddings of modified sequences by mimicked attackers are fed into a contrastive-learning-enhanced classification layer for sequence prediction. In essence, by capturing the local and global information (i.e., the post and sequence information) and leveraging the mimicked behaviors of bad actors in training, ROBAD can be robust to adversarial attacks. Extensive experiments on Yelp and Wikipedia datasets show that ROBAD can effectively detect bad actors when under state-of-the-art adversarial attacks.

Adversarially Robust Detection of Harmful Online Content: A Computational Design Science Approach

Machine Learning (CS)

Finds bad online words even when changed.

19 Dec 2025 2

86%

Explainable Transformer-Based Email Phishing Classification with Adversarial Robustness

Cryptography and Security

Catches tricky fake emails, even AI ones.

15 Nov 2025 0

85%

Leveraging large language models for SQL behavior-based database intrusion detection

Cryptography and Security

Finds sneaky people trying to steal data.

6 Aug 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

15 pages

ROBAD: Robust Adversary-aware Local-Global Attended Bad Actor Detection Sequential Model

Finds fake users even when they try to hide.

Technical Abstract

Adversarially Robust Detection of Harmful Online Content: A Computational Design Science Approach

Explainable Transformer-Based Email Phishing Classification with Adversarial Robustness

Leveraging large language models for SQL behavior-based database intrusion detection