Catch me, Yes we can! – Pwning Social Engineers using Natural Language Processing Techniques in Real-Time

Conference:  BlackHat USA 2018



The presentation discusses the need for a tool to detect social engineering attacks and introduces a tool that analyzes text to detect suspicious activity.
  • Social engineering attacks are a major threat to cybersecurity
  • Current defense mechanisms rely on human detection and are not foolproof
  • The presented tool analyzes text to detect suspicious activity and can be connected to various communication platforms
  • The tool currently prints a text alert but can be modified to provide other forms of warning
  • The tool is a step towards raising the bar for attackers and making them work harder
The presenter shared an example of a phishing email that successfully tricked a high-profile individual by using basic obfuscation techniques and a fake URL. The presenter also discussed the emerging threat of deep fakes and the potential for generative adversarial networks to create convincing fake audio. These examples illustrate the need for better defense mechanisms against social engineering attacks.


Social engineering is a big problem but very little progress has been made in stopping it, aside from the detection of email phishing. Social engineering attacks are launched via many vectors in addition to email, including phone, in-person, and via messaging. Detecting these non-email attacks requires a content-based approach that analyzes the meaning of the attack message. We observe that any social engineering attack must either ask a question whose answer is private, or command the victim to perform a forbidden action. Our approach uses natural language processing (NLP) techniques to detect questions and commands in the messages and determine whether or not they are malicious. Question answering approaches, a hot topic in information extraction, attempt to provide answers to factoid questions. Although the current state-of-the-art in question answering is imperfect, we have found that even approximate answers are sufficient to determine the privacy of an answer. Commands are evaluated by summarizing their meaning as a combination of the main verb and its direct object in the sentence. The verb-object pairs are compared against a blacklist to see if they are malicious. We have tested this approach with over 187,000 phishing and non-phishing emails. We discuss the false positives and false negatives and why this is not an issue in a system deployed for detecting non-email attacks. In the talk, demos will be shown and tools will be released so that attendees can explore our approach for themselves.