logo

Leveraging Streaming-Based Outlier Detection and SliceLine to Stop Heavily Distributed Bot Attacks

Conference:  Black Hat Asia 2023

2023-05-11

Authors:   Antoine Vastel, Konstantina Kontoudi


Abstract

In this presentation, we will discuss how to leverage streaming-based outlier detection and SliceLine to quickly and safely generate large volumes of rules/signatures that can be used to block malicious traffic. While ML use has become more and more widespread, rules are still relevant. Indeed, companies have invested a lot in efficient rule engines capable of quickly evaluating a significant volume of rules. Moreover, rules are often more convenient to create, manipulate and interpret, making them still valuable in addition to ML approaches.We demonstrate that while SliceLine was originally designed to identify subsets of data where ML models perform badly, its use can be adapted to generate a large number of rules linked to an attack in an unsupervised way, i.e. without using labeled data. Additionally, we leverage the bot detection problem to illustrate how SliceLine can be used to generate a huge volume of malicious signatures on the fly.We will also present our optimized Python open-source implementation of SliceLine and show how it can be used in a particular, but difficult, subset of bot detection: distributed credential stuffing attacks, where attackers leverage thousands of infected IP addresses to conduct their attack and bypass traditional security mechanisms such as rate limiting policies.Through a real-world example, we will first explain how streaming-based detection can be used to detect such attacks, and how we use data modeling to apply SliceLine on server side signals (HTTP headers, TLS fingerprints, IP address, etc) to identify and generate blocking signatures linked to a distributed attack. This approach enabled us to block more than 285M malicious login attempts last year across 59 customers.Finally, we will explain how this approach generalizes to other security use cases besides bot detection and how it can be used in different rule engines.

Materials: