Don't @ Me: Hunting Twitter Bots at Scale

Conference: BlackHat USA 2018

2018-08-08

Summary

The presentation discusses the use of targeted social networking mapping to identify and unravel botnets on Twitter. The researchers share their methodology and findings with Twitter and the community, and open source their code to encourage collaboration in identifying bots at scale.

Targeted social networking mapping can be used to gather large Twitter datasets and accurately identify bots within them
Machine learning algorithms can be used to classify bots based on identifying characteristics
The researchers share their methodology and findings with Twitter and the community, and open source their code to encourage collaboration in identifying bots at scale

The researchers found that bots on Twitter can be classified into three types: spam bots, fake follower bots, and amplification bots. They used machine learning algorithms to classify accounts based on identifying characteristics and output a probability that an account is a bot. The researchers also split their data and evaluated different algorithms to test their model's accuracy. They found that by working together as a community, healthy social networks can be created that allow for sharing ideas and building healthy communities.

Abstract

Automated Twitter accounts have been making headlines for their ability to spread spam and malware as well as significantly influence online discussion and sentiment. In this talk, we explore the economy around Twitter bots, as well as demonstrate how attendees can track down bots in through a three step methodology: building a dataset, identifying common attributes of bot accounts, and building a classifier to accurately identify bots at scale. We first demonstrate how to amass a large dataset of public Twitter accounts using the Twitter API, gathering basic profile information as well as public activity from each account. We go on to gather and map the "social graph" of each account, such as who the account is following and, likewise, who is following the account. After this dataset has been obtained, we explore how to identify bots within it. We show common techniques used by real-world bot operators to try and keep the bot "under the radar", which can in many cases be used to help to fingerprint the bot. Finally, we demonstrate how we can tackle the bot problem at scale using data science to build a classifier that accurately identifies bots across our large global dataset.

Materials:

Slides

Paper

Tags: