Social Bots, or accounts from non-genuine people, are posted all over social media. They infiltrate popular topics and serious ones like the COVID-19 pandemic. These bots are not like obvious robocalls or spam emails. They are designed to be human-like and interact with real social media users without their awareness. In fact, recent studies show that social media users find them mostly indistinguishable from real humans.
Now a study by Stony Brook University and University of Pennsylvania researchers published in Findings of the Association for Computational Linguistics (ACL) attempts to look at how human these social spambots really are by estimating 17 human attributes of the bot and implementing state-of-the-art machine learning and natural language processing. The study findings shed light on how bots behave on social media platforms and interact with genuine accounts, as well as the capabilities of current bot-generation technologies.
“This research gives us insight into how bots are able to engage with these platforms undetected,” explains lead author Salvatore Giorgi, a visiting scholar at Stony Brook University and a PhD student in the Department of Computer and Information Science at the University of Pennsylvania. “If a Twitter user thinks an account is human, then they may be more likely to engage with that account. Depending on the bot’s intent, the end result of this interaction could be innocuous, but it could also lead to engaging with potentially dangerous misinformation.”
The image indicates the amount of genuine human accounts (blue) and fake bot accounts (red) by different ages and personality scores within the data of the study. The bot accounts have reasonable ages and personalities but only within an extremely thin range of values (ie, they all express the same human attributes), while the genuine human accounts have a large spread of values.
Senior author H. Andrew Schwartz, associate professor in the Department of Computer Science at Stony Brook University, says the problem with bots is like this: “Imagine you’re trying to find spies in a crowd, all with very good but also very similar disguises. Looking at each one individually, they look authentic and blend in extremely well. However, when you zoom out and look at the entire crowd, they are obvious because the disguise is just so common.
“The way we interact with social media, we are not zoomed out, we just see a few messages at once. This approach gives researchers and security analysts a big picture view to better see the common disguise of the social bots.”
The study looked at more than 3 million tweets authored by 3,000 bot accounts and an equal number of genuine accounts. Based only on the language from these tweets, the researchers estimated 17 features for each account: age, gender, five personality traits (openness to experience, conscientiousness, extraversion, agreeableness and neuroticism), eight emotions (such as joy, anger and fear), and positive/negative sentiment.
Their results showed that, individually, the bots look human, having reasonable values for their estimated demographics, emotions and personality traits. However, as a whole, the social bots look like clones of one another, in terms of their estimated values across all 17 attributes.
Overwhelmingly, the language bots use appeared to be characteristic of a person in their late 20s and with a very positive language tone.
The researchers’ analysis revealed that uniformity of social bots’ scores on the 17 human traits was so strong that they tested how well these traits would work as the only inputs to a bot detector.
Typically, bot detectors rely on more features or a complex combination of information from the bot’s social network and images they post. Schwartz and Giorgi found that by automatically clustering the accounts into two groups based only on these 17 traits and no bot labels, one of the two groups ended up being almost entirely bots. In fact, they were able to use this technique to make an unsupervised bot detector with great accuracy.
Schwartz and Giorgi believe the study findings suggest a new strategy to detect bots – by way of the discovery that while the language used by any one bot reflected convincingly human personality traits, their similarity to one another betrayed their artificial nature.