Anatomy of an AI-Powered Malicious Social Botnet#

What’s this?

This is summary of Wednesday 6th December’s Data Ethics Club discussion, where we spoke and wrote about the paper Anatomy of an AI-Powered Malicious Social Botnet by Kai-Cheng Yang and Filippo Menczer. The summary was written by Jessica Woodgate, who tried to synthesise everyone’s contributions to this document and the discussion. “We” = “someone at Data Ethics Club”. Amy Joint, Vanessa Hanschke, Huw Day, Nina Di Cara and Natalie Thurlby helped with the final edit.

Have you interacted with bots in the past or do they play a part in your social media consumption? What were your experiences?#

On social media, we have seen bots that post images but don’t have many interactive elements. We have also experienced social media bots that tell you how many people have unfollowed you, or remind you if you forget to add alternative text to a photo on twitter (X has now blocked these). Outside of social media, we have interacted with HR bots on MS Teams, which can be asked questions like “how can I claim vouchers for glasses”. However, we’ve found that the HR bots struggle if you ask them questions about certain areas like menopause. The kinds of bots we have mostly interacted with have been tuned to be benign, however bots can also be tuned to be malicious, such as those identified in the paper.

Interacting with malicious bots may be unpleasant, yet people online can be just as disagreeable. Troll farms are people who are paid to spread hateful comments and post on the internet, to manipulate public opinion in a way that promotes the agenda of those in power. It’s been found that governments in at least 30 countries are using troll farms to spread propaganda. Shifting from humans to LLM bots seems like a natural next step for troll farms.

Replacing humans with bots in troll farms makes it much easier and cheaper to push agendas on a larger scale. Automating trolls is a reflection of the “enshittification” of internet platforms, where allocation of value between buyers and sellers shifts in such a way that turns platforms into useless shells. First, buyers are prioritised. When buyers are locked in, sellers are prioritised. When sellers are locked in, shareholders are prioritised, and platforms become shadows of what once drew users to them. There is a danger of enshittification when companies put too much faith in chatbots. An example of too much faith being put in AI could be BT, who are cutting 55,000 jobs and replacing up to a fifth with AI.

Do you think LLMs will exacerbate misinformation on social media in particular or the internet in general? What do you think the Internet will look like in 5 years time if/when LLM use becomes mainstream?#

Humans are already quite adept at spreading misinformation, making it difficult to identify when bots are exacerbating the problem. This is made even more difficult when bots implement generative AI, as the output can be indistinguishable from humans. As generative AI improves, it will become increasingly challenging to distinguish between truth and misinformation. Considering how hard it is to regulate the spread of misinformation by humans, it seems likely that it will also be difficult to regulate generative AI spreading misinformation.

Misinformation may be exacerbated by too much trust being placed in LLMs when they have the tendency to hallucinate. LLMs struggle to admit when they don’t know something, instead making things up to fill in the gaps. The appearance of certainty plays on the inclination of humans to trust and follow rules, and results in too much faith being placed in the output of LLMs. Trust is also induced by the underlying style of communication for social media bots, which tends to be brief, with a lot of jokes and sarcasm. This informal use of language makes it even harder to perceive lying or misleading content, as we assimilate with the bots. Humans placing too much trust in overconfident and inaccurate LLM output thus increases the potential for circulating misinformation.

When thinking about the future of the internet and how to prevent generative AI disseminating misinformation, we may look to events that have sparked similar worries in the past. The concerns around LLMs today are reminiscent of the beginnings of Wikipedia several years ago. Wikipedia’s rapid growth meant that the platform initially struggled to regulate trolls and problem contributors, resulting in worries about misinformation. Over time, perceptions of Wikipedia have improved as people are encouraged to cite where their data comes from, which can then be verified. Proper citation may be an avenue to improve the verification of generative AI output. On the other hand, we can’t just trust that anything cited is correct and LLMs have been found to hallucinate fake citations. On Wikipedia, it is possible to view and edit sources; this isn’t currently possible with bots and LLMs but could perhaps enhance their validity.

Improving the citation ability of generative AI may be one way to prevent misinformation; another way might be by making it easier to distinguish between bots and humans. This could be done by altering the structure of language. “Algospeak” is a type of coded terminology used by humans on social media to avoid algorithmic detection when discussing sensitive topics. Certain letters may be replaced by numbers, or substitutes used in place of words that would be flagged (e.g. “unalive” instead of “dead”). On the technology side, there is potential to watermark generative AI, for example by embedding metadata identifiable by tools but indistinguishable to the human eye. Bots could be built with the purpose of watermarking other bots. Technologies like screenreaders, which help people with visual impairments, could be harnessed e.g. by using the technology to force bots to reveal themselves.

If the problem of misinformation can be adequately addressed, there are some promising use cases for bots which use generative AI. In healthcare, for example, the technology could be used to improve assistance to patients. However, the more useful applications might be for things like performance reports and data analysis; it is often the case that simple solutions are more effective than exciting ones. For instance, it may be that building more swimming pools and bike lanes would reduce cardiovascular deaths more than technologies like digital twins.

Although we did find some positive potential for bots using generative AI, we also felt pessimistic about what the internet would look like in five years if/when generative AI becomes mainstream. There are issues with trust; if we don’t harness the misinformation problem, it will be difficult to believe what we read on our feeds. The effects of misinformation aren’t isolated to the internet; we have seen how single tweets can have significant real world impacts, such as a fake Eli Lilly tweet which caused their stocks to fall. We have also had some insight into how bots may be spreading misinformation to sway political campaigns, such as the recent referendum in Australia. These events give us some insight into the effects we might expect in the future from bots using generative AI to disseminate misinformation.

To think about how people will respond to social media in the future, we can draw similarities between the evolution of social media and the evolution of newspapers. When print newspapers becoming cheaper to print and more widespread, there emerged a division between “highbrow” and “tabloid” papers. Some people switched off, and some people became more invested. With social media, either people may realise that the tools are broken and “log off” completely, or they could get even more drawn into the platforms. This has already happened to several of us who have stopped using Twitter/X. For Discord, we found it to be a bit of a mixed bag as a platform. Yet, we liked the more community based approach to moderation and have found smaller pockets within the platform that are more robust. Widespread disengagement with social media could be just as harmful as over-engagement. Apathy can be weaponised, as people become less likely to vote leading to more extreme outcomes. Overall, we felt that the future of how people interact with social media is uncertain, and we weren’t sure how bots would affect whether people switch off or invest more.

The authors call for more LLM detection methods, regulation and public awareness. What do you think needs to be put in place against social media bots? Whose responsibility is this?#

Better regulation of the internet is crucial, but extremely difficult. This is partially because of how distributed the internet is geographically; making a law in one country might have no impact elsewhere. As things are, companies and organisations self-regulate, resulting in companies having a disproportionate amount of power compared to individuals.

Compared to how complicated regulating the internet is, approaches surrounding public awareness and education may be more straightforward. A lot of attention is paid to the ethics of generative AI, but psychology is perhaps something which is overlooked. Many crises happen because there is a human element which could have been avoided, for example Grenfell, the Titanic, the Post Office scandal. Approaches to mitigate harm should start with psychology as a foundation, and then work up from there. Educational intervention should begin in schools, teaching children to think critically about what they see. This could be improved by more promotion of subjects like philosophy and politics. We should also think about how chatbots are being used for older people, or people with dementia, who may find it more difficult to distinguish between real and fake information. Interacting with bots inevitably leads to some level of anthropomorphising; even if the bots are clunky, it can be difficult to keep in mind that they are bots.

Bonus Question: What change would you like to see on the basis of this piece? Who has the power to make that change?#

In the future, we would like social media to be seen as a public utility; a provision of a social good to connect people in a meaningful way. However, there are several obstacles in the way of getting there, including shifting how social media is monetised. The future of the internet does not seem to be valued highly enough by those with power, as there is not sufficient action being taken to prevent widespread misuse. If a business model has flaws, those flaws will only be amplified by adding automation. We would like to see social media companies and other businesses thinking longer term about the effects of generative AI and misinformation, prioritising our collective future over profit.

Attendees#

  • Nina Di Cara, Research Associate, University of Bristol, ninadicara, @ninadicara

  • Amy Joint, freerange publisher, [@amyjointsci] (https://twitter.com/amyjointsci)

  • Euan Bennet, Lecturer, University of Glasgow, @DrEuanBennet

  • ZoĂ« Turner, Senior Data Scientist, [@Lextuga007]

  • Zosia Beckles, Research Information Analyst, University of Bristol

  • Kamilla Wells, Citizen Developer, Australian Public Service, Brisbane