Data Ethics Club: ChatGPT is Bullsh*t#

Article Summary#

Tackling the problem of large language models (LLMs) outputting false statements is an increasingly important problem as LLMs are employed across more areas of society. Falsities generated by LLMs are commonly referred to as ‘hallucinations’. This paper argues that hallucination “is an inapt metaphor which will misinform the public, policymakers, and other interested parties”. To better address the topic, the paper suggests that the label ‘bullshit’ is more appropriate than hallucinate, as LLMs are designed to give the impression that they are accurately representing the world. The paper distinguishes between ‘hard’ bullshit, requiring an active attempt to deceive the audience, and ‘soft’ bullshit, requiring a lack of concern for the truth. LLMs outputs are framed as soft bullshit at a minimum, and hard bullshit if we view LLMs as having intentions, for example, in virtue of how they are designed.

Discussion Summary#

Do you think that the labels of ChatGPT as a bullshit machine is fair?#

Giving LLMs like ChatGPT labels is a path to improve understanding about the mechanics of how the tools work. Understanding mechanics and defining the (limits of) LLM capabilities is important to reduce harms and ensure they are used correctly. Users need to understand how models are designed in order to evaluate the output. Intentional metaphors can be helpful in conveying what a system is designed to do. However, if used inappropriately, metaphors can be misleading. For instance, the ‘learning’ part of ‘machine learning’ is actually something that looks more like recombining. Examples of the kinds of problems that can arise from misconceptions about the capabilities of AI can be seen in domains like digital health. In digital health, we are finding that people look to ChatGPT to diagnose problems which it is not yet capable of doing reliably.

To encapsulate what is really going on when an LLM is prompted, it is important to understand the ‘temperature’ parameter, of which the paper provides a good explanation. The temperature parameter “defines the randomness of LLM response. The higher the temperature, the more diverse and creative the output”. In other words, temperature enables the model to be tuned to choose more randomly amongst likely words, rather than choosing the most likely word. The effect of this, the paper conveys, is more “creative and human-like text” as well as a higher likelihood of falsehoods.

In framing the effects of elements like temperature, the label ‘bullshit’ does seem appropriate. The paper argues that the temperature parameter shows that the goal of LLMs is not to convey helpful information, but “to provide a normal-seeming response to a prompt”. LLMs are designed to “give the impression” that the answer is accurate, rather than giving an accurate answer. Bullshit, understood as an indifference to the truth, encapsulates the way that LLMs are designed to prioritise objectives. Whilst bullshit might seem a bit clickbait-y, it is effective at drawing you in. Some of us experienced confirmation bias just looking at the title, feeling that it puts into words how we feel about the topic.

For those of us that are more sceptical of ChatGPT, rather than seeing it as right or wrong, we thought that the paper presents a useful paradigm within which to frame a conversation. Paradigm discussions in data science are really important, such as machine learning vs. AI paradigms. In discussing paradigms, analytical philosophy surrounding language and precise definitions facilitates scoping areas and highlighting misconceptions. It is fun to utilise a well-defined but playful word like bullshit in this analysis.

The crux of the paper contrasts bullshit with the term currently used when LLMs output false information – ‘hallucinate’. Hallucinate seems to be a term which has entered common language by slowly creeping in, rather than through the careful and thought-out selection of words that we see in philosophy. We wondered why it has been used as phrasing in the first place. One idea the term hallucinate may be appropriate comes from the fact that in humans, hallucinations can have an element of truth in the individual’s life. This could mirror how in LLMs, the ‘random’ output has some root in its training data.

Despite some elements of ‘hallucinate’ that fit with what LLMs do, it doesn’t quite capture the meaning. Hallucinate has misleading elements, such as the implication that it is something that ‘just happens’. Human hallucinations are without intention; bullshit is a term that is less neutral and encompasses the effect of intention. Intention includes both the output of the system itself, and the work that goes into building the systems. Saying that a system is hallucinating insinuates that its output is not the designers’ fault. Hallucinate thus propagates a false narrative that distances the system designers from the system’s output, thereby side-stepping accountability.

The false narrative around AI glosses over the human input in these systems, masking the role of intention by presenting a façade that no humans are involved. In reality, there is a lot of human work that goes into AI (as discussed in a previous Data Ethics Club). Users are duped into thinking that the technology works much better than it really does.

Labelling LLM outputs as bullshit is supported by the notion that common use cases for LLMs are areas where people already bullshit. We see LLMs used in everyday practices where convincingness is prioritised above accuracy, like in emails, marketing, and research proposals. There was the lawyer that used fake cases generated by ChatGPT in his brief. We also saw fraud and liability as massive use cases for LLMs and wondered if there is any responsibility for those who break the law by using LLMs in these ways.

Bullshit helps to tie the output and effects of LLMs to their human designers, highlighting the human responsibility in the creation of LLMs. Giving the interface of LLMs agency distracts from the designers involved in the system, and their bias. Using the label bullshit helps to reforge this connection to intention and responsibility, thereby improving awareness about underlying mechanics.

Whilst we saw more strengths to the label ‘bullshit’ than ‘hallucinate’, there are some aspects of ‘bullshit’ that we had difficulty with. It is not an easy word to use generically and has various connotations which need qualification. ‘Bullshit’ as a term is intended for people and seems a bit anthropomorphic, even with the disclaimers in the paper. ‘Bullshit machine’ could be better, or perhaps ‘bullshit facilitator’. We also considered the term ‘confabulation’, which is “a memory error consisting of the production of fabricated, distorted, or misinterpreted memories about oneself or the world”.

Although ChatGPT does have a tendency to spit out falsehoods, it might be unfair to call it a bullshit machine if 90% of the output is true. If it is just a predictive tool, which usually predicts correctly, it’s output might not necessarily be classed as bullshit. We expect much more from machines than humans; as humans, we can go our entire lives believing statements to be true and telling other people those statements are true, to find out one day we are wrong. If someone bullshits us, and we believe it to be true and share it with others, we wouldn’t consider ourselves to be bullshitters.

Do you think ChatGPT is a soft or a hard bullshitter? I.e. do you think it has the intention to mislead its audience or not?#

We liked the clear distinction between soft and hard bullshitting, where a soft bullshitter need not have an intention to mislead, but a hard bullshitter does. We would welcome more of a distinction between a hard bullshitter and a liar. Hard bullshitting might be prevented from collapsing into lying if there are underlying ulterior motives.

Whether ChatGPT is soft or hard could depend on its version, with a difference in labels between earlier and later versions. Pre-reinforcement learning (RL), there was less effort dedicated to making sure that ChatGPT was not misleading people. The use of human assisted RL (e.g. RL from human feedback) seems to play an important part in distinguishing between soft and hard bullshit.

On one hand, using human assisted RL might make ChatGPT a soft bullshitter, because an effort (no matter how small) has been made to not mislead. Framing ChatGPT as a soft bullshitter could also be supported by the disclaimer on the bottom, alerting people that it might offer false information. If ChatGPT is devoid of intention, or the output isn’t presented as truth or knowledge, it could be labelled soft bullshit. However, hard vs. soft bullshit makes it seem like hard is worse; it should be made clear that soft is still bad and can be very disruptive. For example, if it is soft, we might care less about verifying it, whereas we would try to fix it from being wrong if it was hard.

On the other hand, the use of human assisted RL may contribute to producing something closer to hard bullshit, as there is a transition from repeating probable words towards being ‘convincing’. RL is used to both make the output more truthful and also to look more ‘truthy’. If we only care about intention in distinguishing between hard and soft bullshit, doing RL changes the appropriate label as it changes the role of intention. Knowing ‘true’ vs. ‘false’ is not enough information to learn to give better answers, and designers will have to make some choices in what defines a better answer. The choices that need to be made when involving RL could be interpreted as the intention of the designers. When intention is involved, ChatGPT thus enters into the space of hard bullshit where the model is trying to convince.

Intention may be of the system’s creators, or of the system itself. The idea of system itself having intention is supported by the importance of process. In the same way that students should learn from the process of writing, it is the practice that is important, not how polished the end result is. This means that no matter how frequently ChatGPT is accurate, it may still be a bullshit machine if it doesn’t go through the right process. For example, not telling the user how sure it is of its answer may increase the system’s level of accountability. The significance of process is why the controversial ad for Gemini got pulled, in which a father used the tool to write an athlete a fan letter on behalf of his daughter. The advert arguably encourages “taking the easy way out instead of practicing self-expression”.

We found that the paper seemed to jump around a bit regarding who was being accused of bullshitting; whether it was the users, the creators, or something else. There also seems to be a scale of intentionality in bullshitting, from students avoiding doing their work, to politicians misleading the public. Downstream, it is difficult to delineate responsibility for the use of the system between the users and the designers. For intentionality in LLMs, we wondered how far developers have ethical agency, and where the buck stops. The people who develop LLMs do not intend to mislead audiences, they intend to develop useful tools.

However, commercialising and selling tools for specific unfit purposes could be classed as intentionally misleading. Designing bots to exhibit human-like qualities has some intention to deceive. The intent of the dataset an LLM was trained on is also important, for example, whether it is academic articles, or articles from tabloid newspapers. The information it was trained on would steer the intent; this would be the intent of the designers and sources, not the LLM. A lack of availability of underlying sources contributes to the bullshit factor.

The existence of intention to mislead likely results from the industry fixation on innovation. The obsession with innovation amongst other factors, has been found to be a significant barrier inhibiting ethical wisdom in the AI developer community. There is always a human at the top of the chain who is pulling the AI along; if we are looking to hold someone responsible, you just have to follow that chain.

Even when the intention to mislead is absent, knowing how to train LLMs to portray truth is complicated by the fact that the concept of truth is a philosophical minefield. If it was easy to define truth, we wouldn’t have a justice system or journalists. There are situations where we presume the truth is knowable, and situations where we presume truth is unknowable. When an LLM hallucinates, compared to when it tells the truth, it feels like something procedurally different is happening, but really it is exactly the same process.

The authors are correct to point out that the relationship to the truth is irrelevant to LLMs. LLMs aren’t tracking ‘truth’ as they have no commitment or connection to the whole picture. Whether LLM outputs are true or false, the intent is always the same. With the variation in temperature parameter, we know that they frequently don’t give us the best guess, trading off accuracy with authenticity. This trade-off produces speech which looks human but is indifferent to the truth. Even with zero temperature, LLMs won’t make stuff up but just go with the consensus of text they are trained on. Going with the consensus seems to align more with parroting than tracking truth.

Considering the difficulties with assigning intentionality, accountability, and defining truth, we wondered if the distinction between hard and soft bullshit is persistently useful. It is difficult to quantify each term, and you can’t extend either definition to all applications of LLMs. Assigning labels to the system themselves is compounded by irresponsible use of LLMs. Once you’ve decided that LLMs are hard bullshitters, we wondered if they could become soft bullshitters, e.g. by learning, or labelling their own uncertainties with data. Accuracy can be a training goal.

What do you think of the implications of the anthropomorphising of AI tools? E.g. hallucination, learning, training, perception etc.#

To truly answer any of the questions we have discussed above, we must address anthropomorphism. As humans, we have a natural tendency to anthropomorphise; we can’t get away from our own human experience, conceptualising the world by reference to ourselves. We wondered if over-anthropomorphism is inherent across humanity, or if it is something seen especially in Anglican traditions. We see anthropomorphism in many systems other than LLMs, such as in robotics. In the Robotics Process Automation office, workers will name their software bot (e.g. Bobby Bot) and talk about the bot as if it is a colleague, for example saying “Bobby’s having a bad day today” when describing lots of exceptions.

Anthropomorphism risks prescribing more intentionality than we mean to, similar to the effects of pareidolia, which is the tendency for perception to “impose a meaningful interpretation on nebulous stimulus”. We had some doubts about as to whether it is possible to ascribe intention to mislead to LLMs themselves, as this may be a case of anthropomorphism.

The problem with anthropomorphism is that it does a disservice to how the system works, which affects broad fields (e.g. science) by trivialising and bypassing the underlying mechanics. As well as disguising technical aspects, anthropomorphism has social repercussions. People respond well to human projections on non-human items, and can build human-like relationships with them, finding friendship and companionship (this is explored in a previous Data Ethics Club, the podcast Black Box, and TV show Humans).

To some extent, the dynamics of LLMs as bullshit machines are not so different to human relationships – we all have friends who are bullshitters and can navigate LLMs in similar ways to how we navigate these friendships. However, the similarity to human relationships poses a risk to people that are vulnerable and can induce trust where it might not be deserved. When we consider the technology in isolation, it is weird to think about whether or not we can trust it. However, when the tools are anthropomorphised, trusting them becomes a natural consequence. When that trust is proven misplaced, such as ChatGPT outputting falsehoods, trust in written language is lost, and people start to consume text differently.

What change would you like to see on the basis of this piece? Who has the power to make that change?#

If the technology will increase efficiency and save lives, then there are strong arguments in favour of using it. However, at the moment there are a range of responses to good and bad practice of AI in research and education. These responses need to be streamlined to orient AI development with society. We know how to deal with cats, but not lions; we should not release lions into the city without knowing who or how to control it.

We thought that the paper is kind of missing the ‘so what?’ element at the end, without showing the dangers of the technology. Lying is already a part of our society – why is ChatGPT different?

One reason ChatGPT is different is because of the scope of its repercussions; we have seen that the repercussions are large and affect many domains. For example, search engines appear to be devolving as the internet is gradually filled with generated content, discussed in a previous Data Ethics Club. Another destructive repercussion of LLMs is their environmental impact. The Washington Post estimates that to construct one 100-word email, about a bottle of water, or enough electricity to power 14 LED light bulbs for an hour is used; for 1 out of 10 working Americans to construct one email once a week, the amount of water consumed by all Rhode Island households for 1.5 days or the electricity consumed by all D.C. households for 20 days is used. There could be creative solutions to the environmental effects of AI, such as the data centre in Devon which is uses the heat it generates to warm a public swimming pool.

To move forwards with technologies like ChatGPT, we thus need to be clear about how we are using them and what we are using them for. Many of us feel that LLMs are useful tools – there are cases of LLMs doing maths to a grad student level of proficiency - and just want to clarify when they are useful. Whatever your opinion about them, LLMs are going to be used, so we need to decide when we can rely on them and what the good use cases are.

Correct language plays an important role in clarifying the usefulness of LLMs. Applying the label of bullshit machine to LLMs can inform how you use them, as you enter into interactions with the expectation that a lot of it might be bullshit. LLMs tell us things, and we should verify that the outputs are true, under the assumption that they are not. Only when we have verified the facts should we share them. There are tools you can buy to check literature and understand which parts of a textual artefact are true, e.g. Wolfram Alpha can be used as a truth checker.

Contemplating what might come in the future, and how to reduce harm, is informed by looking to scenarios where similar concerns have arisen before. Where we are now with ChatGPT could be paralleled with worries that people had about Wikipedia when it started. The fears about Wikipedia were largely overblown, and perhaps this will also be true regarding the fears surrounding ChatGPT. In the future, ChatGPT could be used as a conversation starter, but not as the ‘main thing’. However, the key factor that differentiates the two is that Wikipedia has citations, making the sources evident. To circumvent this gap with ChatGPT, aside from including the actual citations, LLMs could include ‘confidence ratings’ on the statements they’re generating, based on the probabilities of the words being strung together.

Attendees#

  • Huw Day, Data Scientist, Jean Golding Institute, University of Bristol, https://www.linkedin.com/in/huw-day/

  • Amy Joint, Programme Manager, ISRCTN clinical study registry

  • Vanessa Hanschke, PhD Interactive AI, University of Bristol

  • ZoĂ« Turner, Senior Data Scientist, The Strategy Unit (NHS)

  • Paul Matthews, Senior Lecturer, UWE Bristol, https://scholar.social/@paulusm

  • Virginia Scarlett, Data and Information Specialist, HHMI Janelia :grimacing:

  • Joe Slater, Philosophy, University of Glasgow Philosophy Department.

  • Chris Jones, Data Scientist

  • Joe Carver, Data Scientist, Brandwatch

  • Dani Shanley, Philosophy, Maastricht University

  • Mike Hicks, Philosophy, University of Glasgow

  • Kamilla Wells, Citizen Developer, Australian Public Service, Brisbane

  • Euan Bennet, Lecturer, University of Glasgow

  • Robin Dasler, data product manager, California

  • Helen Sheehan, PhD Student, University of Bristol

  • Matimba Swana, PhD Student, University of Bristol

  • Dan Levy, Data Analyst, BNSSG ICB (NHS, Bristol)