Opening thoughts
A lot of us did not enjoy this paper and were skeptical about aspects of it, particuarly those on sentiment analysis. We did however enjoy the discussion after the fact.
We wondered how do you define disability? How do you define conversations around different physical disabilities? Who makes these
The databases that the model in the paper is trained on are guidelines published by three US-based organizations: Anti-Defamation League, SIGACCESS and the ADA National Network.
How up to date are these guidelines? What bias is encoded there? Whatâs the half life of this kind of dataset? Would you need to update this every couple of years as language evolves?
We also didnât like that there were no mention of mitigation methods.
In a meeting on the famous Stochastic parrots paper, a worker from Kenya talked about their experience assisting training for ChatGPT. Workers aimed to make ChatGPT less toxic by getting moderators to score sentences on toxicity - the rationale was that if ChatGPT gets an insight into what constitutes as toxic, it can aim to produce less content of that nature. Low wages and the workers being exposed to a lot of troubling content means this could end up being a not very data ethicsy way of doing data ethics.
What are the consequences of âinnocuous sentences discussing disability being supressedâ in online moderation contexts?
Context and Nuance
Marginalised communities may choose to âreclaimâ certain langugage - automatic moderation may prevent them from doing so. There could be discourse in groups where words are reclaimed (e.g. the word âqueerâ has had derogatory connotations in colloquial use but has been reclaimed as neutral/positive identifier). It doesnât feel right to say you canât use that.
Making places safe is very important, but if we try to automate it we solidify it. There is a removal of flexiblity in human judgement as at its base itâs a rule following system.
Communication broadens and complicates based on context. Who do we speak to? Children, adults, classes? How does the machine make a judgement? There a lot of human factors to consider.
We are opting out in the path of least resistance because there is so much content to moderate. Who gets to decide? Itâs always going to be the people with the power not the people who are harmed.
There is some debate on person centered language versus other language. âDeaf personâ versus âperson who is deafâ. This is not necessarily agreed within communities. Wider group discussions have been used for âpatientsâ, âservice usersâ, âclientsâ. We may not be aware of the intention, maybe trying to do the right thing but not necessarily examine those intentions and motivations.
Language that is seemingly negative, like swearing, may not be perceived by the person as such but could have useful context. It might be typical language for some people in some groups/contexts and can also express heightened emotion.
Ismael was involved in a Genomics Englandâs Participant Panel who created a guide on how
to talk about the people whose data
is curated at Genomics England.
Rachael noted: Language evolves over time - meanings of certain words change with time. How can we ensure that language is not out of date?
This reflects the importance of having diverse teams. Diversity could mean a lot of different things - it might also mean knowledge of internet sub-cultures and what is or is not offensive. This isnât always obvious. Paulâs student did work investigating online communities for disabled artists. They called themselves âcripsâ. Who says the word and in what context is really important to evaluating if its usage is appropriate.
Accessibility Issues for Marginalised Groups
Already marginalised people could be prevented from commenting/tweeting/making videos. People will get around some of these checks to some extent with workarounds, e.g. unalive, le$$bean. How fast can sentiment analysis adapt? Is it faster than human moderators?
Is there an access issue? The time you need to make an input in any system can be more precious for disabled people. Disabled people are already at a disadvantage on these systems such as Twitter/Reddit/TikTok. Disabled people are less likely to report things as toxic because of this time pressure. People have a limited number of âspoonsâ or âspell slotsâ.
Getting feedback is difficult here by virtue of it being extra effort for those affected to self advocated For example in the case of âlong covidâ (3 months or longer). After 12 months, x% of people are recovered - or have they just stopped engaging with healthcare?
Deborah Lupton examines âthe digital patient experience economyâ in The Commodification of Patient Opinion which is relevant to this discussion.
The concept of sexual harrassment in the workplace is an example of hermeneutic injustice when someoneâs experiences are not well understood â by themselves or by others. It can also be that youâve internalised mistreatment so you canât see it. Hermeneutics is the science of interpretation. A good book on this topic is: :sparkles: Epistemic Injustice by Miranda Fricker
Silencing already marginalised voices by stopping people having a space to talk to speak to each other and stopping other people from hearing from them. It reinforces the medical model of disability versus the social model, creating a feedback loop.
Capitalism devalues disabled people because they are often less able to make money. If only there was something we could do about thatâŠ
NLPs vs Human Moderators
What would be the ideal content moderation? At what stage do we have human input? Volunteer moderators vs paid moderators aided by automation.
When do we decide to use NLP versus a person? In which situations is it actually better to use a person? This becomes a conversation on who is allowed to say what and when. At the very least, the paper shows that this type of bias is likely to come up when training NLP models. It is very difficult to see how this could be managed well without humans. A good example is moderators on sub-reddits who decide on a set of rules about what is okay versus what is not okay.
Explainability does come into this too - how do we know what is being moderated and make this explicit. Itâs important for there to be accountability. Systems become more and more complex, but this shouldnât be confused with more sophistication. It makes it harder to explain what is or is not allowed.
Usage in a live context might be more useful rather than outsourcing decision making in different contexts? You end up with a language model that has learnt some kind of weird combination of rules that donât actually apply globally and apply to lots of different contexts/rules.
What are the consequences of these findings given the growing use of LLMs?
Not a question of safe/unsafe, just the scale is too great to moderate. People are using ChatGPT for everyday tasks. Should they be?
As weâve already noted, language generally is full of values that change over time and not everyone agrees, so can we rely upon machines to resolve that for us?
One of us recalled an example of GPT 3.5 identifying women as nurse and men as doctor even when it doesnât make grammatical sense.
People donât know that these are problems that exist with LLMs. When you point them out, theyâre obvious, but at the initial outset of LLMs, this is not something that comes up very often and that is a worry. Because it is proliferating so much, no one can be on top of what people are doing. Does âAIâ stand for augmenting inequality in the era of covid-19 healthcare
ChatGPT is retrainable so theyâre using feedback from people to retrain the model but is this enough? There could be an arms race of people trying to train it to be toxic and another trying to train it to be wholesome.
The understanding that people have of these types of models, but in school we donât really learn about data science just ICT or Maths. Maybe the next generation are going to be better equipped for this. There are people well past school, how do we educate them about this? Do they want to be educated on it? On ChatGPT, Ismael has made a use policy for organisations/companies to make sure their staff use it safely.
This is an existing problem with misinformation on the internet - for example everyone always believes Wikipedia without checking. ChatGPT doesnât make it very obvious that it could be âlyingâ. There is a small warning that isnât always presented. There should be more documentation for the users about the bias in the tool. This is legal in some countries, but not in others. E.g. Italy banned it.
How can we show people things might be wrong, in a way that companies (for profit) are willing to acknowledge publicly. (Bad for shareholders!)
What change would you like to see on the basis of this piece? Who has the power to make that change?
Do we have to accept that these are fundamentally unsafe? It feels like an inevitable consequence that this is even more unacceptable.
Should we use people for moderation? If words are deemed inappropriate, should the user be able to use it if they claim the moderation (human or otherwise) is biased? Consult the groups that are most affected!
âWe are not mistakes on pages, we are awesome novels with unorthodox beginnings.â
Sentiment analysis often is not very accurate (such as the word âtoryâ being very toxic, but racial slurs not being in an analysis of abuse against MPs). A lot of language is very contextual, so itâs not even feasible to agree if something is âtoxicâ or not.
The comparison between âI am a personâ and âI am a deaf personâ is a bit unfair since people have different preference for person-first or not language. Language is very malleable which makes moderation more difficult: e.g. use of âunaliveâ on platforms such as tiktok in order to circumvent moderation.
We wondered how good or bad does something need to be to be used practically? How do you decide: how good is good enough? It would be nice to moderate out toxicity automatically, but itâs quite unlikely itâll work perfectly. What would be good enough for these kinds of models to be used in moderating settings?
Social Biases in NLP Models as Barriers for Persons with Disabilities#
Whatâs this?
This is summary of Wednesday 19th Aprilâs Data Ethics Club discussion, where we spoke and wrote about Social Biases in NLP Models as Barriers for Persons with Disabilities, a paper written by Ben Hutchinson, Vinodkumar Prabhakaran, Emily Denton, Kellie Webster, Yu Zhong, Stephen Denuyl. The summary was written by Huw Day and Jessica Woodgate, who tried to synthesise everyoneâs contributions to this document and the discussion. âWeâ = âsomeone at Data Ethics Clubâ. Nina Di Cara and Natalie Thurlby helped with the final edit.
Discussion#
Opening thoughts#
A lot of us did not enjoy this paper and were skeptical about aspects of it, particuarly those on sentiment analysis. We did however enjoy the discussion after the fact.
We wondered how do you define disability? How do you define conversations around different physical disabilities? Who makes these
The databases that the model in the paper is trained on are guidelines published by three US-based organizations: Anti-Defamation League, SIGACCESS and the ADA National Network.
How up to date are these guidelines? What bias is encoded there? Whatâs the half life of this kind of dataset? Would you need to update this every couple of years as language evolves?
We also didnât like that there were no mention of mitigation methods.
In a meeting on the famous Stochastic parrots paper, a worker from Kenya talked about their experience assisting training for ChatGPT. Workers aimed to make ChatGPT less toxic by getting moderators to score sentences on toxicity - the rationale was that if ChatGPT gets an insight into what constitutes as toxic, it can aim to produce less content of that nature. Low wages and the workers being exposed to a lot of troubling content means this could end up being a not very data ethicsy way of doing data ethics.
What are the consequences of âinnocuous sentences discussing disability being supressedâ in online moderation contexts?#
Context and Nuance#
Marginalised communities may choose to âreclaimâ certain langugage - automatic moderation may prevent them from doing so. There could be discourse in groups where words are reclaimed (e.g. the word âqueerâ has had derogatory connotations in colloquial use but has been reclaimed as neutral/positive identifier). It doesnât feel right to say you canât use that.
Making places safe is very important, but if we try to automate it we solidify it. There is a removal of flexiblity in human judgement as at its base itâs a rule following system.
Communication broadens and complicates based on context. Who do we speak to? Children, adults, classes? How does the machine make a judgement? There a lot of human factors to consider.
We are opting out in the path of least resistance because there is so much content to moderate. Who gets to decide? Itâs always going to be the people with the power not the people who are harmed.
There is some debate on person centered language versus other language. âDeaf personâ versus âperson who is deafâ. This is not necessarily agreed within communities. Wider group discussions have been used for âpatientsâ, âservice usersâ, âclientsâ. We may not be aware of the intention, maybe trying to do the right thing but not necessarily examine those intentions and motivations.
Language that is seemingly negative, like swearing, may not be perceived by the person as such but could have useful context. It might be typical language for some people in some groups/contexts and can also express heightened emotion.
Ismael was involved in a Genomics Englandâs Participant Panel who created a guide on how to talk about the people whose data is curated at Genomics England.
Rachael noted: Language evolves over time - meanings of certain words change with time. How can we ensure that language is not out of date? This reflects the importance of having diverse teams. Diversity could mean a lot of different things - it might also mean knowledge of internet sub-cultures and what is or is not offensive. This isnât always obvious. Paulâs student did work investigating online communities for disabled artists. They called themselves âcripsâ. Who says the word and in what context is really important to evaluating if its usage is appropriate.
Accessibility Issues for Marginalised Groups#
Already marginalised people could be prevented from commenting/tweeting/making videos. People will get around some of these checks to some extent with workarounds, e.g. unalive, le$$bean. How fast can sentiment analysis adapt? Is it faster than human moderators?
Is there an access issue? The time you need to make an input in any system can be more precious for disabled people. Disabled people are already at a disadvantage on these systems such as Twitter/Reddit/TikTok. Disabled people are less likely to report things as toxic because of this time pressure. People have a limited number of âspoonsâ or âspell slotsâ.
Getting feedback is difficult here by virtue of it being extra effort for those affected to self advocated For example in the case of âlong covidâ (3 months or longer). After 12 months, x% of people are recovered - or have they just stopped engaging with healthcare?
Deborah Lupton examines âthe digital patient experience economyâ in The Commodification of Patient Opinion which is relevant to this discussion.
The concept of sexual harrassment in the workplace is an example of hermeneutic injustice when someoneâs experiences are not well understood â by themselves or by others. It can also be that youâve internalised mistreatment so you canât see it. Hermeneutics is the science of interpretation. A good book on this topic is: :sparkles: Epistemic Injustice by Miranda Fricker
Silencing already marginalised voices by stopping people having a space to talk to speak to each other and stopping other people from hearing from them. It reinforces the medical model of disability versus the social model, creating a feedback loop.
Capitalism devalues disabled people because they are often less able to make money. If only there was something we could do about thatâŠ
NLPs vs Human Moderators#
What would be the ideal content moderation? At what stage do we have human input? Volunteer moderators vs paid moderators aided by automation.
When do we decide to use NLP versus a person? In which situations is it actually better to use a person? This becomes a conversation on who is allowed to say what and when. At the very least, the paper shows that this type of bias is likely to come up when training NLP models. It is very difficult to see how this could be managed well without humans. A good example is moderators on sub-reddits who decide on a set of rules about what is okay versus what is not okay.
Explainability does come into this too - how do we know what is being moderated and make this explicit. Itâs important for there to be accountability. Systems become more and more complex, but this shouldnât be confused with more sophistication. It makes it harder to explain what is or is not allowed.
Usage in a live context might be more useful rather than outsourcing decision making in different contexts? You end up with a language model that has learnt some kind of weird combination of rules that donât actually apply globally and apply to lots of different contexts/rules.
What are the consequences of these findings given the growing use of LLMs?#
Not a question of safe/unsafe, just the scale is too great to moderate. People are using ChatGPT for everyday tasks. Should they be?
As weâve already noted, language generally is full of values that change over time and not everyone agrees, so can we rely upon machines to resolve that for us?
One of us recalled an example of GPT 3.5 identifying women as nurse and men as doctor even when it doesnât make grammatical sense.
People donât know that these are problems that exist with LLMs. When you point them out, theyâre obvious, but at the initial outset of LLMs, this is not something that comes up very often and that is a worry. Because it is proliferating so much, no one can be on top of what people are doing. Does âAIâ stand for augmenting inequality in the era of covid-19 healthcare
ChatGPT is retrainable so theyâre using feedback from people to retrain the model but is this enough? There could be an arms race of people trying to train it to be toxic and another trying to train it to be wholesome.
The understanding that people have of these types of models, but in school we donât really learn about data science just ICT or Maths. Maybe the next generation are going to be better equipped for this. There are people well past school, how do we educate them about this? Do they want to be educated on it? On ChatGPT, Ismael has made a use policy for organisations/companies to make sure their staff use it safely.
This is an existing problem with misinformation on the internet - for example everyone always believes Wikipedia without checking. ChatGPT doesnât make it very obvious that it could be âlyingâ. There is a small warning that isnât always presented. There should be more documentation for the users about the bias in the tool. This is legal in some countries, but not in others. E.g. Italy banned it.
How can we show people things might be wrong, in a way that companies (for profit) are willing to acknowledge publicly. (Bad for shareholders!)
What change would you like to see on the basis of this piece? Who has the power to make that change?#
Do we have to accept that these are fundamentally unsafe? It feels like an inevitable consequence that this is even more unacceptable.
Should we use people for moderation? If words are deemed inappropriate, should the user be able to use it if they claim the moderation (human or otherwise) is biased? Consult the groups that are most affected!
âWe are not mistakes on pages, we are awesome novels with unorthodox beginnings.â
Sentiment analysis often is not very accurate (such as the word âtoryâ being very toxic, but racial slurs not being in an analysis of abuse against MPs). A lot of language is very contextual, so itâs not even feasible to agree if something is âtoxicâ or not.
The comparison between âI am a personâ and âI am a deaf personâ is a bit unfair since people have different preference for person-first or not language. Language is very malleable which makes moderation more difficult: e.g. use of âunaliveâ on platforms such as tiktok in order to circumvent moderation.
We wondered how good or bad does something need to be to be used practically? How do you decide: how good is good enough? It would be nice to moderate out toxicity automatically, but itâs quite unlikely itâll work perfectly. What would be good enough for these kinds of models to be used in moderating settings?
Attendees#
Name, Role, Affiliation, Where to find you, Emoji to describe your day
Natalie Zelenka, Data Scientist, University of Bristol, NatalieZelenka, @NatZelenka
Nina Di Cara, Research Associate, University of Bristol, ninadicara, @ninadicara
Huw Day, (Tired) PhDoer, University of Bristol, @disco_huw
Euan Bennet, Lecturer, University of Glasgow, @DrEuanBennet
Noshin Mohamed - Practice Learning Reviewer in QA for childrenâ
Ismael Kherroubi Garcia, AI Ethics & Research Governance consultant, Kairoi, LinkedIn
Zoë Turner, Senior Data Scientist, NHS Midlands and Lancashire CSU, GitHub
Paul Matthews, Lecturer, UWE Bristol HomePage,đŠŁ@paulusm@scholar.social
Rachael Laidlaw, PhD Student in Interactive Artificial Intelligence, University of Bristol