Values in AI Image Systems#

What’s this?

This is summary of Wednesday 13th March’s Data Ethics Club discussion. This week at Data Ethics Club we used two recent controversies to explore how values are embedded in the creation of new or alternative realities in AI image generation. The first controversy is an AI app called DignifAI that adds clothes to women’s bodies, written about in an article by Catherine Shuttleworth. The second relates to the release of Google’s Gemini’s text-to-image launch, discussed in a thread from Margaret Mitchell. Catherine Shuttleworth joined our discussion and gave a brief presentation about her findings whilst writing this article. The article summary was written by Nina Di Cara and edited by Jessica Woodgate. The discussion summary was written by Jessica Woodgate, who tried to synthesise everyone’s contributions to this document and the discussion. “We” = “someone at Data Ethics Club”. Huw Day helped with the final edit.

Article Summary#

We’ve included a summary here about both pieces of content. Both examples highlight how values embedded into AI systems have a huge impact on the outputs they create.

DignifAI#

DignifAI is a relatively new AI tool (i.e. less than a couple of months – although is that young enough to be considered new in the current pace of AI development?) built on top of the Stable Diffusion model, released publicly in August 2023.

The tool claims to ‘dignify’ women by adding clothing and altering appearances using AI. Screenshots show the tool removing tattoos, making hair longer, adding cleavage and reducing waist size. From her research the author Catherine Shuttleworth also noticed instances of lightening skin tones, increasing “traditional” female makeup, and covering exposed skin (e.g. by covering arms to the elbow or wrist). Recent posts from the tool owners on X show also men having tattoos, piercings and alternative hairstyles removed. In some cases men have been made to look slimmer.

Changing people’s appearance without their consent seems to be enforcing something very different to dignity, understood as respect for people. Whilst the initial outcry was about control of women’s bodies, the increase in posts about men illustrates how misogyny harms all of us eventually by enforcing unrealistic gender norms.

Gemini#

Gemini (previously known as Bard) is an AI tool developed by Google that recently launched a text to image generation component. This image generation component was noticed to create scenes where historical figures like the American founding fathers were people of colour, or the pope was Black (as seen in Margaret Mitchell’s thread). This resulted in an outcry about the lack of “white representation” from images produced by the tool.

Dr Mitchell points out that AI systems can be developed to interpret user requests. Some users may be looking for historically accurate depictions. Some might be looking to generate pictures with alternative versions of history, for example by seeking to mitigate the whitewashing of history that has occurred. One of the benefits of using a system of AI models is that it can be built to tailor user requests. Gemini made the mistake of not looking for user input, instead assuming that all users wanted the same thing.

Whilst these aspirational values might be welcomed by some, they can also come across as tokenistic representations that make those excluded by diversity initiatives feel even further removed.

Discussion#

Which values do you see as being most prominent in each of these tools - and what harms or benefits can you see from them being expressed through these tools?#

Both tools implement values from exaggerated sides of the political spectrum, highlighting how AI image and video manipulation can be harnessed to reflect messages pushing particular value systems. The increasing ease and convincingness of AI capabilities increases the amplification of these messages, raising important questions about how we define and prioritise values. DignifAI’s justification is that they are making decisions to “bring decency back into the world”, advocating for more conservative styles of living.. They have voiced that “the goal is for people to see that a degenerate lifestyle is ultimately fruitless” and they will “commit unrelenting psychological warfare to our own ends”. When contacted by Rolling Stone about a particular doctored image, DignifAI denied their platform was behind it. Denial of accountability, and the extreme nature of these kinds of views, suggests that it may be difficult to engage with such groups in a productive way.

DignifAI is an extreme example of a misogynistic narrative which can be seen in many parts of society. The social construct of the male gaze connotes a paradoxical expectation of women. John Berger summarises this hypocrisy as “you painted a naked woman because you enjoyed looking at her, you put a mirror in her hand and you called the painting “Vanity”, thus morally condemning the woman whose nakedness you had depicted for your own pleasure”. The Madonna-whore complex in psychoanalytic literature is a psychological complex where women are seen either as saintly Madonnas or debased whores. There are thus many unrealistic ideals that exist about what a woman should be. In some instances, DignifAI removes tattoos whilst enhancing cleavage size and reducing waists. This confuses the aim to “bring decency back into the world”, with conflicting messages about sexualising or desexualising women. Conceptions of power also play a role in how women are treated and who specifically is being targeted; the biggest stories we hear of tend to relate to powerful women (e.g. Taylor Swift).

Despite the risks of exaggerating particular value systems, there are good use cases for deep fakes. Sometimes, individuals might want to cover up photographs of themselves before they share them with others, e.g. covering up a tattoo for a work photo. On the other side, people might want to generate nudes of themselves. Both of these cases are valid because they have clear consent; the explicit consent of all involved is key to the acceptability of deepfakes. Deepfakes could be useful to make videos appropriate for different ages or religious groups; VidAngel provides a service to filter out entire categories from videos such as sex, violence, or language. However, they were sued by Disney for copyright infringement suggesting there are still some issues with consent or content ownership here.

Lack of consent is central to the problems elicited by DignifAI. There is no opportunity to provide consent unless you are doctoring images of yourself. This highlights how images posted publicly can be so easily abused by others, raising new concerns about what we post on social media. We must be so much more aware of how our data is handled.

In addition to consent and ownership of images, we should consider where the boundary is for acceptable editing. Colour touch ups and the like are generally considered acceptable, but edits run into problems when they are considered to be changing the “historical record”. This is exemplified in the Gemini AI case. Debate around altering historical artefacts has also emerged in literature, such as efforts to remove offensive language from Roald Dahl. Acceptability of editing can be affected by context. For example, the edited photo released by Kate Middleton generated more questions about image manipulation than would usually emerge from the image alone because of the ongoing scrutiny on her.

Generally, it seems to us that whilst there are some acceptable cases of deep fakes and image doctoring, those cases are statistically insignificant and morally outweighed. Cases like DignifAI demonstrate the propensity for harm by doctoring people’s images without their consent and pushing a narrative that enforces unrealistic gender norms.

The complex implications of deepfakes and generative AI tools necessitate asking important questions about their development and deployment. We need to be realistic about a tool’s potential. If there are significant negative effects, we should really ask ourselves if we should be using it. There will be some form of bias that enters some stage of the pipeline. Part of the accountability process involves making an effort to identify where bias could emerge. The tendency towards altering people to be skinnier, paler, with bigger cleavage does not just occur on platforms like DignifAI. This is reflective of a wider issue of bias engrained in generative AI; big data means big mistakes. Comparing these issues makes us question whether it is worse to intentionally produce misogynistic images or allow them to happen through negligence.

Research has proven significant issues with AI tools, which makes us wonder how much we should be using them to “fix” things. Laban et al. (November 2023) conducted a study which found certain LLM’s to “flip” their answers for classification tasks 46% of the time, resulting in an average drop in accuracy of 17% between first and final predictions. High error rates and far-reaching influence suggests that they should not be so widely deployed. However, it is important to acknowledge how quickly these models are developing, and in March 2024 (the time of writing/discussion), November 2023 is a ‘long’ time ago.

Addressing the multifaced issues with AI is a gigantic task. One approach is through regulation; some of us think that it is important there is data legislation, and people whose job it is to enforce that legislation. Just because information is on the internet does not necessitate that it should be free for people to use as they please. We should have the right to control our own images, but this is not guaranteed in current regulation; AI legislation is still in its infancy. The AI Act has recently come into force in the EU, but we questioned its reach. We wonder whether it provides proper protection for people being exploited by AI (e.g. violence against women and young people). Regulation restricting the use of other people’s data could theoretically help to achieve this, but in practice obtaining proof of image ownership before it is appropriated to train AI is unfeasible. Even with laws in place, it is difficult to catch people in the act. Without explicit repercussions, it is difficult to disincentivise people from using AI to exploit others. That said, the US is good at enforcing copyright laws.

Successful regulation surrounding protection of image ownership and commitment to truth can be seen in journalism. There is a vigorous approval procedure that journalists must go through to use someone’s photograph for an article. If reporting on legal topics, journalists must reach out and ask for comment before publishing. Articles can be killed if there is risk of legal ramifications. It would be good if there was a more established pipeline for legal support in the context of AI. Additionally, improvements could be made to reporting features on social media.

It is important, however, to note the limits of regulatory approaches. The processing speed of the legal system combined with roadblocks for individuals making claims means that even when something is put into law, effectively enforcing it is challenging. It is not necessarily true that an AI tool will be more biased or worse than hiring a human editor to sift through hundreds of hours of content. Propensity for harm exists on both sides (using AI or using humans). Regulatory approaches also need to maintain a balance between startup innovation and stifling legislation.

The statement from Google’s CEO about Gemini said that their aim is to provide “helpful, accurate, and unbiased information in our products” and that this has to also be their approach for emerging AI products. Is this a realistic goal?#

Whilst this is a laudable aim, it does not get to the root of the problem. The statement on its own is insufficient to create real impact. We must also consider the authenticity of the company’s intention; whether it is in response to the market, or from a genuine desire to do good. Responding with incrementally “better” tools puts a bandage over deep and complex issues, demonstrating how Google is under the influence of the corrective pendulum. Claiming to provide “helpful, accurate, and unbiased information” is not simple, as these are abstract terms covering intricate topics. We wondered if it is even possible to provide truly unbiased information; the Gemini AI outrage hinged on Google being “unbiasedly biased” (a term which emerged in two discussion groups).

The images Gemini AI generated were of some form of ideal (unbiasedly biased) world that is not historically accurate. This is problematic as it washes over historical issues which still perpetuate in our society. On the other hand, using the documentation we have to recreate historical events “accurately” could reinforce inaccurate stereotypes (e.g. white Jesus). The immaturity of generative AI entails that we are not yet clear about whether its purpose is to provide a genuine reality or some degree of fiction. It is important to be clear about how much it is generating an ideal world, compared to reading from the historical record. If the user is not asked to specify their preferences, generating a satisfactory response is challenging. However, being required to respond to explicit requests might also run into difficulties (e.g. asking for harmful content). These scenarios would necessitate understanding where constraints should lie, which relates to current issues we see in Gemini AI. Most likely, the problems with Gemini AI are because constraints have been made too strong, demonstrating the imposition of technical censoring, rather than the model itself struggling with language complexity.

Uncertainty about the boundaries of truth and proliferation of disinformation is destabilising to society (just look at how many conspiracy theories emerged from Kate Middleton going missing). Trust is being lost in a variety of media formats, and studies have found children to have a “dismaying” inability to distinguish between real and fake news. The pandemic led to children being educated at home, but this might have opened them up to more inaccurate information. It is important to understand the impact that AI has on children. Measurement surveys or feedback loops analysing how these tools are affect peoples’ mindsets aren’t yet commonplace in the public domain.

Viewpoints on social media are becoming more extreme, either through coordination of particular groups, amplification of voices, or increasing polarisation of individual views. Issues surrounding polarisation are illustrated in the controversies surrounding Gemini AI and DignifAI. Both tools are generative, but with polar objectives. This provokes us to question how we define accuracy, and who AI should be helpful to. Should these tools be as diverse as possible, promoting equity, or should they reflect the real world, including all of its social biases?

Margaret Mitchell’s thread shows a table for assessing unintended uses and users of tools - what do you think about this method for better understanding uses and how could it be used to improve AI image generation?#

Mitchell’s table is a good teaching tool but might need further refining, for example, by combining it with project management tools like RACI. RACI facilitates clarifying employee roles and responsibilities, including who is accountable, who should be consulted, and who should be informed about different aspects of a project. Approaches like this should be performed by groups of people working collaboratively, rather than one person ploughing through them as a tick box exercise.

Attendees#

  • Nina Di Cara, Snr Research Associate, University of Bristol, ninadicara, @ninadicara

  • Huw Day, Data Scientist, Jean Golding Institute, @disco_huw

  • Vanessa Hanschke, PhD, University of Bristol, website

  • Catherine Shuttleworth, Journalist and Uni Student! [https://www.instagram.com/catherineros.e/]

  • Lucy Bowles, Data Scientist @ Brandwatch

  • Amy Joint, freerange publisher (until Monday!!), @AmyJointSci

  • Virginia Scarlett, Open Data Specialist at HHMI Janelia Research Campus

  • Noshin Mohamed, Service Manager for Quality Assurance in Children’s Services

  • Euan Bennet, Lecturer, University of Glasgow, @DrEuanBennet

  • Chris Jones, Data Scientist, Machine Learning Programs

  • Michelle Wan, PhD student, University of Cambridge

  • Helen Sheehan, PhD student, University of Bristol

  • Dan Whettam, PhD student - Computer Vision, University of Bristol

  • Lap Chow, Cancer Analyst, NHSE

  • Liam James-Fagg, Data & Insights Manager, allpay Ltd

  • Kamilla Wells, Citizen Developer, Australian Public Service, Brisbane

  • Robin Dasler, data software product manager, daslerr