Data Ethics Club discusses ‘Living in the Hidden Realms of AI: The Workers Perspective’ (26th May 21)#

What’s this?

This is summary of Wednesday 26th May’s Data Ethics Club discussion, where we spoke and wrote about the article ‘Living in the Hidden Realms of AI: The Workers Perspective’ by Sherry Stanley.

The summary was written by Huw Day, who tried to synthesise everyone’s contributions to this document and the discussion. “We” = “someone at Data Ethics Club”. Nina Di Cara and Natalie Thurlby helped with a final edit.

Mechanical Turks and the Sweatshops of Machine Learning#

We discussed an article by Sherry Stanley - a “Turker”: someone who annotates data using Amazon Mechanical Turk (often used to label Machine Learning). Stanley discusses problems with the back end, their pay, and working conditions, showing the human side of data that we often think of as very technical.

People need flexible work, but since the work is remote, contract-based and often time sensitive much of the power left in the hands of those contracting the work, not the turkers. If a turker won’t do a task, there are three more who will take their place. This supply and demand inbalance leads to a weak negotiating position for the turks. Stanley talks about having notifications on her phone to wake her in the middle of the night if a valuable contract comes up.

Turks might complete work but get rejected by the contractors, denying them pay in the process with limited power to protest. The workers often don’t know why their work was rejected and technically their hard work could still be used, making the system open for abuse. This current system leads to exploited turk workers producing rushed data annotations for low levels of compensation - so how do we go about changing this?

How far should our ethical responsibility as data scientists reach?#

Researchers tend to see Turkers as service providers rather than research participants. This viewpoint allows a degree of detachment; for instance, data annotators may not be seen as human research participants by research ethics committees. So how do we ensure responsibility throughout the research chain?

There is a lot of “variety” in ethical behaivour between different researchers. Any mechanism that relies on good faith is simply not enough. There needs to be a clear contract between requester and Turker. Perhaps a good solution to this is including something in grant proposals that guarantees the workers are treated ethically as part of the ethical framework in the proposal.

Outsourcing tech work can be compared to other contract work - wage paid per individual job, precarious. Turker’s rights are in some ways analogous with modern day slavery. The UK has requirements on corperations to look throughout their supply chain to see if people at any point are being exploited in the production of clothing. Why should the production of data be any different?

An institution wide approach would be more powerful than leaving decisions to individual researchers. There could be procedures in place (especially at public institutions like universities) to ensure that reliable and ethically collected/annotated data is used. Major researchers may advocate for ethical data integration but then use low-pay crowd workers for their own work. Accountability throughout the system is required, and we cannot rely on individuals for that.

One issue that was brough up was that if regulations are put in place in countries like the US and UK, work might get outsourced to countries with less strict standards. Whilst we not would want to avoid denying hard work work for Turkers in developing countries, it would be important to enforce accountability throughout the supply chain.

How should we recognise the contribution of annotators?#

Simply acknowledging clickworkers is a good start. Unfortunately the distributed nature of crowdwork makes it difficult to credit workers as well as ensure accountability if things go wrong (e.g. bad annotations).

Some Turk surveys include personal information (e.g. mental health information) so scope for anonymity should be present. But as a general rule, credit (which comes hand in hand with accountability) for work done is vital.

We had a discussion about where you draw the line in crediting clickworkers? In an academic context, do you include them as a co-author/author? This might not be a hard rule, but some sort of industry standard may be appropriate. An important step would be to ask actual clickworkers how they would like to be credited!

Change and the power to make it#

There’s strong parallels between Turk workers and recent advances in rights for ‘gig economy’ workers. Flexible and accessible opportunities to earn an income are important, but we do not want people to be exploited for them.

Any steps we as a society take need to involve closing loopholes in the law that let companies like Uber and Amazon can get away paying contractors less than minimum wage. This is a broader issue in the so- called “gig economy”.

Even more broadly though, exploitation is most rife in environments where the workers, whoever they are, don’t have better options. This naturally led our discussion to Universal Basic Income. Universal Basic Income (UBI) is the idea that every adult in a specific area would receive a standard, unconditional payment at regular intervals. If working was not a necesity, everyone who worked would want to make it worth their while. UBI is being trialled in Wales at the moment, so perhaps more research into this area will aid our understanding in this area. In the past “researchers found the scheme left those happier and less stressed, but did not aid them in finding work.”

Probably a more realistic action for now is to support data annotators in the formation of unions, and back their calls as an industry for better rights and benefits. We rely on annotated data for so many parts of data science, and we can’t allow the people who make it to be invisible.


Note: this is not a full list of attendees, only those who felt comfortable sharing their names.

Name, Job title, Affiliation, Links to find you

  • Natalie Thurlby, Data Scientist, University of Bristol, NatalieThurlby, @StatalieT

  • Nina Di Cara, PhD Student, University of Bristol, ninadicara, @ninadicara

  • Huw Day, Maths PhDoer, @disco_huw

  • Roman Shkunov, Maths/CS undergrad, University of Bristol

  • Paul Lee, investor, @pclee27,

  • Ola Michalec, Postdoc (a social scientist in computer science school) @Ola_Michalec

  • Sergio A. Araujo-Estrada, PostDoc, Aerospace Engineering, UoB

  • James Cussens, Lecturer in CS Dept, UoB,

  • Robin Dasler, software product manager on hiatus, daslerr

  • Henry Addison, Interactive AI PhD student, UoB, henryaddison

  • Arianna Manzini, Research Associate, Centre for Ethics in Medicine, UoB

  • Vanessa Hanschke, PhD student, Interactive AI, UoB