Data Ethics Club: ESR: Ethics and Society Review of Artificial Intelligence Research (8th Sept 21)#

What’s this?

This is summary of Wednesday 8th September’s Data Ethics Club discussion, where we spoke and wrote about the article ESR: Ethics and Society Review of Artificial Intelligence Research, written by Michael S. Bernstein, Margaret Levi, David Magnus, Betsy Rajala, Debra Satz and Charla Waeiss.

The summary was written by Huw Day, who tried to synthesise everyone’s contributions to this document and the discussion. “We” = “someone at Data Ethics Club”. Nina Di Cara and Natalie Thurlby helped with a final edit.


This week, the group discussed the paper ESR: Ethics and Society Review of Artificial Intelligence Research, written by Michael S. Bernstein, Margaret Levi, David Magnus, Betsy Rajala, Debra Satz and Charla Waeiss. The abstract of the paper is below:

“Artificial intelligence (AI) research is routinely criticized for its real and potential impacts on society, and we lack adequate institutional responses to this criticism and to the responsibility that it reflects. AI research often falls outside the purview of existing feedback mechanisms such as the Institutional Review Board (IRB), which are designed to evaluate harms to human subjects rather than harms to human society. In response, we have developed the Ethics and Society Review board (ESR), a feedback panel that works with researchers to mitigate negative ethical and societal aspects of AI research. The ESR’s main insight is to serve as a requirement for funding: researchers cannot receive grant funding from a major AI funding program at our university until the researchers complete the ESR process for the proposal. In this article, we describe the ESR as we have designed and run it over its first year across 41 proposals. We analyze aggregate ESR feedback on these proposals, finding that the panel most commonly identifies issues of harms to minority groups, inclusion of diverse stakeholders in the research plan, dual use, and representation in data. Surveys and interviews of researchers who interacted with the ESR found that 58% felt that it had influenced the design of their research project, 100% are willing to continue submitting future projects to the ESR, and that they sought additional scaffolding for reasoning through ethics and society issues.”

With many members of the group having a hand in AI research, it was natural that many of us had opinions on the suggested regulations enforced by the ESR. Most seemed positive on the suggestion of some more appropriate regulations to AI research, but it was difficult to agree on specifics. Since the type of research evolves with the increase in technical capabilities, it is vital that we continue to have these sorts of discussions to continue to maintain strong ethical standards in our research.

General views about the ESR, and how much it adds to a typical ethical review processes for AI research#

We thought it was a nice idea in theory and that it fills an obvious void in computer science research, (research that doesn’t include humans) in general. Institutional ethical review boards have quite a narrow remit, seeing themselves as protecting research participants. This means they rarely ask questions around the long term ethical implications of research. There is an intersection now for ethical questions about AI and protecting participants and their data, which the methods suggested in this paper would help with.

However we also discussed some critiques of the ESR process:

  • People who could be affected weren’t involved in the decision-making.

  • Weakness of review boards in general: lack of transparency. Small number of reviewers. It’s quite hard to envisage the unanticipated stakeholders. Mapping out the conflicts between stakeholders might be useful.

  • The paper doesn’t evaluate the process in terms of intended outcome, just on how the participants felt. So perhaps that means we need another review system.

  • Would it be necessary to have two separate ethical reviews to go through simultaneously until you satisfy both sets of requirements?

  • If ethical review is not enforced across funding groups then researchers might pick and choose funding boards so they have the easiest time hopping through hoops.

We felt that the paper frames research as a risk, asking researchers to imagine possible negative consequences and seek to mitigate them, but we felt slightly disappointed that it didn’t cover how we do research in the first place.

Asking us, “Under what conditions should we really be doing AI research?”, the paper then leads to a bigger theme: questioning the idea that science is inevtiable and technological progress is inevitable. There is this idea that if we don’t do it, someone else will do. Following this line of thought, perhaps restricting research on ethical grounds will only negatively affect those adhering these ethical standards.

There seems to be this framing by governments who want to use a technological narrative - the idea of technology for the state in order to push for more funding. This paper does challenge this invetiably by saying the review board can say no, but in practice, this was not used much. One case study on AI stress monitoring of employees, the researchers didn’t really respond to any listed concerns.

Agreeing on these sort of things unanimously is impossible, but people trying to remove personal bias is important, i.e. not doing ethical reviews on projects with military funding if you’re unable to be unbiased there. Perhaps we need to start sooner with considering ethical frameworks as early as undergrad research, or at least in PhDs.

On the Consequences and Moral Character of Science#

Consequences of science are very complex and difficult to predict. There is a problem then, with the idea that we can judge science against societal goals. We have very little consensus on this.

One piece of work that touches on this is “The Moral Character of Cryptographic Work” by Philip Rogaway. In it, Rogaway states that “cryptography rearranges power” and so as a result is not only “an inherently political tool”, but it confers “an intrinsically moral dimension” to the field of cryptography. He later goes onto argues that the “inability to effectively address mass surveillance constitutes a failure of our field”. He finishes the abstract with the following sentence: “I plead for a reinvention of our disciplinary culture to attend not only to puzzles and math, but, also, to the societal implications of our work.””

On the contrary, maybe we need to concentrate on the actual crime. David Hilbert is not at fault for the nuclear bomb even if his mathematical work on Hilbert Spaces helped to make it happen - so far how does the accountability go back?

We should not criminalise behaviours that lead to crimes, we should only criminalise the crimes themselves. Being in poverty and starving might lead to stealing food, but we should not criminalise being in poverty. We should not use laws to control behaviours. Some in the group argued we should approach ethical processes in the same way.

We can’t ask of every scientific advance, “what are the implications?” That’s too complicated. Some in the group argued, instead we should concentrate on making sure society is armed with the tools to control these bad things which will inevitably happening. We can’t control a tech firm using our AI research to be racist. We should not blame ourselves for being curious, we should blame the tech firm for being bad.

Maybe it’s hard to separate the end use with the intention. There exists dual use technology, such as a stress recognition system that may not have solely good intentions. Perhaps you should design it with consideration.

Not to calculate the moral worth of one decision over another, but at least considering intuitive moral decision making around some principles. But what principles? Can we ever agree on them? Academics need to be more humble and a bit more measured in what they say they can do (Ismael Kherroubi Garcia has done some excellent writing on this idea in data science, framed as ‘epistemic humility’)

The ESR is based around universities and research funding; would something like this work in other types of organisations - why/why not?#

Stakeholder engagement is always critical and this is not really emphasised in the paper. Second or third-order stakeholders are likely to exist in large-scale businesses, for example:

  • Facebook’s impact on global politics has affected non-users too.

  • Deliveroo’s environmental impact is greater if they don’t increase prices for further deliveries.

Ironically those who hold stakes are not always the stakeholders. Some questions we were left with were:

  • Who are the stakeholders in this ethical review?

  • What are the challenges associated with identifying these stakeholders, who they should be and what their different priorities are

There will be the difficulties of second or third order stakeholders who have nothing to do with the app or whatever but are still impacted by the app itself. For example, Facebook might have affected global politics regardless of whether not you have an account. There will inevitably be a bit of hand wringing about how we could practically apply this, but that does not mean we should ignore the problem entirely.

Is it always a good idea to think about ethics? Probably. Will it be difficult? Almost certainly.


Name, Role, Affiliation, Where to find you

  • Natalie Thurlby, Data Scientist, University of Bristol, NatalieThurlby, @StatalieT

  • Huw Day, PhDoer, University of Bristol, @disco_huw

  • Charlie Newey, Machine Learning Engineer @ Deliveroo, @charlienewey,

  • Charles Radclyffe, EthicsGrade