Data Ethics Club: Henrietta Lacks, the Tuskegee Experiment, and Ethical Data Collection#

Article Summary#

The collection and interpretation of data can have long lasting effects on the subjects of data collection and influence the lives of real people, eliciting a responsibility to gather and use data ethically. The video presents 5 examples of when data collection can go wrong, and discusses how such examples can be prevented in the future.

Illustrating the importance of voluntariness in data collection, Alexis St Martin was a fur trapped in the US who was accidentally shot and seriously injured. Dr. William Beaumont healed St Martin, except for a small hole in his stomach meaning that St Martin could no longer work as a fur trapper. As Dr. Beaumont’s servant, St Martin was coerced into several experiments by dangling pieces of food into the stomach to observe digestion. The findings lead to huge strides in scientific understanding of digestion. Yet, the experiments were clearly unethical as St Martin was not able to feasibly say no to the experiments due to Beaumont holding undue power over him.

Subject should also know what will happen to them during the study, as people have a right to understand all the facts relevant to their decision to participate. In 1932, the Tuskegee Institute ran a secret 40 year study on 600 black men under the guise of free medical care. Researchers were examining the long term effects of syphilis and experimented with a range of drugs - even after it had become clear that penicillin was a highly effective treatment. In 1951, a tobacco farmer named Henrietta Lacks went to John Hopkins hospital in Maryland and had cells collected from a tumour without her knowledge or consent. The cells were used to grow a new cell line (the HeLa line) which scientists used for in vitro experiments, as the cells could thrive and multiply outside of her body. The HeLa line is still used today for medical research into cancer, AIDS, developing the polio vaccine, and were the first human cells to be successfully cloned. Over time, the discoveries facilitated by the HeLa cell line became extremely lucrative for researchers yet Lacks and her family have received no financial compensation. These examples illustrate the ethical issues that arise from misinforming or misleading participants. Many institutions today require that information must be presented clearly and in an appropriate way for the subject’s comprehension level, respecting their dignity and autonomy.

Beneficence requires that researchers minimise the potential risk to subjects, and also that the risk should be outweighed by potential benefits to the patients and scientific community. During WWII, horrifying experiments were conducted on prisoners in Nazi concentration camps. In response, the Nuremberg Code was developed after the war to lay out ten principles that modern day studies must adhere to.

Research ethics in the digital age brings increasing levels of complexities. In the TV show Parks and Recreation, a company comes to the town to offer its residents free Wi-Fi. It eventually emerges that in exchange, the company is tracking everybody’s internet history and mining their data. Data mining happens all the time in everyday life, tracking our shopping habits, social media activity, and search activity. Social media platforms are profiting off our data, selling targeted ads which can sometimes be quite unethical such as by being discriminatory. The laws which protect our rights in response to the new landscape are still being developed, raising questions about privacy, coercion, informed consent, and reasonably clear explanations. Many use social media platforms because we may feel that there is no better option, without having proper understanding of what that use entails and how our interaction with it is profited off.

Discussion Summary#

What training did you receive in ethics for data collection? Had you heard any of the stories in the video before?#

Some of us have had little to no training in data collection or ethics for data collection. During our biology degree, there was one class about medical ethics which wasn’t even compulsory and did not cover data collection and research ethics. We were not taught about how to approach research involving humans, or what to do if an ethical question arose, which left us feeling a bit underprepared when we moved into research roles. Some have received training on a very theoretical level with no clear outline of how to address the ethical issues inherent to data collection, such as the range of biases that can creep in. Some haven’t had to carry out data collection ourselves, but have often received data without being encouraged to examine where that data has come from. This draws parallels with the Henrietta Lacks case, whose data has been widely used for many applications without much attention on where the cells originally came from.

Others have had data collection training in clinical settings and quite intensive ethics training in anthropology. Ethics is especially emphasised in anthropology due to the colonial history of the discipline. A few of the examples in the video have been included in our training or we have come across them separately. In our statistics education, we did not have much data collection ethics training, but when later working in pharmaceuticals we were given rigorous frameworks and were exposed to more legal considerations.

Moving from pharmaceuticals to technology development, ethics was maybe in peoples’ minds but was not explicitly prioritised. Ethics tended to be held as a shared responsibility for everybody to listen and give space to the views of different team members. We’ve found that in practice, ethics is not something that is always considered; in one project, we’ve had to have ethics catch ups with people to make sure that those whose data is being collected are centred. Participants in this project had a great deal of mistrust and felt that they were having research done “to them”, not for them. Broken trust is very difficult to rebuild, and reputational damage is a real hazard to be aware of.

Ensuring that participants’ rights are front and centre of research is crucial to earning trust, yet tensions can sometimes arise between different rights that are difficult to navigate. For example, the right to drop out of a survey can sometimes violate the right to be anonymous. If a patient asks to drop out of a survey, but their records aren’t linked to their names or any identifying information, we cannot remove them. The right to anonymity can also impinge the right to healthcare; in the case of anonymous data collection for rare diseases, if a cure for a particular person is discovered, anonymity may prevent the person from being directly provided the cure.

Anonymity can be challenging to guarantee. People have spotted themselves (or others) in data analytic reports due to unique combinations of symptoms. If there are enough features about a person, such as sexuality, demographic, and disability, it becomes possible to identify them even without their name.

The issue with breaking anonymity by including too much information about someone comes in conflict with the desire to represent a diverse range of people from different demographics. Demographics can influence the generalisability of findings in important ways. We have conducted research in specific regions where the breakdown of ethnicities may mean that any findings are specific to there and not applicable to other places.

Attitudes towards anonymity have changed significantly over time. Historically, researchers would write about a patient and might name them. Today, patients are asked for consent if a picture of their heart taken during surgery is to be used in a scientific article. Henrietta Lacks could have been anonymous, and her cells could have been acquired with ethics and consent, but she wasn’t offered this or even told she was participating. Given that Lacks’ identity was known, she also didn’t receive the support she needed and is a crude case of exploitation.

If researchers do not or cannot ask participants whether they want to be identified, it should be assumed that participants want to remain anonymous. It may be the case that researchers do not want participants to be able to identify themselves, for example, psychiatric patients who may disagree with what practitioners have written about them. We might not even know that we don’t want to read something about ourselves (such as psychiatric reports). Sometimes it may not be possible to ask participants, in which case anonymity is the safer and more ethical option – however, researchers should do their best to ask.

Whilst anonymity is an appropriate default to preserve autonomy and human dignity, participants should be given the choice not to be known and there are interesting situations where people may want to give up their anonymity. In the social sciences, people may want recognition for their work or insight. It may sometimes be important for participants to be able to give constructive feedback. A key pillar of data feminism is appropriately acknowledging everybody involved in the research process. An extreme example of this is the janitor that supposedly told JFK “I’m helping put a man on the moon”. Figuring out the limits of who should be acknowledged is challenging; we should at least try to acknowledge people whose body contributes to things.

What are the attitude you have witnessed/experienced towards ethical data collection? How has this varied amongst different people?#

Depending on where you are, different institutions can think about ethics differently. When working on a study involving student participants with collaborators in the USA, the ethics board was not worried about how data collection was handled if the students weren’t at their university. We’ve also noticed a diffusion of data types and how data is handled when moving from the state level (e.g. hospitals) to the non-state (e.g. gardening groups), where there are blurred lines between research and service. For example, when asking participants about mental health, researchers are required to adhere to strict ethical standards. Administrative staff from hospitals, however, may ask very intrusive questions without having the same level of ethics training.

Across different settings and backgrounds, people may have diverse ethics “languages”. Structured learning would be beneficial to help people bridge definitional gaps and communicate better with one another. Depending on who you are talking to, different people can see different opportunities behind ethical conversations and the kinds of questions that should be asked. Often, we’ve found that people are very keen to learn about ethics but don’t know how to go about it.

Approaches to ethics showcases different points of view in society, where some find it interesting, and others find it anti-innovation and bureaucratic. Some practitioners believe that ethics slows things down and runs behind everything else, seeing it as a “nice to have” and not a “must have”. Ethics processes can be institutionalised, putting ethics into a (usually linguistically) marked box, which can result in people viewing it as an intrusion. Scientific progress is a necessity, and there is a feeling that you “can’t make an enemy of the good”.

People often get switched off when talking about ethics, seeing it as red tape. Often, people are uninterested in ethics because they think it’s too theoretical without enough action. This illuminates a gap in education, where people aren’t being taught about how ethics can be applicable to them, but rather being presented it as something quite abstract. Ethics can be pretty dry when taught in the abstract, and educators need to make it mean something to people. Rather than just talking about ethics, people should actually be doing things to make change happen.

The gap between theoretical ethical discussions and making change happen is exacerbated by the “black boxing” effect, where people just look at their part of the pipeline without seeing where data comes from, or where it goes afterwards. Bystander syndrome has a big impact on people failing to step in or question the data collection process. Ethical considerations are offloaded to others, who offload it again, meaning that nobody takes accountability. It’s important to get people to think about the ultimate purpose of the data they’re collecting or working with, what is the end goal, and what is the application.

Whilst it is important to adopt a wider perspective, it is not easy to foresee how research will evolve over longer time scales. Even if doctors had obtained Lacks’ consent, it wouldn’t have been possible for them to inform Lacks about everything that would happen with her cells. Similarly, data collection that happens today may be innocuous, but it is difficult to say where it could go and what might happen to it; what the unpredictable systemic effects will be. Consent is not a strong enough regulatory tool to encompass all of these possibilities.

Lots of the stories in the video present a tension between ethical frameworks or obtaining proper consent, and scientific progress. Before Henrietta Lacks, biological culture had never been taken and grown. Now, HeLa can be used for all sorts of applications that couldn’t have been previously imagined. Lacks’ cells being taken, used, and found to be immortal, was a series of happenchance. If cancer cells are taken by the NHS today, they are destroyed. The gastric example of Dr Beaumont and Alexis St Martin highlights a similar tension between ethics, consent, and scientific progress. We wondered if people are entitled to their own bodily materials – we never sign an informed consent document when we get our hair cut!

When thinking through ethical implications of a study, researchers should take a step back to look from a system level at where things may go. Perhaps the process should have more emphasis on being inquisitive with the data that we receive. A systems perspective should cover who is being included; often the least “fixable” populations are the ones that get the least research, creating self-reinforcing cycles. From a participant perspective, we often don’t see the outcomes of data we have volunteered for research or told what the greater purpose of it is.

Being kept in the dark about research outcomes may make people less likely to want to volunteer to participate. We perhaps should be better at showing and communicating outcomes to ensure that people are kept informed about research progression. Transparency about how people are using data helps everybody involved learn about it, as participants can see how scientific practice unfolds, and researchers are kept in touch with a broader perspective of the wider system they are operating within. Proper communication of research comes down to funding, as money is needed to disseminate research in ways that make it accessible.

The line between “participant” and “user” for online platforms makes data collection quite blurry. How do you think research ethics should be applied in these digital spaces?#

The video was made almost 8 years ago, and it would be interesting to see how it would be different if it was made today. Things have changed a lot in the past 8 years, particularly in digital spaces such as the handling of privacy, the sorts of data sources being captured, the kinds of data that we are interacting with, and the regulation that exists for data handling. We’ve seen that regulation doesn’t always lead to ethical behaviour, but ethics can help to improve regulation.

Investigating the nuances of research ethics in digital spaces is aided by looking to other domains where ethics is complex and consent can be ambiguous. A lot of the time, anthropological research involves working with people who may be illiterate or unable to understand consent under the western perspective. In anthropological training, consent is seen as a legalistic protection of the institution but separate from ethics. These days, data law is very complex, and it seems impossible for consent to be both understandable and rigorous enough to cover all the cases it should cover. There seems to be a trade-off between being understandable and being comprehensive.

The handling of data by research in general, but especially in the digital era, is not always operated under the mindset of asking what could go wrong but instead tends towards thinking positively and in terms of opportunities. People would rather have a bad something than a good nothing, and there is a pressure to be productive with the data that we have access to. Emphasis on production and advancement can bypass proper ethical assessment, as people are encouraged to move fast rather than to think about how not to break things. Looking at things retrospectively makes it easier to identify what went wrong and how, but it can be difficult to see where all the hazards are in the moment. We may learn about famous disasters, and recognise the mistakes that were made, but it is difficult for people to envision themselves making those same mistakes. It’s possible for people to get things really wrong, but without intending to be evil or do bad science.

Attendees#

  • Huw Day, Data Scientist, University of Bristol: LinkedIn, BlueSky

  • Jessica Woodgate, PhD Student, University of Bristol

  • ZoĂ« Turner, Data Scientist, NHS 😃

  • Rosie Jones McVey, Anthropologist, Uni of Exeter.

  • Chiara Singh, Research Development Associate, University of Bristol, https://www.linkedin.com/in/chiara-singh2020/

  • Tim Binding, Data Scientist, Plymouth City Council

  • Naomi Cornish, PhD student, University of Bristol