Data Ethics Club: The Myers-Briggs Test Has Been Debunked Time and Again. Why Do Companies Still Use It?#

What’s this?

This is summary of Wednesday 22nd May’s Data Ethics Club discussion, where we spoke and wrote about the article The Myers-Briggs Test Has Been Debunked Time and Again. Why Do Companies Still Use It?. The article summary was written by Amy Joint and edited by Jessica Woodgate. The discussion summary was written by Jessica Woodgate , who tried to synthesise everyone’s contributions to this document and the discussion. “We” = “someone at Data Ethics Club”. Huw Day, Vanessa Hanschke, Amy Joint, Nina Di Cara and Natalie Thurlby helped with the final edit.

Article Summary#

The Myers-Briggs Type Indicator is a questionnaire indicating which distinct ‘personality type’ someone fits into – you might also know it as MBTI or 16 Personalities. It was first developed during World War Two by a mother and daughter aiming to help women entering the industrial workforce to find roles that fit them best. Since then, it has been used extensively in recruitment and teambuilding settings, making the Myers-Briggs Company £20 million annually.

Companies often receive far more applications to jobs than an individual hiring manager can consider, so look for legal ways to screen out candidates that do not constitute discrimination against protected characteristics.

Beyond recruitment, MBTI is used to coach people in making career decisions based on their personality, to convince people they were born to do their current role, or to show them that their individual personality type is an important part of their wider team. However, the test has been debunked over and over again – including the idea that someone can be inherently either extrovert or introvert, intuiting or sensing, thinking or feeling, and judging or perceiving. There is little evidence that results are useful in determining managerial effectiveness, building teams, or providing coaching. Test results are often inconsistent when repeated. People’s reliance and draw towards these “Sorting Hat”-like practices can be put down to the human need for systems that explain things simply, and the desire to fit together work and personal identities.

Considering the test was conceptualised to help women slot into a heteronormative, cis-gendered, patriarchal workplace in the 1940s, companies using this test today risk feeding historical bias within the workplace. By categorising people into one of 16 boxes, the test risks artificially limiting people on what they feel they are good at or can achieve, without the flexibility to admit they are still working out their strengths and what they are interested in professionally.

Discussion Summary#

What are the consequences (positive and negative) of relying on personality type testing in the recruitment process?#

We see positive consequences of using personality tests in contexts where there are not enough resources to go through each candidate individually, by helping to reduce the list of candidates quickly and cheaply. Using personality tests to reduce candidates might be more appropriate than looking solely at CVs as recruitment involves more than just examining someone’s skillset: it also involves assessing how someone will fit within a team and intuiting how they will be to work with. To some extent, MBTI can help with this more relational side of recruitment, as it holds some grounding in the natural human tendency to categorise and put structure to phenomena in an attempt to understand the world better.

Utilising methods that mimic human processes of categorisation is supported by evidence for the importance of these processes in how we socialise. Putting people into types allows us to make quick judgements about whether they are (un)trustworthy and if we can cooperate with them. Evidence of quick social judgements can be found as early as infancy; even before babies have a language to communicate, they can still respond to social cues. One type of these cues is ostensive cues (signals that precede social attention cues and alert infants to pay attention), of which ongoing research is being conducted at the Anna Freud centre. From the very beginning of our lives, we thus have various psychological processes which help us to respond and project structure to the world around us. This structure allows us to obtain information about personality and behaviour. It has been found that AI can also mimic something about implicit social categorisation processes, evidenced by predicting personality from eye movements and static facial images. Evidence for processes performing social categorisation lends support to the idea that we could develop systems in mimicry, like MBTI.

There are various benefits that arise from personality categorisation, as it enables us to explicitly assign labels to different groups. Some benefits of labelling can be seen by reference to other domains, like clinical diagnostic labels which are used to indicate diagnosis of a health condition. Clinical diagnostic labels have been found to validate concerns and improve access to treatment. Even if labels are not very accurate, they can provide a framework for communication. This framework helps to inform reactions when one applicant is a bit different to the others. As well as helping to communicate things with others, frameworks can be used to explore things about yourself; there is validity in time spent self-reflecting and self-affirming. Tests like Myers-Briggs provide a place to start from, which is useful for those who find introspection harder. Similarly, theories like love languages have been largely debunked, but provide a good place to start a conversation. Love languages can also be harnessed as a way of distinguishing ourselves. Likewise, MBTI can be seen as a way to advocate for individuality, as it was developed with the intention to help women enter the workforce.

However, whilst the test has its benefits, it shouldn’t be make or break as there are methodology issues which could lead to negative consequences. MBTI may have been beneficial in finding people jobs during WWII, but it seems unlikely that the same test is still accurate 80 years later. Analogously, IQ tests began with a laudable purpose which has been distilled and put to some questionable uses over the years. IQ tests stemmed from the Binet-Simon test, which was created to detect children who were falling behind developmentally and needed help. Though considered dubious and non-usable, mental tests were then appropriated for the army. The Binet-Simon test was later evolved by eugenicist Henry H. Goddard (who also introduced the term “moron” for clinical use). Goddard argued that “feeble-minded” people (those who didn’t perform well on the test) should be prevented from giving birth. Today, IQ tests are accepted as not being valid measurements of intelligence in a broad sense. Similarly, MBTI perhaps had good intentions when originally developed, but falls down in implementation so many years later.

In psychology, the MBTI test is generally considered as having no scientific basis. Reliability is low, with people obtaining different classifications when they retake the test. It seems to us that low reliability is related to the attempt to put people into buckets, where the list of buckets is ill defined. There are alternative personality tests with better validity, for instance the PEN personality test, or the Big Five test, of which longitudinal studies have been conducted, and has been found consistent and reliable. All these tests draw inspiration from Jungian theory, using statistical analysis to identify which concepts are significantly related to someone. Despite evidence that tests like the Big Five are more valid than MBTI, there are still issues with all tests as they rely on questionable methodologies like self-reporting. In the context of recruitment, however, this might not be more problematic than relying on CVs (which are also self-reported).

A major concern we have with personality tests like MBTI is that they massively oversimplify big psychological and philosophical questions. For instance: how extraversion interacts with neuroticism; what the difference is between someone’s experiences and their personality; whether we are the same person if most of our cells are renewed by every seven to ten years; if this changes the makeup of our psychology.

Oversimplification also emerges in attempting to quantify the qualitative, as numbers are not expressive enough to describe a person in entirety. By attempting to distil qualities into numbers we seek to resolve uncertainties, thereby avoiding the reality of how complicated people are. Finding your own identity can be difficult enough. For instance, depression can reduce our sense of identity. It is okay to say you’re still working out who you are. If we ourselves don’t know who we are, we might find it unlikely that a test could know. In addition, tests are not sophisticated enough to identify whether people are purposefully misrepresenting themselves; someone psychopathic could present themselves as charming and friendly.

Complexity in defining identify is exacerbated by the fact that our personalities are dynamic, changing over time. Tests like MBTI cannot capture the fluidity of personality, as they only give us results from a point in time. This doesn’t take into account how people can switch from introverted to extroverted depending on the social context, the type of business or activities they are doing, their mood, how tired they are, and many other factors. Throughout our day, throughout our life, we change who we are. Perhaps personality tests should incorporate non-stationarity by adapting to us as we grow.

Whilst they have many issues, tests like MBTI are popular because organisations seem intent on obtaining shortcuts to find out who you are and how they can label you. Tesco, for instance, and many publishing jobs with high numbers of applicants use personality testing. By putting people into artificial categories, personality tests bypass spending time getting to know you. This might have been appropriate when recruiting for the war effort, in a time of desperate need. However, the tests distil who you are into who you think you are on the day you take the test rather than who you really are.

Overall, whilst they might have some benefits and be fun to do, personality tests cannot capture the complexity of personality. They do not, therefore, seem like a good way of determining decisions. Rather than deterministic indicators of who we are, we should use personality tests as a tool to spark introspection.

How do personality tests contribute to bias when used within existing organisations and teams?#

One form of bias that personality tests introduce is confirmation bias: once you’ve been assigned a label, it is easy to seek out evidence in support of that label. Just as with horoscopes and fortune cookies, if you believe it’s true then you will find links that confirm it. Even if we do not consciously label ourselves as a certain personality type, or the results aren’t accurate, confirmation bias can affect the way that others perceive you, or seep into the unconscious, leading to results becoming self-fulfilling prophecies.

Confirmation bias can encourage labels to become integrated into your identity, either in the way others (e.g. employers) perceive you or your perception of yourself. When a label becomes a part of your identity, you are subject to certain norms that come along with that label, such as being expected to carry the conversation as an extrovert. It might be difficult to accept a label which is assigned to you, and the expectations that come along with that. On the other hand, people might use labels as an excuse, for example for introverts not engaging with others. We are not yet in a society where it is easy to ebb and flow between different labels.

Other forms of bias may emerge in how you perceive the labels of others. We’ve seen personality tests used to assist in understanding how to interact with someone if you perceive them as fitting in a certain category. On one hand, this could help to make sense of how to work with different types of people. On the other hand, assigning labels causes perception problems by encouraging binary views of the world. Perceiving the world in binary ways influences the way we approach others, judging people before you actually know them. Openly sharing your personality categorisation (such as in the example given in the article of people sharing their categories before every meeting) could give people an opportunity to make assumptions about what opinions you might have before you have even fully formed them yourself.

As there are many assumptions behind the MBTI categories, using MBTI to filter out certain personality types could result in discrimination and cultural bias. For example, MBTI may register people with autism as high introversion, and people with ADHD as high extroversion. Quantifying personalities varies in different countries and according to different lifestyles, as culture plays a part in shaping our personality. For instance, Brazilians score high while Nigerians score low for extraversion. Tests like MBTI were developed in a particular society with a particular group of people in mind; imposing that test on other cultures without considering their own values, practices and interpretation of personality has colonial undertones. There may also be gender differences in personality traits. Applying a standardised test without considering contextual nuances could thus result in disability, cultural, or gender discrimination; these problems can compound if test results are automated.

In social work, jobs are often now fully anonymised with no names or ethnicity. There is a potential that MBTI could be introduced to the recruitment process to recover information about candidates. Introducing MBTI could present an opportunity for bias to enter the pipeline. However, it is also true that human interviewers will have their own biases and tendencies to hire people in their own image. Recruiters get too many applications to meaningfully engage with each one, and they have to make quick decisions which are susceptible to bias. There are frameworks and quotas in place that can help ensure diversity, but they can be a double-edged sword. If personality tests could be improved to be more objective, it could be a useful mitigation of interviewer bias.

Considering the biases implied by personality tests and the tendency to overcategorise, we could envision dystopias where processes like personality testing are aggressively automated, testing infants and deciding their future from there. The idea of society pre-defining our roles is explored in the film Gattaka, where genotype profiling is used to qualify people for professional employment.

Are there ethical considerations for how organisations might store and use MBTI/personality typing data beyond the original tests? What duties do employers have for using these responsibly?#

In regard to how MBTI data is used, we wondered how employers decide what “good” is with respect to the results of a personality test, and how that information is applied once it is obtained. If companies use it for recruitment, it seems likely that they would also use it for team evaluation. We wondered whether MBTI implies that a “good” group dynamic is having a mix of people, or that certain people are more suited for certain tasks. In our experience, having different personality types within a team is excellent for creativity and productivity; when putting a team together, it is important to look across the staff cohort. MBTI could help to create diverse groups, or it could be the case that employers implement it to create teams of uniform personalities. Either way, employers should also be listening to the voices of their employees, and hearing feedback about what teams are working well together.

In terms of data collection, whilst aggregating results might have useful applications, anonymisation is important. Ethical concerns are the same as any other type of personal information; it is important that data usage is transparent, data is only stored when necessary, and it is clear why data needs to be stored. Personality data should only be used in the best interests of the people tested, for instance, someone shouldn’t be fired because there is an imbalance of a certain personality type.

Bonus question What change would you like to see on the basis of this piece? Who has the power to make that change?#

Personality tests have a dehumanising element to them, which we would like to see reduced. Whilst tests might be able to identify some similarities and differences between people, it is much better to understand your colleagues through your own personal experience of them. It seems to us that it would be much better to just ask people how they would respond to particular situations; to learn what their values are, independent of your perception of their personality type. The wrong questions are being asked: rather than asking “when do you do this”, personality tests ask “what do you do” without any context or situational cues. Perhaps it would be more appropriate to talk about social graces, discussing different issues, and seeing how people respond. We should want to support people to express themselves as individuals and provide space for people to develop themselves. Alternatives to personality tests could be having more in depth interviews or trial periods. Diversity should also be sought in hiring panels and candidate pools. However, to be appealing to managers, solutions should be cheap and easy to implement.

In general, the unsound methodology leads us to think that organisations should not be using tests like MBTI. Personality tests measure a fixed point in time. If used in isolation, this could result in people being stuck with labels that don’t give much information about who they really are as they grow, and their personality develops. However, instead of using them once and never revising the results, personality tests could be used in ways that harness the dynamic nature of personality, measuring shifting environments and adapting workplaces to changes in their employees. In practice, companies tend to use approaches like employee engagement surveys.

Attendees#

  • Natalie Zelenka, Senior Fellow of Health Informatics, University College London, NatalieZelenka, @NatZelenka

  • Nina Di Cara, Research Associate, University of Bristol, ninadicara, @ninadicara

  • Huw Day, Data Scientist, Jean Golding Institute, @disco_huw

  • Amy Joint, Programme Manager, ISRCTN Clinical Trials Registry, @AmyJointSci

  • Vanessa Hanschke, PhD student, University of Bristol

  • ZoĂ« Turner, Senior Data Scientist, The Strategy Unit (NHS Midlands and Lancashire), @lextuga007

  • Kamilla Wells, Citizen Developer, Australian Public Service, Brisbane

  • Marissa Ellis, Founder of www.diversily.com

  • Dina Molnar, PhD student in Digital Health and Care, University of Bristol

  • Callum Scott, PhD, University of York

  • Owen Bowden, Insight & Analytics Lead, Mencap

  • Robin Dasler, Data Product Manager