Data Ethics Club: AI Tools in Society: Impacts on Cognitive Offloading and the Future of Critical Thinking#
Article Summary#
Critical thinking is the ability to analyse, evaluate, and synthesise information to make reasoned decisions. Cognitive offloading occurs when individuals delegate cognitive tasks to external aids, which reduces engagement in deep and reflective thinking. AI tools provide a means for cognitive offloading to occur by being delegated tasks such as memory retention, decision-making, and information retrieval. Cognitive offloading can free up cognitive capacity for more complex and creative activities, however, there is also growing concern that cognitive offloading with AI can lead to a reduction in cognitive effort and foster âcognitive lazinessâ, causing cognitive skills such as memory retention, analytical thinking, and problem-solving to atrophy.
The paper investigates the influence of AI tool usage on critical thinking skills, with a particular focus on cognitive offloading as a mediating variable. Using a mixed-method approach combining quantitative and qualitative techniques, 666 participants from the UK undertook a structured questionnaire. The questionnaire included items from the Halpern Critical Thinking Assessment (HCTA) and Terenziniâs self-reported measures of critical thinking development. Semi-structured interviews were then conducted with a subset of 50 participants. Participants were recruited from responses to advertisements on social media platforms and categorised into three age groups: 17-25 years (young, 16.517%); 26-45 years (middle-aged, 48.199%); and 46 years and above (older, 35.285%). Findings indicate that higher use of AI tools is associated with reduced critical thinking skills, and cognitive offloading plays a significant role in this relationship.
Discussion Summary#
What did you think of the paperâs claims that âHigher AI tool usage is associated with reduced critical thinking skillsâ and âCognitive offloading mediates the relationship between AI tool usage and critical thinking skillsâ?#
The use of the term âmediatesâ is ambiguous, and we were unclear whether the claim that âcognitive offloading mediatesâ is trivial or substantive. In statistics, a mediation model seeks to identify and explain the mechanism that underlies a relationship between variables through the inclusion of a hypothetical âmediatorâ variable. Yet it is unclear whether this is the definition the paper is referring to. The use of language in the paper creates distancing and avoids identifying a direct cause, which is not very informative but perhaps deliberately so. Similar language is used in information systems (IS), which is a cross-section between technology and business with a touch of psychology examining the combination of different parts such as computer programs, physical devices, and networks. IS often discusses how things are connected rather than focusing on causal relationships.
Effects of LLM usage on cognitive offloading and critical thinking are still uncertain. It could be the case that there are core learning moments that happen in the mundane and menial tasks which LLMs are now being used for. Small and seemingly insignificant steps can provide more granular information that is later useful to help us to identify outliers and inform critical thinking. Not having to search for information could prevent different thoughts being connected; if a teacher is giving you the answers, you donât learn how to find the answers yourself.
Being made to continuously perform critical thinking, on the other hand, facilitates learning. When you learn the process, you can investigate whether the answers you have fit what you already know, as you learn how and when to ask questions. The four stages of competence or âconscious competenceâ learning model from psychology lays out the hierarchy of competence from unconscious incompetence to conscious incompetence, conscious competence, and finally unconscious competence.
In examining the relationship between cognitive offloading and critical thinking, it is important to note that we are not comparing apples with apples. There are key differences between the terms and cognitive offloading does not necessarily influence critical thinking abilities or is necessarily bad. Qualifying cognitive offloading is difficult, especially through self-reporting as people donât always do well at judging when they are cognitive offloading. Obtaining a detailed understanding of what cognitive offloading is and when it may be good or bad is essential for studying the effects of AI on cognitive offloading and critical thinking.
Interesting questions arise surrounding when something becomes cognitive offloading, why it is cognitively offloaded, and when it is acceptable to offload something. It is easy to see why AI is such an appealing tool to so many people, as our brains are constantly trying to optimise our use of energy, and AI provides a handy mechanism to short-cut many tasks. However, what is deemed âoptimalâ for knowledge work is as much a cultural concept as anything else and can change over place and time.
The addition of AI into knowledge tasks is not fundamentally different to the progression of other technologies over time, such as the adoption of calculator. Although, calculators assist with a different sort of task to LLM-based AI tools. Language plays a very different role in our lives to numbers, as language is the means by which much of our communication happens. The focus on language rather than numbers suggests that LLMs may have a different effect on critical thinking to calculators. Our framing of critical thinking could be too set in today and not take into account the wider picture.
We think about the effects of cognitive offloading on critical thinking in relation to all abstracting technologies, such as the difference in operating systems from Linux to Windows. People are forced to exercise different skills depending on the tools they are using and the level of abstraction they contain. Some types of offloading may create a lot of value, such as programming in Python instead of instruction set programming. Other types of offloading can be problematic, such as using automated navigation systems on boats instead of natural navigation which can cause serious issues when there are system failures. Cognitive offloading in health settings is one of the selling points of AI, promising to free up time in a resource constrained area. Supporting over-worked healthcare workers is a valuable problem to address, however, those workers have essential skills that need to be maintained. If those skills atrophy due to over-dependence on AI, technical failures could be catastrophic as there will be no informed oversight to correct AI mistakes.
The question of when it is acceptable to cognitively offload comes up in education a lot. In the US school system, a lot of what is being learnt revolves around the use of a particular tool that someone thinks is necessary. Deciding which tools to use, what should be cognitively offloaded, and what skills should be imparted is a really difficult decision and can be messy. Currently, the tools that students are expected to learn are LLM based, which is a technology that society is still figuring out how to use.
Is this a problem of lack of critical appraisal/over-trust in AI output, or is it more generally we donât critically appraise enough?#
From our experience in teaching, we have found that it can be hard to get people to think critically (sometimes it is hard enough just to get them interested enough to learn the concept!), so perhaps people in general donât use critical appraisal enough. Humans have a tendency to trust the information that is provided to them. Many people donât or donât know how to check the sources of what they read on the internet. This carries over to the general reception of AI, where many accept it without critically appraising it.
People tend not to question LLMs because is not communicated to them that the tools are a statistical combination of letters and words, but done in such a way to appear very confident. The confidence conveyed in LLM outputs makes those outputs less likely to be appraised. Explainability is lacking in many machine learning (ML) models, giving a mystical âblack boxâ appearance that makes it challenging to see under the hood and think critically about those models.
The narrative that ânobody knowsâ how ML models make decisions is not something that companies are working to dispel; some companies are even encouraging the myth to make the technology appear more powerful than it really is. The language used by LLMs pulls the tools into a fictional realm where people can emotionally connect with them, as outputs can easily be prompted to sound like a psychologist or mimic romantic relationships. It is important, however, that people know how to critique AI generated outputs as there are still glaring issues with the technology. For example, the foundations of some of the data LLMs are trained on are problematic, as companies use any data they can, including that which is biased or toxic.
A lot of the time, LLMs are a hammer in search of a nail. In their current state, LLMs are not particularly versatile but appear to be versatile due to their use of language, which leads to people deploying them in all sorts of contexts. LLM companies claim that LLMs take care of busy work and allow time for heavier cognitive work, but we wondered what exactly is meant by heavier work and whether everyone has opportunities for heavier work. It does make sense that people will want to use LLMs in certain contexts, with the incentive that if you can cognitively offload lots of things, you can use the rest of your time elsewhere; in places where time spent may be more valuable. We have had jobs involving menial tasks and can see that AI could help with those tasks. However, we need to already have experience and know how to perform those tasks ourselves before we can know how to apply AI to those tasks.
We wondered how education can teach critical thinking better and how we can get people to engage. Education will need to flex and change to adapt to AI, a part of which may involve getting people to continually practice critical thinking alongside using AI rather than trying to separate out the two. Teachers need to make sure they are bringing educational scenarios back to concrete examples of tangible problems, so that students can relate (as discussed in a previous Data Ethics Club).
From our teaching experiences, we have found that one of the most common ways AI is used in academia is for translation. Many international students carry out assignments in their native language and then use LLMs to translate what they have written into English. Cognitive offloading may not be the relevant term here, which exemplifies how there are many other skills and functions in question that AI is used for that may not involve cognitive offloading. Narrowing in on cognitive offloading may thus miss some relevant aspects.
Using LLMs for translation will be intuitive for lots of students, and although university guidelines do say that it is cheating, we have found that many are not aware of that. Perhaps students should be able to write their assignments in the language they are studying in, or perhaps this matters more for some disciplines (e.g. philosophy, arts) more than others (e.g. maths). AI translation may miss some conceptual nuances that a human would be able to identify.
Thereâs a lot of associations in the paper, but is there much evidence of causal relationship?#
The paper seems to imply a causal relationship without saying it outright. There may well be a causal relationship between AI usage, cognitive offloading, and critical thinking, however the paper does not do enough to substantiate these claims. The premise of the paper is that people arenât critically reviewing the output of AI tools, but we were unclear as to whether the paper is attempting to prove this. There are plenty of statements in the paper that could be true but arenât backed up by the paper itself as the evidence is insufficient. There seems to be an implicit expectation that the results connect cognitive offloading and critical thinking, yet there is not enough linking between the two. Although not listed as either of the proposed hypotheses, there also seems to be a hypothesis that AI might affect young people in a different way to older people, as the age groups are compared. Further studies need to be conducted to prove whether there is a relationship.
A lot of statistical tests are thrown at the problem, but not in the most informative way, and some of the correlations are not justified. The tests performed are not comprehensive and there are some missing questions for certain topics such as cognitive offloading. There is also much reuse of citations both in motivating the problem and backing up the method.
There are significant methodological issues with the participant sampling, as the sample is not very representative. The study recruited a lot of PhD students and used the internet to source participants, but this means that the sample will be biased towards those that are more technologically savvy and thus more likely to be using AI. We wondered how people that donât use AI were contacted. There is minimal information about who was selected for the semi-structured interviews. There is also not much information about how many people were needed for each age group; the paper states that the required sample size was 384 participants and the actual sample size was 666.
As the study is on self-reported measures, we wondered if people were aware of the changes in their skills and whether this influenced their survey responses. Self-evaluation is fluid, which we have experienced ourselves insofar as we used to think we were good at critical appraisal but then began to doubt that. Responses may have been influenced by the Dunning-Kruger effect, which is a cognitive bias in which people with low ability give overly positive assessments of their ability. The more information you have on something, the better able you are to evaluate how much you know about it. It can be difficult to evaluate your own understanding of something as understanding can range from having an overview of something to in-depth knowledge.
In health research, things have to be much more explicitly connected and accounted for and through this lens, this paper does not present enough evidence to prove a relationship. We have experienced people making technical claims about LLMs and using those claims as evidence of why theyâre bad, without proper justification. This echoes wider ML research practices which have been criticised for lacking scientific rigour.
Whilst we may agree with some criticisms of LLMs, critiques and claims need to be accurate and robust. It is not convincing to argue that something is bad based on flimsy evidence. Propagating research without proper evidential support has real consequences and can be difficult to roll back once it is out there. It can take work to overturn bad science and can look untrustworthy, such as the enduring falsehood that vaccines cause autism. It takes a lot more work to refute a lie than to make a lie, conceptualised in Brandoliniâs law (or the bullshit asymmetry principle). Brandoliniâs law states that the amount of energy needed to refute bullshit is an order of magnitude bigger than the energy needed to produce it.
Given that many people are not good at or do not generally employ critical appraisal, we wondered what conclusions most people would draw from this paper. We found the paper to be long-winded and a bit confusing. Bad uses of statistics can undermine trust in the process of science, possibly resulting in disengagement. Many problems can boil down to numeracy over statistics and understanding what these numbers really mean. Statistics can be hard to demystify, even for experts. We have found that explaining to someone why the conclusion that they have reached is wrong is really hard.
What change would you like to see on the basis of this piece? Who has the power to make that change?#
Cognitive offloading is difficult to demonstrate and measure so it may be hard to find more than anecdotal evidence. The nature of the study design means it is difficult to infer natural behaviours, as it relies on self-reporting and interviews, rather than observation. Some of the issues could be because the paper is a bit stuck between methodologies, as proving sound evidence of causality is difficult and mixing methodologies detracts from the clarity of the paper. Observational studies could be more useful to examine how these things work in the real world and in different contexts, taking wider system factors into account.
We thought about how we would do this study better. An alternative methodology that looks at the settings in which cognitive offloading happens might be more appropriate to infer causal connections. Another approach could be to accompany self-reporting on AI usage with objectively measurable observations and more standardised testing apparatus. For instance, a cross-over study is when two or more treatments are compared by giving participants treatments in a pre-determined sequence. This could look like splitting a coding class could be in two and giving one side access to LLMs and one side without. Their comparative performance could then be evaluated in terms of speed, quality of work, and so on.
A related study, âYour Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Taskâ by Kosmyna et al. (2025) split participants into three groups using an LLM, a search engine, or no tools. The study examined the cognitive load during essay writing using electroencephalography (EEG) as well as essay scores from human teachers and AI, finding that LLM users indicated under-engagement compared to other groups.
Attendees#
Huw Day, Data Scientist, University of Bristol: LinkedIn, BlueSky
Jessica Woodgate, PhD Student, University of Bristol
Robin Dasler, data product manager, California
Fearghal Kavanagh, Software Developer former Optometrist, London
Hessam Hessami Data scientist and Founder at Ethiquette AI
Emma Tonkin, data manager in digital health at Bristol
Joe Carver, Data Scientist, Brandwatch
Kamilla Wells, Citizen Developer / AI Product Manager, Brisbane
Chris Jones, Data Scientist