Data Ethics Club: Do AI Companions Understand?#

Article Summary#

AI companion services are AI assisted chatbot tools designed to be fluent, persuasive, sometimes humanlike, and even empathetic. The services are made to appear as personal and able to engage in meaningful conversations. However, in the USA several youth suicides have been linked to AI companion services and risks of the technologies have been highlighted in the news and academic research. These concerns include the worry that teens may become detached from reality by investing in AI companions as feeling and sentient beings.

To develop understanding of how young people in the UK are integrating AI companions into their daily lives, the article presents a survey into 1,009 teenage AI companion users between the ages of 13-18. Respondents were asked about frequency and nature of their interactions with AI companions, motivation for use, trust in advice, social impact, and attitudes towards AI capabilities and supervision. Key takeaways from the findings include: almost a third of respondents report interacting with AI companions on most days; 10% of respondents frequently confide in AI companions about important or serious issues, with over half having done this at least once; almost a quarter of 13-15 year old respondents think AI companions can feel; almost a third report AI companions as more satisfying than real-life friends; and 7% state AI is replacing some of their human friendships.

Respondents reflected a duality of needing practical on-demand advice and non-judgemental confidants for emotional support. Yet, many still find conversations with friends more satisfying, with over two thirds reporting that AI companions don’t affect their human friendships at all. Just over half of respondents reported trusting the information they receive from AI companions. 37% of younger teens and 44% of older teens stated that they thought a little parental supervision over usage is required.

From these findings, the paper emphasises the need for cultivating developmentally appropriate AI literacy skills so that young people can understand what AI systems are doing. Artificial systems that present as if they are exhibiting real care could be especially problematic for younger users who are more impressionable. The article suggests that policy development should carefully consider the role of emulated empathy, which is using AI-based technology to copy, simulate, mimic, and display the appearance of human empathy.

Discussion Summary#

What role should/could parental supervision play in teenagers’ use of AI?#

Parental supervision invokes a difficult standoff between freedom and protection. Developing an independent self from your parents is a core part of being a teenager and restricting this could be developmentally undermining. Policing children breaks trust, potentially causing friction within families. Requiring supervision also shifts accountability and puts pressure onto parents. However, it is important to acknowledge that parents are legally accountable for their children until they are at the age where they are legally considered an adult and responsible for their own actions.

The extent and type of parental supervision could significantly vary. One approach could be full supervision, observing all interactions a child has with AI companions. Full supervision may be laborious for both the parent and child, potentially putting strain on the relationship. Teenagers have always found ways to circumvent parental supervision and perhaps AI companion services are just one more channel to get around. Instead of reading the chat history, an alternative approach could be to understand the social interactions of the teen more broadly. Parents could encourage higher level discussions about AI and how the teen understands it; addressing not what the teen is saying to the chatbot, but how they are talking to it, why, and the amount of time or energy they are putting into using the tool.

What works changes over time, and good parenting 50 years ago may look quite different to good parenting today. Whilst older generations could feel inclined to think that because something worked for them, it should work for others now, this isn’t always the case. Teens today may have more opportunities than past teenagers in some respects and less in others as society evolves and so do the circumstances we exist within. Rapid technological development and prevalence of AI, for example, marks a stark difference between the lives of teenagers today and in the past.

Increasing availability and convincingness of services like AI companions could shape teenage development in odd ways if we do not carefully consider how tools should be managed. We already see how AI influences the way that we present and discuss issues, such as the standard format of an LLM answer being increasingly common in written text. We wondered whether this influence might reach further into our psyche as tools become more integrated in social domains of our lives (such as AI companion tools), interfering with our interpersonal social relationships.

For teenagers especially, the effects of AI influence on social development could be problematic. LLM tools reduce friction in interactions through their sycophantic nature which prioritises agreeing and placating the user instead of challenging and making them question if they are behaving in the right way. We worried about whether this encourages delusional beliefs and self-aggrandisement. Replika is an AI companion service which has the slogan of “Always here to listen and talk. Always on your side.” If this becomes the standard that children learn to expect from their friends, we wondered how it may affect the way they form friendships. We need to consider the limits of what these tools can and can’t do, such as humour and inside jokes, and the effects that this has on social development.

Examining the limits and boundaries of AI companions draws attention to how they are defined and where those definitions come from. The line of where an LLM goes from a homework helper to a social output is blurry, yet it is important to distinguish between using an LLM as some form of search tool and using it for emotional support. This made us wonder which situations teenagers are using AI companions in. Perhaps an appropriate framework for children and teenagers to use LLMs is via institutional interactions, such as students being asked by teachers to trigger chats about the results of a science test, engendering the opportunity to have a conversation about their learnings and help fill in any gaps in their knowledge. This did lead to thoughts, however, about brand loyalty and big companies signing contracts with schools to try and recruit customers when they are young. It has been suggested that a valuable application of LLMs is to help educators mark the work of students. Although, even if we hate marking, we feel that anything requiring creativity and subjectivity should not use LLMs. Educators are employed to share their brain with students, and we shouldn’t offload this process to intermediaries.

Deciding whether teenagers should be using AI companions relates to the social media debate. Research from the platforms themselves, such as Facebook, show that most teenagers do not benefit and might be harmed from using social media tools. Yet it’s also true that social media can be a vital lifeline for queer and neurodivergent kids, and taking it away means these children have one less output. Rural and remote accessibility to technology did not play a very large role in the discussion and subsequent passing of the social media ban in Australia.

Talking to AI may sometimes be better than talking to the people who are around or complete strangers with potentially nefarious intentions. It could also be an option for those who can’t afford therapy. Ideally, however, at-risk users would be able to disclose information to social workers as the way LLMs are currently set up skips safeguarding measures. The taking away of these tools being also a removal of an outlet for vulnerable groups is perhaps part of a broader systemic problem with social support and technology regulation.

We thought that the results about opinions on parental supervision could have been reported more clearly. Page 14 states that “the most common view among teen users (37% of younger teens, 44% of older teens) is that ‘a little supervision’ is wanted”. Yet, if you look at table 5 on page 11, 85% of younger teens and 79% of older teens want a little supervision, if not more. This creates a very different picture of the findings. To get a clear picture of attitudes, we thought that the cumulative of ‘a little’, ‘some’, ‘a lot’, and ‘full’ supervision should be collated.

Section 3.4.3 states that “(66%) report having “never” felt uncomfortable with something an AI companion has said or done. This suggests that, for most, these platforms are providing a generally safe and positive user experience.” Do you agree with the second statement? What standard should we hold these models to?#

It is important to remind ourselves that this is 66% of those surveyed rather than 66% of UK teenagers. The title of the report (“Do AI Companions Understand? Most UK Teens Say Yes”) is a bit misleading in this sense. The authors of the paper don’t want to demonise these tools and want to avoid causing moral panic – which can be often the case in the media, where the worst case scenarios tend to be the most loudly talked about. Being clear that the percentages are only of those surveyed rather than all UK teens helps put the results into perspective.

We felt that it is not necessarily a bad thing for users to feel uncomfortable when interacting with AI companions or LLMs more generally. There is a danger of LLMs being too comfortable which we should perhaps look out for. Being friendly and helpful can sometimes actually be unhelpful by making users reliant and preventing them from doing things for themselves. The sycophantic behaviour of LLMs reminds some of us of our experience working at sex offender prisons, in which the inmates were often very charming. Another analogy is to CEOs that like to surround themselves with “yes” men so that feedback on their ideas is always positive. LLMs should be biased towards being honest and factual rather than helpful, but in reality they tend to lean the opposite way, such as by replying to questions that make no sense with overly confident answers. LLMs also have a tendency to flip flop a lot, which is when the output switches between one answer and its opposite.

To mitigate some of the sycophancy issues, one approach could be to get LLMs to positively reflect on the questions they’ve been asked. Instead of trying to immediately answer the question, an LLM could ask the user “why are you asking me this”. We’ve found that specifically asking LLMs not to validate everything you say and give some pushback makes our interactions with LLMs more useful. However, companies may not be commercially incentivised to incorporate these measures; ChatGPT 5.0 was purposefully made less sycophantic and users complained, which lead to OpenAI ramping up the sycophancy once more.

The level of sycophancy and propensity for an LLM to make users feel uncomfortable depends on the way that the user is prompting the tool as well as the training of the tool itself. Unsafe answers may require unsafe questions. We wondered if in situations where teenagers felt uncomfortable, the structure of questions they were asking pointed towards the response they got. Perhaps an LLM should be able to distinguish between “can I pass my a-levels” and “can I assassinate the queen”. This highlights how safeguarding standards are not keeping up with AI.

We wondered how much of AI safeguarding depends on the user, as a standard LLM interaction involves one human and one tool. However, in the case of children, it is not really up to them to be able to fully comprehend what is or is not safe. Asking “how can I trick my sister” is a pretty normal question for a child, yet we can see how the response could easily go from “wipe snot on her” to, after some continued prompting and interaction, “blow up her chair”. LLMs do not currently have the inbuilt capacity to stop this from happening, and whilst the child may feel safe in the interaction, the situation is not safe and needs adult moderation.

To draw a parallel, if there was a discussion forum of teenagers being hosted or moderated by an educator with 22% feeling unsafe or uncomfortable, the forum would immediately be shut down. Reporting the positive statistic of 66% never feeling uncomfortable could be inaccurately requoted and reported by AI companies to further push their services onto children.

The policy recommendations don’t seem to direct any strong regulation on the AI companies themselves. What do you think about this?#

It’s acceptable to ask for things like parental supervision but we wondered how realistic legal change might be. Law moves slowly in comparison to tech, and it is unlikely that AI companies will act to change anything without being legally required to. The general trend in the tech industry seems to be to “see what happens first” before considering restrictions. One mechanism to incentivise companies to change could be by including additional liabilities.

Examining how to regulate AI companions feels linked to the conversation about children using social media. We wondered if the Australian child social media ban reaches to LLMs. In the UK, there currently do not seem to be age or usage limits on existing generic LLMs.

Based on the scope of the study alongside the lack of certain methodology details, we thought that perhaps the paper should not be leading to policy recommendations but instead directions for future research. We wondered how we can come up with better advice for how to deal with AI other than “not” AI. It seems like a lot of advice is not very contextual, however, there are important nuances that need to be considered. Minority groups, for example, may see safety in very different ways to mainstream narratives.

What additional questions would you like to ask the surveyed teenagers?#

Before going into detail about additional questions for surveyed teenagers, we had some questions about the methodology. Whilst surveys are a valuable research tool, some aspects of this survey could be discussed in more detail. A literature review or background section, for example, would help to frame the context of the paper and help us assess whether the survey is asking the right questions. Many people coming from a variety of fields write about the topic of AI, leading to a range of perspectives as people from different fields ask different questions. Surveys can end up lumping these disparate views together, assuming shared meaning. However, closer investigation often reveals that concepts don’t add up because they are angling from different fields. The paper would thus be improved by spending more time elucidating key terms, such “think” and “understand”.

There were some interesting topics that came up in the paper that could be explored in more detail, such as the discussion on different types of empathy. We wondered whether there is more that could be done with the data collected, such as by conducting analyses across different questions, for example, investigating differences between those using the tools more or less frequently. We noticed big differences between age groups on certain questions and would like to understand this more. Differences in answers between varying age groups could be related to older teenagers being exposed to AI tools for longer and thus having a better understanding of how the tools work, or it could come from younger respondents having used the tools from earlier in life. We would also like to know whether there were differences in gender responses.

In addition to the contextualisation of the paper and discussion of results, we thought that some aspects of the sampling could be improved. The study is framed in a general manner but seems to be directed towards populations that are more likely to be “at risk”. We would be interested to know how participants were recruited, and whether populations who use these tools differently to those included in the survey may have been missed. Understanding the scope of populations included or not included in the survey would be aided by incorporating more complex demographics of the participants.

To expand the generalisability of the survey, further research could address differences between the USA and UK regarding teenage use of AI companion services, placing emphasis on potential differences in extreme cases. Other important aspects to examine include the experience of groups with diverse disability, ethnicity, gender, sexual orientation, and remote/rural/island characteristics. Investigating the experience of teenagers that do not use AI, who were excluded from the study, would also be key.

Reproducibility would be strengthened by incorporating a discussion on how the survey questions were developed, e.g. whether they were shaped by focus groups. Some concepts which adults may take for granted as being obvious are not necessarily understood in the same way by younger people. Time, for example, is experienced very differently by various ages and younger teenagers might not have a good grasp over the difference between daily, weekly, and monthly use.

As participants required parental consent before being able to take part in the study, a sort of sampling bias arises. We wondered what the effect of this may be on the demographics of the population incorporated. It is difficult to collect data on children without the consent of their parents, and the survey would have needed to go through some sort of ethical committee. Perhaps the next steps of the paper could include finding an ethical way to collect this data. An alternative approach that could be appropriate could be longer-term observational style methodologies, such as an ethnographic study. Ethnography can help to build a richer sense of developmental understandings, which would be useful to learn about the perspectives teenagers have on AI and how those perspectives develop.

Given that the study required parental consent, we wondered how much influence parents had over the answers and whether respondents answered from what they really believe or from what they think they should say. A teenager is unlikely to be completely honest about romance, for example, when their parents are in the room. Teenagers especially are very susceptible to the opinions of others and could easily have been reporting what they’ve picked up from other people and authorities instead of their own experience. Rather than asking the participants to talk about themselves directly, it might be interesting to ask them what their friends do – how their friends use LLMs, and how the respondent thinks LLM use has influence them. Asking about their friends instead of themselves is less personal and thus perhaps easier to be honest about.

Additional questions we would like to ask the respondents include whether they are using the tools instead of a diary. Diaries have historically been a mechanism for teenagers to explore their feelings, and one could imagine an LLM being used in a similar way. A diary, however, is a static object that does not feed back to the author or reinforce what they are writing in it – unlike an AI companion. For the respondents that have talked about something serious to an AI companion, we would like to ask whether that is the only place where they have talked about it or more of an additional place to go, and whether they also talked to a human about it.

The understanding that teenagers have of what AI companions are, including knowledge of the underlying algorithmic structure and data handling, is something that we felt concerned about. We wondered what teenagers know about data privacy in relation to AI companions and LLMs, and whether they are aware of what data are private or shared. Conversations with LLMs are not secure; if an account is shared, the ways in which one person uses an LLM (e.g. building a romantic relationship) may trickle into other peoples’ interactions.

It also worried us that teenagers think that AI companions provide advice. The finding that 56% of respondents believe that AI companions can think and 77% believe it is false that AI companions can feel would be interesting to investigate further, to illuminate what teenagers understand by this and why they answered how they did. The concepts of “thinking” and “feeling” are very difficult to define and perhaps come along varying scales. We don’t know the extent to which teenagers assimilate an LLM “thinking” with the way that they themselves think. We wondered whether anyone has tied a developmental model into what teenagers know or perceive of thinking and understanding.

We also wondered what the effects are of treating an LLM like it is thinking or feeling, and whether this has implications for how we socialise. A lot of the troubling cases reported in the media involve people who know that they are dealing with a chatbot and still become emotionally entangled with it. This suggests that reminding people that LLMs are tools rather than people may not be enough to mitigate problematic use. To further investigate the perception of LLMs and social implications, we would like to ac extent teenagers think LLMs are “real”.

The way we perceive LLMs may be analogous to how we interact with puppets. Most people don’t think puppets are real, and yet there are benefits of believing puppets not to be real. For example, those that find it difficult to maintain eye contact with a person may find it easier to do with a puppet. Just as puppets are useful for neurodivergent people to practice social skills, AI companions could be a valuable space for neurodivergent teenagers to express themselves. Children who hyperfocus may really enjoy talking to an LLM because it will never get tired. However, this quickly becomes problematic if a child just wants to spend all their time talking to AI about their interests. Whilst a tireless AI chat companion can provide an outlet for these children, it doesn’t help them to learn that a natural conversation involves give and take, listening as well as talking.


Attendees#

  • Huw Day, Data Scientist, University of Bristol: LinkedIn, BlueSky

  • Jessica Woodgate, PhD Student, University of Bristol

  • Joe Carver, Data Scientist, Brandwatch

  • Evie, Data Scientist, Jean Golding Institute

  • Paul Matthews, Lecturer, UWE Bristol

  • Cali Riley, Senior Analyst, Manchester City Council

  • Rachel Hamilton, Creative Writing Teaching Fellow, Bath Spa Uni

  • Liz Ing-Simmons, Research Software Engineer, King’s College London,

  • Kristi Long, Knowledge Manager, NHS Education for Scotland

  • Karen Louise Dawe, Lecturer in Digital Health, UoB LinkedIn

  • Kamilla Wells, Citizen Developer / AI Product Manager, Brisbane

  • Robin Dasler, data product manager, California