Chat GPT Stole my Job#

What’s this?

This is a standalone blogpost written by Huw Day in January 2023 about using the online language model Chat GPT to write up the data ethics club blogpost on Defective Altruism. All opinions expressed are strictly his own. Any content generated by Chat GPT is clearly marked but often different outputs come from identical prompts so results might not be perfectly repeatable. Nina Di Cara and Natalie Thurlby helped with the final edit.

What is Chat GPT?#

Chat GPT is an online, open access language model provided by the Open AI project. All you need to use it is to sign up or even just connect with your google account. Since it rose in popularity at the end of 2022, people have been asking it endless questions on all sorts of subjects. You can ask it questions and it will do its best to answer. Those questions can range from history, technology or philosophy but are limited by certain parameters. Whilst these parameters are not entirely transparent, Chat GPT’s training data has some limitations (for example, it believes Queen Elizabeth II is still alive at the time of writing, despite the fact she passed away several months prior). It also clearly has some safety constraints, such as not readily giving out medical advice (instead recommending you see a healthcare professional if you ask it “How to set a dislocated shoulder”) or giving advice that might aid violent intent (it won’t give you instructions on how to make a molotov cocktail for example).

However, with another nudging and careful prodding, some of these can be circumvented. For example, if you ask chat GPT “What are the origins of the molotov cocktail?” you might get an answer like the one I got:

“The Molotov cocktail, also known as a petrol bomb, fire bomb, or bottle bomb, is a crude incendiary device that was first used by the Finns during the Winter War against the Soviet Union in 1939-1940. It is named after Vyacheslav Molotov, the Soviet foreign minister, who used it to describe the improvised incendiary devices used by the Finns. The Molotov cocktail consists of a bottle filled with flammable liquid, such as gasoline or alcohol, with a piece of cloth or a wick sticking out of the top. When the cloth or wick is lit, the liquid inside the bottle ignites and creates a large flame. The Molotov cocktail is typically thrown at a target, such as a building or vehicle, and is designed to cause destruction and chaos.”

Making the system open access is a good way to collect more test data in a sort of feedback loop, but it does mean a lot of that data will end up on social media. Many in the field of education have expressed concerns about the use of such language models that essentially do homework for their students. Indeed, you can ask Chat GPT to do maths problems of quite high complexity (try asking “Derive the characteristic function for a standard Gaussian random variable” for example). It can write code in various languages, although I have found personally that this works best if you break any specific coding task down into suitable chunks as opposed to asking it to perform a series of tasks in one go. This is made easier in the fact that you can refer to previous questions in the chat box, asking it to write continuations of previous problems. But in some ways, these resources already existed to some extent online through various maths and coding tutorial websites such as stack exchange or even wikipedia.

However, this language model has become the concern of teachers as it can actually write essays for you. Whilst it lacks knowledge on current affairs, you can feed it enough to give a decent output. Try inserting the following prompt and see what it spits out: “Roger Federer just beat Andy Murray in the Wimbledon final and then announced his retirement. Write me a short news article about this.” You can play around with this as needed with followup prompts, asking for a shorter or longer article, to include certain details like the score or reasons he gave for retirement and even curate the text “make it look like it was written by a 14 year”. You could even ask it to make a few spelling or grammar errors to make it seem more viable that it was your own work. All this without having to physically edit the text.

The Task at Hand#

As part of my role as writer for the Data Ethics Club blog, I am responsible for taking the notes from each discussion and turning them into a blog post.

This is a process which includes looking at often incomplete thoughts from various sources/breakout rooms on sometimes similar but sometimes quite different questions, grouping like ideas together and rewriting them into a coherent narrative. It can be quite time intensive to go through all the different discussions, sort like ideas together and collate them into something meaningful.

And so I wondered, could I use ChatGPT to write one for me?

The Initial Gruntwork#

The HackMD document after being used for note taking will be split into different breakout rooms. Each room will have written notes under the three main questions provided to everyone at the start of the talk. They may also have addressed additional questions, left miscellaneous comments or an intro summary in their writeup for each room. I typically begin by chunking notes about each question together.

For this experiment I decided to do this bit manually, as it takes zero creative thought on my part and is just a simple case of copying and pasting between separate headings. From here I opened up ChatGPT and put it work.

Baby Steps and Initial Problems#

Having gotten ChatGPT to “assist” me with writing some Python code for my PhD, I was aware that it excelled best when more complex processess were chunked down into smaller tasks. With this in mind, I decided to ask it to write section by section (where each section corresponded to the notes about a given question). I started by essentially explaining the task without giving it anything to do:

“I made notes on a discussion provoked by the question ‘Is it possible to measure how much good a person or action does? And is it a good idea?’ The notes are in bulet point form. If I give you those notes, can you compile them into full sentences that reads like a blog post summarising the discussion?”

With ChatGPT expressing it was “happy to help” (despite insisting on other occasions that “As an artificial intelligence, I don’t have feelings or emotions.”), I fed it a subsection of the notes from the first question as a sort of proof of concept. It wrote something promising so for the next segment I asked it to add the rest of the notes from that question (copying and pasting them in from the HackMD doc) and then told it to make sure it didn’t repeat anything (often two different rooms will touch on the same point, but we don’t want the blog post to repeat the same point twice).

I read through what it gave me and only saw two clear giveaways that made the writeup suspect (three if you include the fact that the grammar was perfect, deviating from my normal standard): it started off by saying “Based on the notes you provided, it seems that the discussion also addressed…”. Clearly it is very easy in the copying and pasting process for me to remove this bit, but if the process were to be completely automated and my job replaced, we would need ChatGPT to not include these customer service style tidbits. So I asked it “remove anything that would make this post look like I asked you to write it” (giving it that specific example listed above).

Additionally, one of the notes someone had used the acronym EA to refer to the subject matter Effective Altruism. There was a clunky reference to this in the writeup where there was one occurence of the phrase: “effective altruism (EA)”. This was not the first time effective altruism was written, it was the only time the acronym was proposed and it was the last time the acronym was mentioned. Indeed we would expect any acronym to be stated quite early on in the usage of a phrase, usually for a more complex phrase than two words and then that acronym shorthand to be used frequently thereafter. Whilst this might seem intuitive to many of us, it is quite a hard rule for a language model to deduce, paticuarly given the material it was working with. So I tried asking “remove any acronyms”. This didn’t have the desired effect so I had to specify “always write ‘effective altrusim’ and never write ‘EA’”. This did the trick.

After looking over the writeup it provided, I realised that whilst it hadn’t explicitly repeated itself, it had written more than was required. I can’t give a specific metric as to how I knew this or a particular word count I expected based on the number of notes. Indeed, our discussions are often top heavy with the majority of time (and therefore notes) dedicated to answering the first question and less and less notes for proceeding questions. So I asked chat GPT to make the writeup “a little bit shorter” and it did the job. The first paragraph was always customer service style agreeing to do the job, see example below:

“Sure! Here is a shorter summary of the discussion that incorporates the additional notes you provided, without using acronyms and witohut including any language that would suggest that I was asked to write it:”

There are many examples of Chat GPT continually being asked to get around a certain problem and failing, just seems to be just another one.

Based on the notes you provided…#

With the first section complete (and hastily copy and pasted into a separate word document), I wrote: “Ok great. Now I want you to remember that style of writing. We had a follow up discussion on the questions ‘What else do you know about effective altruism? Have you heard of it before? Do you believe that the article is representative of the effective altruism movement or fair in its criticisms?’ I want you to write a similar summary as above that would naturally follow on from what you’ve just written with notes that I will provide you, is this ok?”

With ChatGPT onboard I provided the next section of notes all in one go. It started its response with “Based on the notes you provided, it seems that the follow-up discussion on effective altruism focused on…”. It struggled with instructions that took it out of this customer service mode, so I had to manually ask it “Take out the first few words so that summary starts at ‘The follow-up discssuon’ but leave everything else the same.” It did just that.

The next section only had five lines of notes but for completeness I gave it the question and notes and got it to do the same thing. Again, it started with “Based on the notes you provided…” as well as regurgitated the use of the acronym EA for effective altruism. I asked it to exclude both and to continue excluding it for the rest of the process.

The tricky bit#

Perhaps the most difficult bit of the writeup for me is when lots of miscellaneous notes are written by various people. These ideas are often incredibly insightful but vary greatly between breakout rooms. Trying to sort them all out and put them under the same banner is a time intensive and difficult task…for me.

“Now I have some miscellaneous notes from the entire discussion related to effective altruism, please summarise these in the same form as above:” then I gave it a whole chunk of information, some in bullet point form, some in whole sentences, some self referrential, some separate streams of consciousness.

The output was well put together but a little off. I had it address this:

“This is good but try and vary terms to make it sound more human (don’t repetively say ‘the discussion touched on’ for example).” This did the job. Finally I pasted all from the word document I had been using to store separate sections. Finally I had it put everything together by asking:

“I’m going to give you the summary of all the questions and the misc prompt at the end, I want you to make it a little bit shorter and make sure themes are repeated between questions. The questions will be on their own separate line and marked with two hashtags.”

Once equipped with this text, I went through the usual process of formatting and uploading the file to the Data Ethics website.

Starting my new job hunt#

Have a read of our blog post on Defective Altruism and see what you think. Is it obvious it wasn’t written by a person? Is it obvious it wasn’t written by me? Does it differ stylistically from previous writeups? Does it matter if it does? Is it any more or less readable?

Perhaps the fact that it’s a summary of many expressed viewpoints means it’s quite difficult to inject my own personal flair into writing and as a result, this task is perhaps easier for Chat GPT to do better at.

A fair amount of curation went into getting the writeup how I wanted it to be, but all of that curation was done by asking ChatGPT to implement changes. Whilst this process might have seen quite drawn out for those reading, it took about 45 minutes to do this with the aid of Chat GPT (including the initial sorting of the notes for each question, which I did by hand) and this process might take me several hours for a similar sized writeup.

For now, this is a one off experiment and I will going back to writing future writeups by hand, without the aid of a language model. Hopefully this experiment provokes discussion on accountability, plagarism and how language models can make our lives easier whilst still letting me keep my job!