When I recently summarized an article that claimed that Large Language Models (LLM) are “bullshit”, I got a lot of strong reactions offline and online about that term, and a comment recommending the article “Beware of botshit: how to manage the epistemic risks of generative chatbots” (Thanks, Ian!). In that article, Hannigan, McCarthy and Spicer (2024) suggest using the term “botshit” to describe what can happen when users uncritically use the output of LLMs, and I spent the better part of a Baltic Sea crossing today enjoying that article.
I really really appreciated reading this article (and also the slide deck summarizing it!), as the authors start out by giving an excellent introduction to how LLMs work, and where the hallucinations actually come in. I have reimagined their table in the figure below (I always have to re-draw things to force myself to think them through, but I am also pretty sure that this figure will come in very handy whenever I need to teach anything related to LLMs again!).
(Btw, I just remembered how I learned about machine learning: Through the game “Chomp!” by Dan Wallace! Check it out if you want a fun and super easy scicomm game!)
In addition to the data input and unsupervised learning of the model in the first four steps, they describe the “Reinforcement Learning from Human Feedback” process used to fine-tune responses in ChatGPT. In this process, humans compare prompts and responses with what they would deem the most desirable response, and thus new rules are created to make sure the output is not just based purely on extrapolation from training data, but matches “human users’ values”. This can be about the format of the output, reducing bias or toxic content, or avoiding fictional information. Humans rank different outputs to train the model towards what they would prefer to see.
It is easy to see how all of these steps can lead to undesirable outputs, hallucinations, for example if the training dataset contains old, faulty, or biased data, or if it is “cleaned” in a way that introduces new problems. But also the human training part can cause problems, for example by introducing values or biases that are misaligned with the final users’ values.
Based on this, the authors “highlight that LLMs are designed to ‘predict’ responses rather than ‘know’ the meaning of these responses. LLMs are likened to ‘stochastic parrots’ (Bender et al., 2021) as they excel at regurgitating learned content without comprehending context or significance.” They produce “a technical word-salad on patterns of words in training data (which is itself a black box)”. There are two types of botshit: Intrinsic botshit is wrong according to the training data, and extrinsic botshit is made-up information that is not supported by training data.
A lot of research has shown that bullshitters, in contrast to liars, have no concern for the truth and just make up stuff, which can then happen to be true — or not. They do it for prestige, to maintain face, out of a workplace culture, where saying anything but using grand words is valued over saying something of substance. Using LLM outputs uncritically, botshitting, is probably done out of similar motivations.
But where does this lead us when we still want to work using LLMs, but in a good way? The authors suggest two questions we need to answer in order to assess the risks of using LLMs for a given task:
This leads to four different cases (which, of course, as always, are not as distinct in practice as in the framework…), see also figure below:
I think this is a useful framework to think about how to assess and manage risks.
The next chapter is highly relevant: “Using chatbots with integrity”. The authors identify four main risks that are present in all four modes above, but most strongly associated with one each:
Lastly, the authors present a chapter titled “learn to rely on me”. They compare the process of learning to rely on LLMs to the process that occurred when calculators first became available and used in schools, and it was feared that that might destroy students’ maths learning, which turned out to not be the case. The authors present three types of “guardrails” that should be in place and can help mitigate the risks of using chatbots:
So far for the summary of this article. Puh! But this was a super useful read.
I am thinking about this mostly in the context of how higher education teaching should deal with LLMs, but also regarding a side-project of mine trying to understand how to use LLMs in Scholarship of Teaching and Learning (SoTL), since this is where I actually need to provide guidance and guidelines. And I think that this article is very helpful for both, both the way that LLMs and the processes behind them are explained, and in the risk assessment and mitigation framework. We might need to think a bit about how we define “risk” in SoTL. It is very unlikely that people will die or be seriously be harmed by a chatbot hallucinating and then a teacher botshitting about teaching and learning. But SoTL is about scholarship, so botshitting rather than following the scientific method does carry the risk of a slippery slope of “oh, I’ll just use it for this, I would never use it for anything serious“, of compromising the integrity of the teachers (who are all researchers, too), and of botshit then being taken and trusted as scholarly results, which they are not and cannot be, and build on by others. So my main interest now is to look further into the organization- and user-oriented guidelines that I think should be in place for using LLMs in SoTL, and to discuss those further. So if you have any ideas, please let’s discuss! :-)
Hannigan, I. P. McCarthy, A. Spicer (2024), Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots, Business Horizons, 2024 (I accessed the free pre-print here https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4678265, and a slide deck summarizing the article)