Generative AI and LLMs

TL; DR: My recommendation is that you do not use ChatGPT, NotebookLLM, or similar large language model (LLM) tools in this class for substantive tasks (annotating/summarizing readings, writing solutions), though they can be useful for programming tasks. If you do use these tools, please do so after independent effort, and clearly document how you’ve used them.

I am not opposed to the thoughtful use of LLMs. Contrary to the endless hype and marketing served by their creators and the media, these tools have strong limitations and over-reliance on them can greatly impede the discovery and learning processes. However, once you understand these limitations and have enough domain knowledge to look for red flags, they can be useful for certain (albeit limited) tasks, and learning how to engage with them responsibly is a legitimate professional skill.

The framing and policies on this page are heavily influenced by Andrew Heiss.

Bullshit

The fundamental problem with LLMs is that they are bullshit generators (Hannigan et al., 2024; Hicks et al., 2024). Bullshit, in the philosophical sense, is text produced without care for the truth (Frankfurt, 2005). It is not a lie, which specifically exists in opposition of truth, or a mistake, which is subject to correction when exposed to divergence from the truth, but it is agnostic to the truth; it simply exists to make the author sound like an authority. Truth simply does not matter to a bullshitter.

Hannigan, T. R., McCarthy, I. P., & Spicer, A. (2024). Beware of botshit: How to manage the epistemic risks of generative chatbots. Bus. Horiz., 67, 471–486. https://doi.org/10.1016/j.bushor.2024.03.001
Hicks, M. T., Humphries, J., & Slater, J. (2024). ChatGPT is bullshit. Ethics Inf. Technol., 26, 1–10. https://doi.org/10.1007/s10676-024-09775-5
Frankfurt, H. G. (2005). On bullshit. Princeton, NJ: Princeton University Press.

LLMs literally exist only to produce bullshit. They use a predictive statistical model to guess what next word is likely; there is no reference to whether the underlying idea produced by this sequence of words is truthful or even coherent. This means that, once an LLM goes off track, they are subject to wild hallucinations where they may invent concepts or artifacts (books, articles, etc) that do not exist, merely because they seem plausible as a string of text. Given that LLMs are trained on publicly available text, an increasing amount of which is now generated by LLMs (so-called “AI slop”), the uncritical use of these results can just perpetuate the bullshit cycle.

As we are environmental scientists and engineers, there’s another problem, which is the impact on the environment of the computers needed to train and run LLMs. The more judicious we can be with these tools, the better we can manage their energy and water needs for when they’re actually useful.

Writing And Reading

Since LLMs only produce bullshit, they cannot help you with the process of writing or engaging with readings. Writing, whether prose or technical solutions, forces you to clarify your ideas and confront where they are vague or half-baked. This is a critical part of the educational experience! Producing plausible-looking but substantively-empty text cannot achieve this goal.

If you have written your own initial text but would like to clean it up (grammar, concision, etc), LLMs may be helpful since the substance is already present in your text. Just make sure to carefully edit it to ensure that your ideas are still present and clear. This type of engagement with LLM output can be useful and reflects a similar need to engage with the statistical models and methods we are working with in class.

If you do use an LLM at some part in your writing process, you should cite it and make clear how you engaged with the output. This includes:

  1. What prompt(s) did you use?
  2. How did the LLM output influence your writing or framing?

This not only makes it clear to me whether you used the LLM responsibly (otherwise, this borders on plagiarism), but it actually helps you in case there is some bullshit that made it into your answer that is not actually a reflection of your understanding. I will not bother trying to guess if your writing is AI-generated1. Your work will be graded on its own merits, and since we’re looking for thoughtfulness and engagement in your written work, LLM-generated materials are likely to be penalized2

1 While there are tools that purport to do this, they do not reliably work.

2 And if you did not disclose the role of LLMs in generating the work, this will not be a convincing reason for why you should not lose points.

Coding

Using LLMs for programming is a little different. There are many programming tasks (autocompletion of syntax, interpreting error messages) that are greatly facilitated by the use of LLM tools such as ChatGPT or GitHub CoPilot. Even people who already know what they’re doing tend to solve syntax and debugging problems by Googling or going to forums like Stack Overflow. LLMs are a shorcut for this approach.

However, if you’re trying to learn how to program (or program in a new language or using a new toolkit), the use of LLMs, even for debugging, can be greatly detrimental to this process. As they training data for LLMs are often didactic examples, the output code is often wildly inefficient3. There are also likely to be errors due to the generation mechanism: all the LLM can do is guess what the next line of code is, not reason about whether the overall logic of the code makes sense or if it will run. It’s hard to track down these errors if you played no role in the development of the code. This is particularly true if you’re dependent on an LLM to think through debugging, since you won’t know how to find where the LLM.

3 This is not going to be a problem in our class, but might be in your future.

My suggestion for how to use LLMs for coding are:

  1. Try to write your own code first. At the very least, think through the logic of what a solution would look like and write down any relevant equations. Then try to write down a version of that in code form. It’s okay to use syntax-checking and autocomplete tools here, but try to think about what the command is and look at some documentation (or ask on Ed Discussion) if it’s not clear to you.
  2. If you run into errors, first see if they’re obvious. In Julia, for example, many errors are the result of not using broadcasting. Being able to spot these common error messages is useful and fast.
  3. If you run into further errors, or cannot tell why a particular piece of code is not working, then feel free to use an LLM4. Just make sure you don’t just copy and paste output, but, if the code works, try to understand how it differed from your own so you can not make the same mistakes next time.

4 Appropriately documented, of course.