The Case for LLMs as Hallucination Engines
> This might explain why ChatGPT is so much worse at writing modern poetry (which is tightly restricted by copyright law) than it is at writing in older styles. For instance, it seems to me to be much better at writing a Samuel Johnson essay about kangaroos than it is at writing a modernist poem about same.
No, you've simply run into the RLHF mode collapse problem (https://www.lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-of-mode-collapse?commentId=tHhsnntni7WHFzR3x) interacting with byte-pair encoding (https://gwern.net/gpt-3#bpes). A GPT doesn't genuinely understand phonetics due to the preprocessing of the data destroying individual-letter information and replacing them with large half-word-sized chunks, and then during RLHF, it avoids writing anything which doesn't make use of the memorized pairs of rhymes because it's unsure what rhyming or nonrhyming poetry looks like.
(If you are skeptical, try asking ChatGPT this simple prompt: "Write a nonrhyming poem." a dozen times and count how many times it actually does so rather than rhyming. Last time I checked, the success rate was still well under 20%.)
The Ea-Nasir screenshot also shows the effects of the RLHF, I suspect. My advice would be to minimize use of GPT in favor of Claude for all historical scenarios involving anything less wholesome than _Barney & Friends_. While the model is not as good, the RLAIF of Claude seems to be a good deal less indiscriminate & crippling.
Love the idea of playing into LLM as Hallucinator…and into historiography as a form of hallucination…and the argument that one form of great history-learning might be a kind of hallucinating/making-up-history. (Same argument could be made in literature: learn by translating what is on the page into what might’ve been…or as Helen Vendler put it: taking seriously “the nonexistence of what is”--i.e., thinking about the different ways that Hamlet could’ve ended as a way to see more cool things about how it did.)
Makes me think of David Liss’s, A Conspiracy of Paper, a detective story set during the crazy historical moment of the world’s first stock market crash and the shift to paper money, which I think Liss described writing (while a history grad student) because he could find no existing books that gave a real feel for the moment,
Maybe all historical fiction is a form of informed hallucination. (And, at least for me, one of the most thrilling ways to learn about history…because the context is right, even if some of the characters and their storylines are invented.)
This is fascinating!!! I'm wondering how this could be applied to a high school situation here in South Africa... I'll explain the situation... While studying Apartheid our country's curriculum for grade 9s require them to interview someone (older) that was impacted by Apartheid. This of course, can be a traumatic conversation for the person being interviewed if approached in the wrong way, and i'm not convinced that all grade 9s have the EQ to handle such conversations. Seeing these simulations play out makes me wonder how we could use them to the benefit of our students... I also have a question with regards to the gameplay commands and how that works within each 'turn'. This really is amazing, thank you!
This is a fantastic example of playing to the strengths of LLMs. Have you tried automatically checking the results for (the most common) historical inaccuracies?
I wonder if one really needs such long well-crafted prompts, especially for ChatGPT 4. I did a quick experiment with the prompt "I want to the player in a text-based game where I'm in Damascus in May, 1348, a city in chaos due to the plague." As a non-expert I thought it went well: https://chat.openai.com/share/f0e6ac86-2738-4fdf-b0d8-057d8a3c4cf2
This is fascinating - thank you. Hard not to wonder whether the talking rat has been borrowed by ChatGPT from the BBC's Horrible Histories franchise.
I really like this idea. Maybe in a future version we can combine the idea of a knowledge graph with GPT. The knowledge graph contains all the primary sources plus the referenced people places and events that could then form the set of nodes that you can talk to. And when you talk to a person or a place or an event, it will draw on the closest primary source as a source documentation (maybe enhanced with some pre-prompting) to create a very realistic chat experience
Why leave hallucinations to chance? ;) The prompt could tell ChatGPT to randomly insert several authoritative sounding but verifiably false facts, to give the students debunking challenges!
Loved this article! This perspective on using LLMs, I think, sheds so much light on how AI can be integrated into our education systems in the future instead of circumvented.
I was also thinking -- would be cool as an assignment to get students to create their own simulation prompts. Lots and lots of possibilities.
This is really fun. Before GPT, I had students watch the Seventh Seal, as well as have a lecture on the Black Death, and they also read a sheaf of primary sources. They then had to write a film review concerning the historical accuracy of Bergman’s film. Some of the ones paying more attention understood it was a metaphor for the time it was made…the Cold war, and anxiety about destruction of humanity.
Loved reading this piece. When GPT came out I thought of so many ideas. One I was actually working on is an enhanced version of what you did, where you're able to see a physical representation of the historical person you want to talk to. Maybe you can have a look! :) https://www.timeless.cool/