The Case for LLMs as Hallucination Engines
> This might explain why ChatGPT is so much worse at writing modern poetry (which is tightly restricted by copyright law) than it is at writing in older styles. For instance, it seems to me to be much better at writing a Samuel Johnson essay about kangaroos than it is at writing a modernist poem about same.
No, you've simply run into the RLHF mode collapse problem (https://www.lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-of-mode-collapse?commentId=tHhsnntni7WHFzR3x) interacting with byte-pair encoding (https://gwern.net/gpt-3#bpes). A GPT doesn't genuinely understand phonetics due to the preprocessing of the data destroying individual-letter information and replacing them with large half-word-sized chunks, and then during RLHF, it avoids writing anything which doesn't make use of the memorized pairs of rhymes because it's unsure what rhyming or nonrhyming poetry looks like.
(If you are skeptical, try asking ChatGPT this simple prompt: "Write a nonrhyming poem." a dozen times and count how many times it actually does so rather than rhyming. Last time I checked, the success rate was still well under 20%.)
The Ea-Nasir screenshot also shows the effects of the RLHF, I suspect. My advice would be to minimize use of GPT in favor of Claude for all historical scenarios involving anything less wholesome than _Barney & Friends_. While the model is not as good, the RLAIF of Claude seems to be a good deal less indiscriminate & crippling.
This is fascinating - thank you. Hard not to wonder whether the talking rat has been borrowed by ChatGPT from the BBC's Horrible Histories franchise.
Love the idea of playing into LLM as Hallucinator…and into historiography as a form of hallucination…and the argument that one form of great history-learning might be a kind of hallucinating/making-up-history. (Same argument could be made in literature: learn by translating what is on the page into what might’ve been…or as Helen Vendler put it: taking seriously “the nonexistence of what is”--i.e., thinking about the different ways that Hamlet could’ve ended as a way to see more cool things about how it did.)
Makes me think of David Liss’s, A Conspiracy of Paper, a detective story set during the crazy historical moment of the world’s first stock market crash and the shift to paper money, which I think Liss described writing (while a history grad student) because he could find no existing books that gave a real feel for the moment,
Maybe all historical fiction is a form of informed hallucination. (And, at least for me, one of the most thrilling ways to learn about history…because the context is right, even if some of the characters and their storylines are invented.)
This is fascinating!!! I'm wondering how this could be applied to a high school situation here in South Africa... I'll explain the situation... While studying Apartheid our country's curriculum for grade 9s require them to interview someone (older) that was impacted by Apartheid. This of course, can be a traumatic conversation for the person being interviewed if approached in the wrong way, and i'm not convinced that all grade 9s have the EQ to handle such conversations. Seeing these simulations play out makes me wonder how we could use them to the benefit of our students... I also have a question with regards to the gameplay commands and how that works within each 'turn'. This really is amazing, thank you!
This is a fantastic example of playing to the strengths of LLMs. Have you tried automatically checking the results for (the most common) historical inaccuracies?
I wonder if one really needs such long well-crafted prompts, especially for ChatGPT 4. I did a quick experiment with the prompt "I want to the player in a text-based game where I'm in Damascus in May, 1348, a city in chaos due to the plague." As a non-expert I thought it went well: https://chat.openai.com/share/f0e6ac86-2738-4fdf-b0d8-057d8a3c4cf2
I really like this idea. Maybe in a future version we can combine the idea of a knowledge graph with GPT. The knowledge graph contains all the primary sources plus the referenced people places and events that could then form the set of nodes that you can talk to. And when you talk to a person or a place or an event, it will draw on the closest primary source as a source documentation (maybe enhanced with some pre-prompting) to create a very realistic chat experience
Why leave hallucinations to chance? ;) The prompt could tell ChatGPT to randomly insert several authoritative sounding but verifiably false facts, to give the students debunking challenges!
This is really fun. Before GPT, I had students watch the Seventh Seal, as well as have a lecture on the Black Death, and they also read a sheaf of primary sources. They then had to write a film review concerning the historical accuracy of Bergman’s film. Some of the ones paying more attention understood it was a metaphor for the time it was made…the Cold war, and anxiety about destruction of humanity.
I'm a life-time history teacher and fascinated by your Plague Simulators.
Would love to replicate the model in other settings / eras. But I'm new to AI.
Would you be willing to share a template that teachers could use transpose activity to another time and place?
What a fascinating exploration! Benjamin Breen’s journey using LLMs like ChatGPT in history classes opens doors to innovative teaching methods. His insight into simulating historical settings through AI, despite acknowledged inaccuracies, underscores the potential for unique learning experiences. The emphasis on humanities in this AI-driven educational future is a refreshing perspective, especially when considering its inherently textual nature.
Breen’s comparison between a high school student’s analysis and that of a history major beautifully illustrates the depth of understanding gained through historical training. The caution about potential pitfalls for educators in the short term and the necessity to adapt teaching methods resonates profoundly. The example showcasing improved results by refining prompts demonstrates the evolving landscape of educational tools.
Thank you, Benjamin Breen, for sharing your experiences and insights, navigating the exciting yet challenging terrain of integrating AI into history education. Your efforts in exploring these uncharted territories are commendable.
Loved this article! This perspective on using LLMs, I think, sheds so much light on how AI can be integrated into our education systems in the future instead of circumvented.
I was also thinking -- would be cool as an assignment to get students to create their own simulation prompts. Lots and lots of possibilities.
Loved reading this piece. When GPT came out I thought of so many ideas. One I was actually working on is an enhanced version of what you did, where you're able to see a physical representation of the historical person you want to talk to. Maybe you can have a look! :) https://www.timeless.cool/