15 Comments
User's avatar
Julia Imbruglia's avatar

This is really fascinating. My research team recently ran some experiments with LLM synthetic data, and the historians (including me) were asked to compare it with our historical archive. The computational linguists asked us about the accuracy and "plausibility" of the synthetic data, which led to interesting conversations about the nature of the archive, historical interpretation, and metaepistemological questions about capturing historical modes of plausible reasoning.

Maya Indira Ganesh's avatar

This is great, v compelling. Do you know the Turing Institute folks working on Ai and Culture/ Humanities and the Doing AI Differently project - Cody Kommers and Drew Hemmant. This might be something they might want to fund.

Benjamin Breen's avatar

Thank you Maya and good to hear from you! I will follow up by email, I don't know them but would love to talk more.

Anselm Küsters's avatar

I really enjoyed this piece, particularly the ranked taxonomy! While reading point 4, I was reminded of Rohit Krishnan's recent Substack article about management flight simulators and his Vei project, which essentially does the same thing for recent institutional history. Krishnan takes the Enron email corpus, builds a chronological timeline from it and enables users to explore alternative scenarios by branching off from real historical decision points. LLM actors play out the people involved, writing messages, and so on. An interesting combination would be to run Krishnan's event-spine method on the type of probate records, parish registers etc. that you describe in point 4. The main difference from his enterprise use case is the absence of ground truth to validate against, but this might be less problematic for historians than for managers? The goal would be what you called "structured speculation" about the space of possible outcomes, rather than prediction. Here is the link to his piece: https://www.strangeloopcanon.com/p/can-we-build-a-management-flight

Quentin Hardy's avatar

“Talkie is — a free-floating, mid-Atlantic ghost of 19th century print culture,” is accurate, so I’m surprised you opened up with a teaser about accessing “a collective consciousness.” It’s obviously nothing like that.

Indirectly, serving this period-only mashup makes for an interesting comment on the practice of History. Would Hegel, whose influence reverberates and grows well beyond his decade, carry less weight than writing about horses, which were critical to much of everyday life? Does Kierkegaard register at all? Much of what matters, matters for what happened later.

Benjamin Breen's avatar

Collective consciousness is a hypothetical for future versions of a tool like this. Right now, the historical LLMs currently being made are more restricted to the readily available, digitized print sources so it's necessarily more limited. Once a wider range of languages and sources (manuscripts!) are fed into historical LLMs, I think it will become a lot more interesting in terms of them being an index of worldviews as supposed to just representing, say, Anglo-American print culture.

Quentin Hardy's avatar

I appreciate what you're saying, but that's still a stretch way too far for me. Essentially all the LLMs have to go on is text, largely for people who could read. That's about 20% of the global population in 1900, 12% in 1800, and less before that. If you really want to stretch it a couple hundred years you could throw in lithos, woodblocks, and stained glass, but those are all elite works to convey a certain story from a small slice of experience. Even the Grimm Brothers didn't keep their originals, and that's fairly recent stuff. That will be true of print, no matter how many Armenian newspapers and Tagalog nationalist novels you throw in there.

Which is to say, even before we get deeply into what creates a collective consciousness, you'll never represent a vanished thing with an LLM proxy. It's the snows of yesteryear, all the way down.

Quentin Hardy's avatar

For what it's worth, I asked Talkie-1930, "who is Wallace Stevens." It responded, "Wallace Stevens is an American poet, born at Chelsea, Massachusetts, in 1881. He has published several volumes of verse, and has also written prose."

In fact, by 1930 Wallace Stevens, born in Reading, PA, had published the volume "Harmonium" in 1923. He did not publish his next volume, "Ideas of Order," until 1936. The birth error is perhaps acceptable, the anachronism defeats the purpose of the resource.

There's a long way to go before this corpus is reliable even for investigations of what was known at the time.

Oliver Sourbut's avatar

So you're the one who, on our timeline, ends up creating the myriad simulations that make said hypothesis true...

Data Frank's avatar

If historical LLMs can reveal the "mental furniture" of an era, then MAP might eventually reveal the mental furniture of a creator. The more interesting question isn't what people did, but what assumptions kept generating the same actions week after week.

Jakob Ehe's avatar

The question underneath this is whether a model trained on 1920s text is capturing statistical patterns of language, or something closer to a worldview — and whether those are separable. Shannon would say the model can only give you the entropy of the corpus, not what the speakers believed. But the Talkie-1930 demo suggests the gap between those two things might be smaller than it sounds. I keep thinking about this from the other direction: the people who built the foundations of computing in the 40s and 50s were shaped by a very specific intellectual moment. How much of that worldview is still encoded in the abstractions we inherited from them?

Mephistophilis's avatar

It's an amazing idea and I'm already playing around with Talkie-1930 now I've discovered it (not easy to run though). I think there is definitely scope for using historical models which are fluent but not contaminated by modern arguments and discourse to probe how much certain concepts arrive inevitably from the structure of language itself or are more features of the content of a particular era. "Consciousness" is one topic that seems particularly relevant - how does an LLM not trained on 100yrs of modern science fiction handle artificial minds - but things like ethical principles also seem like they'd be good to explore.

Deirdre Loughridge's avatar

“The output wouldn’t tell you what really happened in 1789. But it would generate a structured speculation about the space of possible Frances” is so wonderfully put. And I’d like to second the call for a Kircher LLM, which I would definitely query about music, sound amplification, and dragons.

Nikola Kondovski's avatar

Steampunk AI you say? I'm in

Born Yesterday's avatar

I love this idea so much. SO, if your celebrity crush is, say, Galen, could one chat with him about the humors on AI? :)