5 Comments
Nov 16, 2023Liked by Benjamin Breen

Very interesting indeed! I shared it on the social networked formerly known as Twitter and Blusky. I’m just curious: which is the reference for the Catalan book? I couldn’t find it in the Library’s catalogue. Catalonia was a major consumer and trader in Brazilian sugar and tobacco until the war of Spanish Succession and did two weeks of research there earlier this year about that.

Expand full comment
Nov 15, 2023Liked by Benjamin Breen

Phenomenal. I wish I’d had this tool when I did my PhD in History.

Expand full comment

marvellous stuff, thank you!

Expand full comment
Dec 7, 2023·edited Dec 7, 2023

Very interesting article. Btw on the redaction test any type of transformer based NLP should be promising for these type of tasks because ‘cloze’ methods or masking is one of the primary ways they are trained. You might need to do some ‘fine tuning’ - that is supplement the basic GPT with an external data set to supplement the specific knowledge in less well known areas. Also some of the other models like BERT or T5 might be able to do better because of slightly different ‘cloze’ implementation. Limited training data may also be the challenge in identifying the 7 people in the photo. Possibly some of the figures may have been too obscure to have had enough training examples and some sort of augmentation could assist.

Expand full comment