AI@IA: The Internet Archive on AI

Here’s an interesting story by Internet archive founder Brewster Kahle on how they are using Artificial Intelligence to extract lyrics from their collection of archived 78RPM records:

AI@IA — Extracting Words Sung on 100 year-old 78rpm records

Some interesting comments follow where you find that experts in the old content take issue with the parts AI gets wrong, but also a positive note about what having lyrics adding to the capability for search as well as supporting access to those who are hearing impaired.

And it reminds me of the words of Delmar Larsen in our OEG Voices podcast where he describes how LibreTexts sees the value of machine language translation, though not perfect:

However, last year, two AI-based machine translation algorithms have gotten pretty good. Are they perfect? No, they are not. But they are pretty good. And the argument that we had here is it better to have a hundred thousand pages in a new language that’s 95% good versus 20 pages that are perfect in that language.

Also in that internet archive post Brewster himself jumps in (I have to respect someone at this capacity who engages with readers via comments!) with a reference to a comment likely composed with an AI text generator.

But Wait, There is More!

I just got this announcement of a free online webinar from the Internet Archive - Generative AI Meets Open Culture: Opportunities, Challenges, and Ethical Considerations

When (your local time): 2023-05-02T17:00:00Z
Participate: Register here


With the rise of generative artificial intelligence (AI), there has been increasing interest in how AI can be used in the description, preservation and dissemination of cultural heritage. While AI promises immense benefits, it also raises important ethical considerations.

In this session, leaders from Internet Archive, Creative Commons, and Wikimedia Foundation will discuss how public interest values can shape the development and deployment of AI in cultural heritage, including how to ensure that AI reflects diverse perspectives, promotes cultural understanding, and respects ethical principles such as privacy and consent.

Join us for a thought-provoking discussion on the future of AI in cultural heritage, and learn how we can work together to create a more equitable and responsible future.

This one looks definitely worth attending, I am going, who else?


Session recording has just been published, apparently there were over 400 attendees

Key links shared include:

The full log of chat conversations is available, which were extensive!