What’s of interest? Outline for a European Books Data Commons – Open Future
Tell me more!
This new concept paper presents an outline for establishing a European Books Data Commons (EBDC)—a piece of public digital infrastructure designed to provide centralized access to large, high-quality datasets of digitized books from European libraries. It is conceived as a commons-based infrastructure governed collectively by the contributing libraries.
Authored by Paul Keller and building on a series of structured conversations about the idea of a European Book Data Commons that we convened together with Europeana during the first half of 2025, this paper addresses a critical gap in how Europe manages its digitized cultural heritage in the age of AI. It also builds on earlier work presented in Towards a Books Data Commons for AI Training, which explored the broader concept of creating shared infrastructure for making book collections available for AI model development while ensuring that libraries and cultural heritage institutions maintain control over their digitized materials and can fulfill their public service missions.
The EBDC proposal responds to the challenge that many European libraries face: their digitized collections of public domain books remain largely inaccessible for AI training and other innovative uses. By creating shared public infrastructure under library control, the EBDC would enable these institutions to optimize their collections for diverse uses—from individual access to bulk data provision for AI model development—while maintaining clear provenance and data quality.
Where is it?: Outline for a European Books Data Commons – Open Future
This is one among many items I will regularly tag in Pinboard as oegconnect, and automatically post tagged as #OEGConnect to Mastodon. Do you know of something else we should share like this? Just reply below and we will check it out.
Or share it directly to the OEG Connect Sharing Zone