Tagged for OEG Connect: Fairly Trained

cogdog · January 17, 2024, 6:18pm

What’s of interest? Fairly Trained

Tell me more!

We certify fair training data use in Generative AI.

Generative AI models are trained on the work of human creators.

Our mission is to make sure those creators are being treated fairly.

There is a divide emerging between two types of generative AI companies: those who get the consent of training data providers, and those who don’t, claiming they have no legal obligation to do so.

We believe there are many consumers and companies who would prefer to work with generative AI companies who train on data provided with the consent of its creators.

Fairly Trained exists to make it clear which companies take a more consent-based approach to training, and are therefore treating creators more fairly.

Our first certification is our Licensed Model (L) certification for AI providers. We plan to add more certifications over time.

The L certification can be obtained for any generative AI model that doesn’t use any copyrighted work without a license.

h/t to Nate Angell who shared this in the Creative Commons Slack community

Where is it?: https://www.fairlytrained.org/

This is one among many items I will regularly tag in Pinboard as oegconnect, and automatically post tagged as #OEGConnect to Mastodon. Do you know of something else we should share like this? Just reply below and we will check it out.

hmross · January 17, 2024, 6:52pm

Alan,

Thank you for sharing this. I’ve really been struggling with the dismissive nature so many people have for the creators of work being used to train AI. I find it particularly troubling that Creative Commons has a post on their website indicating that anything AI generated is in the public domain and can use the materials as such. If something created by AI that was trained on copyrighted work without the creator’s permission, how can we, especially those of us in the open community, treat it as public domain content? Attribution and gratitude are so important in the open movement.

cogdog · January 17, 2024, 7:27pm

We hear you Heather, and it feels like trying to find firm footing on shaky and shifting grounds.

I found it quite both interesting and disturbing to read how non specific prompts (not naming the characters) in both Midjourney and DALL•E 3 could readily reproduce what most humans can readily identify as copyrighted images

I do not think most are being dismissive of the problem of generative AI training on a vast amount of both copyrighted and various licensed works and spitting out things that can never be directly traced to sources. It’s more of the greyness of existing laws and practices. The suggestion of it being public domain traces back of course to the copyright laws as we know it asserting that machines (and monkeys) cannot claim copyright.

The laws are going to lag behind the practices.

What we can do is stick to good practices like attribution and gratitude. For that matter, I could not help but notice the Fairlytrained web site tagged in this post uses imagery that have no attributions. Only with some dedicated reverse image searching,

I found at least the second image of computer circuits is from Unsplash, only by looking through results, and finding it used in Techcrunch and Yahoo with a linkless attribution of “Photo by Taylor Vick on Unsplash” (and I could only find one valid Taylor Vick profile and it did not include this image) (see how far down the hole I will go).

So can we do better attribution than technology news sites? heck yes!

I( say always strive to Attribute Everything (whether a license says so or if there is no license) with the old trusty TASL Title Author Source License. When I do use something from DALL-E or Craiyon, I still try that, at least you can have a publishable link, though for license I just have to write “in determinate”

This of course probably helps not much in working with faculty on OER. The best we can do is be explicit and transparent in identifying where AI is used. It all can change of course, but I’d say in good faith this is the best we can do now while the laws and licenses are fuzzy.

There was some good discussion in yesterday’s BCcampus FLO Panel on The Creative and Ethical Use of Artificial Intelligence in Post-Secondary Education, including from @clintlalonde sharing an addition to their OER Publishing Guide a new chapter on Generative AI.

It’s gonna be messy for a while!

cogdog · January 17, 2024, 7:59pm

Ahh, and sometimes there is always “more to the story” . As shared by @NateAngell who originally tipped me off, it’s worth seeing who is supporting this effort of " trying to get a fairer deal for human creators."

See the bottom of

hmross · January 17, 2024, 8:58pm

Alan,

I know it’s certainly not everyone in the open movement, but just this morning I was in a meeting with colleagues and felt like a lone voice saying, “Wait, if we’re going to encourage the use of AI, we need to have bigger conversations about copyright and the ethical use of the content created with instructors and students.” I missed the webinar, but I will check out the new chapter in the guide on AI.

cogdog · January 17, 2024, 11:07pm

Keep speaking up! I think BCcampus will post the archive, shared resources, and more stuff in about 2 weeks they said.