On or Off the AI Train?

cogdog · May 24, 2024, 3:21pm

My good colleague and outspoken critic @poritzj is suggesting a possible future event/activity to debate or flesh out “sides” if you will about what some (including me) see as a runaway train.

We know / acknowledge the problems, environment, ethics, mangling of intellectual property, fraud, social injustice- yet the train picks up speed (woah it can analyze a painting!)

Jonathan share a recent Wired Article (roundabout link to avoide the paywall) raising the issue of low paid workers in Africa doing the grunt work of moderation and keyword training as “being modern-day slavery.”

And also, look! Iowa’s water being slurped to fuel the machines!

How do we sort all this? Are we without a means to do anything? I have no answers, but collectively, here, with some 1800 people inside this community, what can we do?

helen.beetham · May 24, 2024, 3:46pm

Hi Alan, thanks for inviting what I’m sure will be a host of responses.
I’ve been blogging and speaking from the ‘woah’ side for some time now, and I have several posts on the hidden labour of language models, including data workers in the global south. These from last summer:
https://helenbeetham.substack.com/p/labour-in-the-middle-layer
https://helenbeetham.substack.com/p/luckily-we-love-tedious-work

And a more recent one that touches on these issues:
https://helenbeetham.substack.com/p/human-intelligence

Also this on the environmental risks:
https://helenbeetham.substack.com/p/things-dont-only-get-better

I don’t often self-promote so shamelessly but your invitation was too tempting

Looking forward to the debate.

Helen

Read my latest writing at helenbeetham.substack.com

@helenbeetham | helen.beetham@gmail.com | h.a.beetham@wlv.ac.uk | helen.beetham@manchester.ac.uk

poritzj · May 24, 2024, 3:54pm

To give some more backstory to this post of Alan’s, here’s what I sent to him:

You often post things through the OEGlobal platform which are fun/positive uses of AI. And there are plenty of other people in this area who say they are doing valuable things with AI, some of whom I deeply respect.

But the more I learn and think about this technology, the more worried I am about it. I think it’s really bad for our community, for education in general, and for the world in general. And I do not think it is “inevitable.”

You know what would be fun: to have a debate, in the style of a presidential election … or, wait, presidential politics (in the US at least) has become such a shitshow that maybe I’m thinking of the substantive debates from when I was young … ok, how about the style of one of the famous Oxford debating society debates? Anyway, a structured thing with limited times and speakers who were respectful but speak without euphemism or artifice. Could be along the lines of

Pro-AI Position: AI is a revolutionary technology which will improve the lives of students and instructors and do so much more for so much less money (… maybe … I can’t even think of a good way to state the positive position!!!)

Anti-AI Position: AI is an absurdly overhyped cat fart which will do a tiny bit of real good in an extremely limited set of circumstances, while doing enormous harm to students, instructors, and the world: to use AI is to say you don’t mind the facism, so long as the trains are on time, to make an historical analogy.

There could be something like 15 minutes for each side to present their main arguments, then a series of responses for 5 or 10 minutes which would have to be on topic to the points the other side made, then a final round where the moderator and/or the audience would drill down into points that they perceived were made but not answered by one or both sides, or which needed better explanation.

Alan responded that surely we could come up with something more reminiscent of the open community than a straight debate. So here’s what I then suggested:

Maybe a call goes out for folks interested in participating, where they have to submit ideas for one side or another of the debate – and they state if they would like to be on the live team. The teams could have open documents – maybe CryptPad – in which they assemble the main points the team will make. Those who want to be part of the live action would have to commit also to an organizational meeting where arguments could be discussed and prioritized and live roles assigned. Then in the actual session, the teams would present in some series of timed chunks (like: intros, presentation of main points: 12 min each team; 8 min each team for rebuttal; 5 min each for moderator follow-up questions; 5 min each to respond to audience questions from a shared doc), with the assigned roles they had decided upon amongst themselves. Maybe there would be two or three one minute comments, on each side, from audience members who were inspired during the event itself.

Timing would have to be very strict for something like this to work. We tend to be shy during webinars and let folks who raise their hands during Q&As just ramble on for ages, but if we wanted something like I described above, we’d have to run it like a classroom using some open pedagogy approach … and, in the classroom, the pedagogy may be open, but the instructor has to scaffold and control the class quite a bit to get things accomplished.

There could be open (CryptPad) docs which the teams would use during the live session to take notes on things they wanted to say in the rebuttals, and a common doc that the whole audience could use to write comments and ask for permission to speak up in one of those last timeslots.

Also an open doc for proposals or concrete summaries presented by both sides, and towards the end, the whole audience could be asked to vote on those statements.

What does the community think? I know @mahabali does amazingly successful and innovative open ped-style webinars, maybe she has some advice (and/or maybe she would be willing to participate)?
Alek Tarkowski (who doesn’t seem to be on here) did a cool thing in Europe using a digital democracy tool to discuss similar issues … but their approach took a bit more time beforehand (i.e., it was more of a heavyweight tool), and had less of the debate format which makes Alan a bit nervous but which I think might be a lot of fun in such a controversial area (if handled well and respectfully).

danmcguire · May 24, 2024, 4:04pm

I might be interested in this topic if you could connect it to OER, Open Educational Resources. One thing AI has apparently done is to wipe the term OER out of any discussions related to education.

cogdog · May 24, 2024, 4:19pm

THANK YOU SO MUCH, sorry for resorting to all caps, but I am overjoyed to see your message, Helen. Yes, I have seen many of your thoughtful and rich posts (and need to read more) – this is in no way self promotional.

cogdog · May 24, 2024, 5:39pm

Thanks Dan, we are definitely aiming to frame this around open education and implications for OER. And your first hand experience is invaluable.

cogdog · May 24, 2024, 5:41pm

Speaking of trains, and as if timing means something, I just got an email from a web forms service announcing their integration with ChatGPT 4o:

The GPT-4o hype train is real. It’s faster, it’s cheaper, and it’s just plain better than any of its predecessors.

At least they admit the “hype” part

CameronBenham · May 24, 2024, 7:14pm

Good point, Dan. My team at New America is hosting a Hackathon at the end of June to explore how AI might be able to better expand the field of OER, whether that’s through GPTs that curate OER, or a tool that is able to update OER to a student’s background, needs, interests, abilities, etc. I’d be happy to connect with you about it.

danmcguire · May 24, 2024, 8:34pm

Yes, Cameron, let’s talk. Hackathon’s usually frighten me, but I am currently leading a team that includes a few people who are less intimidated by them than me. See https://oertist.org/

And, welcome to OEGlobal Connect.
Dan

cogdog · May 25, 2024, 2:57am

Very interesting, Cameron (and ditto the welcome from @danmcguire), but can you share more about the hackathon? This would be extremely relevant to our community here. I poked around the New America site but could not find any more information.

cogdog · May 25, 2024, 2:59am

Psss @poritzj I found your cat

AjitaD · May 25, 2024, 6:19am

Wow! What a discussion! And I don’t know whether I should even be saying anything in front all stalwarts here. But it is a debate, so the urge. As far as debate is considered, somehow it reminds me of the age-old topic given in English class for English essay writing- Technology- bane or boon? But my point is- is there even this option? Technology- in this case AI, is going to be there and continue to evolve as long as it reaches its stage of redundancy. So, let it evolve. Let’s see the possible directions that it may take, be vigilant w.r.t social and ethical issues and step in when required. It definitely makes life easier in most cases, we just need to balance it out.

lightweight · May 27, 2024, 8:56am

Riffing on your double-mention of ‘English’, @AjitaD, I wonder how big the corpus of data for LLM training is in other languages, especially indigenous ones. Vanishingly small, I suspect. I wonder if a great direction for our community might be to a) accept the economics of abundance of derivative learning materials (licensing-still-to-be-determined-by-the-bigtech-controlled-courts) in the Angelosphere and b) encourage & enable those outside of it to focus on inventing or remixing (via translation) OERs in their own languages…

I also wonder who has reviewed (curated) LLM-generated OERs? I wonder how much work by subject matter experts is required to make sure that LLM-generated materials are actually valid & accurate? More than a little, I suspect.

cogdog · May 27, 2024, 5:20pm

Thanks Dave and its good to see the conversations here.

On the makeup of the corpus, it seems an unknown question if we will really know (seems to be closely guarded),. Definitely the make up of the corpus matters but also, and seemingly more mysterious is what influences what comes out? Anthropic is still working to figure that out.

I would agree, and know from other voices here, that translation might be a positive opportunity.

And finally I wonder what we mean by “LLM-generated OERs”? Aren’t we going to see as LLMs get into our writing, research tools, that what is purely LLM generate vs what includes some portion or element of LLM generated stuff? I tihnk it’s going to get grey, and hoping for an AI/not AI litmus test seems myth worthy.

There was also that recent study (~~cant find it right way~~) that looked at the perception of LLM generated code versus the source from Stack Exchange / Overflow, like although the generated stuff was statistically more wrong (was it 30%?) people preferred it more (?) Like we can be satisfied with stuff that appears on the surface to be viable?

Later edit: Luckily I followed a mastodon message to a blog post that linked to the article

On which I always do my SIFT practice to go upstream to the linked research source from Purdue University, Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of ChatGPT Answers to Stack Overflow Questions

danmcguire · May 29, 2024, 5:13am

I show elementary teachers how to use AI to create Openly licensed content for their students. We are currently working on developing tools to translate OER content into one of the 75 African languages other than English, French, and Arabic that have at least a million speakers.

Over 80% of the elementary students in Africa speak speak something other than those big three as a mother tongue. Of the 1.2 billion people in Africa, only about 7 million speak English natively.

There’s plenty of work to do with AI and OER.

It works better to teach kids how to read in their mother tongue than some other language. AI is useful in creating the OER digital reading material for elementary students in Africa.

OERIDEAL · June 4, 2024, 3:18am

I feel I need to be go to a training on the ethics of AI and the world. I had no idea but I’m not surprised.