UNESCO Digital Learning Week: Call for emerging EduGPT Models

This recently dropped into my inbox … not sure how many people are working at the intersection of AI and Open, and are working on developing Generative Pre-training models specifically for education (EduGPT)?

But as we have been talking about AI for a while, it is a hot topic generally and especially on Connect :wink: , I thought I would share


There ought to be people here involved with this although I found it not very specific what constitutes “Generative Pre-training models specifically for education” thinking it is GPT based on educational content?

Of some relevance, I respect the perspective (and chock full of links) from Dan Cohen’s newsletter Humane Ingenuity 47: AI Is Coming for Scholarship Next

A perfect question for ChatGPT… :wink:

Generative Pre-training models specifically for education, also known as EduGPT, refer to language models that are designed and trained with a focus on educational content and applications. These models are built upon the foundation of the broader Generative Pre-trained Transformer (GPT) models, such as GPT-3, but they are adapted and fine-tuned to cater to the needs of the education sector.

EduGPT models aim to provide enhanced capabilities for generating educational materials, answering questions, assisting with learning tasks, and offering personalized educational experiences. They are trained on vast amounts of educational text from diverse sources such as textbooks, research papers, online educational resources, and other relevant materials. By leveraging the knowledge acquired during pre-training, EduGPT models can generate coherent and contextually relevant responses in a variety of educational domains.

These models have the potential to support various educational applications, including:

  1. Intelligent Tutoring Systems: EduGPT can act as a virtual tutor, providing explanations, examples, and interactive learning experiences tailored to individual students’ needs.
  2. Content Creation: EduGPT can assist in generating educational materials, such as lesson plans, quizzes, summaries, and study guides, saving educators time and effort.
  3. Language Assistance: Students can use EduGPT as a language model to improve their writing skills, receive feedback on essays, and ask questions related to grammar, vocabulary, or specific subjects.
  4. Information Retrieval: EduGPT can help students and teachers access relevant and up-to-date information by answering questions, providing explanations, and summarizing content from various educational sources.
  5. Personalized Learning: EduGPT models can adapt to the learner’s preferences, pace, and learning style, offering personalized recommendations, adaptive feedback, and tailored resources.

I have to say I was a bit confused about what exactly they were asking for, but I love the irony of your solution @jan!!

ChatGPT is wonderful for language-related tasks (describing, rephrasing, explaining), as long as you’re careful about very niche areas, such as translating botanical terms, which works terribly (it convincingly hallucinates nonsense which looks very plausible). For example, when I can’t remember an idiom, I describe it in plain language and ChatGPT brings it right back.

It’s also great for “explain concept XYZ like I’m 10 years old”.

And it can be fun, too. Here’s your response above, @isla, rephrased as Donald Trump :stuck_out_tongue_winking_eye:

Look, folks, I gotta tell you, I was totally puzzled by what the heck they were even asking for. But you know what? I absolutely love, and I mean love, the tremendous irony in your solution, Jan. It’s just incredible, believe me. Nobody does irony like me, folks. Nobody.

Here is EduGPT, as ChatGPT explains it to a 10-year-old:

Generative Pre-training models specifically for education, or EduGPT, are like super smart computers that are really good at understanding and talking in human language. They are designed to help students and teachers with all kinds of educational things.

Imagine you have a really smart virtual tutor who knows a lot about different subjects. It can explain things to you, help you with your homework, and even create lesson plans and quizzes for you. It’s like having a helpful friend who knows a lot about everything you’re learning in school.

EduGPT can also help you improve your writing. You can ask it questions about grammar and vocabulary, and it can give you feedback on your essays. It’s like having a language expert who can guide you in becoming a better writer.

These models can also help you find information quickly. If you have a question about a specific topic, you can ask EduGPT and it will give you an answer or find information from different sources. It’s like having a very knowledgeable librarian who can find the right books for you.

Another cool thing about EduGPT is that it can personalize your learning experience. It can understand how you learn best and give you suggestions that match your interests and learning style. It’s like having a teacher who knows exactly how to help you learn in a way that’s fun and effective for you.

So, EduGPT is a special computer program that can be your tutor, writing assistant, researcher, and personal learning guide all in one. It’s designed to make learning easier and more enjoyable for you.

This is the new era of how things are explained, eh? Well played, Jan.

But to dig into this, the crux issue for many is in ChatGPT the inability to connect back to sources, and the inability (so far) to provide un-hallucinated references:

They are trained on vast amounts of educational text from diverse sources such as textbooks, research papers, online educational resources, and other relevant materials.

What textbooks, papers, resources? What “other” materials? It’s vague. That at least is a value of the research service Elicit who’s training source is Semantic Scholar (if IU am reading correctly).

Also worthy of some thought is @opencontent 's asking if the chat like experience might change our notion of what is a textbook, rather than being a static published entity but also something learners interact with through a query like experience.

Now that gets interesting!

Due to the way LLMs work, it may never be possible to reference training data. When I think about this, it kind of makes sense. To use humans as an analogy, the mental model of the world in our head is so complex that it’s impossible to fully credit the training data. We would have to mention potentially hundreds or thousands of humans who have affected a particular response, from an elementary school teacher to the lady at the supermarket checkout counter who slightly influenced the weight we give to ideas. Moreover, we don’t quite remember who exactly taught us what. It’s quite likely that my mom taught me the word “mama” but how did I come to understand the concept of “engine”? No idea. Probably many, many influences.

Despite the inability to credit the training data, some large language models can provide references (or “further reading”) used in a response. Whenever I needed a reference, I used ChatGPT “Browse with Bing” plugin. Annoyingly, this was disabled last week. Apparently the content owners didn’t like what the plugin was doing (i.e., accessing content without clicking on the ads). Looks like the interests of ChatGPT users may not be aligned to those of content creators very well.

There are still workarounds, such as using the “new Bing” (which is free as in beer but has other limitations, such as response length), or using one of the tools that leverage ChatGPT through API calls and provide extra functionality, such as references.

One thing to be aware of: ChatGPT sometimes makes a claim, provides a reference, and the reference has nothing to do with the response. In other words, it doesn’t just hallucinate responses, it can hallucinate a reference, too. And no, I wasn’t using Bing in "Creative"mode… :wink:

Buyer beware.

Correct, I did not phrase that well! Yes, LLMs cannot connect a single response they generate to the sources that provide it, as it is not even constructed that way (this really works against our pre-conceptions)… the source is really “all of the content it is trained on”

I meant being more clear for a specific EduGPT model about what it is drawing on for it’s responses.

Yes, I stopped looking to ChatGPT to provide what meaningful references, it more or less generates something showing “referenciness” (~ Colbertian truthiness) in form and words to perhaps authors, journals likely associated with the generated text.