Tagged for OEG Connect: Image Accessibility Creator

cogdog · April 8, 2025, 7:23pm

What’s of interest? Image Accessibility Creator

Tell me more!

Enhance the accessibility of your course images with our intuitive tool. Designed to effortlessly generate alt text and detailed descriptions, as well as extract text from slides and images that are not accessible, our tool simplifies the creation of inclusive content. Just upload your image and hit ‘Create Image Details.’ Within seconds, you’ll see the generated content appear on the right side of the screen. Need to tailor the descriptions further? Easily add more specifics with the ‘Add Details’ option and generate updated descriptions to perfectly meet your needs.

This tool currently utilizes the GPT-4o model provided by OpenAI. We are using their API platform, which ensures the data sent is not retained or stored to improve their training or models.

h/t Judith Sebesta in CCCOER Webinar “New Accessibility Requirements for Web Content and Mobile Applications for Public Entities” https://www.cccoer.org/webinar/new-accessibility-requirements-for-web-content-and-mobile-applications-for-public-entities/

Where is it?: https://asuo-ai-labs.streamlit.app/Image_Accessibility

This is one among many items I will regularly tag in Pinboard as oegconnect, and automatically post tagged as #OEGConnect to Mastodon. Do you know of something else we should share like this? Just reply below and we will check it out.

Or share it directly to the OEG Connect Sharing Zone

JudithSebesta · April 8, 2025, 7:57pm

Glad you passed on this tool, Alan! I have found it quite effective and hope others do as well.

cogdog · April 8, 2025, 8:31pm

That was a fantastic session, Judith, and I appreciate how you started with the accessibility actions you applied in the presentation.

I’ve seen a few mentions of this tool, and would hope/ask if others here might share some outputs and how effective they find its results.

My test was with this image I used elsewhere ni OEG Connect as a welcome post topic - I have incorporated the alt text in the image uploaded here.

The Long Description returned was quote well written!

The image features a colorful, stylized sculpture of a figure with its arms outstretched against a clear blue sky. The figure has a cheerful face with a wide smile, and its head is adorned with black geometric shapes. The left arm is clad in a red sleeve with a swirling black pattern, while the right arm has a yellow sleeve with large orange polka dots. The torso is divided into two sections: black and white vertical stripes in the center, resembling a suit, and the lower part displaying blue and white vertical stripes. Each arm ends with a gray hand, stylized and simplified. The sculpture exhibits a playful, cartoonish aesthetic. In the bottom right section of the torso is a signature, “Romero Britto.”

The alt text I had hand was written “Colorful statue of figure with outstretched legs and hands offering a welcome”-- This is what the Image Accessibility Creator suggested, definitely better.

Colorful sculpture of a stylized, smiling figure with outstretched arms, featuring a mix of patterns and colors including red, yellow, and blue.

As a tip for OEG Connect, you can easily upload images by dragging and dropping files into the editor (or use the image upload button), yet what you get for a default alt text is the file’s name, not useful at all.

One your image is uploaded, if you click the pencil under it on the right side preview, you can insert better alt text.

I would enjoy seeing examples of photos others load into the Image Accessibility Creator. Upload your image here, and use the alt text it suggests to add to your image. How well does it describe your image? Would you use this tool more often? Will you aim to include alt text everywhere you can?

wisalalim · April 8, 2025, 9:13pm

Dear OEG Team,

Thank you for sharing details about the Image Accessibility Creator tool. It’s exciting to see such innovative solutions designed to improve inclusivity in educational content.

This tool’s ability to effortlessly generate alt text, detailed descriptions, and extract inaccessible text from images is a significant advancement in creating content that caters to all learners. The intuitive features like customizing descriptions through the “Add Details” option further demonstrate its user-friendly nature.

The use of GPT-4o by OpenAI ensures reliability and efficiency, while safeguarding data privacy, which is a crucial aspect for educators and creators. Tools like this contribute greatly to compliance with modern accessibility standards, as highlighted in Judith Sebesta’s CCCOER Webinar.

I am particularly impressed by its potential to simplify accessibility efforts in teaching resources and provide valuable support for the open education community.

Thank you for encouraging us to share resources like this one. It’s inspiring to witness such contributions making a positive impact.

Best regards,

cogdog · April 8, 2025, 11:35pm

I would be interested to see your results for a local photograph or an image from your library. I did some searching for open licensed photos of Sudanese food, and found this photo:

Aseeda & Mulah flickr photo by Qusai Akoud shared under a Creative Commons (BY) license

If my research is correct “asida” is a soft dumpling of boiled flour and “mullah” is a kind of stew?

The Image accessibility creator described this:

The image depicts a close-up of two types of food items on a plate. In the foreground, a white, mold-shaped item with a smooth texture is positioned on the right. To its left, a thick, brown sauce is being poured onto the plate, partially covering the base of the white item. The plate itself is gold and reflective, creating a warm ambiance. The background is softly blurred, making the plated food the central focus of the image.

What I find interesting is the AI is used here purely to create a visual description, not to identify exactly what is in the photo, but it is able to know that this photo is a type of food. So in my few examples of testing, it is performing very well for drafting my alt text description (used above).

wisalalim · April 9, 2025, 5:18am

Asida and Mulah: A Traditional Sudanese Dish Reflecting Rich Heritage

Asida is one of the delicious traditional dishes in Sudan, made from sorghum flour, which is carefully ground, mixed with water, and left to ferment. Afterward, it is cooked over heat until it reaches a smooth and firm consistency. Asida, with its simple flavor and unique texture, serves as the perfect base for a variety of Mulah (stews).

Mulah Tagalia is one of the most famous sauces served with Asida. It consists of dried or minced meat cooked with oil, onions, and spices such as coriander, salt, and black pepper. Once the mixture boils, Wika (dried okra powder) is added, giving the Mulah its distinctive flavor and rich texture.

Asida and Mulah are a hearty breakfast meal, easy to digest, and are served on various occasions, showcasing the authenticity of Sudanese culture. In addition to their delicious taste, Asida holds deep social and cultural significance, symbolizing hospitality and generosity in Sudanese society.

The Importance of Asida and Mulah in Sudanese Heritage :

Tradition and Authenticity: Asida reflects the traditional Sudanese lifestyle, relying on local ingredients and sustainable food practices.
Social Gathering: It is served at family events and celebrations, strengthening social bonds.
Sustainability: It is made with locally available ingredients, making it an economical and sustainable choice.

abdulsyahid · April 9, 2025, 6:13pm

Just tried it and… PERFECT! Super smooth, fast, and actually kinda fun to use!
Loved how it whipped up alt text and long descriptions like a pro —my slides finally feel seen and understood .

Huge thanks to the amazing team behind this, and a big shoutout to @cogdog for the tip!
Accessibility just got a glow-up Let’s make the web kinder for everyone!
Any other tools like this? Drop them below—we’re all ears!

edtechdev · April 11, 2025, 5:10pm

The ASU tool was down sometimes, so I made a version in ChatGPT: ChatGPT - Alt Text Generator
One tiny advantage over the ASU one is in ChatGPT you can paste an image in, instead of having to download and upload it.
I forgot - ChatGPT can now edit images, too, if someone wanted to increase the contrast or change the color of text, etc.

John Swope also has one here: https://alt-text.streamlit.app/?template=Alt+Text+Generator

I tried making one in Copilot (our free school account), but I guess it can’t visually understand images in our version, even though it thinks it can.

Here’s an alt text generator that uses an open weight AI model (currently llama vision), but it’s not as good or easy to use as the others: HuggingChat

Gemini “Gems” might be an option, too. I just used Gemini 2.0 Flash in https://aistudio.google.com/ to OCR over 100 sign-in sheets.

cogdog · April 11, 2025, 7:45pm

Thanks Doug, lots of options and the ones I checked worked great.

cogdog · April 11, 2025, 8:09pm

Woah, Doug, the Gemini Gems was interesting. I had a ~2006 photo of me and my friend Brian Lamb

Yes! This is IT! flickr photo by cogdogblog shared into the public domain using Creative Commons Public Domain Dedication (CC0)

In AI Studio, I uploaded the image and asked for “alternative text for image”

I am a bit curious later to see what the sliders on the right do, as well as from the top, the “System Instructions” which suggests I can add something like a style or tone for the response??

I have 3 results returned:

Concise
Two men smiling outdoors in front of a restaurant window advertising “Grilled Fish Tacos”.

Probably what I would use in alt text as it conveys enough – and I replaced the default alt ttext of the meaningless file name with this in the photo above.

More Detailed
Two men standing side-by-side outdoors, smiling at the camera. The man on the left wears sunglasses, a beard, and a green plaid shirt. The man on the right wears glasses, a white button-up shirt, and leans against a peach-colored pillar. Behind them, “Grilled Fish Tacos” is written on a glass window.

Very Detailed
Two men pose for a photo outdoors in daylight. The man on the left has dark hair, a full beard, sunglasses, and smiles, wearing a green, yellow, and black plaid long-sleeved shirt over a white t-shirt. The man on the right has dark hair, eyeglasses, and smiles, wearing a white/off-white textured long-sleeved button-up shirt with rolled cuffs, a gold chain necklace, and a gold bracelet. He leans against a peach-colored textured stucco pillar. Behind them is a large glass window/door reflecting the outdoors, with “Grilled Fish Tacos” written on it in white script.

Heck I forgot I once wore gold

This does a solid job. It’s hard to see a significant different between any of these approaches, of course I have barely tried. They all return better written ones than I would have done.

Ahh, but then I got curious about the tab for “Experimental: Thoughts Produced by Model”

One might get carried away to think, “look it is showing me its reasoning” but it does model a decent approach to the description process - The main subject, overall setting, then details of each identified person/thing, a first synthesis, then a review agains criteria, and a revision. Does it really do this internallY?

Here’s a thinking process for creating the alt text:

Identify the main subject(s): The image clearly shows two men standing side-by-side outdoors.

Describe the setting: They appear to be outside a building, likely a restaurant or shop, given the glass window/door behind them with text. There’s a textured, peach-colored wall or pillar to the right. It’s daytime, and the lighting seems bright.

Describe the person on the left:

Appearance: White male, dark hair, beard and mustache, smiling.

Clothing: Wearing sunglasses, a green and black plaid long-sleeved shirt over a white t-shirt.

Pose: Standing facing the camera, smiling, left hand possibly in pocket (or just out of frame).

Describe the person on the right:

Appearance: White male, dark hair, clean-shaven (or very light stubble), smiling.

Clothing: Wearing eyeglasses, a white/off-white, textured, long-sleeved button-up shirt (sleeves rolled up slightly), a gold chain necklace, a gold bracelet on his left wrist, and possibly a small earring in his left ear.

Pose: Standing slightly angled towards the camera, leaning slightly against the pillar, smiling, left arm bent with hand near the other man.

Describe the background elements:

Behind the men is a large glass window or door.

Reflections are visible in the glass.

White text is written on the glass: “Grilled Fish Tacos”. Part of other text (“Ballast Po…”) is visible to the far left.

Inside, through the glass, hints of tables and chairs can be seen. A fish silhouette decoration is visible near the top left. An “EXIT” sign is reflected or visible near the top center.

To the right is a textured, peach-colored stucco pillar or wall.

A white electrical box or fixture is mounted on the pillar.

Synthesize the descriptions into coherent alt text: Start with the main subject and setting, then add details about each person and the background.

Initial draft: Two men standing outside a restaurant. The man on the left has a beard and sunglasses and wears a plaid shirt. The man on the right has glasses and wears a white shirt. “Grilled Fish Tacos” is written on the window behind them.

Adding detail: Two men smiling for the camera, standing outside in front of a glass window with “Grilled Fish Tacos” written on it. The man on the left wears sunglasses, a beard, and a green and black plaid shirt. The man on the right wears eyeglasses, a white button-up shirt, a gold chain, and a gold bracelet, leaning against a peach-colored pillar.

Refining wording and flow: Two men stand side-by-side outdoors, smiling at the camera. The man on the left has dark hair, a beard, and sunglasses, wearing a green and black plaid shirt over a white tee. The man on the right has dark hair and eyeglasses, wearing a white textured button-up shirt, a gold necklace, and a gold bracelet; he leans against a peach-colored stucco pillar. Behind them is a glass window or door with the words “Grilled Fish Tacos” in white script. Reflections and the interior of the building are visible through the glass.

Review against guidelines:

Is it descriptive? Yes.

Is it concise? Reasonably so, given the detail.

Does it identify the main subjects and context? Yes.

Does it avoid subjective interpretation (e.g., “happy”)? Uses “smiling” which is observable.

Is it accurate? Yes.

Final Version: Two men stand side-by-side outdoors, smiling at the camera. The man on the left has dark hair, a beard, and sunglasses, wearing a green and black plaid shirt over a white tee. The man on the right has dark hair and eyeglasses, wearing a white textured button-up shirt, a gold necklace, and a gold bracelet; he leans against a peach-colored stucco pillar. Behind them is a glass window/door with the words “Grilled Fish Tacos” in white script. Reflections are visible in the glass.

This reminds me of some great posts by Mike Caulfield on what reasoning means, the difference between it being a verb and a noun-- The Two Meanings of Reasoning and the Future Path of AI

Whether AI “reasons” or not is a bit irrelevant — the question is whether it produces valid reasons

and also Critical Reasoning with AI: How we know LLMs are applying reasoning patterns, and not just reverse image searching

Thanks again for sending me here, Doug!