
Speech AI for All: Promoting Accessibility, Fairness, Inclusivity, and Equity – CHI 2025 Workshop Recap
July 5, 2025Our AIES 2025 Best Paper Award: Community-Centered AI Data Governance for Stuttered Speech
December 22, 2025By Shaomei Wu
It’s been a little while since our last update — we’ve been busy! AImpower.org has been accelerating our technical, community, and advocacy efforts, amplifying marginalized voices in shaping emerging technologies.
We’ve accomplished a lot, and I’m excited to start sharing what we’ve been up to! Over the next few posts, I’ll dive into different areas of our recent progress — beginning with this one, which highlights our technical work and achievements.
Disfluency-Friendly Speech AI
A core focus of our technical work is co-developing disfluency-friendly speech AI models and products in partnership with the stuttering community. Unlike typical participatory approaches where communities are only occasionally consulted, we put the community on the driver’s seat at every stage – from data collection, data governance, model evaluation, and model development.
- Community-led Data Governance: We published data governance principles and guidelines grounded in direct input from the stuttering community. Our findings were presented at the AIES Conference in Madrid, Spain a couple of weeks ago, and recognized by the AIES community by the best paper award!!!

- Speech Data Collection
- With tremendous support from the stuttering community, we gathered English speech samples from 51 people who stutter across the US and Canada. This dataset includes both conversational speech and common voice commands, capturing a wide variety of stuttering patterns. Annotation is currently underway, and we will share more details once it’s ready for release — hopefully in Q1 2026!
- We have also released the StammerTalk dataset, consisting of 43 hours of spontaneous conversations and reading of voice commands by 66 Mandarin Chinese speakers who stutter. It is currently hosted at HuggingFace, and we welcome researchers and developers to request access and share your intended use cases. Since its release in June, this dataset has already been downloaded more than 100 times!

- Community-Centered Data Annotation: Annotating stuttered speech is far from straightforward — stuttering is both highly variable and deeply personal. To challenge the historical omission of people who stutter from this process, we co-designed our annotation guidelines with the stuttering community. Through interviews, workshops, and pilot sessions, we developed a framework grounded in the embodied knowledge of people who stutter. This approach not only improves annotation accuracy but also affirms the community’s epistemic authority over their own speech. We currently have a research paper under-review on this topic, and are preparing these guidelines for publication as a resource for the broader AI community.
- Auditing Speech AI Performance: Using both existing and newly collected data, we audited major speech AI models, including OpenAI’s Whisper and Meta’s wav2vec. Our findings revealed that current ASR systems perform 1.7 to 4 times worse for stuttered speech compared to fluent speech. Alarming patterns emerged: newer model versions sometimes regress in performance on stuttered speech, and hallucinations occur 3–5x more often — occasionally producing offensive, dehumanizing errors like “oink oink oink.”


- Improving Speech AI Models for Stuttering: We didn’t stop at identifying the biases — we also actively addressed them! By fine-tuning Whisper v2 with our stuttered speech dataset, we reduced transcription errors by half (see the off-the-shelf model on the left and finetuned model on the right in the figure below). Our fine-tuned model is now available on HuggingFace (and has been downloaded more than 20 times since its public release in October). The methodology behind our model is detailed in our paper, which was presented at the FAccT ’25 conference.


Inclusive Videoconferencing
- Product Development: We beta launched the MVP of LibOrate, a videoconferencing companion app co-designed and co-developed with the stuttering community. We are currently recruiting participants who stutter for beta testing. If you are a person who stutters, please sign up here – your feedback is crucial to inform our next iteration of the product!

Here are some encouraging testimonies from our current users:
“I can think there’s plenty of people who don’t stutter who would also find this really useful.”
– A, PWS
“The name tag is a really nice feature. It can serve as a simple, low-pressure reminder that I stutter.”
– S, PWS
“I’m definitely going to share this with my clients — adults, teenagers, and middle schoolers — and with others I know who stutter. This is great.”
– K, PWS and SLP
“I love it — the reminders are fantastic, and I like that it’s customizable.”
– T, PWS
“This product is empowering — it shows that something was designed specifically for people who stutter. That has never existed before. And I can see it being useful beyond just people who stutter.”
– J, PWS
“The waving features, especially the ‘I’m not done’ message, would be the most helpful for me because I feel like the problem I have is when people are interrupting me.”
– N, PWS
- Continuous Community Feedback: Over the coming weeks, our users will test LibOrate in everyday Zoom meetings, to help us better understand the App’s longer term impact and further improve its usability, accessibility, and design. We hope to publicly launch the App at Zoom Marketplace by the end of the year!
- Cross-Cultural Research: In partnership with Chris Constantino (Florida State University), Prof. Yingting Chen, and Prof. Chi-Lan Yang (Japan), we are studying the cross-cultural effects of stuttering self-disclosure in virtual meetings. This project will advance both theoretical and applied understandings of how disclosure practices influence inclusion and participation in global digital spaces.




