
What’s New with AImpower.org: Sharing Our Technical Work, Progress, and Impact
November 10, 2025The Social and Technical Values of a Grassroots Chinese Stuttered Speech Dataset
December 22, 2025Our Paper Wins Best Paper Award at AIES 2025!
We are thrilled to share that our paper, “Govern With, Not For: Understanding the Stuttering Community’s Preferences and Goals for Speech AI Data Governance in the US and China,” has been selected as a Best Paper Award winner at the AAAI/ACM Conference on AI, Ethics, and Society (AIES) 2025! This paper is the result of a close, sustained collaboration between us at AImpower.org (Jingjin Li and Shaomei Wu) and Prof. Norman Su, PhD students Peiyao Liu and Rebecca Lietz from the Authentic User Experience Lab (AUX Lab) at University of California, Santa Cruz, as well as PhD student Ningjing Tang from Carnegie Mellon University.
This recognition is especially meaningful—not only for AImpower research team, but for the entire stuttering community whose insights, lived experiences, and leadership made this work possible.

What is this research about
Modern speech AI relies heavily on large datasets collected from real people. But as we highlighted in our presentation, this power is not evenly distributed. A small number of corporations control most speech datasets, while marginalized groups—including people who stutter—remain under-represented or excluded.
Stuttered speech is not merely audio—it can reveal medical history, identity, and deeply personal aspects of self-presentation. Contributing voice data can therefore feel empowering and risky at the same time. Millions of people worldwide who stutter (1% of the global population) experience stigma that shapes how comfortable they are with sharing their speech.
Our work asks how to create data governance systems that recognize these realities and center the values of those who stutter. Our research questions focus on:
- RQ1: How can we use and govern disability related data in an ethical, respectable, and power-sharing way that maximizes the community’s agency and control?
- RQ2: How do socio-geographical differences affect the community’s preferences and needs with respect to data governance?
How We Conducted the Study
We started by conducting interviews with eight stuttering advocates—community leaders, technologists, speech-language experts, and organizers. Their insights grounded our understanding of consent, privacy, long-term stewardship, and the realities of grassroots data work.
Based on the interview findings, we then conducted a survey study with 149 people who stutter (83 from China, 66 from the U.S.), exploring broader community attitudes toward data sharing, trust, governance needs, and perceived risks.
Key Insights
1. Transparency builds trust (more than money). Many advocates and the broad community shared that clarity of purpose and transparency of data collection project mattered more than financial incentives.
“If it’s just framed as ‘we’re collecting speech samples for X research project,’ it can feel unclear or unmotivating.
But if it’s presented as ‘you have the chance to help create accessible AI for people who stutter — that’s exciting!’”
— Alex, stuttering community organizer
Across both countries, participants emphasized the importance of:
- Seeing clear community benefit
- Knowing who has access to their data
- Understanding how it will be used
2. Consent is not a form — it’s a relationship. Consent must go beyond a one-time checkbox. People who stutter want:
- Tiered consent options, allowing participants to specify what types of use they agree to (e.g., academic research, commercial development, training, public demonstrations)
- Ongoing communication and updates about how data is being stored, accessed, and reused over time
- The ability to withdraw or delete data long after initial contribution, recognizing that comfort levels and personal circumstances can change
One advocate described how consent often stops at the point of data collection, with little follow-up or accountability.
“I was never told how my data was used.” — Eric, stuttering community advocate
3. Voice data carries unique and lasting risks. Unlike text or survey responses, voice data is inherently identifiable—even when names are removed. Standard anonymization practices are insufficient for speech data.
“We manually removed names, but voices are still uniquely identifiable.” — Teresa, stuttering community organizer and data collector
Advocates highlighted three interrelated challenges:
- Stuttering patterns themselves can reveal identity, even in the absence of explicit personal information
- De-identification is labor-intensive and never fully reliable, particularly at scale
- Misuse or overexposure of voice data can reinforce stigma, rather than reduce it
4. Open access with caution. While advocates and survey participants broadly supported data sharing for public benefit, they expressed discomfort with uses that drifted away from original community intent.
“We had a request to use our data to make an art installation — and it’s like, no, that’s not why people gave us these recordings.”
— Natalie, speech-language pathology professor and data steward
For sensitive speech datasets from marginalized communities, ethical data sharing requires moving beyond default open-access norms. Community members favored:
- Purpose-limited data use
- Request-based or reviewed access
- Clear expectations around citation, attribution, and reuse
5. Community-led stewardship is essential but difficult without structural support. Unlike academic or corporate institutions, community-led projects often lack access to legal counsel, standardized governance templates, secure data infrastructure, and long-term funding for stewardship activities such as consent management, access review, and participant communication. Grassroots organizers expressed concern about having to rely on improvised consent processes due to limited resources:
“We had to get some stock language up because we ran out of time. Most of it came from ChatGPT.”
— AlexCommunity groups are asked to carry legal and ethical risk without institutional backing. Without sustained investment in governance infrastructure, even well-intentioned, community-driven data initiatives face heightened liability, misalignment with participant expectations, and burnout among organizers.
6. Governance values travel across borders. Despite differences in law, culture, and stigma between the U.S. and China, participants shared strikingly similar governance priorities: Transparency, agency, community benefit, oversight over commercial or media use. Chinese participants demonstrated significantly greater acceptance in academic research and commercial product development.
From Privacy Protection to Relational Care
Based on the findings, we emphasize a shift from traditional privacy-only governance to relational governance: governance not as a static contract, but an ongoing relationship grounded in care, accountability, and trust. This means: designing governance structures that evolve with community needs, recognizing data contributors as partners, not subjects, prioritizing mutual benefit over extraction. These are also the core next steps that AImpower is focusing on.
What Comes Next
Winning a Best Paper Award is an honor—but the real work lies ahead. Our future work aims to turning community values into concrete tools such as actionable consent templates tailored for speech data, community-approved access protocols, governance design workshops, and sustainable stewardship models for grassroots organizations.
AI systems will increasingly shape how stuttered voices are recognized, interpreted, and represented. Ethical governance must ensure that people who stutter remain at the center of those decisions.
Acknowledgments
This work would not exist without the stuttering advocates and community members in the U.S. and China who generously shared their experiences, concerns, and hopes. Special thanks to: StammerTalk, Proud Stutter, SPACE, and all partners who assisted with recruitment. We are also grateful for support from the NSF Award #2427710 and the Patrick J. McGovern Foundation.
Access the Paper
“Govern With, Not For: Understanding the Stuttering Community’s Preferences and Goals for Speech AI Data Governance in the US and China” (AIES 2025 Best Paper)



