CHI – W4A 2023 Trip Report

Stuttering and Voice-activated AI: Panel Reflections

April 20, 2023

Calling All Stutterers Who Participate in Video Calls

June 13, 2023

May 31, 2023

1. Share our organization and our work

We presented our paper “‘The World is Designed for Fluent People: Benefits and Challenges with Videoconferencing Technologies for People Who Stutter” at CHI. Check out the 10-min presentation video and our slides.

Although I have done several conference presentations before, this CHI was the first time that I disclosed my stutter during my presentation (as you can see in the presentation video). While the audience were very supportive and gave me their full attention during my presentation, they also seemed a bit taken back by this piece of information and did not know what to do with it. I do really appreciate the few brave ones who asked questions in the end 🙂

I want to ensure everyone: “yes, you can talk to me in public settings”. I love answering questions and having meaningful conversations, even if I stutter!

I also tried requesting conference accommodations for my stutter for the first time at CHI this year. Everyone I interacted with during this process was trying their best to be helpful, but it was definitely a new thing for them as well. And my end experience was…okay… I will write a separate post about conference accessibility next week.

2. Connect with the broader research community

With over 4000 participants, CHI was a great place to catch up with many old friends in person, after a three year break.

We joined the Cornell InfoSci Reunion dinner, energized by all the new faces since Niranjan and I left Ithaca.

We also connected with the authors of the other two full papers () on stuttering, and had lots of good discussions! It is exciting to see three full papers on stuttering presented at this year’s CHI (only one in the past 20 years!), we hope to collectively raise broader awareness and interest on this topic within the HCI/CSCW community and inspire more research at the intersection of stuttering and technology.

As the Program co-chair for Web4All, I was glad to see many new faces at W4A this year! The students are particularly inspiring, I was so glad to see the accessibility challenge awards and the best technical paper award all went to papers that were led by students!

Lots of fun and memories..

Photos collage with photos of Shaomei with other conference participants, AImpower members, and landscape of Hamburg and Austin.

3. Learn and get inspired

Despite all the socialization and networking, I love conferences because I can learn exciting new research in a short period of time! Of course I couldn’t make it to all the sessions that sounded interesting, I did come across many papers that were inspiring, informative, and relevant to our work at AImpower. I will list a few of them below, please share your favorites in the comments!

1) Speech & voice technology

As I mentioned, there were several other papers studying how people who stutter interact with technologies at this year’s CHI.

I really liked “From User Perceptions to Technical Improvement: Enabling People Who Stutter to Better Use Speech Recognition” by Colin Lea, Zifang Huang, Jaya Narain, Lauren Tooley, Dianna Yee, Tien Dung Tran, Panayiotis Georgiou, Jeffrey Bigham, Leah Findlater from Apple Research. They found that, like everyone else, people who stutter see the utilities of speech recognition systems and want to use them more. However, they face significant technological barriers: people who stutter are frequently cut off from speaking, and are more likely to be misunderstood by current systems. This paper also benchmarked current automatic speech recognition (ASR) model performance for stuttered speech, and showed that the average word-error-rate (WER) for stuttered speech is 19.80%, about 4 times higher than what is reported (~5%) for the general public! And people with a severe stutter, the average WER is 49.2%, meaning that the system would misinterpret every other word, which is basically unusable. The following graph is taken from the paper. It benchmarks WER of Apple’s ASR systems for stuttered speech over the past 5 years, showing a range between 19/7% to 29.5%. While we do see the trend of improved performance, it is still way out of the acceptable range for commercial systems, raising fairness concerns on the consequences of such systems being deployed before closing the performance disparity.

Indeed, as illustrated by Kimi Wenzel, Nitya Devireddy, Cam Davidson, Geoff Kaufman from CMU, an ASR system with higher than average WER is not only an annoyance, but could lead to real psychological harm to marginalized groups. In their paper, “Can Voice Assistants Be Microaggressors? Cross-Race Psychological Responses to Failures of Automatic Speech Recognition”, they tested a Wizard-of-Oz (WoZ) ASR system with low and high WER settings with Black and White participants, and found that Black participants under the high WER setting reported significantly higher level of self-conscious and lower level of self-esteem after using the technology, while same effect did not occur for White participants.

They argued that, ASR systems with high WER is a type of micro-aggressor, and the act of frequently mis-interpreting and misunderstanding someone’s speech is a form of microaggression that could do real damage to one’s self-esteem and mood, especially for social groups that are more often than others to receive micro-aggression and discriminations.

A bad system is not just a bad system, it can cause real personal and social harm, which are often oblivious to “mainstream” users and developers!

These two papers made me even more convinced that we should NOT deploy current ASR systems broadly before closing the performance disparity or understanding its impact to different groups.

There are also several technical works that develop different forms of “digitalized speech therapy” for people who stutter. For example, Li Feng, Zeyu Xiong, Xinyi Li, Mingming Fan from the Hong Kong University of Science and Technology (Guangzhou) presented a system called “CoPracTter” for Chinese people who stutter to practice their speech with social support from other community members. While I appreciate the spotlight on the stuttering community in China and see the technical merits of this work, I have some reservations about the focus on fluency and hope systems like that do not reinforce the current prejudice against stuttered speech. Another example is “Speak in Public”, a tool developed by Francesco Vona, Francesca Pentimalli, Fabio Catania, Alberto Patti, and Franca Garzotto in Politecnico di Milano that uses VR, biosensors, and emotion recognition to simulate different real-life speaking situations for people who stutter as part of exposure therapy. I do really like how this tool focuses on uncovering the emotional challenges associated with stuttering rather than the observables, but I also wish it was less positioned as a “treatment” option but an “empathy” builder, as I believe that stuttering should be accepted as a valid form of human speech, not “treated” as if it is a disease.

Two papers from Japan showed promises in ASR without voice. Zixiong Su, Shitao Fang, and Jun Rekimoto from the University of Tokyo built “LipLearner”, an mobile application to lipread short commands and phrases with one-shot learning (e.g. the user only needs to record one example for each new command). And Jun Rekimoto from the University of Tokyo also presented a system called “WESPER” that would recognize whispered speech and regenerate it in a synthetic voice in real time without previous training.

2) Autoethnography as a research method

I have been increasingly interested in autoethnography as a research method, especially for researchers with marginalized identities and/or with personal experiences of the research topic. I was glad to see a few papers demonstrating this method in this year’s CHI and will definitely draw inspiration from them in AImpower’s own research.

For example, in “Self-Tracking to Do Less: An Autoethnography of Long COVID That Informs the Design of Pacing Technologies”, Sarah Homewood form University of Copenhagen used Fitbit to track her activities over 18 months whilst having long-COVID. Her autoethnographic accounts explored and uncovered the design space for fitness tracking activities to support their users to “do less” as a way to manage chronic illness and maintain long term wellness.

Another interesting autoethnography work is “Maintainers of Stability: The Labor of China’s Data-Driven Governance and Dynamic Zero-COVID” by Yuchen Chen (Univ. of Michigan), Yuling Sun (East China Normal University), and Silvia Lindtner (Univ. of Michigan). The authors described the “big data + iron feet” approach taken by the Chinese government during the Shanghai 2022 COVID lockdown, and the roles of technologies and volunteers play in it. A large amount of the data used in this paper were collected by Yuling Sun (the corresponding author) through autoethnographic writing of her personal experiences as a volunteer frontline worker during the four-month lockdown of her university campus from March to June 2022.

Autoethnography provides unique value for these papers as it allows the researchers access to fine-grained, private data that would otherwise be inaccessible to 3rd parties. It also enables the researchers themselves to turn a challenging experience or aspect of their lives into novel scientific insights that contribute to our collective discourse of HCI research.

3) Accessibility

I might be biased, but there seemed to be a greater presence of accessibility-related work at CHI this year, which I found very encouraging. I did not get to see all of them, but here are a few papers that I saw and liked.

“Towards Inclusive Avatars: Disability Representation in Avatar Platforms” by Kelly Mack, Rai Ching Ling Hsu, Andrés Monroy-Hernández, Brian A. Smith, and Fannie Liu. The authors conducted a pretty thorough research on how people with disabilities want and would represent themselves and their disabilities on virtual platforms, and provided some good ideas about how to support users with disabilities to better represent themselves, like showing assistive technologies, showing full body instead of only the head, and ways to show invisible disabilities.
“Accessibility Barriers, Conflicts and Repairs: Understanding the Experience of Professionals with Disabilities in Hybrid Meetings” by Rahaf Alharbi, John Tang, and Karl Henderson. Another paper on the topic of disabilities and virtual meetings – it is hard! The authors interviewed people with a wide range of disabilities to uncover their challenges with hybrid meetings. I like how the authors provided recommendations for both the technology (e.g. allow users more control over viewing experience) and meeting organizers (e.g. practice access check-ins). It is never just a technical problem!
“Counterventions: a reparative reflection on interventionist HCI” by Rua Mae Williams, Louanne Boyd, and Juan E. Gilbert. This is technically not an accessibility paper, but tackles a broader issue in the tech community that overfocus on “intervention”. The paper categorized common current narratives for accessibility work in HCI (i.e. charity, techno benevolence, mutual beneficence, edge-case for innovations, clinical intervention for human body) and suggested reparative readings of past research as a way to build counternarratives in accessibility research that are truly respectful and equitable to people with disabilities. Solid critical work, recommend everyone to have a read.

While AImpower’s work has been geared towards the “experience” aspects of accessibility, i.e. creating new and empowering technological experiences for people with disabilities beyond existing guidelines and legal requirements, attending W4A has reminded me that a lot of accessibility challenges are much more fundamental: most websites are still not or struggling to meet the accessibility requirements, many types of digital content are still not accessible to screen reader users.

The Web Content Accessibility Guidelines (WCAG) appeared frequently in W4A papers and follow-up discussions. It has been widely regarded as the basic standards and a compliance framework for web accessibility. However, despite lots of efforts developing these guidelines, and a whole “accessibility compliance” industry that implements and evaluates them for companies and governments, the current state of web accessibility is still way below satisfactory for both users and developers. Paper “A Platform to Check Website Compliance with Web Accessibility Standards” by Reinaldo Ferraz , Ana Duarte, João Bárbara, Adriano Pereira and Wagner Meira found that less than 0.5% of the Brazilian websites fully (or almost fully) meet the requirements defined in the guidelines. And even when the developers want to make their sites accessible, it is challenging to understand and fully implement the WCAG guidelines. In “Accessibility Metatesting: Comparing Nine Testing Tools”, the author, Jonathan Pool, tried nine different tools for catching accessibility issues on 121 web pages, only to find each tool discovered numerous issues missed by all the other eight tools.

I also have a deep appreciation for works to make different types of content accessible to screen reader users, including but not limited to chemical structural formulas, mathematical diagrams, 3D games, and data visualizations:

AutoChemplete – Making Chemical Structural Formulas Accessible by Merlin Knaeble, Gabriel Sailer, Zihan Chen, Thorsten Schwarz, Kailun Yang, Mario Nadj, Rainer Stiefelhagen and Alexander Maedche;
Authoring Web-accessible Mathematical Diagrams by David Austin and Volker Sorge
Amaze3D: Making 3D Worlds Accessible to Blind Gamers by Greg Gay, Matthew Ralston
Understanding and Improving Drilled-Down Information Extraction from Online Data Visualizations for Screen-Reader Users by Ather Sharif, Andrew Mingwei Zhang, Katharina Reinecke and Jacob O. Wobbrock

4) Ethics & equity

I also really liked several papers that studied the ethics and power dynamics in technology, especially from a marginalized community’s perspectives.

One of my favorite papers in this year’s CHI is “Old Logics, New Technologies: Producing a Managed Workforce on On-Demand Service Platforms”, by Anubha Singh, Patricia Garcia, and Silvia Lindtner from University of Michigan. The authors studied how on-demand delivery service platforms (e.g. Swiggy) control and manage its workforce that force workers to endure conditions of extreme overwork and exhaustion despite their dissatisfaction and attempts to protest and resist. They showed how algorithms reshaped the labor process while reinforcing the mechanism of the capitalism system. E.g. providing financial incentives for earning 5-star ratings, disciplining workers through granular information capture, and reframing workers as “delivery executives” or “delivery partners” to create a sense of belonging to the high-tech industry. As a result, the “exploitation appears self-driven and voluntary, more intense, harder to notice, and therefore difficult to resist”. Systems like this also exploit the lower socio-economic status of the workforce – they are able to reserve a large army of backup workers and suspend/remove workers who try to organize and resist. As AImpower recently started engaging with the disabilities community on the topic of technology and economic empowerment, I personally resonate with many of the observations made in this paper. While technologies like crowdsourcing platforms and AI introduced employment opportunities for those who might have been rejected or undervalued by the traditional labor market, such technologies still run on top of the capitalistic values of maximizing profits over cost, and are getting more efficient than ever to control and manipulate human workers.

Another interesting work in this category is “Playing with Power Tools: Design Toolkits and the Framing of Equity” by Adrian Petterson, Keith Cheng, and Priyank Chandra from University of Toronto. They leveraged Nancy Fraser’s dimension of justice to review 17 design toolkits that all aim to help designers create more equitable technology, and find both common frameworks and various limitations of these toolkits. They also offered some good suggestions for approaching equity through design that seemed valuable for AImpower to reflect and reference in our own technology design process.

This post is a lot longer than I expected! 😛

TL;DR: we learned a lot at CHI and W4A and look forward to incorporating these learnings into our upcoming research and technical work.

The next conference we will be attending is the National Stuttering Association’s annual conference at Fort Lauderdale, Florida in early July. We will host a workshop titled “Claim Your Virtual Presence”! I will share more details about the workshop in our blog soon and hope to see you there!

CHI – W4A 2023 Trip Report

Stuttering and Voice-activated AI: Panel Reflections

Calling All Stutterers Who Participate in Video Calls

aimpowerwp

About Us

Find us here

CHI – W4A 2023 Trip Report

Stuttering and Voice-activated AI: Panel Reflections

Calling All Stutterers Who Participate in Video Calls

Stuttering and Voice-activated AI: Panel Reflections

Calling All Stutterers Who Participate in Video Calls

1. Share our organization and our work

2. Connect with the broader research community

3. Learn and get inspired

1) Speech & voice technology

2) Autoethnography as a research method

3) Accessibility

4) Ethics & equity

Share this:

Like this:

Related

aimpowerwp

Related posts

A New Partnership to Empower Stuttering Voices in Speech AI

Stuttered Speech Data for Stutterers, with Stutterers, by Stutterers: Insights from Stuttering Advocates

AImpower x UC Santa Cruz Designathon: Innovating for Community-Led AI Initiatives

Discover more from AImpower.org