Online speech per­cep­tion exper­i­ments: Democ­ra­tiz­ing sci­ence and teaching.

Christi­na Y. Tzeng, San José State University

YouTube

By load­ing the video, you agree to YouTube’s pri­va­cy pol­i­cy.
Learn more

Load video

Full Tran­script:

Christi­na Y. Tzeng:
All right. Thank you, Rachel, for the intro­duc­tion at the begin­ning of the ses­sion and to both you and Joshua for gath­er­ing us in this space. I also want to say thank you upfront for every­one who is still here at the last talk of the day. I am excit­ed to share about my expe­ri­ence con­duct­ing speech per­cep­tion exper­i­ments online, high­light­ing the pow­er for these online exper­i­ments to democ­ra­tize sci­ence and teach­ing. I’ll aim to achieve two objec­tives in my talk today. The first is to share some find­ings that add to what we now know is a grow­ing piece of evi­dence that online speech per­cep­tion exper­i­ments are high­ly effi­cient and do yield robust data. The sec­ond objec­tive is to share some thoughts on how online exper­i­ments, more broad­ly, can make sci­ence more acces­si­ble for both researchers and participants.

Christi­na Y. Tzeng:
In my work, I study how we, as lis­ten­ers, over­come the enor­mous amount of vari­a­tion that we encounter when we lis­ten to dif­fer­ent voic­es and utter­ances. Our exper­i­ments typ­i­cal­ly require par­tic­i­pants to lis­ten to audi­to­ry stim­uli and make sub­se­quent respons­es on a com­put­er to each one. For in-per­son or in-lab exper­i­ments, this would typ­i­cal­ly require what’s pic­tured on the left: a sound-atten­u­at­ed, dis­trac­tion-free booth, high-qual­i­ty head­phones, and spe­cial­ized soft­ware, as well as hardware.

Christi­na Y. Tzeng:
As a dis­claimer, I have to state that my first real dive into the world of online exper­i­ments was in late 2019, which makes me a rel­a­tive­ly nov­el user of these online exper­i­men­tal meth­ods, but this is when I start­ed to won­der, “Is such a high­ly con­trolled lis­ten­ing envi­ron­ment real­ly necessary?”

Christi­na Y. Tzeng:
In the inter­est of achiev­ing this first objec­tive, I’d like to share what are now pub­lished find­ings from my first for­ay into the online exper­i­ment world. This is work done in col­lab­o­ra­tion with my col­leagues, Dr. Lynne Nygaard and Dr. Rachel Theodore, where we exam­ined the time course of a phe­nom­e­non called lex­i­cal­ly guid­ed per­cep­tu­al learning.

Christi­na Y. Tzeng:
We know that lis­ten­ers use a whole host of cues to map the acoustics of the speech sig­nal onto lin­guis­tic units. One of these cues is lex­i­cal knowl­edge. Imag­ine hear­ing a frica­tive sound that’s between an S and an SH sound. If that ambigu­ous sound is embed­ded into this word on the left, the lis­ten­er hears that sound as an S as in dinosaur. But if that same ambigu­ous sound is instead embed­ded in the word on the right, the lis­ten­er hears that sound instead as an SH as in effi­cient. But if lis­ten­ers are exposed to these ambigu­ous sounds in sta­ble lex­i­cal con­texts, that bias them to hear either S or SH sound.

Christi­na Y. Tzeng:
What we then see are changes in the lis­ten­er’s rep­re­sen­ta­tions of their S and SH cat­e­go­ry. These changes in sound cat­e­go­ry rep­re­sen­ta­tion are what we call lex­i­cal­ly guid­ed per­cep­tu­al learn­ing. In both the online and in-per­son ver­sions of this task, the lex­i­cal­ly guid­ed per­cep­tu­al learn­ing par­a­digm takes about 20 min­utes to com­plete. So here, lis­ten­ers com­plete an expo­sure phase fol­lowed by a test phase. And in the expo­sure phase, they com­plete a lex­i­cal deci­sion task where they hear an ambigu­ous sound such as a frica­tive between S and SH. One group hears this ambigu­ous sound that’s embed­ded in words, bias­ing them to hear it as an S, where­as anoth­er group is biased to hear that same sound as an SH. So after expo­sure, the lis­ten­ers com­plete a pho­net­ic cat­e­go­riza­tion task where they iden­ti­fy ambigu­ous sounds on a non-word con­tin­u­um here, either as asi or ashi.

Christi­na Y. Tzeng:
We drew our sam­ples from Pro­lif­ic and exe­cut­ed the exper­i­ments in Goril­la. We com­plet­ed a total of six exper­i­ments in this pub­li­ca­tion, but in the inter­est of time, I’ll share the find­ings from one. What will appear here are the results of the pho­net­ic cat­e­go­riza­tion task at test, where­upon hear­ing ambigu­ous sound on the asi/ashi con­tin­u­um, we mea­sured the like­li­hood that par­tic­i­pants heard those sounds as asi. Here, we see robust evi­dence for lex­i­cal­ly guid­ed per­cep­tu­al learn­ing. As lis­ten­ers, we’re more like­ly to hear the ambigu­ous sounds as asi when they were biased to hear S dur­ing expo­sure indi­cat­ed by the red line, then when they were biased to hear the sounds as SH dur­ing expo­sure shown here by the green line.

Christi­na Y. Tzeng:
To show­case the high lev­el of data qual­i­ty that we see at the indi­vid­ual lev­el, here are sep­a­rate plots for each of the 70 par­tic­i­pants at test where we can see the expect­ed psy­cho­me­t­ric curves for every sin­gle par­tic­i­pant. We only exclud­ed 5% of our par­tic­i­pants across the six exper­i­ments due to fail­ure to per­form the task. We did have to exclude 16% of the total num­ber of par­tic­i­pants due to fail­ure to pass the woods at all, head­phone check that Dr. Theodore described at the begin­ning of the ses­sion. But this was a small price to pay, giv­en the speed of data col­lec­tion. So for exam­ple, we col­lect­ed data from the 70 par­tic­i­pants pre­sent­ed in Exper­i­ment 1 in under a sin­gle hour.

Christi­na Y. Tzeng:
I hope what I’ve shared has sup­port­ed the idea that online speech per­cep­tion exper­i­ments are high­ly effi­cient and yields robust find­ings even with audi­to­ry tasks that require fine-grained pho­net­ic dis­crim­i­na­tions like the one I presented.

Christi­na Y. Tzeng:
I now want to turn to the idea that online exper­i­ments can pro­vide us with two things in par­tic­u­lar: access to a larg­er and more diverse pool of par­tic­i­pants and also more user-friend­ly exper­i­ment build­ing inter­faces for our stu­dents and research mentees.

Christi­na Y. Tzeng:
This is the fig­ure I showed ear­li­er. We repli­cat­ed the find­ing with anoth­er end of 70 par­tic­i­pants using a sec­ond stim­u­lus set shown here on the right, mean­ing we ran a total of 150 par­tic­i­pants with­in the span of about an hour and a half, which using in-per­son meth­ods would have tak­en us weeks or even months.

Christi­na Y. Tzeng:
For his mas­ter’s the­sis, one of my stu­dent col­lab­o­ra­tors, Ulis­es Quin­tero, is inter­est­ed in recruit­ing par­tic­i­pants who speak Eng­lish and a sec­ond lan­guage. So in Pro­lif­ic, if we use our stan­dard inclu­sion cri­te­ria, includ­ing this cri­te­ri­on of speak­ing Eng­lish plus anoth­er lan­guage, we auto­mat­i­cal­ly have access to over 3,000 par­tic­i­pants, which is mag­ni­tudes greater than what we would have access to using in-per­son meth­ods. For his under­grad­u­ate hon­ors the­sis, Justin Au built a talk­er ID task in Goril­la on his own using pri­mar­i­ly the tuto­r­i­al sup­port that is on Goril­la’s web­site as a guide.

Christi­na Y. Tzeng:
And by address­ing the two ques­tions about audi­to­ry research more broad­ly that Rachel shared at the begin­ning of the ses­sion, the first is, “What do you think is the biggest chal­lenge for audi­to­ry research online, and how do you over­come it?” As Jason men­tioned, due to the pan­dem­ic, we have all been forced to some extent to embrace online meth­ods more read­i­ly, but I think we are still very much in the process of estab­lish­ing both the valid­i­ty and the reli­a­bil­i­ty of these meth­ods. And one way for us to do this is to run online and in-per­son exper­i­ments in par­al­lel so that we, not just as indi­vid­ual researchers but as a field, can be reas­sured that our tasks can be suc­cess­ful­ly trans­ferred across these dif­fer­ent platforms.

Christi­na Y. Tzeng:
And the sec­ond ques­tion, “What can audi­to­ry research gain most from online meth­ods?” My take on this is that, with how quick­ly, we can col­lect data from a whole num­ber of dif­fer­ent pop­u­la­tions. We’ve essen­tial­ly elim­i­nat­ed the data col­lec­tion bot­tle­neck. Adapt­ing in-per­son exper­i­ments to the online world takes a lot of tri­al and error, and I’m still very much in that learn­ing phase, but I think that the reduc­tion of this bot­tle­neck dras­ti­cal­ly changes the pace of audi­to­ry research and sci­ence more broadly.

Christi­na Y. Tzeng:
With that, I’d like to extend my grat­i­tude to my recent col­lab­o­ra­tors as well as to all of you for your atten­tion. I look for­ward to your ques­tions and comments.

Rachel Theodore:
Excel­lent, Christi­na. Thank you so much for those real­ly care­ful thoughts. Ques­tions. Yeah, here’s one for you, Christi­na. “I was won­der­ing if, in your work, you’ve observed the use of dif­fer­ent expo­sure phase meth­ods besides lex­i­cal deci­sion in an online world. How’s the sto­ry lis­ten­ing closed sen­tences and if you’ve noticed any dif­fer­ences at test as a func­tion of those expo­sure phase methods?”

Christi­na Y. Tzeng:
Thanks for that ques­tion. So again, com­ing back to this dis­claimer that I’m a rel­a­tive­ly nov­el user of online method­ol­o­gy in gen­er­al for audi­to­ry research, we’ve only done some pilot work using oth­er kinds of expo­sure method­ol­o­gy. We’re in the process of pilot­ing a talk­er iden­ti­fi­ca­tion task, where dur­ing the expo­sure phase, lis­ten­ers well hear utter­ances spo­ken by spe­cif­ic talk­ers and have to indi­cate which talk­er they think they’re hear­ing with the ulti­mate goal being able to iden­ti­fy the dif­fer­ent voic­es in the task. And so far, we haven’t seen any kind of notice­able dif­fer­ence in per­for­mance for in-per­son­/in-lab ver­sions of that and online ver­sions. What we do notice is that, some­times, par­tic­i­pants will take self-inflict­ed breaks. And so one les­son we’ve learned is that in addi­tion to keep­ing the task rel­a­tive­ly short, we will build in some breaks so that they’re not leav­ing the com­put­er for an extend­ed peri­od of time. But the short response to that ques­tion is at least with talk­er iden­ti­fi­ca­tion tasks and this lex­i­cal­ly guid­ed per­cep­tu­al learn­ing task, we haven’t seen rea­sons to not trans­fer these into the online world.

 

Get on the Registration List

BeOnline is the conference to learn all about online behavioral research. It's the ideal place to discover the challenges and benefits of online research and to learn from pioneers. If that sounds interesting to you, then click the button below to register for the 2023 conference on Thursday July 6th. You will be the first to know when we release new content and timings for BeOnline 2023.

With thanks to our sponsors!