Sur­pris­ing sounds influ­ence risky deci­sion making


Glo­ria W Feng — Yale University


Sur­pris­ing sen­so­ry events are com­mon in dai­ly life but often behav­ioral­ly irrel­e­vant. Here, we test­ed whether inci­den­tal sur­pris­es influ­ence deci­sion mak­ing, across six online exper­i­ments designed on the online platform.

Par­tic­i­pants (n=1200) made choic­es between risky and safe options in which each option pre­sen­ta­tion was pre­ced­ed by task-irrel­e­vant six-tone audi­to­ry sequences. In two exper­i­ments (each n=200), “com­mon” sequences heard before 75% of tri­als con­sist­ed of iden­ti­cal tones and “rare” sequences heard before 25% of tri­als end­ed with a nov­el deviant tone. Rare sequences simul­ta­ne­ous­ly increased risk tak­ing and increased switch­ing away from the option cho­sen on the pre­vi­ous trial.

Our com­pu­ta­tion­al mod­el cap­tured both changes with val­ue-inde­pen­dent risk-tak­ing and choice per­se­ver­a­tion para­me­ters, respec­tive­ly. When sequence prob­a­bil­i­ties were reversed such that rare sequences con­sist­ed of the six iden­ti­cal tones, par­tic­i­pants still increased option switch­ing after hear­ing these sequences but did not increase risk tak­ing. In two con­trol exper­i­ments, both effects were elim­i­nat­ed when sequences were pre­sent­ed in a pre­dictable man­ner. The choice switch­ing effect may arise not from tone nov­el­ty but from rec­og­niz­ing sur­pris­ing sequences.

Thus, we find evi­dence for two dis­so­cia­ble influ­ences of sen­so­ry sur­prise on deci­sion mak­ing. Aber­rant sen­so­ry pro­cess­ing is impli­cat­ed in psy­chi­atric dis­or­ders includ­ing schiz­o­phre­nia and psy­chosis. Our find­ings offer a new way to eval­u­ate patients and treat­ments by exam­in­ing the rela­tion­ships between sen­so­ry pre­dic­tion errors and behavior.

Alto­geth­er, we find that sur­pris­ing sounds sys­tem­at­i­cal­ly alter human behav­ior, iden­ti­fy­ing a pre­vi­ous­ly unrec­og­nized source of behav­ioral vari­abil­i­ty in every­day deci­sion making.

Full Tran­script:

Glo­ria Feng 0:00
here. Thank you every­one. My name is Glo­ria and I will be talk­ing about my project titled sur­pris­ing sounds influ­ence risky deci­sion mak­ing. So this project was con­duct­ed ful­ly online and con­sists of four main exper­i­ments which I ran over the course of about a year. And I’m excit­ed to share with you its results and also what it has taught me about online research.

So to start, imag­ine that you’re in a busy city. In this urban jun­gle, there are sur­pris­ing sen­so­ry events every­where, you can imag­ine the sound of a car honk­ing for you to get out of the way, or the sound of an approach­ing sub­way car approach­ing you. So in response to the sur­pris­ing sounds, you might quick­ly change your behav­iour. This could mean chang­ing course on the pedes­tri­an cross­way or quick­ly jerk­ing back­wards away from the plat­form edge. In both of these instances, an imme­di­ate behav­iour­al response to a sur­pris­ing sound can be tru­ly adap­tive, because it can pro­tect you from an imme­di­ate dan­ger, or alert you about a poten­tial reward. How­ev­er, but after being in these envi­ron­ments for a while, you might notice that most of the time, the abun­dant nois­es that you’re sur­round­ed by, are actu­al­ly behav­ioral­ly irrelevant.

Think about a time when you were stuffed inside of a crowd­ed sub­way car on your morn­ing com­mute. And while you’re try­ing to focus on doing a cross­word puz­zle, or as you’re writ­ing up a mes­sage to a friend, you hear some­one’s ring­tone going off on the side, or the sound of a con­ver­sa­tion hap­pen­ing in the back­ground. These are also con­sid­ered sur­pris­ing sen­tence sen­so­ry events, but it’s a lit­tle bit less intu­itive, what kind of imme­di­ate effects that might have on your behav­iour, if at all? And if so whether those effects on your behav­iour are sys­tem­at­ic or not. So this is the kind of thing that we were won­der­ing about whether you know, task irrel­e­vant or behav­ioral­ly irrel­e­vant, sur­pris­ing sounds real­ly affect our behav­iour. And we decid­ed to look at this gen­er­al ques­tion in a small­er domain of risky deci­sion making.

And so now the ques­tion kind of becomes, do sur­pris­ing sounds sys­tem­at­i­cal­ly effect our risky deci­sion mak­ing, even when those sur­pris­ing sounds are actu­al­ly task irrel­e­vant. So this is the way that we took a stab at this ques­tion. And we asked par­tic­i­pants to make choic­es between a risky gam­ble option so like a unbi­ased coin flip, essen­tial­ly, or a safe option on every tri­al, and with a key­board press, they can decide to choose the risky option. And after a short delay, they get to see whether they’ve won or lost that, or they can choose the safe choice by click­ing anoth­er key­board key.

On every tri­al, par­tic­i­pants can see one of three dif­fer­ent types of tri­als. So they can either see a gain tri­al, which is at the top, which fea­tures either a poten­tial gain or a small­er poten­tial gain, or a loss tri­al, which involves only poten­tial loss­es, or final­ly, a mixed tri­al, which con­tains a mix­ture of poten­tial gains or poten­tial loss­es. So this is a pret­ty stan­dard par­a­digm used to mea­sure and kind of cap­ture peo­ple’s risk tak­ing pref­er­ences. The key though, is that we intro­duced sur­pris­ing audi­to­ry sequences, or we intro­duced audi­to­ry sequences to this paradigm.

So in the inter tri­al inter­val, so in the three sec­onds before par­tic­i­pants are shown their next set of options to make the choice, par­tic­i­pants have to pas­sive­ly lis­ten to a six tone audi­to­ry sequence. And what’s impor­tant to note is that these audi­to­ry sequences are sup­posed to be task irrel­e­vant in the sense that what­ev­er sounds that they hear, are com­plete­ly not pre­dic­tive of what­ev­er they’re going to be shown next. And it’s not going to be pre­dic­tive of the rewards that they’re going to get. So on a major­i­ty of tri­als on 75% of tri­als, par­tic­i­pants will hear what I’ll con­sid­er a com­mon sequence. So that’s on the bot­tom row here. And it’s com­mon sequence, as shown in the graph­ic con­sists of six iden­ti­cal tones. So I’m going to try to play that for every­one. And I hope it’s not too loud. Let me see. Okay, yeah, so I just played this, this is the com­mon sequence con­sists of six iden­ti­cal tones. Now, on 25% of tri­als, on a more minor­i­ty of tri­als, peo­ple will actu­al­ly hear a rare sequence. So this first sequence will start off the same way as the com­mon sequences do with five tones, but at the end, it will have a dif­fer­ent end­ing. So in this graph­ic here, it shows that it ends on an tone that has a dif­fer­ent pitch. And so this rare sequence will sound like this.

Okay, so now you can kind of imag­ine that some­times on a rare tri­al par­tic­i­pants would be like sur­prised, and we kind of want­ed to cap­ture or to analyse how does risky deci­sion mak­ing dif­fer on these rare tri­als as opposed to com­mon tri­als? Okay, so this is how we approached the way that we col­lect­ed our data.

So there were sev­er­al fac­tors that drew us into con­duct­ing all of our exper­i­ments online, because of the many advan­tages of doing online research, which includes access hav­ing access to large pools of par­tic­i­pants, and we also have, you know, the abil­i­ty to col­lect large and also diverse sam­ples, like for exam­ple, in Pro­lif­ic there’s the option to gen­der bal­ance our sam­ples, which is a very nice fea­ture. And also, prob­a­bly the biggest advan­tage is that it’s extreme­ly time effi­cient to con­duct stud­ies online. Typ­i­cal­ly in the lab, if you want­ed to col­lect a dataset of 100 par­tic­i­pants, it could take months. And it can be incred­i­bly expen­sive and time and mon­ey to run. But the fact that we can press a but­ton and essen­tial­ly col­lect our whole dataset in a day is a huge plus.

How­ev­er, all of this flex­i­bil­i­ty and con­ve­nience does come at the expense of hav­ing max­i­mum amounts of exper­i­men­tal con­trol over our par­tic­i­pants envi­ron­ment. In our case, the crux of our study was real­ly to see how a spe­cif­ic sound manip­u­la­tion can influ­ence peo­ple’s behav­iour. And so it’s extreme­ly impor­tant for us to make sure that the sound manip­u­la­tion is real­ly doing its job, so that we know that our results can be trust­ed and are valid. So thus, we came up with four dif­fer­ent con­sid­er­a­tions, which have to do with kind of address­ing some of the com­mon dis­ad­van­tages of online research.

The first one is get­ting the ques­tion is the sound even on? So this sounds triv­ial, almost. But how­ev­er, when peo­ple are doing exper­i­ments at home, they’re using dif­fer­ent browsers they might have ad block­ers on. So there’s no guar­an­tee that, you know, due to tech­ni­cal issues or some­thing they, for some rea­son, can’t hear the sounds. Anoth­er one is does the audio have suf­fi­cient sound qual­i­ty and clar­i­ty. So this is a big one, because we can’t con­trol the types of audi­to­ry equip­ment peo­ple use. And so the vari­abil­i­ty there is immense. And we want to find a way to con­straint that. Third one is, are there dis­trac­tions or back­ground noise? Yeah, so par­tic­i­pants could be doing this out­side, they could be doing this in pub­lic or at home. And espe­cial­ly giv­en that our study is all about study­ing how irrel­e­vant sur­pris­ing sounds affect peo­ple’s behav­iour, we def­i­nite­ly want those irrel­e­vant, sur­pris­ing sounds not to come from their own envi­ron­ments, but from our task specifically.

And final­ly, we want­ed to make sure that par­tic­i­pants are fol­low­ing basic instruc­tions. So this is not spe­cif­ic to our study. Of course, in gen­er­al, we want par­tic­i­pants to be atten­tive, to be com­pli­ant to instruc­tions and gen­er­al­ly doing our task in good faith. So now I’ll show you how we struc­tured our exper­i­ments. So we had used Goril­la as the host­ing and exper­i­ment build­ing plat­form for our exper­i­ments. And so you can see here it’s like the I drew out a graph­ic sum­maris­ing the exper­i­ment tree that par­tic­i­pants kind of pro­gressed through. So in the first five min­utes of the task, we have par­tic­i­pants com­plete two screen­ers. Both of these were sourced from Goril­las open mate­ri­als library, which is nice.

And so the first one is the brows­er auto­play sound­check so this one’s super basic. All it does is that it plays two sec­onds of like a music clip and ask peo­ple whether or not they can hear the music yes or no. If they can’t, then it leads peo­ple through some instruc­tions on how they can maybe dis­able an ad block­er or some­thing to fix a prob­lem. And oth­er­wise, if they can’t, then they par­tic­i­pants are giv­en the option to exit the study ear­ly and return their sub­mis­sion. We thought this would be ade­quate for address­ing con­sid­er­a­tion one which is whether the sound is on or not. Then peo­ple would progress into doing the head­phone screen. So this is based off of the loud­ness judge­ment test devel­oped by Whit and col­leagues. And essen­tial­ly all it does is that it has par­tic­i­pants lis­ten to three, a sequence of like three audi­to­ry tones, and then par­tic­i­pants then have to label which one sounds the qui­etest and what’s impor­tant to note is that, um, this screen­er is real­ly easy to pass if you’re wear­ing head­phones. But it’s dif­fi­cult to dis­tin­guish dis­crim­i­nate between the three tones if you were play­ing sound from your com­put­er, but not wear­ing head­phones. So essen­tial­ly, those who achieved more than five out of six in accu­ra­cy for this screen­er would pass the screen.

So over­all, these two, five these screen­ers in the begin­ning result­ed in around a 30% exclu­sion rate in our exper­i­ments. And so we col­lect­ed enough data so that by the end, we were able to analyse 200 par­tic­i­pants in our main risk tak­ing task.

Okay, and very quick­ly, I’ll now talk about some of the spec­i­fi­ca­tions we use for pro­lif­ic so we use pro­lif­ic as our main plat­form for recruit­ing our par­tic­i­pants. For the device require­ments we just made explic­it that desk­top is required, and also that there’s audio in the exper­i­ment. And in the study descrip­tion, we tried our best to be as upfront and clear as pos­si­ble. This is not what we wrote for par­tic­i­pants ver­ba­tim. But essen­tial­ly, we want­ed to get two mes­sages across, we want­ed to make sure peo­ple’s ad block­er was turned off. And also that head­phones are a required part of doing this exper­i­ment. So we want­ed to put that upfront before peo­ple even accept­ed the study and did the screen­ers. And last­ly, for the pre screen­ing that we did on pro­lif­ic, we kept it quite loose actu­al­ly. We exclud­ed par­tic­i­pants from pre­vi­ous stud­ies. So we had use pro­lif­ic to recruit par­tic­i­pants to do pilot ver­sions of ear­li­er iter­a­tions of our exper­i­ment. So of course, we did­n’t want to invite those same par­tic­i­pants to come back into our main study.

Alright, so now that I’ve gone over the nuts and bolts of how we ran this exper­i­ment, I’ll talk about the results that we found. So in front of you, at the top, you see the two exper­i­ment like par­a­digm descrip­tions, it’s of exper­i­ments one and exper­i­ments two each that we col­lect­ed 200 par­tic­i­pants on. They’re vir­tu­al­ly iden­ti­cal in terms of the kind of struc­ture where there 75% of tri­als are com­mon. 25% of tri­als are rare with like a deviant end­ing, but the only dif­fer­ence is that For exper­i­ment two, we peri­od­i­cal­ly switch the sides of the stim­uli left and right, every 10 tri­als or so. But oth­er­wise, our pre­dic­tions for the two exper­i­ments would be very similar.

So what we found was that sur­pris­ing sounds increase peo­ple’s risk tak­ing. So the plot you see on the left here, what I’ve done was that I took the dif­fer­ence of the risk tak­ing rate of rare tri­als minus com­mon tri­als. So since these bars are sig­nif­i­cant­ly pos­i­tive, that sug­gests that peo­ple are tak­ing more risks for rare tri­als rel­a­tive to com­mon tri­als. And what’s nice is that exper­i­ments one and exper­i­ments two are both in agree­ment with each oth­er on this one this result.

But from these plots alone, we can’t tell whether this increase in risk tak­ing is dri­ven by only like a sub­set of tri­als, for exam­ple. So one ques­tion I had was, oh, is this dri­ven by gain tri­als only, for exam­ple, so what I did was that I com­bined these two datasets. So I had enough data, and I broke out all the tri­als into gain tri­als and makes tri­als and last tri­al types. And then I plot­ted that against the rate at which peo­ple chose the risky option. So what you can see here is that there’s based on this kind of stair step look­ing pat­tern, irre­spec­tive of the tri­al type, so in all three tri­al types, par­tic­i­pants showed increased rate risk tak­ing for rare ver­sus com­mon tri­als. So what this is sug­gest­ing now is that not only are peo­ple tak­ing more risks, just in gen­er­al, we can see that it’s hap­pen­ing in all dif­fer­ent types of tri­al types, irre­spec­tive of whether there’s poten­tial wins or poten­tial loss­es at stake.

So with this kind of sys­tem­at­ic effect of risk tak­ing, we went to cap­ture this in terms of a com­pu­ta­tion­al mod­el. So one of the advan­tages of using this real­ly basic risk tak­ing par­a­digm is that it’s very well char­ac­terised com­pu­ta­tion­al­ly. So there’s a foun­da­tion­al the­o­ry called Prospect The­o­ry, which cap­tures peo­ple’s risk tak­ing pref­er­ences as a func­tion of peo­ple’s loss aver­sion, there’s a para­me­ter for that there’s para­me­ters for risk aver­sion for gains and loss­es. And final­ly, there’s a choice sto­cha­sisi­ty parameter.

On top of this, on top of Prospect The­o­ry, we went ahead and added an addi­tion­al risky bias dif­fer­ence para­me­ter, essen­tial­ly, is a cap­ture of val­ue inde­pen­dent bias, that would cap­ture the dif­fer­ence between risk tak­ing for rare tri­als ver­sus com­mon tri­als. So this risky bias dif­fer­ence para­me­ter, a pos­i­tive one would indi­cate increased risk tak­ing for rare tri­als, where­as a neg­a­tive risky bias para­me­ter cap­tures have a bias towards the safe option. So on the left is a plot that you’ve already seen. And on the right, I took the mod­el derived risky bias dif­fer­ence para­me­ter fit for the two exper­i­ments. And what we can see is that it’s sig­nif­i­cant­ly pos­i­tive in both exper­i­ments match­ing what we see in the mod­el and three mod­el inde­pen­dent analy­ses. So this is quite reas­sur­ing, actually,

that we found this. Okay, so on the left, you see this, the exper­i­ment designs for exper­i­ment one and two. And what we found now is that fol­low­ing risky, fol­low­ing rare sequences, par­tic­i­pants are increas­ing their risk tak­ing. How­ev­er, from these two exper­i­ments, the way that it’s designed, we’re not sure if peo­ple are tak­ing more risks, because peo­ple are kind of recog­nis­ing that they’ve heard a rare sequence because it hap­pens 25% of the time, or if it’s because peo­ple are sim­ply react­ing to the deviant tone at the end of the rare sequence.

So what I did was that I devised two oth­er exper­i­ments for exper­i­ments 3 and exper­i­ment 4 such that now the rare sequence no longer ends in a rare or nov­el end­ing, instead, it ends on the com­mon tone. So the ques­tion now becomes, after rare sequences do peo­ple still increase the risk tak­ing, and what you can prob­a­bly guess from the title, it actu­al­ly com­plete­ly elim­i­nates the effect. So now I’m show­ing that when we cut in a sense, I remove a local sur­prise from the rare sequence, I actu­al­ly get rid of the risk tak­ing effect, which is a real­ly cool and strik­ing result.

All right, so let me sum­marise what I found. Inci­den­tal sur­pris­ing sounds real­ly do sys­tem­at­i­cal­ly increase risk tak­ing. And I showed that in exper­i­ments 1 and 2, and I showed that this effect is con­sis­tent, con­sis­tent for a bit both behav­iour­al and com­pu­ta­tion­al mod­el­ling based analy­ses. Next, I see that the risk tak­ing effect of sur­prise can be elim­i­nat­ed sim­ply by slight­ly, you know, tweak­ing the sta­tis­tics of the audi­to­ry sur­prise. And as I showed, with the meth­ods and how I built the exper­i­ment, I showed that head­phones, read­ers were used to enhance data qual­i­ty, and helped address the chal­lenges and you can call them dis­ad­van­tages of online research.

So that’s the thing about like these dis­ad­van­tages of online research, right, such as, for exam­ple, hav­ing poor con­trol over exper­i­ments set­ting or hav­ing lack of exper­i­menters super­vi­sion, at the end of the day, these could all turn out to be a huge advan­tage at the end, which is some­thing I found once you’ve estab­lished your results. So essen­tial­ly, the par­tic­i­pants from my study were recruit­ed from over 12 dif­fer­ent coun­tries. Were in the pres­ence of poten­tial dis­trac­tions and we’re prob­a­bly doing the exper­i­ment dur­ing dif­fer­ent times of the day. And yet, despite all of that, we’re still able to char­ac­terise clear sys­tem­at­ic effects of sur­pris­ing sounds on peo­ple’s risky deci­sion mak­ing that were robust to all these vary­ing conditions.

So doing this exper­i­ment online instead of in the lab def­i­nite­ly made things hard­er for us in some ways, because we had workarounds that we need­ed to do, but it ulti­mate­ly made the results feel a lot stronger and more gen­er­al­iz­able. So I think this gives me a lot of opti­mism about doing online research in the future. And it can be daunt­ing, but also real­ly reward­ing in the dis­cov­er­ies it allows us to make. Thank you very much. That’s the end.

Jo Ever­shed 15:37
Glo­ria, that’s amaz­ing. I love that last point you were mak­ing, how we, we love the con­trol of the lab, it feels safe. And, and con­trolled. I’m sor­ry for all the con­trols lab, but it makes our results less robust and less reli­able. And of course, tak­ing the research online makes it hard­er to get it right and to get it get that data and to design your exper­i­ment so that it so that it works and that you’re, you believe the data, but when it does work, you feel much more con­fi­dent that the result is robust, and is going to per­sist into dif­fer­ent envi­ron­ments. So that was a real­ly love­ly point at the end.

I did have one ques­tion for you. Once par­tic­i­pants have passed the head­phone screen­er at the begin­ning, how do you make sure that par­tic­i­pants con­tin­ue to play the task with sound through­out the whole exper­i­ment? Is some­thing? Is that some­thing you looked at?

Glo­ria Feng 16:28
Yeah, that’s a real­ly good ques­tion, Joe. So one thing that I did­n’t dis­cuss in this exper­i­ment was I only focused on the pre screen­ers that hap­pened before the exper­i­ment. So we actu­al­ly had some ques­tions les­son checks, dur­ing the actu­al main exper­i­ment, the main task, so what we’ve done was that we call this like expec­ta­tion checks where par­tic­i­pants have to lis­ten to, they have to play both tone sequences, and then they have to label which one they felt was com­mon or not. So this was kind of a test of like, whether they’ve, you know, lis­tened to instruc­tions on keep­ing their head­phones in, through­out the task, or also have they com­pre­hend­ed the task enough to kind of dis­tin­guish between what is rare and com­mon. So we asked this, both at the begin­ning of the exper­i­ment, and also at the end of the experiment.

And while we did realise that, you know, some, I guess, bad actors could poten­tial­ly not be wear­ing head­phones through­out the whole exper­i­ment, and then when they see a ques­tion like this pop up, you know, put the head­phones back in, and then lis­ten to it and answer it cor­rect­ly. So in that way, the check is still cor­rupt­ible. But we think that that only by prob­a­bly can’t be very com­mon, I sup­pose. So, we did have this. And thank­ful­ly, after look­ing at my data to see peo­ple’s accu­ra­cy on these expec­ta­tion ques­tions, on aver­age accu­ra­cy on it was like above 95%. So that was quite reas­sur­ing over­all. So it sound­ed like after the screen­er peo­ple did seem to be quite good at the task and also able to dis­crim­i­nate between the dif­fer­ent sequences.

Jo Ever­shed 17:56
Yeah, so it sounds like we don’t get many bad actors are pro­lif­ic, which is what they promised us. I don’t know, if you’re here from the talk from Pro­lif­ic this morn­ing, they do quite a lot of checks, to make sure we get good qual­i­ty par­tic­i­pants from them. So that’s all very reas­sur­ing. There is one ques­tion in the q&a, which is from Lau­ra, this result seems oppo­site to what one would expect. How do you inter­pret the increased risk tak­ing after sur­pris­ing sound if they might process this as a threat cue, rather than your tones, which are fair­ly neu­tral, I guess.

Glo­ria Feng 18:26
Yeah, exact­ly. Um, thanks. That’s a great ques­tion. And that’s some­thing that we had been think­ing about a lot, which is like, what is the valence of these sur­pris­ing sounds that are occur­ring? So I guess, we tried our best to keep that as neu­tral as pos­si­ble. And that, like the, we weren’t using stim­uli that are known to be aver­sive, such as like screams or things like that. It is very inter­est­ing that it increas­es our risk tak­ing, I think the way that we were kind of under­stand­ing this effect was that there was kind of like an ori­ent­ing response that par­tic­i­pants might be get­ting when they are hear­ing the sur­pris­ing tone, which we thought was con­sis­tent with, like approach behav­iour with in the sense of like, you can think of a nat­u­ral­is­tic exam­ple, like a frog. And sud­den­ly there’s a stim­u­lus that comes up like a fly, and then imme­di­ate­ly decides to approach that stim­u­lus. So based on the behav­iour­al results that we found that this was kind of like a val­ue inde­pen­dent, like, approach motion. We thought this was con­sis­tent with this, like, bias towards risky deci­sion mak­ing. Yeah.

Jo Ever­shed 19:34
Bril­liant, thank you, Glo­ria. Thank you so much for your time Gloria.

