Bryony Payne, UCL
@bryony_payne
My research looks into the perceptual bias we afford to voices that belong to us or to others and how this bias might be modulated by using the voices in a social context. With the challenge of moving our research online, we needed a socially interactive online environment that individual participants could access remotely.
To this end, we created a cooperative, two-player online game in which participants were able to choose a new synthesised voice to represent themselves and then use that voice to interact with another participant in a 30-minute drawing game. At test, we assessed whether social use of the voice modulated the degree of perceptual bias afforded to it via a perceptual matching paradigm. Specifically, we compared the bias demonstrated by participants who played this online game (n=44) to a control group (n=44) who had only brief exposure to the voices and did not play the game. Results show that participants afforded a perceptual bias to the synthesised voices they chose, but that the degree of bias was not modulated by social use of the voice. Here I present these results alongside the online tools, tasks, and platforms we used to attain them.
Full Transcript:
Bryony Payne:
Okay. Hi everyone. I’m a PhD student at UCL, and I’m going to be talking to you about the challenges of testing social interaction in isolation and how we overcame those challenges. So briefly for context, my research looks at voices and voices are obviously a key part of our self-identity. They not only have great personal importance to us, but also have great social importance because it’s through our voice that we share ourselves with others and achieve our social and communicative goals.
Bryony Payne:
So broadly my research asks questions like, well, are we biased towards our own voice because it’s a voice that belongs to us? And can we actually give people a new voice that isn’t actually their own voice inherently, but get them to associate that voice to themselves and then show a bias for it? And how much of this bias is affected by whether or not they’ve had a chance to use the voice socially given how important voices are to our social interactions?
Bryony Payne:
So the first sort of part of my PhD looked to answer the first two of these questions. And we can see here from the results that when we give people a new voice, simply by telling them that this new voice is now yours, this voice belongs to you, we see that reaction times to that voice are significantly quicker than the reaction times to a voice we tell them belonged to a friend or a voice we tell them belongs to a stranger. And the fact that this voice, the self-voice is being perceived more quickly than either of the other two voices is purely because this voice has now been deemed to be a more self-relevant stimulus than either of the other two voices. And as it becomes more self-relevant, it accrues a processing advantage that prioritizes it in our perception. So this was done via a perceptual matching paradigm in Gorilla, which people can use via Gorilla Open Materials if they want to try it.
Bryony Payne:
But the question then became well, what about using this voice socially, rather than just giving participants a chance to hear the voice that we’ve suddenly told them is theirs and then measuring the bias towards it. What about if we give participants a chance to use that new voice and then measure how they perceive it? So that was the main sort of aim, but then the pandemic hit. So suddenly testing social interaction became very difficult and the question really became, well, how can we create an online environment where people can interact using a new voice?
Bryony Payne:
And so my first top tip then is to collaborate where you can, because that’s the only way that this got done. We were lucky to collaborate with academics who work in AI, Angus Addlesee and Professor Verena Rieser. And together, we managed to build an online two-player game, combining our skills and we were able to create this environment where we could host pairs of participants to come in, interact in a real-life interaction and use a new voice that they’d chosen for themselves. Importantly, this new voice was a synthesized voice made by CereProc voices and they create human-sounding voices in a range of accent. And by using a synthesized voice in this task, it meant that we could provide participants with a huge amount of agency, not only in what the voice sounded like and what they wanted to be represented as, but also a huge amount of flexibility in what they wanted to say with that voice.
Bryony Payne:
The game itself that we created was called Drawing Conclusions. And that’s because it took the form of a drawing game and it looks something like this. This was created in Node.js app. The idea of the game was that pairs of participants would take on the role as either a narrator or an artist. And the narrator had to verbally describe to the artist how to draw a picture without telling the artist what it was that they were drawing. So this is the screen that the narrator would have seen. And it went something like this. The narrator would choose their synthesized voice from a dropdown menu, choose a picture from a picture deck that we supply to them, and then they would try to type instructions to tell the artist how to draw that picture. Importantly, these written instructions were then said aloud in the text to speech voice that they had chosen for themselves.
Speaker 2:
Start by drawing a big rectangle in the middle of the screen.
Bryony Payne:
So it went something like that. The artist would then be able to hear the narrator’s instructions and follow those instructions accordingly. And this process would happen iteratively until either the narrator was satisfied that the picture was complete or until the artist had successfully guessed what it was that they had drawn. So this was a bit like Pictionary and it actually was a very fun way of getting participants to interact with a real-life human being, to achieve a social goal and achieve that goal by using the voice that they had just chosen for themselves.
Bryony Payne:
So we’ve got the game, we’ve got the environment, how do we get participants there? We had quite a complicated set up, especially as it needed to all be run remotely. We needed to start participants in Gorilla, which was our main test platform, getting them to choose a voice and answer questions, like why they chose that voice for themselves. We then needed to get them through to the drawing platform, the game platform we created. And then back again to Gorilla to be able to answer questions about the bias towards that voice.
Bryony Payne:
And this is where the next tip comes in, which is really to manipulate tools to your needs. The tools exist. You just have to figure out how to use them for the best. So Gorilla supplies a redirect mode, which allows you to transfer participants from Gorilla out to a third-party platform. And then you can embed a link into that platform and send them back into Gorilla when you’re done. And importantly, the participant starts where they left off, which is really helpful for sort of ensuring continuity between your tasks.
Bryony Payne:
It was also important to think about how we were going to recruit participants. How are we actually going to get participants to do the study at all? Ordinarily in person, we might be recruiting pairs of participants to come into the lab at preset times, but that again, wasn’t possible. So instead we recruited online via Prolific, and obviously Prolific as normally associated with the main benefit of recruiting hundreds of participants at once. But it’s also, we’re saying that you can use Prolific to actually recruit very small and very controlled numbers on time.
Bryony Payne:
So we started a Prolific study and we only opened up two available spaces. We then recruited participants two at a time into the drawing game. We’d allowed them to complete the study. And at that time it was paused in Prolific. And then we could gradually increase the places to another two participants and this allowed it to be a very controlled way of getting participants through our study. We could track their progress through the study. And it also meant if anybody withdrew from the study, we had an immediate pool of people that were ready to take over.
Bryony Payne:
So overall I think in order to navigate people through multiple platforms on using different tools, it’s really important to use really clear and well-piloted instructions. We also used video instructions for things like explaining how to play the game before they got to the game platform. And videos instructions are just a really good way of getting people to listen to a lot of information in one go in a more engaging way. And it’s also important that they actually can’t skip past them in Gorilla. So they have to listen to them before they move on.
Bryony Payne:
So we’ve got the participants there. The next question was, well, how are we going to keep them there? This study was a very long study. It was about an hour long as sort of an average, but some people took a lot longer. And to be honest, the game was the fun half, and that was the first half. So how to keep people in your study after that, rather than just play the game and then go make a cup of tea? So firstly, we needed to make sure that it ran as smoothly as possible before we even began. And obviously everyone has said pilot, pilot, pilot, and that’s very true. I also limit browsers. I find Chrome to be the least glitchy, especially when we’re working across multiple platforms. Chrome presented the least amount of issues.
Bryony Payne:
If you’re using auditory stimuli in a task, I sometimes find in Gorilla that the first auditory stimulus doesn’t play or doesn’t play quite on time and that can throw participants off. So I actually use a dummy sound at the beginning of tasks that include auditory stimuli. And that dummy sound is normally just a period of silence that the participants don’t even know has been included. But it means that by the time the first proper sound wants to play, it’s ready to go and it runs more smoothly. I also make really good use of progress bars. Participants are really grateful to have progress bars in the study. And if I can’t have a progress bar on screen, because I don’t want it to be there officially, I always tell participants how long the next section of the study is going to take.
Bryony Payne:
And if you’ve done all of those things, you should have good data and you can see these are the results from my study. So here you can see in the drawing game group, the people that chose a voice and played the game, they have prioritized the self-voice significantly more in comparison to the other voices that they heard in that game, but actually in comparison to a controlled group who just chose the voice and didn’t play this game, the results are exactly the same. There’s no significant difference. So here we can show that there is perceptual prioritization of voices that we own or voices that we’ve used, but that prioritization isn’t modulated by whether we’ve used them nor how we’ve used them in an interaction with another person.
Bryony Payne:
So thanks very much to my lab, to Angus Addlesee, to CereProc for their synthesized voices, to Gorilla and Prolific. And if you’ve got any questions, please do drop me an email. Thank you very much.
Speaker 2:
Thank you very much, Bryony. If there are any questions for Bryony, do drop them in the Q&A. We have time to answer one now, otherwise it’s going to be me asking the question. So do, okay, keep questions coming into the Q&A, and I’m going to quickly ask Bryony a question. So you talked about presenting sound in these sorts of paradigms. And obviously, there was a whole session yesterday when two people were talking about doing online testing with sounds. Have there been any other issues that you’ve run into in presenting auditory stimuli online?
Bryony Payne:
So I tried to create an intentional binding paradigm. I’m not sure if people here are familiar with that, but it relied on having very accurate onsets of time and offsets of time of auditory stimuli. And I actually found that the biggest issue with that was that some browsers present a lag between when Gorilla for instance tells you that the time has been started and when the browser has actually played it. And that’s mostly why I use Chrome because it seems to have the least variation across participants of when that time onset of the auditory stimulus is actually played. Whereas a browser, I think it was Safari sometimes had up to 500 milliseconds of lag. And because my study relied on having a very accurate and precise measure of time in milliseconds, a bigger lag of that makes a big issue. So I know that Gorilla has produced quite a lot of work and pre-prints about the timing accuracy of these things. But for the sake of minimizing that in my study, I always just use Chrome.
Speaker 2:
Thank you very much. I think there are some more questions coming in, but I’m going to say thank you very much to Bryony, but also to all our speakers in this opening session of Buffet of Online Research. And we’re going to hand back over to Jo. Thank you very much.