Alex FrasÂer, UniÂverÂsiÂty of Oxford
As online research has become more prevaÂlent, researchers are invesÂtiÂgatÂing the posÂsiÂbilÂiÂty of repliÂcatÂing techÂniques that go beyond simÂple behavÂiourÂal meaÂsureÂments. One method that has capÂtured the imagÂiÂnaÂtion of researchers is leverÂagÂing the webÂcam to colÂlect eye-trackÂing data. SevÂerÂal packÂages have been develÂoped for colÂlectÂing such data but have sigÂnifÂiÂcant limÂiÂtaÂtions due to extenÂsive and potenÂtialÂly frusÂtratÂing calÂiÂbraÂtion proÂceÂdures. UnforÂtuÂnateÂly, this can limÂit the accesÂsiÂbilÂiÂty of these packÂages when colÂlectÂing data with speÂcifÂic popÂuÂlaÂtions, such as chilÂdren and parÂticÂiÂpants with neuÂro-develÂopÂmenÂtal difÂfiÂculÂties. To overÂcome this, we have looked at how gaze detecÂtion studÂies are conÂductÂed with infants, where researchers will manÂuÂalÂly score gaze direcÂtion from videos to minÂimise data loss.
Using these methÂods, we have develÂoped GazeScorÂer, an autoÂmatÂed gaze scorÂing packÂage that can disÂtinÂguish a Left, Right, and CenÂtral gaze locaÂtion using basic image proÂcessÂing. Using videos colÂlectÂed through a GorilÂla-hostÂed experÂiÂment, we have demonÂstratÂed a good levÂel of inter-rater reliÂaÂbilÂiÂty between GazeScorÂer and a manÂuÂal scorÂer. This opens the posÂsiÂbilÂiÂty of a hybrid-scorÂing sysÂtem with minÂiÂmal manÂuÂal interÂvenÂtion in the short term. Future develÂopÂment will focus on utilÂisÂing live webÂcam footage for data colÂlecÂtion through the browsÂer. This softÂware would proÂvide a potenÂtial resource for researchers who would benÂeÂfit from gaze-based responsÂes, but do not require high spaÂtial resolution.
Full TranÂscript:
Alex FrasÂer:
Great. So thanks for havÂing me. And I’m realÂly excitÂed to tell you about the project that we’ve been workÂing on for the last year and a bit. But I thought I’d start by disÂcussing my jourÂney into online research and where I started.
Alex FrasÂer:
So in 2017 my departÂment shut down withÂout any real notice, and we lost access to lab space right in the midÂdle of my PhD. I moved a lot of my research online and we were realÂly impressed by the amount of data we were able to colÂlect in such a short periÂod of time. But we also wantÂed to look at how we could take this data into parÂticÂiÂpants’ homes, of those who are not able to get to the lab quite so easÂiÂly, so small chilÂdren and diverse populations.
Alex FrasÂer:
But we were also interÂestÂed in how we could move beyond the reacÂtion times and the accuÂraÂcy scores that we were getÂting very reliÂably in the browsÂer. And so we conÂtactÂed GorilÂla and they had been workÂing on this at the same time, and they built us an experÂiÂment for us to work with. And we startÂed doing some pilotÂing, but we found that the calÂiÂbraÂtion was quite long and involved and it made it quite difÂfiÂcult for us to colÂlect … Get the kind of eye trackÂing that we’d hoped to be able to do with these populations.
Alex FrasÂer:
So we took a bit of a shot in the dark and we startÂed a new project with the Oxford Research SoftÂware EngiÂneerÂing team to see what we could do about makÂing our own pipeline for anaÂlyzÂing webÂcam data. And when we sat down with them and we estabÂlished what we wantÂed. The main thing that we needÂed was someÂthing that had a very limÂitÂed calÂiÂbraÂtion. So we actuÂalÂly realÂly wantÂed to minÂiÂmize the calÂiÂbraÂtion to as litÂtle as posÂsiÂble to make it as simÂple for us to colÂlect data.
Alex FrasÂer:
And as we disÂcussed it more, we startÂed thinkÂing more about what we wantÂed. And so we were focused so much on tryÂing to repliÂcate an eye trackÂer, and a lab-based eye trackÂer, but we thought we may take a step back and actuÂalÂly conÂsidÂer doing someÂthing maybe a bit more simÂplisÂtic, but maybe more reliÂable. And we looked to the manÂuÂal scorÂing that is often done in infant research, and we wonÂdered how we could do that a bit more effiÂcientÂly and a bit less labor intenÂsive. And so we decidÂed to actuÂalÂly focus more on gaze oriÂenÂtaÂtion and so codÂiÂfyÂing a left and right look, comÂpared to actuÂalÂly tryÂing for a preÂcise gaze location.
Alex FrasÂer:
So to do this means to colÂlect a lot of video footage of parÂticÂiÂpants folÂlowÂing the tarÂget stimÂuli. And Sylvia disÂcussed how she did that with chilÂdren in the preÂviÂous talk, but with the adults it was a lot more simÂple. We could just send them the same proÂceÂdure and they genÂerÂalÂly were able to comÂply themÂselves and we didÂn’t have to superÂvise them as they were doing it. And we did this in GorilÂla and colÂlectÂed a lot of footage online and we endÂed up with a series of videos like this. And what we were able to do is we could trim these videos down to and synÂchroÂnize them with the tarÂget stimÂuli. So we know approxÂiÂmateÂly where they’re lookÂing as they are watchÂing the stimuli.
Alex FrasÂer:
Then we needÂed to break down the images into indiÂvidÂual frames. And once we had those indiÂvidÂual frames, we were able to do more image proÂcessÂing than [inaudiÂble 00:03:12] we were using the webÂcam footage indeÂpenÂdentÂly. But also, we needÂed to set down a baseÂline, ground truth, that we could comÂpare to our autoÂmatÂic scorÂer. So to do this we went back to the manÂuÂal scorÂing that we were tryÂing to repliÂcate. And so we put all of these images online into anothÂer GorilÂla experÂiÂment, and we had an indeÂpenÂdent naive researcher who came in and manÂuÂalÂly scored all of the videos for their gaze oriÂenÂtaÂtion. Which took a fair amount of time and a lot of effort, but we got that done.
Alex FrasÂer:
And this meant that we were able to do a good comÂparÂiÂson to our autoÂmatÂed scorÂer. And lookÂing at how we actuÂalÂly did our autoÂmatÂed scorÂing, the first thing we need to do is idenÂtiÂfy the face in the image. So once we detect the face, we were able to cut it down and we could plot landÂmarks on to each image of every face. And specifÂiÂcalÂly what we needÂed was the eye locaÂtion, and we could actuÂalÂly isoÂlate the eye and work with that independently.
Alex FrasÂer:
And as you can see, the eye itself is actuÂalÂly very small, only about 30 to 40 pixÂels, so there’s not a lot of space for us to work with. But what we were able to do is we were able to idenÂtiÂfy the iris by lookÂing for essenÂtialÂly the darkÂest space that was withÂin the tarÂget. And when we processed this we endÂed up with a shape like this. And when we have this shape we can then idenÂtiÂfy the midÂdle of the shape, and we clasÂsiÂfy this as being the midÂdle of the Iris.
Alex FrasÂer:
Now I said we [inaudiÂble 00:04:39] wantÂed to try and minÂiÂmize any calÂiÂbraÂtion, and this is where we replace what a traÂdiÂtionÂal eye trackÂing calÂiÂbraÂtion would be. So instead of doing a traÂdiÂtionÂal calÂiÂbraÂtion that you would expect, we rather, we just specÂiÂfy where a cenÂtral gaze is. So we know where the parÂticÂiÂpant is lookÂing when they’re lookÂing straight ahead. We do this withÂin the first frame of any video, and so then we can assign a buffer around the cenÂter of that image. And this is what we do instead of our calÂiÂbraÂtion, any moveÂment outÂside of the buffer region would be conÂsidÂered a codÂiÂfied look towards the left and the right.
Alex FrasÂer:
And so now we have autoÂmatÂed scorÂing and we have a manÂuÂal scorÂing, we can actuÂalÂly comÂpare the two. So in these visuÂal plots, we can see how the top row, which is the manÂuÂal scorÂer, is codÂiÂfyÂing the gaze, and the botÂtom score is also codÂiÂfyÂing at the same time. There’s a litÂtle bit of a lag, but genÂerÂalÂly they are folÂlowÂing the same, they’re conÂvergÂing in their gaze oriÂenÂtaÂtion. So to quanÂtiÂfy this a litÂtle bit more we did a Cohen’s kapÂpa comÂparÂiÂson between the two. And we set a minÂiÂmum valÂue that we wantÂed to accept as 0.6, which is a genÂerÂal conÂsenÂsus for Cohen’s kapÂpa scores.
Alex FrasÂer:
And what we found when we look at the data among the adults is that the vast majorÂiÂty of parÂticÂiÂpants got above a Cohen’s kapÂpa valÂue of 0.6. And if anyÂthing, a lot of the parÂticÂiÂpants are scorÂing well above 0.8, and almost approachÂing near-perÂfect agreeÂment. This is only focusÂing on the staÂtÂic frames where the tarÂget is at its most extreme posiÂtion, but this is still showÂing very good agreeÂment. There are a couÂple of parÂticÂiÂpants where we see one eye under-perÂformed comÂpared to the othÂer, but in genÂerÂal we are doing very well. I’m lookÂing at the samÂple that Sylvia colÂlectÂed before me. We see in the chilÂdren, we see very simÂiÂlar patÂterns of results, where the majorÂiÂty are getÂting very good agreeÂment in both eyes, but there is a couÂple where the agreeÂment is lowÂer in one over the othÂer. But this being that we’re still seeÂing very high agreeÂment in these optiÂmal conÂdiÂtions we’re lookÂing to work with.
Alex FrasÂer:
So to give you a bit of a sumÂmaÂry in what we hope to do with this movÂing forÂward. BasiÂcalÂly, how did we perÂform? I think we did very well. There was genÂerÂalÂly quite good agreeÂment between the autoÂmatÂic and the manÂuÂal scorÂer in these optiÂmal conÂdiÂtions that we put down. When we look at more difÂfiÂcult bits where the eye is actuÂalÂly in moveÂment because it’s folÂlowÂing a tarÂget, the perÂforÂmance isn’t quite as good. And we’re lookÂing at how we can improve that and what eleÂments may impact the moveÂment. And hopeÂfulÂly we can improve the algoÂrithm. But it’s just basiÂcalÂly just a first pass at the probÂlem in the first instance. And hopeÂfulÂly we’ll be able to improve this before we can give access to peoÂple in the near future.
Alex FrasÂer:
But the othÂer thing I wantÂed to disÂcuss is where do we fit in withÂin the curÂrent online research? And we heard a lot of great work done yesÂterÂday using WebGazÂer and the team workÂing with mouse viewÂer. And in fact, we saw Kat Ellis who showed, she manÂaged to get some good results with kids with FragÂile X, who were able to go through the calÂiÂbraÂtion that we were havÂing trouÂble with. And so this is all very impresÂsive, and we’re hopÂing that we can just be anothÂer resource that will fit into this new enviÂronÂment of online research. And hopeÂfulÂly peoÂple will be able to do good things with this in the future.
Alex FrasÂer:
Yeah, thank you to everyÂone on the team who’s worked with us. And our pre-print is availÂable with a bit more detail about the data that I preÂsentÂed here. Feel free to email me if you have any questions.
SpeakÂer 2:
Thanks very much, Alex. We have got a couÂple of quesÂtions in the chats, so I’m going to ask you one of them.
Alex FrasÂer:
Okay.
SpeakÂer 2:
But then you go down to them in the Q&A, that’d be realÂly kind. Thank you. So the first one is from CatherÂine Ellis and she says, “How still do parÂticÂiÂpants have to be?” So for examÂple, if you were workÂing with chilÂdren, how careÂful would you need to be about that?
Alex FrasÂer:
We still to mainÂtain relÂaÂtiveÂly litÂtle moveÂment, so we do need to minÂiÂmize moveÂment as much as posÂsiÂble. But because we’re capÂturÂing all the feaÂtures of the face as we are, one of the future things that we hope to be able to do is comÂpenÂsate for moveÂment more.
Alex FrasÂer:
So yeah, as I said, this is still the very earÂly, very preÂlimÂiÂnary stuff. Once we can work more with the face landÂmarks and accountÂing for moveÂment, we’ll be able to estabÂlish ways of comÂpenÂsatÂing for moveÂment a litÂtle bit betÂter. So that’s the goal in the future.
SpeakÂer 2:
BrilÂliant. Thank you very much.


