Analysing Eye-Tracking Data

Analysing Eye-Tracking
Data
Hayward Godwin
University of Southampton
Outline
Part 1
Eye-tracking measures an overview
Data Viewer reports
The Organise-Analyse-Visualise approach in R
Part 2
Try it yourself!
Eye-Tracking Measures
An Overview
for a detailed review, see Rayner (2009)
Global versus Local measures

Global measures are computed at the overall (or global)
level of a trial and ignore what was being fixated at any
point in time
e.g., mean fixation duration for a trial
Local measures are computed for each object or

stimulus in a trial, paying attention to what was being
fixated at any point in time
e.g., mean fixation duration for target words in a reading
study
Mean Fixation Duration (global)

(Mean duration of fixations)
Search for a blue square target Mean Fixation Duration =

125
(130+125+110+90+150+190)/6
130
110
190
90
150
Mean Fixation Duration (local)
(Mean duration of fixations on a specific object type)

Search for a blue square target Mean Fixation Duration for target =
125
(110+190)/2
130
110
190
90
150
Number of Fixations (global)

(Mean number of fixations)
Search for a blue square target

125
130
110
190
Number of fixations =
90
150
Number of Fixations (local)
(Mean number of fixations on a specific object type)

Search for a blue square target Number of fixations for target =
125
130
110
190
90
150
Total Gaze Duration (global)

(Sum of fixation durations)
Search for a blue square target Total gaze duration =

125
130+125+110+90+150+190
130
110
190
90
150
Total Gaze Duration (local)
(sum of fixation durations on a specific object type)

Search for a blue square target Total gaze duration for target =
125
110+190
130
110
190
90
150
First-pass Gaze Duration
(sum of fixation durations on the first visit or pass of an

object)
125
First-pass gaze duration for

target =
110
130
110
190
90
150
(the second fixation of 190ms

duration occurs on the second
pass so is excluded)
Single Fixation Duration
(mean of fixation durations when an object is only ever

fixated once)
125
130
110
190
This is one of the cleanest

measures there are in eyetracking since only fixating an
object once means we can chart
the time taken to fully process
that object
Here, only two objects are ever
fixated once. These are
highlighted to the left.
90
150
Since the target object is fixated
Proportion of objects fixated (global)

(Proportion of objects directly fixated)
Search for a blue square target Proportion fixated = 3 / 5 = 0.6

125
130
110
190
90
150
Proportion of objects fixated (local)

(Proportion of objects directly fixated, broken down by
object type)
125
Proportion of distractors
fixated=2/4=0.5
Probability of fixating target =
1/1 = 1
130
110
190
90
150
Saccade onset latency
(Time from display onset to start of first saccade)

125
130
110
190
90
150
If display occurs at time 0, then

this is 130ms
Mean number of visits
(Mean number of times each object is visited)

125
130
110
190
Count up number of times each

object is visited and then divide
by the number of objects that
were visited
Do NOT include zero values for
unvisited objects
1 + 2 + 1 = 4 / 3 = 1.3
90
150
Saccade Amplitude
(Mean amplitude of saccades)
125
(1.2 + 1.4 + 2.2 + 0.2 + 3.4) / 5

1.2
1.4
130
110
2.2
190
90
0.2
3.4
Mean length of all saccades =
150
Verification Time
(Time between first fixating and button press)

125
130
110
190
90
150
Find when button press occurred.

If we find that it occurred 150ms
into the second fixation (of
190ms) on the target, then
verification time =
110 + 90 + 150 + 190 150
A better way to do this is to find
the time the first fixation starts
on the target and take this value
away from the RT
Scanpath Ratio
(sum of saccade lengths to target divided by shortest

distance to target)
125
(1.2 + 1.4 + 2.2 + 0.2 + 3.4) /

5.2
1.2
1.4
130
5.2
110
2.2
190
90
0.2
3.4
Scanpath ratio =
150
Notes on Measures
Many, many measures that can be run
Just because you can run these, it doesnt mean that
you should
Focus on running only the measures that address your
research questions and avoid doing or reporting
additional ones for the sake of it (i.e., avoid fishing!)
Data Viewer Reports
Fixation Report
One row of data for every fixation in your study (per trial, per
participant)
You will typically need to use the fixation report if you are running
visual search/scene perception studies
Use fixation reports to filter out fixations that coincide with other
events, such as display changes, button-press responses, etc
This can be done by filtering using the Interest Period (as youll
see in the tutorials) but often youll end up removing some
fixations you still want
Fixation reports can also be used to re-compute the size of interest
areas and capture fixations that fell just outside of interest areas
Fixation Report Important Columns

RECORDING_SESSION_LABEL: The recording session ID
TRIAL_INDEX: Trial number
CURRENT_FIX_INDEX: The fixation ID for the current
CURRENT_FIX_DURATION: The duration of the current fixation
CURRENT_FIX_BUTTON_PRESS_X: The time during the current fixation that a
button was pressed
CURRENT_FIX_INTEREST_AREA_LABEL: The interest area label of the current
fixation (. if the eyes are not on an IA)
CURRENT_FIX_NEAREST_INTEREST_AREA_LABEL: The nearest IA to the eyes
CURRENT_FIX_NEAREST_INTEREST_AREA_DISTANCE: The distance to the
CENTRE of the nearest IA
Can also get NEXT_ and PREVIOUS_ versions of all measures
Interest Area Report

One row of data for every interest area in your study
(per trial, per participant)
Reading researchers typically use this type of report
They typically change the interest period to be set to
the time period of the trial itself, enabling the filtering
out of any unnecessary fixations
Interest Area Report Important

Columns
IA_DWELL_TIME - Total time spent on the IA (sum of all fixations on IA)
IA_FIRST_FIXATION_DURATION - Often referred to as First Fix Duration in reading research. The duration of
the first fixation of the interest area (only on first pass, if the target region is skipped this will have no
value)
IA_FIRST_RUN_DWELL_TIME - Often referred to as Gaze Duration in reading research. A sum of all fixation
on the IA for the first pass. You also use this column for calculating Single Fixation Duration, but remove all
occurrences where the IA region was fixated more than once.
IA_ID/IA_LABEL - The ID number and label for the interest area
IA_REGRESSION_IN - Returns 0 or 1
IA_REGRESSION_IN_COUNT - Returns the number of regressions in
IA_REGRESSION_OUT - Returns 0 or 1
IA_REGRESSION_OUT_COUNT - Returns the number of regressions out
IA_REGRESSION_PATH_DURATION - Often referred to as Go Past Time in reading research. Sum of all
fixations that occur before passing to the right of the target interest area (to a greater numbered IA_ID).
IA_SKIP - Returns a 0 or 1
Message Report
One row of data for every message that occurred during
the study (per trial, per participant)
If you want an accurate view of when things happened
during your study, the message report is the one to use
This is particularly important for gaze-contingent
studies where display changes occur
You can technically get most of the messages that occur
from the fixation report. However, some messages do
get missed from the fixation report
Message Report Important

Columns
CURRENT_MSG_LABEL : message text details
CURRENT_MSG_TEXT : message text details
CURRENT_MSG_TIME : the time the message occured
Sample Report
One row of data for every sample recorded by the eyetracker during the study (per trial, per participant)
If you have your Eyelink running at 1000Hz, that gives
you 1,000 rows of data per second of recording
Sample reports typically are tens of millions of rows in
size
Youll only need to use a sample report if you have
certain highly customised setups (e.g., moving displays)
or want to get an idea of millisecond-by-millisecond
pupil size (as is the case in pupillometry)
The Organise-AnalyseVisualise Approach in R
Data
In the past, data could easily be organised in Excel,
Analysed in SPSS and Visualised in
SPSS/Excel/Sigmaplot
With the size and complexity of eye-tracking studies,
this is no longer really possible
We can now do all three steps in R, making the
transition between them easier:
Organise: data.table
Analyse: ezANOVA
Visualise: ggplot
Organising your Scripts for

Reproducible Results
However you do things, its best to have a consistent
approach to organising your R scripts
I have two types of script:
ORGANISE__XYZ.R scripts that organise the data
ANALYSE__XYZ.R scripts that analyse and visualise the
data
However you set up your own R scripts, find an

approach and stick to it
This then makes it easier to copy and paste existing
scripts, and being consistent means you can go back to
old stuff and understand it more easily
Organise: the data.table package

Why use data.table?
It does things very quickly
It extends (builds upon) data.frame objects, meaning that
everything you can do to a data.frame object, you can do to
a data.table
Now going to go through some examples of what it can

do and how to use it
Ill be giving out the example code later, so no need to
type or run through it now
Create a data.frame
Create a normal data.frame

It will look something like this on
the right
It lists different trials for a bunch
of participants and gives you their
RT (Reaction Time) in ms
Convert data.frame to data.table
Add Keys
For large data sets you will want to set keys
When data are keyed, they can be processed faster
A key is set to various columns in your data.table
When a column is associated with a key, it will be able
to group the data by that column more rapidly
In our example, let's set participant id (ppt) and
trialType as keys so we can group the data by these
values more rapidly using the setkey command
Basic Syntax
{WHERE} allows you to select only certain columns. In
other words you can get the command you run to focus
only on the data cells WHERE certain conditions are met
{SELECT} is where you tell data.table what columns or
values you want back. In other words you SELECT
certain values
{GROUPBY} allows you to group the output data in
different ways. This is a bit like pivot tables in Excel.
Getting means
How about the mean RT overall?
Gives us:
In other words we are SELECTing the mean of the RT

column
Getting means
Overall RT isnt the interesting. Lets GROUP BY trialtype:
Gives us:
In other words we are SELECTing the mean of the RT column but

GROUPING BY the trialType column
Getting means
Now let's group by participant and trialType:
Gives us:
In other words we are SELECTing the mean of the RT column but

GROUPING BY the trialType and ppt columns
Getting means
But what if we want to only obtain the means for trials 3 and 4? How do we do that?
We use WHERE !
(Reminder == means is equal

to)
Gives us:
In other words we are SELECTing the mean of the RT column but GROUPING BY the
trialType and ppt columns but only including values WHERE trial is 3 or 4
Adding Columns
Data.table also offers more convenient syntax for
adding columns
If you run:
You add a newColumn column with a value of 1. You can
combine this with WHERE and GROUP BY commands. If
you run:
You get:
Joins and Merges

Suppose we forgot to include information relating to which
condition each participant was in. How do we get that in
there?
We can use a join!
A join in data science is a special type of operation that
combines two datasets
To do this, create a new data.table, listing the participant
id and the condition and follow the steps in the next slide
Joins (or merges) hunt down identical column names and
then join the data from one table with that from another
Performing the Join

Create new data.table containing condition information
and set the keys
To perform a join, its one simple command
We then
have
our
joined-up
data
DT
joinedDT
cDT
We then
have
our
joined-up
data
DT
joinedDT
cDT
Other Types of Join

Weve just done our first join!
Note that weve just joined one column with one other column, but
there is no theoretical limit to how many columns you can join by at
once
There are many types of join, which you may want to use (e.g., left,
right, natural, outer, full, Cartesian product, etc.)
The main point is making sure that the column names match in the
tables you are trying to join, or else things will go horribly wrong
Analysing Data
Worked Example
Worked Example: Mean Fixation

Durations (global)
Lets begin by taking data from a fixation report
Well analyse it, compute mean fixation durations
(global), run an ANOVA, and then plot a graph
The data and scripts required are on the website but
lets walk through it together first
Computing Mean Fixation Durations

(global)
Example from a fixation report
First we compute the by-trial, by-participant means:
This gives us the mean fixation duration for each

participant and each trial
Then we take the mean of these to get means by
participant:

(global)
This is what we now have:
Each participant (RECORDING_SESSION_LABEL) grouped by TRIAL_TYPE

with a DV (mean fixation duration)
What next?

(global)
Now we analyse the data using ezAONVA!

This is from the ez package
Note: make sure that all columns that are factors in your
anova are factors in R before proceeding

(global)
Example
from
a fixation report
ezANOVA
syntax:
The dependent variable column

A list of within-subjects factors
A list of between-subjects factors
The column containing participant IDs
The data.table name

(global)
Here, we want to see if the within-subjects variable TRIAL_TYPE

influences fixation durations. So we do this:
And get this:
Most of this should be self-explanatory (its significant!)

Note that ges is generalised eta-squared, a measure of effect
size (remember: APA format wants effect sizes now). Cite this paper
when you use it: http://www.uv.es/friasnav/Bakeman2005

(global)

Lets plot it!
To produce a plot, we can use ezStats to first get
descriptive means
The nice thing here is that ezStats has the same syntax
as ezANOVA (i.e., you can copy/paste)
Take a look at the values:

(global)
Now, lets plot it! We use ggplot to do the plotting.
The data.table containing the means for plottingControlling axes and making
it APA format
Draw points (as opposed to bars/lines)
Set up the aesthetics
of the plot, with x
being the values
plotted along the xaxis and y being the
value plotted on the yaxis
Save the plot to disk
Graphing with ggplot

Theres a very large number of options when plotting
with ggplot
We will only cover very basic ones here
More information can be found at:
http://www.cookbook-r.com/Graphs/
http://ggplot2.org/
And elsewhere online

(local)
Next, we want to see if the within-subjects variable

TRIAL_TYPE influences fixation durations AND if fixation
durations are different for each interest area type
We have two types of interest area: TARGET and
DISTRACTOR
We therefore run local mean fixation durations, comparing
target and distractor fixation durations
We also now need to remove fixations that did not fall on an
interest area
The column to use is CURRENT_FIX_INTEREST_AREA_LABEL

(local)
Same process as before: compute by-trial means and then by-ppt

means
The only difference now is that were removing fixations that didnt
land on an interest area (i.e., WHERE
CURRENT_FIX_INTEREST_AREA_LABEL is .)
Were also now GROUPING BY the
CURRENT_FIX_INTEREST_AREA_LABEL column

(local)
Now its time to run the ANOVA
This is done the same as before, just now we have one
more within-subjects factor
But the results are similar: only TRIAL_TYPE is significant

(local)
Next, we get the means as before:
Again, we are now adding

CURRENT_FIX_INTEREST_AREA_LABEL to our list of
grouping within-subjects factor columns
Sneak Peak at the Graph

Note that this graph
has two panels or in
ggplots language two
facets, one for
DISTRACTOR_A
objects and one for
TARGET objects
How do we get it to do
The facet_wrap command will create

facets for every level of
CURRENT_FIX_INTEREST_AREA_LABEL
Youre not limited to creating facets for only one column. Try out
facet_wrap(TRIAL_TYPE~CURRENT_FIX_INTEREST_AREA_LABEL) and see what happens
Writing it up
When writing up eye-tracking data, dont just assume
the reader knows why you examined each measure
Given the complexity and number of possible measures
its vital that you are extremely clear both in your own
head and when you write things up why each measure
was examined and what that measure is telling you
If people start complaining that youve explained it too
much and that its bordering on being patronising, then
youre doing it right
Writing it up
From Godwin, Hyde, Taunton, Calver, Blake & Liversedge (2013)
Simple approach:
Begin by stating what the
measure has been shown to
demonstrate in the past
Make a prediction for that
measure in your own study
Then describe how you
examined it
Finally describe what it
showed
Dont just bombard the
Writing it up
From Sheridan & Reingold (2013)
Writing it up
From Sheridan & Reingold (2013)
Writing it up
From Fitzsimmons & Drieghe (2013)
Writing it up
From Fitzsimmons & Drieghe (2013)
The bigger picture

This approach forms part of a larger picture when
writing up your work
Lets just note a few pointers before finishing
The bigger picture

Introduction
First paragraph: general context of the work, prelude main
points
Middle paragraphs: existing research on the topic, highlighting
what has been missed or not done (either at all or perfectly)
before
Ending paragraphs: say how your work will overcome the
limitations in previous work, clearly noting how what you have
done fills a gap in the existing literature and human
knowledge. Tell them why your work is awesome. State your
research question(s). Applied relevance also gets noted if
relevant
The bigger picture

Results
First paragraph: describe what you are going to do in your
results and why
Second paragraph: describe how you cleaned your eyetracking data
Middle paragraphs: go through each of your measures in the
same order as you predicted them in your introduction. For
each one, state WHY you are analysing that one and WHAT it
shows you, and whether it confirms or rejects your predictions
The bigger picture

Discussion
First paragraph: re-state what you did in the study and remind the reader of
your goals and research questions.
Middle paragraphs: go through each of your measures in the same order as
you predicted them in your introduction. For each one, state WHY you
analysed that one, what the outcome was, and WHAT THAT MEANS in
relation to your predictions
Later paragraphs: draw the results together for an overall picture. State
applied implications if necessary. Suggest future studies that would be cool.
Never end by saying something along the lines of more research is
needed.
The rest of today

Next up:
Head to the website (http://wiki.psychwire.co.uk/) and go
through the Part 4: Data Viewer section
Then go through the Part 5: Data Analysis section, which
will outline the bits weve gone through above and some extra
pieces here and there
Thats it.

Analysing Eye-Tracking Data

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analysing Eye-Tracking Data

Uploaded by

Copyright:

Available Formats

Analysing Eye-Tracking

Global versus Local measures

Local measures are computed for each object or

Mean Fixation Duration (global)

Search for a blue square target Mean Fixation Duration =

Mean Fixation Duration (local)

(Mean duration of fixations on a specific object type)

Number of Fixations (global)

Search for a blue square target

Number of Fixations (local)

(Mean number of fixations on a specific object type)

Total Gaze Duration (global)

Search for a blue square target Total gaze duration =

Total Gaze Duration (local)

(sum of fixation durations on a specific object type)

First-pass Gaze Duration

(sum of fixation durations on the first visit or pass of an

First-pass gaze duration for

(the second fixation of 190ms

Single Fixation Duration

(mean of fixation durations when an object is only ever

This is one of the cleanest

Since the target object is fixated

Proportion of objects fixated (global)

Search for a blue square target Proportion fixated = 3 / 5 = 0.6

Proportion of objects fixated (local)

Saccade onset latency

(Time from display onset to start of first saccade)

If display occurs at time 0, then

Mean number of visits

(Mean number of times each object is visited)

Count up number of times each

(1.2 + 1.4 + 2.2 + 0.2 + 3.4) / 5

Mean length of all saccades =

(Time between first fixating and button press)

Find when button press occurred.

(sum of saccade lengths to target divided by shortest

(1.2 + 1.4 + 2.2 + 0.2 + 3.4) /

Data Viewer Reports

Fixation Report Important Columns

Interest Area Report

Interest Area Report Important

Message Report Important

The Organise-AnalyseVisualise Approach in R

Organising your Scripts for

However you set up your own R scripts, find an

Organise: the data.table package

Now going to go through some examples of what it can

Create a normal data.frame

Convert data.frame to data.table

In other words we are SELECTing the mean of the RT

In other words we are SELECTing the mean of the RT column but

In other words we are SELECTing the mean of the RT column but

(Reminder == means is equal

Joins and Merges

Performing the Join

To perform a join, its one simple command

Other Types of Join

Worked Example: Mean Fixation

Computing Mean Fixation Durations

This gives us the mean fixation duration for each

Computing Mean Fixation Durations

Each participant (RECORDING_SESSION_LABEL) grouped by TRIAL_TYPE

Computing Mean Fixation Durations