43
1 INTRODUCTION
Enavigationsystemsareexpectedtoenhancesafety
ofnavigationandsecurity.Witheachnewgeneration
of navigational equipment, producers are trying to
provide new features and extended functionality.
However, additional enavigation functions can
make it more difficult to understand systems’
primaryinformationandmayhamperthe
operation
of the primary functions of the system with poor
usability.For example,additionalinformation in an
Electronic Chart Display and Information System
(ECDIS)mayimpedetheroutemonitoringfunction.
Therefore, the usability rating methods must be
developed to ensure that the seafarers are able to
successfully perform primary operations
of systems
upgradedwith enavigationfunctions, regardlessof
thetypeandspecificationsof the systemandusers’
knowledgeandexperiencewiththesystem.
2 USABILITYTESTING
The International Organization for Standardization
(ISO) defines usability asʺThe extent to which a
product can be used by specified users to achieve
specified
goals with effectiveness, efficiency, and
satisfaction in a specified context of useʺ. When
considering usability of a software application the
usabilitymeanstheperceptionofatargetuserofthe
effectiveness(fitforpurpose)andefficiency(workor
timerequiredtouse)ofthegiveninterface.ISOand
various
research public and private organizations
likeNASAorSATechnologieshavedevelopedover
the last 30 years several methods of rating and
quantifying the task load (TL) (Hart 2006, NASA
1986) and situation awareness (SA) (Endsley 1988,
Endsley 2013). Nowadays these methods are
common industry standards used while assessing a
task,
system,orteamʹseffectiveness,orotheraspects
of performance. They utilize many taskspecific
techniques to achieve the rating goal. For example,
the monitoring of eyetracking in combination with
Extended Framework for Usability Testing in
e-Navigation Systems
P.Zalewski&B.Muczyński
M
aritimeUniversityofSzczecin,Szczecin,Poland
ABSTRACT:ThepaperpresentstheframeworkofusabilitytestingofECDISequipmentbasedontheIMO’s
GuidelinesonSoftwareQualityAssuranceandHumancentredDesignforeNavigation.Byincorporatingthe
eyetrackingtechniquesintotheprocedureitwaspossibletomeasurethevisualattention
distributionandthe
cognitiveworkload.Thepresentedmethodcouldbeusedtoevaluateusabilityofeveryenavigationsystem,
whichisnecessarytoensurethattheseafarersareabletosuccessfullyperformprimaryoperationsofsystems
upgraded with enavigation functions, regardless of the type and specifications of the
system and users’
knowledgeandexperiencewiththesystem.Theinitialresultsarepresentedanddiscussedasthestudyisstill
ongoing.
http://www.transnav.eu
the International Journal
on Marine Navigation
and Safety of Sea Transportation
Volume 10
Number 1
March 2016
DOI:10.12716/1001.10.01.04
44
SAGATisoneofthespecifictechniquesdevelopedin
Maritime Universityof Szczecin (MUS) for
evaluation, improvement and general usability
testing (UT) of ship’s navigation equipment
(Muczynskiet.al.2013,Zalewskiet.al.2012)
ISO/TR 16982:2002 provides information on
humancentred UT methods, comprising various
taskspecificratingtechniques,whichcanbe
usedfor
design and evaluation of enavigation displays. It
details the advantages, disadvantages and other
factorsrelevanttousingeachUTmethod.Thedigest
ofthesemethodsispresentedintheTable1.
In March 2015, International Maritime
Organization has issued the circular on “Guideline
onSoftwareQualityAssurance
andHumancentred
Design for eNavigation”, officially introducing UT
methodsintofutureelectronicequipmentformarine
navigation.Theappendix3ofthisguidelinepresents
anUTprocessbasedontheECDISexampleasaone
closely aligned with testing of future enavigation
systems.ThisUTexamplealignswiththe
integration
andtestingstageofaHumanCentredDesign(HCD)
process for evaluating the performance of essential
tasks by competent users. The selection of test
participants is important and has a bearing on the
quality of test results. If tasks require operations
basedonnavigationalexperienceorknowledge,then
appropriate
participants should be selected. Tasks
thatare generallyperformedby lessexperienced or
knowledgeablepersonnelshouldbesimilarlytested.
TheUTactivityinvolves thefollowingsteps:
1 planning;
2 preparation;
3 undertakingandcontrollingtests;
4 evaluationofresults;and
5 useoffeedback.
A UT plan should be developed by
defining
scenarios and identifying the most important or
criticaltasksthatusersmustperform.Usersandthe
testenvironmentareidentifiedatthisstage.Agoal
based approach should be used when setting the
taskswiththeaimoffacilitatingflexibleyetpractical
assessment of the target enavigation system.
The
following steps can be a part of the goalbased
approach:
1 definitionofgoalsbasedonthecontextofuseof
the system, which may come from functions
stipulated in internationally agreed performance
standards;
2 specifyingfunctionalrequirementsorthe criteria
to be satisfied in order to conform to
the goals,
taking into account the relevant performance
standardsanduserrequirements;
3 specifying usability requirements that must be
achieved during testing, based on the aspects of
effectiveness,efficiencyandsatisfaction;and
4 preparationofteststhatwillassistinverifyingthe
extent to which the system conforms with the
identifiedgoals.
A care has to be taken to guarantee the
reproducibility of the test on different types of
equipmentandwiththesamesettingsandscenarios.
Thismeansthatitisadvisedtoavoidtestscenarios
whereitisnecessarytousethespecificfunctionality
availableonlyinasingle
typeofECDIS.
Table1.UTmethodsthatcanbeappliedwhiledesigninge
navigationproducts(ISO2012)
_______________________________________________
NameofDirect Shortdescriptionofmethod
themethodinvolvement
ofusers
_______________________________________________
Observation Yes Collectionofinformationina
ofuserspreciseandsystematicway
aboutthebehaviourandthe
performanceofusers,inthe
contextofspecifictasksduring
useractivity.
Performance‐Yes Collectionofquantifiable
relatedperformancemeasurementsin
measurementsordertounderstandthe
impactsofusabilityissues.
CriticalYes Systematiccollectionof
incidentspecificevents(positiveor
analysisnegative).
Questionnaires Yes Indirectevaluationmethods
whichgatherusersʹopinions
abouttheuserinterfacein
predefinedquestionnaires.
Interviews Yes Similartoquestionnairesbut
withgreaterflexibility
involvingfacetoface
interactionwiththe
interviewee.
Thinkingaloud
Yes Involveshavingusers
continuouslyverbalizetheir
ideas,beliefs,expectations,
doubts,discoveries,etc.
duringtheiruseofthesystem
beingtested.
Collaborative Yes Methodswhichallowdifferent
designandtypesofparticipants(users, 
evaluationproductdevelopersand
humanfactorsspecialists,etc.)
tocollaborateintheevaluation
or
designofsystems.
Creativity Yes/NoMethodswhichinvolvethe
methodselicitationofnewproductsand
systemfeatures,usually
extractedfromgroup
interactions.Inthecontextof
humancentredapproaches,
membersofsuchgroupsare
oftenusers.
Simulation Yes/NoUseofcomputersimulation
modelling
toolsforinitial
evaluations.
Document‐  NoExaminationofexisting
basedmethodsdocumentsbytheusability
specialisttoforma
professionaljudgementofthe
system.
Modelbased NoUseofabstractrepresentations
approachesoftheevaluatedproductto
allowthepredictionofusersʹ
performance.
ExpertNoEvaluationbased
onthe
evaluationknowledge,expertiseand
practicalexperiencein
ergonomicsoftheusability
specialist.
AutomatedNoAlgorithmsfocusedon
evaluationusabilitycriteriaorusing
ergonomicknowledgebased
systemswhichdiagnosethe
deficienciesofaproduct
comparedtopredefinedrules.
_______________________________________________
45
3 EYETRACKING
Eye tracking is a set of techniques and methods to
measurethepositionofthesubject’seyesinrelation
to the visual scene. In this way a gaze point is
obtained.Eyetrackingitselfhasnotbeenincludedin
IMO’s “Guideline on Software Quality Assurance
andHuman
centredDesignforeNavigation”butit
meets criteria for both “Observation of users” and
“Performancerelated measurements” methods and
can be used efficiently in “Simulation” methods. It
has been proved as a valid method for usability
testinginmanypreviousresearches(Ehmke,Wilson
2007, Goldberg, Kotval 1999, Strandvall 2009).
Few
studies reported usefulness of this technique in the
ship’s bridge environment (Papachristos et.al. 2012,
Lutzhoft,Dukic2007).(JacobandKarn2003)reports
four most common eye tracking measures that are
usedinusabilitystudies,thoseare:
1 Fixation: a relatively stable eyeinhead position
withinsomethresholdofdispersion
(typically~2°)
over some minimum duration (typically 100200
ms), and with a velocity below some threshold
(typically15100degreespersecond).
2 GazeDuration:cumulativedurationandaverage
spatiallocationofaseries ofconsecutivefixations
within an area of interest. Gaze duration
typically includes several fixations
and may
includetherelativelysmallamountoftimeforthe
shortsaccadesbetweenthesefixations.Afixation
occurring outside the area of interest marks the
endofthegaze.Insomestudies,thismeasureis
called“dwell”,“glance”or“fixationcycle”.
3 Area of interest: area of a display or
visual
environmentthatis ofinteresttotheresearch or
designteamandthusdefinedbythem(notbythe
participant).
4 Scan Path: spatial arrangement of a sequence of
fixations.
Dependingontheequipmentusedandtypeofa
study a number of other measures can be used. A
detailed list is given in (Holmqvist et.al. 2011) and
includes number,duration and frequency of blinks,
saccades direction and velocity and microsaccades.
Comprehensive eye tra cking study can give insight
intosearchefficiency (e.g.dueto poorarrangement
ofdisplayelements),importanceofspecificinterface
element, task difficulty and participant’s stress and
cognitiveworkload.
Usefulness of eye tracking techniques is
hamperedby severaltechnicaldifficulties relatedto
both data collection and data analysis. Despite
technological advancement still around 1020% of
populationcannotbetrackedreliably,thisisusually
thecasewitholderparticipantsthathaveanykindof
visualimpairment
andhavetouseeitherglassesor
contactlenses.Anotherproblemisrelatedtothefact
that each eye position is given in vertical and
horizontalcoordinatesinasystemthatisfixedinthe
eyetracker–headframe.Thismeansthatwhenusing
a stationary eye tracker, the participant should
restrict
head movements to a small area (about a
cubicfoot).Whenamobileeyetrackerisused,which
istheonlyreliablesolutionwhenconductingastudy
onaship’sbridgesimulator,fixationcoordinateshas
to be transformed into a ship’s bridge coordinate
system. Thishas to be done manually
using frame
byframeanalysisoradedicatedsoftwarethatallows
for fixationbyfixation mapping. Both methods are
very laborious and time consuming and present a
seriousdrawbackforanystudywithaconsiderable
number of participants and long scenarios. Last
majordifficultyisrelatedtodatainterpretation.Eye
trackingdataanalysiscanproceedeithertopdown−
based on cognitive theory or design hypotheses, or
bottomup−basedentirelyonobservationofthedata
withoutpredefinedtheoriesrelatingeyemovements
to cognitive activity (Goldberg et.al. 2002). For a
usabilitystudyitisimportanttocloselyexaminethe
data stream and
relate it to the current task and
environment. For example, when considering long
fixations during an ECDIS’ Usability Testing all
external factors have to be identified before a
statement abouthigher difficulty of the task can be
made.
4 FRAMEWORKFORECDISUSABILITYTESTING
IntheenvironmentoftheFullMission
Ship’sBridge
Simulator(FMBS)inMUStheUTprocessofthetwo
Kongsberg manufactured ECDISes, SeaMap 10 and
KBridge 7.0, was conducted in accordance to the
recommendationssetbyIMO.Thegoalwasdefined
as “to pla n and display the shipʹs route for the
intended voyage and to
plot and monitor positions
throughoutthevoyage”,basedonSOLASregulation
V/19.2.1.4.Similarly,functionalrequirementsforthe
ECDISes were defined based on the IMO’s ECDIS
performance standard (IMO 2006). The following
functional requirements related to the nautical data
handling necessary for safe navigation, with the
followingsubrequirementsweretakeninto
account:
1 Chartdatahandling(forinstance:changedisplay
orientation,mode,etc.);
2 Own ship data handling (for instance: read
position,speed,etc.);and
3 Trackedtarget(TT)andradardatahandling(for
instance: show TT symbols overlaid on ECDIS
chartarea,etc.).
In the case of ECDIS, the
“usability” can be
evaluated in terms of user effectiveness and
efficiency for each of the ta sks and overall
satisfaction of the system (for example through
subjectiveevaluationbyTLandSA).Ashighlighted
in the Table 2, the measures of effectiveness were
relatedtothedifficultyandcompletenessofthetask
execution. The achievement rate was used as a
measureof“effectiveness”andquantifiedbythefour
levels: “1. Smoothly”, “2. Not smoothly”, “3. With
errors”, “4. With suggestions”. Usability outcomes
werebased on thedialogueprinciples,as identified
in(ISO 9421110:2006),using UTmethods based on
(IMOMSC.1/Circ.1512,ISO/TR
16982:2002).
The specific scenarios and tests, tasks were
created to satisfy the functional requirements. The
following are the tasksfor abasic display handling
scenario:
Task 1: Adjust display modes and scale to meet
operatorʹsneeds
Task2:Obtaininformationaboutalighthouse
Task 3: Measure the
bearing and distance to a
landmark
46
Task 4: Overlay a tracked target symbol and
obtaininformationaboutthetarget
Quantitative performance criteria such as time
taken to complete tasks and questionnaires which
assistwithoverallsubjectivesystemevaluationwere
included inline with the criteria set in the Table 2.
Thesewerenecessary,asone
caneasilydeductwhile
studying the Table 2 that, for example, to
differentiate between“Achieved not smoothly” and
“Not achieved” the time limit of the specific task
completionmustbeset.
Table2. Achievement criteria for the generic usability
ratingbasedon(IMOMSC.1/Circ.1512)
_______________________________________________
AchievementlevelCriteria
_______________________________________________
Achieved 1.Smoothly 1.Participantsunderstoodthe
informationcorrectlyand
operatedproperlywith
confidence.
2.Participantsmadesome
mistakesbutnoticedthe
mistakesimmediatelyand
achievedthegoalsmoothly.
2.Notsmoothly 1.Participantscompletedthe
taskproperlybythemselves,
butwithsomehesitationor

confusion.
2.Participantstooktimeto
findthefirstactionorto
recoverfromerrorsbut
completedthetaskwitha
smallnumberofinteractions.
Not 3.Witherrors 1.Participantscouldnot
achievedunderstandtheinformation
correctly.
2.Ittookalargenumberof
interactionsto
achievethegoal
eveniftheycompletedthetask
properly.
4.With1.Participantscouldnot
suggestions completethetaskby
themselvesandneeded
suggestionsfromthe
instructorormoderator.
_______________________________________________
Basedonthestudypresentedin(Muczynskiet.al.
2013) participants were divided into groups with
different experience. Two factors were taken into
account:
1 Generalseafaringexperience
2 ECDISexperience
SincebothECDISgenericandECDIStypespecific
courses are mandatory, it was considered if the
participanttook thetypespecific
course fora given
ECDIS type and what was the last time when a
participant was working with the same type of
ECDIS.
Each participant was given the same set of task
with no time limitation. To supplement the IMO
recommendations, the eye tracking data were
collected for each participant
throughout the initial
study. The mobile eye tracker “SMI Eye Tracking
Glasses” was used for the data collection. The data
wasanalysedusingSemanticGazeMappingfunction
provided by the SMI BeGaze software. Number,
location, frequency and duration of fixations were
recorded and mapped on the ECDIS interface. The
interfacewas
divided into 3main AreasofInterest:
chartarea, alarmsand sensor dataand menus (Fig.
1).Becausemostofthemenusaredisplayed on the
screenaftera specifiedbutton has been clicked, the
sizeofAreas ofInterestswasnot constant.Whenit
wasrelevantandadvisablefixations
wereidentified
withtheaccuracytoasinglemenu,submenu,button
oreithergraphicalornumericalinformation.
Figure1. Main areas of interest on the ECDIS interface:
chartarea,menuareaandalarmsandsensordata
The initial framework for usability testing
includes4stages,asgivenintable3.
Table3.FourstagesusedforECDISusabilityrating
_______________________________________________
Stage1 Achievementlevel Gradedonscale14as
showninTable2
Stage2 TimeTotaltimerequiredto
accomplishgiventask
Stage3 Eyetracking Allrelevantdatacaptured
measureswiththeeyetracker
Stage4 Scanpathanalysis Detailedanalysisof
participant’svisualattention

distribution
_______________________________________________
Inthestage1eachtaskisratedinascalefrom1to
4accordingtoIMOguidelines(Table2).Duringthe
firstiterationofthestudyeachtaskwasevaluatedby
the experienced instructor. For the next iteration a
dedicated computer software is being developed.
This will help to
make the data analysis faster and
moreefficient.Itwillalsoremovethesubjectivityof
the evaluation process and allow increase the
reproducibilityofthisstudy.
The stage 2 is concerned with the time required
foraccomplishingeachtask.Thisiscloselyrelatedto
thefirst stage,whendiscriminating between
level 1
and level 2. It is suggested to treat the time
independently because by itself it provides an
indirect measure of number of steps required for
eachtask.
Inthe stage3 all relevant eye tracking measures
are taken into account and evaluated. This data
provides a basis for a
cognitive evaluation and
should help in identification of those tasks that are
the most demanding and result in increased
workloadforthe participant.Inthe FMBSKECDIS
studyfollowingmeasureswereconsidered:
totalnumberoffixationspertask,
fixationsfrequency,
fixationsduration,
location of fixations
in a given area of the
interface,
gazeduration,
numberofblinks,
durationofblinks.
47
Thosemeasurehavetointerpretedandcompared
both between the participants and between the
baseline scenario. The baseline scenario should be
designed to provide low cognitive workload
environmentforeachparticipantandshouldbeused
tocollecteyetrackingmeasures.Thecollecteddatais
latercompared tothebaselinescenario
andused to
infer aboutthe changes in cognitive workload, task
difficultyanddesignissues.
Via stage 4 analysis it is possible to identify all
distractors and errors during a given task, by close
examinationofscanpaths.Forexample,duringatask
of measuring a bearing and a distance to
a given
landmark, the scanpath shows precisely where the
participant’s attention was focused in any given
moment(Fig.2).Onatypicalscanpathfixationsare
represented as circles, where size of each circle
corresponds to the fixation’s duration and colour
intensityisusedfororderingmorerecentfixations
are shown
with vivid and opaque colour. Without
the eye tracking technique, it is only possible to
register participant’s actions and evaluate if those
wereeithercorrectorincorrect.Byincorporatingthe
eyetrackerintothestudyitispossibletoregisterand
evaluate participant’s attention distribution. This
shows not only actions
but also intentions of the
participant and makes it possible to recreate the
searchprocessonacognitivelevel.
Dueto the complexnature ofthe scanpathsit is
not feasible to create and analyse a single scanpath
for a complex task. For this kind of analysis each
complextaskshould
bedividedintoasetofsimple
subtasks.
Figure2. Scanpath for the task: bearing and distance
measurement
Figure3. Scanpath for the task: bearing and distance
measurement. Situation where participant had problem
withsettinganappropriatechartscale
Eachsubtaskshouldbeclearlydefinedbutonly
ifit canbe proved to bean indispensable step in a
giventask.Forexample,whenaroutecreationtaskis
considered,itcanbedividedintofollowingsubtasks:
checking and setting default route parameters,
opening a route window, defining a
waypoint,
savingarouteandvalidatingaroute.
5 CONCLUSIONS
Thedevelopedframeworkextendsthe usabilitytest
procedure as described in IMO’s “Guideline on
Software Quality Assurance and Humancentred
Design for eNavigation”. By incorporating the eye
trackingtechniquesintothe procedureitispossible
to measure visual attention
distribution and
cognitive workload. This allows for a detailed
analysis of the interface and identification of the
majordesignflaws.
Atthisstageofthestudytheobtainedqualitative
and quantitative measures are preliminary and
cannot be used for reliable estimations of the
usability rating of the SeaMap 10 and K
Bridge 7.0
interfaces.Thefirststageofthisstudywasnecessary
todevelopthedescribedframeworkandtoverifyits
validity. To draw an unambiguous conclusion, it is
necessary to conduct the study on a considerable
number of participants so the sample size is large
enough to describe the variability
of each measure
and the measure’s significance for the usability
rating.
The described procedure requires a considerable
amountoftimetoanalysetheeyetrackingdata,and
aspecializedresearchequipment,toobtainaprecise
andreliableeyemovementdata.
REFERENCES
Ehmke,C.,Wilson,S.,Identifyingweb usabilityproblems
fromeyetrackingdatainProceedingsofthe21stBritish
HCI Group Annual Conference on People and
Computers: HCI...but not as we know it‐Volume 1
(BCSHCIʹ07), Vol. 1. British Computer Society, 2007,
Swinton,UK,pp.119128.
Endsley, M. R.,
Situation awareness global assessment
technique (SAGAT). Proceedings of the National
AerospaceandElectronicsConference(NAECON),pp.
789–795.NewYork:IEEE,1988.
Endsley, M. R., Situation Awareness Oriented Design, In
Lee,J. D.andKirlik,A.(eds.),TheOxfordHandbookof
Cognitive Engineering. New York: Oxford University
Press,2013.
Goldberg,J.H.,
Kotval,X.P.,Computerinterfaceevaluation
using eye movements: methods and constructs,
InternationalJournalofIndustrialErgonomics,vol.24,
no.6,pp.631–645,Oct.1999.
Goldberg, J.H., Stimson, M.J., Lewenstein, M. Scott, N. &
Wichansky, A.M., Eye tracking in web search tasks:
designimplications.InProceedingsoftheEyeTracking
Research
& ApplicationsSymposium2002.5158.New
York,ACM.
Hart, S. G., NASA Task Load Index (NASA–TLX) 20
Years Later, NASAAmes Research Center Moffett
Field,CA,2006.
Holmqvist, K., Nyström, M., Andersson R., Dewhurst R.,
Jarodzka H., and van de Weijer J., Eye tracking: A
48
comprehensiveguidetomethod sandmeasures.Oxford
UniversityPress,Inc.,2011.
IMO MSC.1/Circ.1512, Guideline on Software Quality
Assurance and HumanCentredDesign for e
navigation,IMO,London,2015.
IMOMSC.232(82),RevisedECDISPerformanceStandards,
IMO,London,2006.
IMO SOLAS, International Convention for the Safety of
Life at Sea, 1974, as amended,
Consolidated Edition,
IMO,London,2014.
ISO 9421110:2006, Ergonomics of humansystem
interactionPart110:Dialogueprinciples,ISO,2009.
ISO/TR 16982:2002, Ergonomics of humansystem
interaction Usability methods supporting human
centreddesign,ISO,2002.
JacobR.J.K.andKarnK.S.,CommentaryonSection4.Eye
tracking
in humancomputer interaction and usability
research:Readytodeliverthepromises.,inTheMind’s
Eye:CognitiveandAppliedAspectsofEyeMovement
Research,Elsevier,2003,pp.573–605
Lützhöft,M.,Dukic,T.,ShowmewhereyoulookandIʹll
tell you if youʹre safe: Eye tracking of maritime
watchkeepers.
Proceedings of the 39th Nordic
ErgonomicsSocietyConference,2007,Lysekil,Sweden
MuczyńskiB.,GucmaM.,BilewskiM.,ZalewskiP.,Using
eye tracking data for evaluation and improvement of
trainingprocessinship’snavigationalbridgesimulator.
Scientific Journals Maritime University of Szczecin,
33(105),2013,75–78.
NASA, NASA Task Load Index
(TLX) v. 1.0 Manual,
NASA, NASAAmes Research Center Moffett Field,
1986.
Papachristos,D.,Koutsabasis,P.,&Nikitakos,N.,Usability
evaluation at the ship’s bridge: A multimethod
approach. 4th International Symposium on Ship
Operations, Management and Economics., 2012,
Athens,Greece.
Strandvall T., Eye Tracking in HumanComputer
Interaction and Usability
Research, in Human
ComputerInteractionINTERACT 2009, vol. 5727,T.
Gross, J. Gulliksen, P. Kotzé, L. Oestreicher, P.
Palanque, R. Prates, and M. Winckler, Eds. Springer
BerlinHeidelberg,2009,pp.936–937.
ZalewskiP.,TomczakA.,GralakR.,SimulationAnalysisof
ECDIS’ Route Exchange Functionality Impact on
Navigation Safety, The
European Navigation
Conference 2012, Gdańsk, 2527.04.2012, Annual of
Navigation19/2012/part