43
1 INTRODUCTION
Enavigationsystemsareexpectedtoenhancesafety
ofnavigationandsecurity.Witheachnewgeneration
of navigational equipment, producers are trying to
provide new features and extended functionality.
However, additional enavigation functions can
make it more difficult to understand systems’
primaryinformationandmayhamperthe
operation
of the primary functions of the system with poor
usability.For example,additionalinformation in an
Electronic Chart Display and Information System
(ECDIS)mayimpedetheroutemonitoringfunction.
Therefore, the usability rating methods must be
developed to ensure that the seafarers are able to
successfully perform primary operations
of systems
upgradedwith enavigationfunctions, regardlessof
thetypeandspecificationsof the systemandusers’
knowledgeandexperiencewiththesystem.
2 USABILITYTESTING
The International Organization for Standardization
(ISO) defines usability asʺThe extent to which a
product can be used by specified users to achieve
specified
goals with effectiveness, efficiency, and
satisfaction in a specified context of useʺ. When
considering usability of a software application the
usabilitymeanstheperceptionofatargetuserofthe
effectiveness(fitforpurpose)andefficiency(workor
timerequiredtouse)ofthegiveninterface.ISOand
various
research public and private organizations
likeNASAorSATechnologieshavedevelopedover
the last 30 years several methods of rating and
quantifying the task load (TL) (Hart 2006, NASA
1986) and situation awareness (SA) (Endsley 1988,
Endsley 2013). Nowadays these methods are
common industry standards used while assessing a
task,
system,orteamʹseffectiveness,orotheraspects
of performance. They utilize many taskspecific
techniques to achieve the rating goal. For example,
the monitoring of eyetracking in combination with
Extended Framework for Usability Testing in
e-Navigation Systems
P.Zalewski&B.Muczyński
M
aritimeUniversityofSzczecin,Szczecin,Poland
ABSTRACT:ThepaperpresentstheframeworkofusabilitytestingofECDISequipmentbasedontheIMO’s
GuidelinesonSoftwareQualityAssuranceandHumancentredDesignforeNavigation.Byincorporatingthe
eyetrackingtechniquesintotheprocedureitwaspossibletomeasurethevisualattention
distributionandthe
cognitiveworkload.Thepresentedmethodcouldbeusedtoevaluateusabilityofeveryenavigationsystem,
whichisnecessarytoensurethattheseafarersareabletosuccessfullyperformprimaryoperationsofsystems
upgraded with enavigation functions, regardless of the type and specifications of the
system and users’
knowledgeandexperiencewiththesystem.Theinitialresultsarepresentedanddiscussedasthestudyisstill
ongoing.
http://www.transnav.eu
the International Journal
on Marine Navigation
and Safety of Sea Transportation
Volume 10
Number 1
March 2016
DOI:10.12716/1001.10.01.04
44
SAGATisoneofthespecifictechniquesdevelopedin
Maritime Universityof Szczecin (MUS) for
evaluation, improvement and general usability
testing (UT) of ship’s navigation equipment
(Muczynskiet.al.2013,Zalewskiet.al.2012)
ISO/TR 16982:2002 provides information on
humancentred UT methods, comprising various
taskspecificratingtechniques,whichcanbe
usedfor
design and evaluation of enavigation displays. It
details the advantages, disadvantages and other
factorsrelevanttousingeachUTmethod.Thedigest
ofthesemethodsispresentedintheTable1.
In March 2015, International Maritime
Organization has issued the circular on “Guideline
onSoftwareQualityAssurance
andHumancentred
Design for eNavigation”, officially introducing UT
methodsintofutureelectronicequipmentformarine
navigation.Theappendix3ofthisguidelinepresents
anUTprocessbasedontheECDISexampleasaone
closely aligned with testing of future enavigation
systems.ThisUTexamplealignswiththe
integration
andtestingstageofaHumanCentredDesign(HCD)
process for evaluating the performance of essential
tasks by competent users. The selection of test
participants is important and has a bearing on the
quality of test results. If tasks require operations
basedonnavigationalexperienceorknowledge,then
appropriate
participants should be selected. Tasks
thatare generallyperformedby lessexperienced or
knowledgeablepersonnelshouldbesimilarlytested.
TheUTactivityinvolves thefollowingsteps:
1 planning;
2 preparation;
3 undertakingandcontrollingtests;
4 evaluationofresults;and
5 useoffeedback.
A UT plan should be developed by
defining
scenarios and identifying the most important or
criticaltasksthatusersmustperform.Usersandthe
testenvironmentareidentifiedatthisstage.Agoal
based approach should be used when setting the
taskswiththeaimoffacilitatingflexibleyetpractical
assessment of the target enavigation system.
The
following steps can be a part of the goalbased
approach:
1 definitionofgoalsbasedonthecontextofuseof
the system, which may come from functions
stipulated in internationally agreed performance
standards;
2 specifyingfunctionalrequirementsorthe criteria
to be satisfied in order to conform to
the goals,
taking into account the relevant performance
standardsanduserrequirements;
3 specifying usability requirements that must be
achieved during testing, based on the aspects of
effectiveness,efficiencyandsatisfaction;and
4 preparationofteststhatwillassistinverifyingthe
extent to which the system conforms with the
identifiedgoals.
A care has to be taken to guarantee the
reproducibility of the test on different types of
equipmentandwiththesamesettingsandscenarios.
Thismeansthatitisadvisedtoavoidtestscenarios
whereitisnecessarytousethespecificfunctionality
availableonlyinasingle
typeofECDIS.
Table1.UTmethodsthatcanbeappliedwhiledesigninge
navigationproducts(ISO2012)
_______________________________________________
NameofDirect Shortdescriptionofmethod
themethodinvolvement
ofusers
_______________________________________________
Observation Yes Collectionofinformationina
ofuserspreciseandsystematicway
aboutthebehaviourandthe
performanceofusers,inthe
contextofspecifictasksduring
useractivity.
Performance‐Yes Collectionofquantifiable
relatedperformancemeasurementsin
measurementsordertounderstandthe
impactsofusabilityissues.
CriticalYes Systematiccollectionof
incidentspecificevents(positiveor
analysisnegative).
Questionnaires Yes Indirectevaluationmethods
whichgatherusersʹopinions
abouttheuserinterfacein
predefinedquestionnaires.
Interviews Yes Similartoquestionnairesbut
withgreaterflexibility
involvingfacetoface
interactionwiththe
interviewee.
Thinkingaloud
Yes Involveshavingusers
continuouslyverbalizetheir
ideas,beliefs,expectations,
doubts,discoveries,etc.
duringtheiruseofthesystem
beingtested.
Collaborative Yes Methodswhichallowdifferent
designandtypesofparticipants(users, 
evaluationproductdevelopersand
humanfactorsspecialists,etc.)
tocollaborateintheevaluation
or
designofsystems.
Creativity Yes/NoMethodswhichinvolvethe
methodselicitationofnewproductsand
systemfeatures,usually
extractedfromgroup
interactions.Inthecontextof
humancentredapproaches,
membersofsuchgroupsare
oftenusers.
Simulation Yes/NoUseofcomputersimulation
modelling
toolsforinitial
evaluations.
Document‐  NoExaminationofexisting
basedmethodsdocumentsbytheusability
specialisttoforma
professionaljudgementofthe
system.
Modelbased NoUseofabstractrepresentations
approachesoftheevaluatedproductto
allowthepredictionofusersʹ
performance.
ExpertNoEvaluationbased
onthe
evaluationknowledge,expertiseand
practicalexperiencein
ergonomicsoftheusability
specialist.
AutomatedNoAlgorithmsfocusedon
evaluationusabilitycriteriaorusing
ergonomicknowledgebased
systemswhichdiagnosethe
deficienciesofaproduct
comparedtopredefinedrules.
_______________________________________________
45
3 EYETRACKING
Eye tracking is a set of techniques and methods to
measurethepositionofthesubject’seyesinrelation
to the visual scene. In this way a gaze point is
obtained.Eyetrackingitselfhasnotbeenincludedin
IMO’s “Guideline on Software Quality Assurance
andHuman
centredDesignforeNavigation”butit
meets criteria for both “Observation of users” and
“Performancerelated measurements” methods and
can be used efficiently in “Simulation” methods. It
has been proved as a valid method for usability
testinginmanypreviousresearches(Ehmke,Wilson
2007, Goldberg, Kotval 1999, Strandvall 2009).
Few
studies reported usefulness of this technique in the
ship’s bridge environment (Papachristos et.al. 2012,
Lutzhoft,Dukic2007).(JacobandKarn2003)reports
four most common eye tracking measures that are
usedinusabilitystudies,thoseare:
1 Fixation: a relatively stable eyeinhead position
withinsomethresholdofdispersion
(typically~2°)
over some minimum duration (typically 100200
ms), and with a velocity below some threshold
(typically15100degreespersecond).
2 GazeDuration:cumulativedurationandaverage
spatiallocationofaseries ofconsecutivefixations
within an area of interest. Gaze duration
typically includes several fixations
and may
includetherelativelysmallamountoftimeforthe
shortsaccadesbetweenthesefixations.Afixation
occurring outside the area of interest marks the
endofthegaze.Insomestudies,thismeasureis
called“dwell”,“glance”or“fixationcycle”.
3 Area of interest: area of a display or
visual
environmentthatis ofinteresttotheresearch or
designteamandthusdefinedbythem(notbythe
participant).
4 Scan Path: spatial arrangement of a sequence of
fixations.
Dependingontheequipmentusedandtypeofa
study a number of other measures can be used. A
detailed list is given in (Holmqvist et.al. 2011) and
includes number,duration and frequency of blinks,
saccades direction and velocity and microsaccades.
Comprehensive eye tra cking study can give insight
intosearchefficiency (e.g.dueto poorarrangement
ofdisplayelements),importanceofspecificinterface
element, task difficulty and participant’s stress and
cognitiveworkload.
Usefulness of eye tracking techniques is
hamperedby severaltechnicaldifficulties relatedto
both data collection and data analysis. Despite
technological advancement still around 1020% of
populationcannotbetrackedreliably,thisisusually
thecasewitholderparticipantsthathaveanykindof
visualimpairment
andhavetouseeitherglassesor
contactlenses.Anotherproblemisrelatedtothefact
that each eye position is given in vertical and
horizontalcoordinatesinasystemthatisfixedinthe
eyetracker–headframe.Thismeansthatwhenusing
a stationary eye tracker, the participant should
restrict
head movements to a small area (about a
cubicfoot).Whenamobileeyetrackerisused,which
istheonlyreliablesolutionwhenconductingastudy
onaship’sbridgesimulator,fixationcoordinateshas
to be transformed into a ship’s bridge coordinate
system. Thishas to be done manually
using frame
byframeanalysisoradedicatedsoftwarethatallows
for fixationbyfixation mapping. Both methods are
very laborious and time consuming and present a
seriousdrawbackforanystudywithaconsiderable
number of participants and long scenarios. Last
majordifficultyisrelatedtodatainterpretation.Eye
trackingdataanalysiscanproceedeithertopdown−
based on cognitive theory or design hypotheses, or
bottomup−basedentirelyonobservationofthedata
withoutpredefinedtheoriesrelatingeyemovements
to cognitive activity (Goldberg et.al. 2002). For a
usabilitystudyitisimportanttocloselyexaminethe
data stream and
relate it to the current task and
environment. For example, when considering long
fixations during an ECDIS’ Usability Testing all
external factors have to be identified before a
statement abouthigher difficulty of the task can be
made.
4 FRAMEWORKFORECDISUSABILITYTESTING
IntheenvironmentoftheFullMission
Ship’sBridge
Simulator(FMBS)inMUStheUTprocessofthetwo
Kongsberg manufactured ECDISes, SeaMap 10 and
KBridge 7.0, was conducted in accordance to the
recommendationssetbyIMO.Thegoalwasdefined
as “to pla n and display the shipʹs route for the
intended voyage and to
plot and monitor positions
throughoutthevoyage”,basedonSOLASregulation
V/19.2.1.4.Similarly,functionalrequirementsforthe
ECDISes were defined based on the IMO’s ECDIS
performance standard (IMO 2006). The following
functional requirements related to the nautical data
handling necessary for safe navigation, with the
followingsubrequirementsweretakeninto
account:
1 Chartdatahandling(forinstance:changedisplay
orientation,mode,etc.);
2 Own ship data handling (for instance: read
position,speed,etc.);and
3 Trackedtarget(TT)andradardatahandling(for
instance: show TT symbols overlaid on ECDIS
chartarea,etc.).
In the case of ECDIS, the
“usability” can be
evaluated in terms of user effectiveness and
efficiency for each of the ta sks and overall
satisfaction of the system (for example through
subjectiveevaluationbyTLandSA).Ashighlighted
in the Table 2, the measures of effectiveness were
relatedtothedifficultyandcompletenessofthetask
execution. The achievement rate was used as a
measureof“effectiveness”andquantifiedbythefour
levels: “1. Smoothly”, “2. Not smoothly”, “3. With
errors”, “4. With suggestions”. Usability outcomes
werebased on thedialogueprinciples,as identified
in(ISO 9421110:2006),using UTmethods based on
(IMOMSC.1/Circ.1512,ISO/TR
16982:2002).
The specific scenarios and tests, tasks were
created to satisfy the functional requirements. The
following are the tasksfor abasic display handling
scenario:
Task 1: Adjust display modes and scale to meet
operatorʹsneeds
Task2:Obtaininformationaboutalighthouse
Task 3: Measure the
bearing and distance to a
landmark
46
Task 4: Overlay a tracked target symbol and
obtaininformationaboutthetarget
Quantitative performance criteria such as time
taken to complete tasks and questionnaires which
assistwithoverallsubjectivesystemevaluationwere
included inline with the criteria set in the Table 2.
Thesewerenecessary,asone
caneasilydeductwhile
studying the Table 2 that, for example, to
differentiate between“Achieved not smoothly” and
“Not achieved” the time limit of the specific task
completionmustbeset.
Table2. Achievement criteria for the generic usability
ratingbasedon(IMOMSC.1/Circ.1512)
_______________________________________________
AchievementlevelCriteria
_______________________________________________
Achieved 1.Smoothly 1.Participantsunderstoodthe
informationcorrectlyand
operatedproperlywith
confidence.
2.Participantsmadesome
mistakesbutnoticedthe
mistakesimmediatelyand
achievedthegoalsmoothly.
2.Notsmoothly 1.Participantscompletedthe
taskproperlybythemselves,
butwithsomehesitationor

confusion.
2.Participantstooktimeto
findthefirstactionorto
recoverfromerrorsbut
completedthetaskwitha
smallnumberofinteractions.
Not 3.Witherrors 1.Participantscouldnot
achievedunderstandtheinformation
correctly.
2.Ittookalargenumberof
interactionsto
achievethegoal
eveniftheycompletedthetask
properly.
4.With1.Participantscouldnot
suggestions completethetaskby
themselvesandneeded
suggestionsfromthe
instructorormoderator.
_______________________________________________
Basedonthestudypresentedin(Muczynskiet.al.
2013) participants were divided into groups with
different experience. Two factors were taken into
account:
1 Generalseafaringexperience
2 ECDISexperience
SincebothECDISgenericandECDIStypespecific
courses are mandatory, it was considered if the
participanttook thetypespecific
course fora given
ECDIS type and what was the last time when a
participant was working with the same type of
ECDIS.
Each participant was given the same set of task
with no time limitation. To supplement the IMO
recommendations, the eye tracking data were
collected for each participant
throughout the initial
study. The mobile eye tracker “SMI Eye Tracking
Glasses” was used for the data collection. The data
wasanalysedusingSemanticGazeMappingfunction
provided by the SMI BeGaze software. Number,
location, frequency and duration of fixations were
recorded and mapped on the ECDIS interface. The
interfacewas
divided into 3main AreasofInterest:
chartarea, alarmsand sensor dataand menus (Fig.
1).Becausemostofthemenusaredisplayed on the
screenaftera specifiedbutton has been clicked, the
sizeofAreas ofInterestswasnot constant.Whenit
wasrelevantandadvisablefixations
wereidentified
withtheaccuracytoasinglemenu,submenu,button
oreithergraphicalornumericalinformation.
Figure1. Main areas of interest on the ECDIS interface:
chartarea,menuareaandalarmsandsensordata
The initial framework for usability testing
includes4stages,asgivenintable3.
Table3.FourstagesusedforECDISusabilityrating
_______________________________________________
Stage1 Achievementlevel Gradedonscale14as
showninTable2
Stage2 TimeTotaltimerequiredto
accomplishgiventask
Stage3 Eyetracking Allrelevantdatacaptured
measureswiththeeyetracker
Stage4 Scanpathanalysis Detailedanalysisof
participant’svisualattention

distribution
_______________________________________________
Inthestage1eachtaskisratedinascalefrom1to
4accordingtoIMOguidelines(Table2).Duringthe
firstiterationofthestudyeachtaskwasevaluatedby
the experienced instructor. For the next iteration a
dedicated computer software is being developed.
This will help to
make the data analysis faster and
moreefficient.Itwillalsoremovethesubjectivityof
the evaluation process and allow increase the
reproducibilityofthisstudy.
The stage 2 is concerned with the time required
foraccomplishingeachtask.Thisiscloselyrelatedto
thefirst stage,whendiscriminating between
level 1
and level 2. It is suggested to treat the time
independently because by itself it provides an
indirect measure of number of steps required for
eachtask.
Inthe stage3 all relevant eye tracking measures
are taken into account and evaluated. This data
provides a basis for a
cognitive evaluation and
should help in identification of those tasks that are
the most demanding and result in increased
workloadforthe participant.Inthe FMBSKECDIS
studyfollowingmeasureswereconsidered:
totalnumberoffixationspertask,
fixationsfrequency,
fixationsduration,
location of fixations
in a given area of the
interface,
gazeduration,
numberofblinks,
durationofblinks.