279
1 INTRODUCTION
Analysesofshiptrafficareimportant,e.g.toestimate
emissionofgreenhousegases,monitorfleetefficiency
andforconductingstudiesonshipsafety. Automatic
Identification System (AIS) data has become an
integral part of these studies, as they provide
positionalandoperationalinformationforalargepart
oftheshippingfleet.
AIS is a communication system tha
t uses the
maritime Very High Frequency (VHF) bands to
transmit ship movement and technical data at
specified intervals. This includes static data, such as
the shipʹs name, draught, destination and Estimated
TimeofArrival(ETA),aswellasdynamicdatafrom
the ships sensors, such as speed and position (ITU
2014).Atypi
caluseofAISistoexchangeinformation
between vessels that are in the same area, to
automaticallyidentifyothershipsandavoidhighrisk
situations. It is also used in traffic monitoring, to
provideguidancebyvesseltrafficservices(VTS)and
byma
nyothershoresideusers.Thedevelopmentof
AIS was a joint project between the International
Maritime Organization (IMO) and the International
Association of Marine Aids to Navigation and
Lighthouse Authorities (IALA). The International
Convention for Safety of Life at Sea (SOLAS) states
that all ships of 300 gross tonnage and upwards
engagedinint
ernationalvoyages, cargoships of500
gross tonnage and upwards not engaged on
internationalvoyages,aswell asall passenger ships
built after 2002, or operated after 2008, should have
an AIS (IMO 2002). This essentially means that all
largershipsengagedinglobalshippingshouldhave
AISequipment.Nationalrequirementswillnormally
alsorequireshipsnotcoveredbyIMOregulationsto
carryAIStra
nsmitters.Thismeansthatmorethan85
000shipsworldwidewilltransmitAISdata(Mantell
2014).
AISdataisgathered by AIS receivers, which can
be found on board ships, on buoys, on land (IALA
2011)andmorerecentlyonsat
ellites(hereafterSAIS).
Expanding the Possibilities of AIS Data with Heuristics
B.B.Smestad&B.E.Asbjørnslett
NorwegianUniversityofScienceandTechnology,Trondheim,Norway
Ø.J.Rødseth
SintefOcean,Trondheim,Norway
ABSTRACT:AutomaticIdentificationSystem(AIS)isprimarilyusedasatrackingsystemforships,butwith
thelaunchofsatellitestocollectthesedata,newandpreviouslyuntestedpossibilitiesareemerging.Thispaper
presentsthedevelopmentofheuristics forestablishingthespecificshiptypeusinginformationretrievedfrom
AISdataalone.Theseheuristicsexpandthepossibilit
iesofAISdata,asthespecificshiptypeisvitalforseveral
transportation research cases, such as emission analyses of ship traffic and studies on slow steaming. The
presentedmethodfordevelopingheuristicscanbeusedforawiderrangeofvessels.Theseheuristicsma
yform
the basis of largescale studies on ship traffic using AIS data when it is not feasible or desirable to use
commercialshipdataregisters.
http://www.transnav.eu
the International Journal
on Marine Navigation
and Safety of Sea Transportation
Volume 11
Number 2
June 2017
DOI:10.12716/1001.11.02.10
280
Land based AIS receivers can detect AIS messages
normallyupto4050nauticalmilesoffshore(Skauen
2013),shipsfurtheroffshorewillremainundetected
bylandbasedAISreceivers.In2005,researchersfrom
the Norwegian Defence Research Establishment
published the first study investigating whether
satellites could be used to
gather AIS signals (Wahl
2005).In2008,afollowupstudybyHøyeetal.(2008),
foundthatAIS signalscouldbe detectedbysatellite
based AIS receivers positioned in altitudes of up to
1000 km. However, since the AIS system was not
initiallydesignedforspacebasedreceivers,but
rather
to be a shiptoship communication system, there
were some problems. A satellite will have a much
largercoverageareathanAISreceiversweredesigned
for, which could lead to interference problems
betweenthedifferentships’AISsignals.Accordingto
thestudy,theresultcouldbethatsomeAIS
messages
wouldnotbedetectedbythesatellite.Inpracticethis
leads to a more reliable satellite coverage in areas
withlesstraffic,whilehightraffickedareascanhave
interference problems. In 2010, the Norwegian AIS
satelliteAISSat1waslaunched.Thissatelliteisina
sunsynchronous polar orbit
at 630 km altitude
(Eriksen 2010). The satellite transmits the AIS
messages it receives to Svalbard Ground Station at
each passing. Eriksen et al. (2010) states that over a
time span of 24 hours, areas along the equator is
coveredtwotothreetimes,whiletheHighNorthand
Southis
coveredupto15times.In2013,AISSat2was
launched to give extended coverage. This gave a
higherupdateratetotheSvalbardGroundStation,as
wellasahigherglobaldetectionrate.
The use of AIS data in studies on maritime
transportation has become increasingly prevalent.
Smithet
al.(2014)preparedareportasapartofthe
WorldShippingEfficiencyIndicesprojectfundedby
the International Council on Clean Transportation.
ThestudycombinedglobalSAISdatafrom2011with
technicalshipdatafromsourceslikeClarksonsWorld
FleetRegister, andtheSecondIMOGreenhouseGas
Study (Buhaug
2009). The SAIS data provided
operationalcharacteristics,suchasspeedandloading
condition. In addition, estimates on the distance
travelled were derived from the SAIS data. Data
from Clarksons World Fleet Register provided
technical specifications, such as the ship type (for
instance LNG tanker or crude oil tanker)
for each
individualship.
TheThirdGreenhouse Gas(GHG)studybySmith
et al. (2014) had an advantage over the preceding
studies,asitcouldutilizeSAISdata.Thesedatawere
usedtogetmorepreciseactivitymeasuresandbetter
emissions estimates for each ship. This was
aggregatedtothe
totalemissionsforeachship type.
In the previous study, emissions were estimated by
using the annual average activity for the different
shiptypes.
Categorizingshipsintoshiptypeandsizecategory
is vital to perform studies on operational efficiency
and greenhouse gas emissions. Knowing the design
speedisnecessary
fordevelopingspeedrelativefuel
consumption models for ships‐where the design
speed is the speed giving theʺoptimalʺ tradeoff
between speed and fuel consumption. The design
speed is amongst others a factor of the block
coefficientoftheship,whichinturnislargelygiven
bytheshiptype.
Previousstudies,suchasSmithetal.
(2014), have used commercial vessel databases to
retrieve the ship type for each specific ship in the
study.However,usingexternaldatabasestoretrieve
theshiptypecanbecostlyasthesedatabasesrequire
asubscription.Ontheotherhand,manualretrievalof
the ship type from open databases can be time
consuming.Thecombinationofthesetwofactorsmay
inhibit studies on maritime transportations using
estimationbasedonAISdata.
IntheSESAMEStraitsproject(SESAME2017),the
challenge was to give guidance to ships headed for
and in the Straits of Malacca
and Singapore and to
estimate possible fuel savings by suggesting more
efficientspeedstotheships.Aproblem, however,is
to find enough information about the ships to do a
reasonableestimationoffueluseandfuelsavingsfor
differentspeeds.Thisinformationcanbebought,but
injustfive
days,morethan3000differentshipswere
recorded by the AISstations in the area.As the
market for such services are limited and quite cost
sensitive, it was not very attractive to buy the
information.
Theresearchquestionsthatemerged,whenfaced
with these challenges was: How well can AIS
data
aloneidentifytheshiptype andsize?Canheuristics
for identifying the ship type for any ship be
constructed? The objective of this study was to
establishheuristicsforidentifyingtheshiptypefora
largeproportionoftheworldfleet,usingSAISdata.
The method for constructing
the heuristics is
outlinedinSection2,whiletheheuristicsparameters
canbefoundinSection3.Theperformanceforthese
heuristics is provided in Section 4, while the results
and the validity of the heuristics are discussed in
Section5.Aconclusionisgiveninthefinalsection.
2 METHOD
SatelliteAISdataspanningthetimeperiodofMay1st
2014toSeptember15th2014wasretrieved.TheSAIS
data had been collected using the two satellites
AISSat1 and AISSat2, and was provided by the
Norwegian Coastal Administration for use in the
SESAME Straits research project. AISSat
2 data was
onlyavailableafteritslaunchinJuly2014.
These SAIS data included static and/or dynamic
AIS messages for 85,108 ships, identified by unique
MMSI numbers. 43,671 of these ships had both
dynamic and static data. Mantell et al. (2014) stated
thatthetotalworldfleetconsistedof
88,483shipsas
ofMay2014.Approximately95%oftheworldfleetis
presentinourdata,andabouthalfoftheworldfleet
is represented with both dynamic and static data.
TheseSAISdataisshownasgroupAinFigure1.
We developed heuristics for a selection
of ship
types with high relevance to international shipping
(Table1).Thisselectionisinlinewiththeselectionin
otherstudiessuchasSmithetal.(2014).
281
Table1. AIS vessel groups, ship types and sizes in this
study
_______________________________________________
AISvessel ShiptypeShipsize
group
_______________________________________________
Tankers LNGandLPGCarriers General,QFlexand
Qmax
OilTankersUL&VLCC
Cargoships Containervessels Panamax
BulkcarriersPanamax
_______________________________________________
TheClarksonsGroupprovidesadatabasewherea
selectionofvesselsofeachshiptypeandsizecategory
are listed by the ship’s name (Clarksons 2015). This
dataisshown inFigure1 asgroupB.Theshipsare
only identified by their name, and not by a more
uniqueidentifier
suchastheirIMOorMMSInumber.
VesselcharacteristicswerealsoretrievedfromtheS
AISdatabymatchingthenameoftheshipfromthe
vessel database to the name registered in the SAIS
data. The ships that were present in both the SAIS
data and the
vessel database are a candidate group,
formedbyasubsetofthetwogroups,andisshown
asgroupCinFigure1.
Figure1.Theprocessusedforconstructingtheheuristics.
ThevesselsarematchedbetweenClarksonsvessel
sheets andthe SAIS data based on their name, and
nottheiruniqueIMOnumber,sothereisapossibility
thatshipsfromothershipclasses,withthesameship
name, are included in the candidate group. To
mitigatethissourceoferrors,
adatacleaningprocess
was required. In the data cleaning, ships with
dimensionsoutsidetheexpectedintervalfortheship
type in question were removed. For instance, cargo
shipsandtankersaretypicallyclassifiedintodifferent
size categories, which often correspond to the
maximum dimensions of important seaways and
ports,
suchasthePanamaCanalandtheSuezCanal.
If a ship was categorized as a Panamax ship in the
Clarksonsvessel database,buthadreportedawidth
ordraughtexceeding thesetofmaximumdimensions
in the Panama Canal in the SAIS data, it was not
includedinthe
traininggroup.
Theinitialversionofthemethod usedmaximum
observed speed as one of the parameters for
classification. Early testing of this heuristic showed
thatsomeshipsweremisidentified.Asanexample,a
274mlongand48mbroadoiltankerhadamaximum
observedspeedof20knots.
Becauseoftherelatively
high speed, this vessel was classified as an LNG
carrier.However,speed recordingsfromAIS data is
most commonly speed over ground, and not speed
relativetothewater.Theserecordingsmaythusbea
result of particularly favorable wind and current
conditions,andnotnecessarilyerrors
in speed data.
To find the frequency of the different speed
recordings,allreportedspeedswerebucketedinone
knot intervals. Out of 165 speed recordings for this
vessel, there was only one record of the maximum
recorded speed of 20 knots. The highest speed,
amongst those with the highest
frequency, was 14
knots.Thedatashowedthatthevesselhadthisspeed
at ten occasions. To avoid these rare occurrences of
highspeed,anewconstraintwasputintheheuristics;
foramaximumspeed to be valid, the vessel should
havetenormoreAISrecordsofhavingthat
speed.
After the data had been cleaned, we used the
resultingshipsasatraininggroupfortheheuristics,
shown by group T in Figure 1. Using this training
group, common dimensional traits and operational
characteristics for each ship type was derived by
inspection,andultimatelyusedtoformthe
heuristics.
ThiswasrepeatedforeveryshiptypeinTable1.The
process of making heuristics for panamax bulk
carriersisusedasanexampleandoutlinedbelow.
The heuristic, which consists of constraints on
dimensions, draught, speed and AIS vessel group,
wereappliedonthefullsetofSAIS
data(groupA) as
aperformancetest.Theperformanceofeachheuristic
wascheckedbymanuallyconfirmingthespecificship
type of all ships classified by the heuristic, using
onlineshipdatabases.Thesearedatabaseswherethe
shiptypeofasingleshipcanbefoundusingtheIMO
or
MMSI number. The accuracy of a heuristic was
definedasthenumberofshipscorrectlyidentifiedby
ship type, divided by the total number of ships
identified.
2.1 DevelopingheuristicsforPanamaxBulkCarriers
Thecandidategroupfortheheuristictraininggroup
was made by identifying all panamax bulk carriers
present
inboththeClarksonsvesseldatabaseandthe
SAIS data. Out of the 2459 panamax sized bulk
carriers in Clarksons vessel database at the time of
retrieval(spring2015),2200shipswerealsopresentin
theSAISdata.
2.1.1 DataCleaning
2.1.1.1 Erroneousshipdimensions
Thebreadthwasrequiredtobelessthan34m,as
themaximum widthofthePanama Canalis33.5m.
Theextra0.5mwasallowed,assomepanamaxbulk
carriersseemedtoberegisteredwithawidthof34m
intheSAISdata,probably
duetoaroundingerror.
Thisconstraintisillustratedbythetophorizontalline
in Figure 2. There were a lot of vessels in the
candidategroupexceedingthisbreadth.Thefactthat
seemingly panamax vessels could exceed this
constraint can be attributed to the lack of a unique
identifier
inthe vesselsheetsasearlierdescribed.In
other words, these may have been nonpanamax
vesselshavingthesamenameasthepanamaxvessels
in the Clarksons vessel sheets. To ensure that only
Panamaxvesselswerepresentin thetraining group,
an additional breadth constraint of minimum 30 m
was
added.Vesselsbelowthisbreadthwouldfallinto
othershipcategories.Thisconstraintisillustratedby
thebottomhorizontallineinFigure2.Thesebreadth
requirements reduced the candidate group to 1668
ships.
282
Figure2. Length and breadth for the candidate group of
panamax sized bulk carriers. The horizontal lines indicate
themaximumandminimumallowedbreadth.
After the breadth constraints was enforced, three
vesselsintheresultinggrouphad alengthover250
m. The dimensions of these three vessels were
manuallyinspectedinanopenshipdatabase,tocheck
for any errors. The longest ship, Vishva Anand,
actuallyhadalengthof229m,notthe
332mitwas
recordedwithin theSAISdata.The second longest
shipwasacontainervesselmisidentifiedasthebulk
carrierSantaRegina,astheysharedtheirname.The
last vessel was the 259 m long bulk carrier Orissa.
This is an exceptionally long bulk carrier, with
a
breadthofonly32m.Sincethesethreevesselseither
werewronglyregisteredor exceptionally large, they
were excluded. After these exclusions, 1665 vessels
remained. The rest of the vessels in the candidate
grouphadreasonablesizes,andwehadnoreasonto
suspectthattheir dimensions wereerroneous.These
shipscouldnowbeusedasa traininggroupforthe
heuristics.
2.1.2 Heuristictraining
2.1.2.1 Maximumspeedconstraint
Becauseofthehighutilizationoftheship’svolume
in bulk carriers, it was expected that the maximum
speed as registered by AISS is lower compared to
otherdimensionallysimilarvessels,suchascontainer
vessels.Asmanyas92%ofthecontainervesselshad
anobservedmaximum
speedof15.9knotsormore,
while 92% of the Panamax bulk carriers had an
observedmaximumspeedof 15 knotsorless.There
wasagroupofshipsreportingspeedsupto18knots,
which can be seen in Figure 3. This can be due to
especially favorable wind
and current conditions. It
canalsobeduetoothershipsbeingmisidentifiedas
bulkcarriers.Becauseofthesefindings,themaximum
recordedspeedallowedintheheuristicwassetto15
knots.
Figure3.Maximumspeedandlengthofthetraininggroup
Panamaxsizedbulkcarriers.
2.1.2.2
Draughtconstraint
Bulkcarrierstypicallycarriesunpackeddrycargo.
Themaincargotypesarecoal,ironore,cereals,sugar
or cement. They have a high utilization of their
volume, as the cargo is held in several transverse
cargoholdsoverthefullshipbreadth.Becauseofthe
high utilization of the
ship’s volume, a high
maximum draught and large differences between
maximum (when the ship is fully loaded) and
minimum (when the ship sails without cargo)
draughtareexpected.
Figure4showsthedraught,lengthandbreadthof
the ships in the training group. There was no
apparentcorrelationbetweenthese
variables.
However,alloftheshipsinthetraininggrouphada
maximumdraughtabove5m,sothisconstraintwas
includedintheheuristic.
The scatterplot in Figure 5 shows the lack of
apparentcorrelationbetweenthemaximumchangein
draughtovertherecordingperiodversusbreadthor
length.