791
1 INTRODUCTION
The analysis of global trade and market is usually
based on official customs data, which is relatively
macroscopic and lagging behind. It is suitable for
macro analysis (see Fu et al 2006; Liu 2014; Chen
2015). The other method is the aggregation of inter
regional seaborne trade flows
by tracking vessel
movementsovertime(seeNossum1996;HajiS2013;
Adland R et al 2017), which is microscopic and
flexible.AccordingtoNossum,Fearnleys,ashipping
broking firm, summarized the shipping information
providedbytheshipbrokerstogetthereviewsofthe
world bulk trades. With the
emergence of AIS
technology, the realtime broadcasting of AIS
messages makes the shipping data more timely,
convenientandobjective.TheAISdatawereusedto
summarize and analyze the global container trade
circulation in Haji S (2013) and the global crude oil
trade circulation in Adland R (2017). Both of
the
statistics were compared with the official data, and
theimpactofthedifficultyindeterminingtheactual
shipmentsizeofshipswasmentioned.
The AIS data does not include the true shipment
size of the ship, the information related to the
shipmentsizeisthedeadweightton.Deadweightton
istheshipmentsizewhentheshipisfullyloaded.But
inthe actualtransportation, the ships are always no
load or half load, and it is difficult to get the
centimeter deadweight ton which reflects the actual
shipmentsizeoftheship.Renetal(2017)established
the state recognition
system of ship load based on
neural network. However, the object of the study is
inland ships, and only the ships’ load state can be
recognized, it’s not possible to make an accurate
estimateoftheshipmentsize.Inthispaper,BPneural
networkisusedtoestimatethetrue
shipmentsizeof
ships, so as to improve the accuracy of AISbased
trade volume estimates and analysis of marine
market. This method is aimed at iron ore carriers
engagedinimportandexporttrade.
Estimation of Shipment Size in Seaborne Iron Ore Trade
X.Zhou&Q.Hu
ShanghaiMaritimeUniversity,Shanghai,China
ABSTRACT:Shipmentsize is unavailableandimportantinAISbasedtrade volume estimates.A method of
shipment size estimates based on AIS (Automatic Identification System) data and BP neural network is
proposed. The shipʹs length, width, designed draught, current draught and deadweight ton are input
parameters, the actual shipment size of the ship is output value, and the BP neural network is trained to
estimatetheactualshipmentsizeoftheironorecarriers.Then,theAISdataisusedtocalculatetheironore
tradevolumein2018.Comparedwithcustomsdata,theannual
errorofimportvolumeofChinaislessthan
0.5%.Theresultshowsthattheproposedmethodisaccurateandpractica l.
http://www.transnav.eu
the International Journal
on Marine Navigation
and Safety of Sea Transportation
Volume 13
Number 4
December 2019
DOI:10.12716/1001.13.04.11
792
2 DATASOURCEANDPARAMETERANALYSIS
2.1 Datasource
The AIS data of the ships used in this paper is all
provided by ShangHai Maili Marine Technology
Co.,Ltd (http://www.hifleet.com) and stored after
receivinganddecoding.
Theshipsinvolvedinthispaperarealloceanbulk
carriersusedforinternationaliron
oretrade,andthe
information of such ships are included in Lloydʹs
maritime archives. Therefore, in order to avoid the
situationthatship’sAISdataismisinputbysailors,
this paper combines Lloydʹs maritime archives to
improvethestaticinformationofAISdata.
Theactualshipmentsize
oftheshipsusedinthis
paperisderivedfromtheLineupofportagencyLBH
whichincludesshipname,departureport,departure
time, shipment size, destination, etc. In order to
excludetheimpactofotherfactors,theselectedships
areallbulkcarriersdeliveredfromAustraliadirectly
to China,
and multiport unloading ships have been
excluded.
Match the Lloydʹs maritime files with ship AIS
databyMaritimeMobileServiceIdentify(MMSI)and
shipnametogetamorecompleteandaccurateship
staticinformation(name,callsign,MMSI,IMO,vessel
type,designeddraft,length,width,etc.)and
dynamic
information (longitude, latitude, speed, draft, etc.).
Then,matchthetrueshipmentsizeforeachshipwith
Lineup through the name, departure port and
departuretime.
2.2 Characteristicextraction
AIS data include static information, dynamic
informationandvoyagerelatedinformation,there is
no obvious linear relationship between all kinds of
information, so we need to select the appropriate
informationfornonlinear modeling, in order to find
the nonlinear mapping relationship between the
ship’s AIS messages and actual shipment size. BP
neural network has obvious advantages in the
modelingofnonlinearrelationsandiswidelyusedin
information fusion (see Hu
X et al 2011) , track
prediction (see Gan S et al 2016; Rong Zhen 2017),
informationrecognition(seeZhangetal2009;Zhuet
al2012)andotherstudies.
After analysis and screening, the following
featuresareselected:
1 Length and width of vessel. It is a major factor
affecting
the shipment size because it’s closely
relatedtotheshipʹscargocapacity.
2 Deadweight ton. Deadweight ton = Displacement
ofthefullyloadedship‐theweightofemptyship.
It can reflect the shipment size when the ship is
fullyloaded.
3 Designeddraft.Itisthedraftwhen
theshipisfully
loaded.
4 Currentdraft. It changeswhen the shipment size
changes.
Theratioofcurrentdrafttodesigneddraftandthe
ratio of shipment size to deadweight ton are
theoreticallypositivelycorrelated.
Taketheabovefivefeaturesasinputsandthetrue
shipment size as output of
BP neural network, and
obtainthemappingrelationshipbetweenthefeatures
and the true shipment size through supervised
learning.
2.3 Dataprocessingandstorage
MatchtheAISdatawiththeLineupdatathroughthe
shipname,departureportanddeparturetimetoget
the actual shipment size. Store the
AIS data with
actual shipment size in My SQL database, and set
MMSI as the primary key for convenience of query
andchange.Storagestructureisshowninfigure1.
Datapreprocessing:
1 Dynamic data. The draft of the ship is entered
manuallybythecrew,sotheinformationmaybe
entered incorrectly or changed untimely, or the
draftforlastvoyagemayberetained.Forthedata
with unreasonable draft, search the ships’
historicaltrackonhttp://www.hifleet.comthrough
MMSI.Observeitsdraftchangesafterdeparture.If
there is any changes before arriving next port,
choosethelatter.
2 Static data.
The AIS data were matched with
Lloydʹs Marine archives through MMSI and ship
name,soastoimprovetheaccuracyofAISstatic
data. Through preprocessing, the accuracy and
integrityoftrainingdataareguaranteed.
Figure1.Datastoragesample
3 ESTIMATIONMODELOFSHIPMENTSIZE
BASEDONBPNEURALNETWORK
3.1 Networkstructure
BP neural network is a kind of back propagation
learning algorithm (see Yu 2011) proposed by the
teamofscientistsledbyRumelhartandMcMellandin
1986. It can store a large number of inputoutput
mode
mapping relations. Its learning rule is the
steepestdescentmethod.Theweightsandthresholds
of the network are continuously adjusted by back
propagation to minimize the network errors. BP
neural network consists of a series of simple units
whicharecloselyrelatedtoeachother.Itstopological
structure includes input
layer, hidden layer and
outputlayer.HerewebuildaBPneuralnetworkwith
793
one hidden layer, five input layer nodes and one
outputlayernode.Thestructureisshownasfollows:
51
xx are the inputs which are length, width,
designed draft, current draft and DWT (deadweight
ton);
y
is the output, the true shipment size;
ij
is
node
i tonode
j
weight, and
jk
is node
j
to
node
k weight. Each neuron node has a certain
numberofinputsanduniqueoutput.
Figure2.BPneuralnetworkstructure
3.2 Datastandardization
BP neural network has a strong dependence on
normalization.Inordertoavoidtheinfluencescaused
by magnitude difference of input data, the data
normalization is necessary. Common methods for
data normalizationinclude minmax standardization
andzscoremethod.Theformerisadoptedhere.The
conversionfunction
isasfollows:
*
min
max min
x
x
(1)
x
is sampledata, mi
n
is the minimum value of the
sample data,
ma
is the maximum value of the
sampledata,and
*
x istheconverteddata.Theoriginal
data can be mapped to values between 0 and 1 by
datastandardization.Aftertheprediction,theoutput
results should be antinormalization to get practical
significance.
3.3 Hiddenlayernode
Too few nodes will affect the accuracy of network,
while too many nodes may
lead to overfitting. The
principle to determine the number of nodes in the
hiddenlayeristotakeascompactastructureas
possibleundertherequirementofaccuracy.Herewe
choose the number of nodes with an empirical
expression(seeZhangetal2008):
anmM (2)
M
is the number of hidden layer nodes, n is the
number of input layer nodes,
m is the number of
outputlayernodes,
a isanintegeramong1~10.Here,
n is5 m is1
M
optimumrangeis3~12.
3.4 Network
straining
StepsofthisBPalgorithmisdescribedasbelow(see
Wang2013):
Step 1 Initialize the parameters such as net
structure,layernumbers,nodesnumberofeachlayer,
the weights and thresholds, and set the appropriate
networklearningrateandtransferfunction.
Step 2 Calculate the hidden layer
output H . n
isthenumberofinputlayernodes,
l
isthenumberof
hiddenlayernodes,
j
isthethresholdofhiddenlayer
neuron,
)(xf
isthetransferfunctionofhiddenlayer
‐‐sigmoidfunction.
1
()1,2,,
n
jijij
i
Hf x j l


(3)
1
()
1
x
fx
e
(4)
Step 3 Calculate the output
Y
. m is the number
of output layer nodes,
k
is the threshold of output
layerneuron,
)(x
isthetransferfunctionofoutput
layer.
1
()1,2,,
l
kjjkk
j
YH k m


(5)
xx
)(
(6)
Step 4 Errorcalculation.
k
y is the desired output
and
k
Y istheactualoutput.
1, 2, ,
kkk
eyY k m
 (7)
Step5 Updatetheweights.
1
(1 ) 1,2, , ; 1,2, ,
m
ij ij j j i jk k
k
H
Hx e i nj l

 
(8)
1, 2, , ; 1, 2, ,
jk jk j k
He j lk m


(9)
1
(1 ) 1, 2, ,
m
jj j j jkk
k
HH e
j
l


(10)
1, 2, ,
kkk
ek m
 (11)
Step 6 The updated weights and thresholds are
used to recalculate the error. If not, return to step 2
untiltheerrorislessthantheseterror.
794
3.5 Simulationandresults
1650 data of AustraliaChina and BrazilChina iron
oreshipsareputintoBPneuralnetwork.Thenumber
ofhiddenlayernodes is4.1600ofthemarerandomly
selected as training data and 50 as testing data. The
training and testing are carried out
according to the
aboveprocess.Thetestresultsareshownasfollows:
Figure3.Comparativeanalysisofpredictionresults
Figure4.Errorpercentage
Figure 3 is a comparison between the predicted
resultsandtheexpectedresultsoftheneuralnetwork.
It can be seen from the graph that the predicted
resultsaregood.Inordertoquantifythecomparison
results, a determination coefficient
2
R
is set to
express the goodness of fit of the model. The closer
the value is to 1, the better the fit of the model is.
Figure 4 is the error percentage of the predicted
results.Inthis test,
2
R
isabove0.99, and the erroris
concentratedwithin0.04%.Theresultissatisfactory.
4 EXAMPLEVERIFICATION
4.1 Verificationprocess
Figure5.ModelAccuracyTestingProcess
In order to verify the feasibility of the proposed
method,anexperimentwascarriedout(Figure5).The
AISdata of bulk carriers leaving Australianportsin
February2018 andgoingtoChina arematchedwith
the Lineup of iron ore ships in Australian ports in
February 2018. Then, a data
set of iron ore ships
leaving from Australia in February is obtained. The
datasetcontainsidentityinformationofships(MMSI,
name), model input parameters (length, width,
designed draft, current draft, deadweight ton) and
actual shipment size. Statistics and analysis are
performed in the following four experimental
scenarios:
Scenario 1
Preprocess the AIS data of iron ore
ships to improve incomplete and inaccurate data.
Replace the actual shipment size with deadweight
ton, and the weekly statistics of all the deadweight
tons of the ships is carried to obtain the 1st trade
volumeofAustraliaChinainFebruary.
Scenario 2Preprocess the
AIS data of iron ore
ships. Put the input parameters (length, width,
designeddraft,currentdraft,deadweightton)intothe
welltrained network. Replace the actual shipment
size with the network’s output, and the weekly
statisticsiscarriedtoobtainthe2ndtradevolumeof
AustraliaChinainFebruary.
Scenario
3Select the AIS data of iron ore ships
without preprocessing. Put the input parameters
(length, width, designed draft, current draft,
deadweight ton) into the welltrained network.
Replace the actual shipment size with the network’s
output, and the weekly statistics is carried to obtain
the3rdtradevolumeofAustralia
ChinainFebruary.
In the case of a large amount of AIS data, data
preprocessingtakessomuchtime.Sothisscenariois
set to test the robustness of the model and observe
whetheritcanstillhavegoodpredictedresultsinthe
caseofdatamissing.
795
Scenario 4Do the weekly statistics with true
shipment sizes and obtain the 4th trade volume of
AustraliaChinainFebruary.
4.2 Verificationresult
50dataarerandomlyselected,andtheshipmentsize
of every ship under different experimentalscenarios
areasfollows:
Figure6.Shipmentsizeunderfourexperimentalscenarios
AsshowninFigure6,undertheconditionofdata
preprocessing,thedataisrelativelycomplete,andthe
network prediction results are in good agreement
with the actual shipment size; In the case of
incomplete data, there are a few results not
satisfactory, but most of the network prediction
results are
good, indicating that the model has a
certain degree of robustness. The deadweight ton is
generallylargerthanthetrueshipmentsize.
ThestatisticsofAustraliaChinaironoreseaborne
tradevolumeinFebruary2018areasfollows:
Table1. Statistics under different experimental scenarios
(milliontons)
Calculatethepredictionerror
e accordingtothe
following formula (
p
x
‐ predicted value,
t
x ‐ true
value):
()/
pt t
exxx (12)
Table2. Statistics error comparison under different
experimentalscenarios
The statistical results show that, in the case of
complete data, the statistical error is the smallest,‐
0.93%; In the case of incomplete data, the statistical
error is‐1.88%. Both results are better than the
statisticalerrorofdeadweighttons,whichis3.34%.
5 PRACTICALAPPLICATION
After the actual shipment
size is predicted by well
trained network, the AIS data is summarized to get
theseabornetradevolumeofironorein2018:
Figure7.Ironorecirculationin2018
(Line’stransparencyindicatesthevolumeofthetrade.The
majortrading countriesare identified,and the font sizeof
countriesnamearerelatedtothevolumeoftrade.Linesare
linksbetweencountrieswheretradetakesplace,notvessels’
tracks)
Table3.Thetopten
traderoutes
Table4.Thetop10countriesintermsofnetexportsandnet
imports
To verify the accuracy of the data, the import
volume of China (mainland) is compared with the
customsstatistics.Accordingtocustomsstatistics,the
annualimportvolumeis1064.78milliontons,andthe
statistical error with AISbased statistics is 0.5%.
Figure8showsthemonthlystatisticscomparison.
796
Figure8.Chinaʹsmonthlyironoreimportsin2018
The primary reasons for the difference of two
statisticalresults:
1 InthestatisticsbasedonAISdata,oncea ship is
berthed,itisincludedintheimportstatistics.The
timeisslightlydifferentfromthecustomsstatistics
time.
2 ThedropoutofAISsignal.
6 CONCLUSION
This paper proposes
a shipment size of iron ore
estimationmethodbasedonAISdataandBPneural
network.1650piecesofdatawereusedintheneural
network for training and testing. Weekly AISbased
statisticsofAustraliaChinatradevolumeinFebruary
in four scenarios is done to show the superiority
of
the proposed method. Then, iron ore carriers’ AIS
dataisputintothewelltrainednetworktoobtainthe
shipment size of them. And the global iron ore
circulation in 2018 is published by the AISbased
tradevolumestatisticsmethod.Theimportvolumeof
China is selected for comparison
with the customs
statistics data, and the annual statistical error is less
than0.5%.
This study canimprove the accuracy of seaborne
tradevolumestatistics.Ithaspracticalsignificancefor
relevant departments, companies and individuals to
predict the market and make decisions. This paper
has realized the prediction of the
shipment size of
ships, and the results are satisfactory. The next step
willbeaimedatthepredictionofthecargocategory
and the corresponding volume, in order to play a
more detailed and effective reference role for the
actualdecisionmaking.
REFERENCES:
[1]Fu xiaoqi, Xie wen, Zheng guihuan, et al. Forecast and
analysis of Chinaʹs import and export in 2006[J].
managementreview,2006,18(1):2427+65.
[2]Chenwei.Importandexporttradepredictionbasedon
linear ARIMA and nonlinear BP neural network
combination model [J]. Statistics and decisionmaking,
2015(22):4749.
[3]Liuxianfeng.Predictionofoiltradevolumefrom2014to
2017 basedon wavelet analysis andARIMA
combination model [J]. Theory and practice of finance
andeconomics,2014(4):117121.
[4]Nossum, B.“The evolution of dry bulk shipping, 1945
1990”.Selfpublished.Oslo.1996.
[5]Haji S , OʹKeeffe E
, Smith T . Estimating the global
container shipping network using data and models[J].
Estimating the Global Container Shipping Network
UsingData&Models.
[6]Adland R, Jia H, Strandenes S P. Are AISbased trade
volume estimates reliable? The case of crude oil
exports[J]. Maritime Policy & Management, 2017,
44(1):1
9.
[7]Ren jie, Zhang ao, Yan liping, et al. Research on the
status identification ofinland river ships [J].
Instrumentationtechnology,2017(6):3840.
[8]ShangHai Maili Marine Technology
Co.,Ltd.http://www.hifleet.com
[9]HuX,LinC.APreliminaryStudyonTargetsAssociation
AlgorithmofRadarandAISUsingBPNeuralNetwork
[J].ProcediaEngineering,2011,15:14411445.
[10]GanS,LiangS,LiK,etal.Shiptrajectorypredictionfor
intelligent traffic management using clustering and
ANN[C]//Ukacc,InternationalConferenceonControl.
[11]Zhen rong, Jin yongxing, Hu qinyou, et al. Vessel
BehaviorPrediction Basedon AIS DataandBP Neural
Network[J].Chinamaritimeindustry,2017,40(2):610.
[12]Zhujinshan,Sunlicheng,Yinjianchuan,etal.Model
and simulation of ship signal recognition based on BP
neural network [J]. Journal of applied science and
engineering,2012,20(3):455463.
[13]Zhangyw,cuiwb,wugt,
etal.Improvementanalysis
ofBPneuralnetworkforshipremotemonitoringsystem
[J].Chinamaritime,2009,32(2):1419.(inChinese)
[14]Yu tao. BP network adaptive learning rate algorithm
analysis[D].Dalianuniversityoftechnology,2011.
[15]Zhangqingqing,hexingshi.Improvedmethodfornode
selection of BP neural
network and its application [J].
Journal of xiʹan university of technology, 2008,
22(4):502505.
[16]Wang xiaochuan. Analysis of 43 cases of MATLAB
neural network [M]. Beijing university of aeronautics
andastronauticspress,2013.