69
1 INTRODUCTION
Maritime transportation plays an important role in
the global economy and nearly 90% of trade is
realized by maritime transportation. Route design is
one of the main tasks faced by navigators. The quality
of route is related to the navigation safety and
economic benefits of ships. Generally, the route
should be able to satisfy two main objectives, which
are avoiding dangerous areas and ensuring economic
benefits, respectively. For maritime transport, it is
important to design safe routes quickly and easily.
Ship route design is usually completed by the
captain and chief officer. Although this approach will
not cause problems in most cases, the manually
designed route is still affected by the professional
quality, sailing experience and personal emotion of
the crew. Therefore, is not necessarily the optimal
route.
With the development of computer technology,
mapping technology and information technology, the
current marine data can accurately and completely
describe the navigation environmental information. In
this case, the researchers began to use heuristic
algorithms to automatically generate ship route and
many methods and procedures have been developed
to determine the best route available (Dijkstra 1959, E.
Kobayashi 2011, Choi et al. 2015). However, in
complex situations, it is difficult to realize in the
actual navigation environment, and it is difficult to
accurately and completely represent the obstacles in
the complex environment with many obstacles. The
crew still rely more on sailing experience and
reference routes to complete the route design.
In this paper, a path optimization method based
on historical information generation is proposed,
which can be a planned route in a certain area of a
ship of different types, sizes and headings. The
maritime department or the shipping company can
use this method to obtain an optimized route and
provide reference for the route planning of the ship. A
path optimization method based on historical
Ship Route Planning Using Historical Trajectories
Derived from AIS Data
Y.K. He, D. Zhang, J.F. Zhang & M.Y. Zhang
Wuhan University of Technology, Wuhan, China
National Engineering Research Center for Water Transport Safety, Wuhan, China
T.W. Li
ChangJiang Waterway Transportation Monitoring and Emergency Center, Wuhan, China
ABSTRACT: Ship route planning is one of the key issues in enhancing traffic safety and efficiency. Many route
planning methods have been developed, but most of them are based on the information from charts. This paper
proposes a method to generate shipping routes based on historical ship tracks. The ship's historical route
information was obtained by processing the AIS data. From which the ship turning point was extracted and
clustered as nodes. The ant colony algorithm was used to generate the optimize route. The ship AIS data of the
Three Gorges dam area was selected as a case study. The ships’ optimized route was generated, and further
compared with the actual ship's navigation trajectory. The results indicate that there is space of improvement
for some of the trajectories, especially near the turning areas.
http://www.transnav.eu
the International Journal
on Marine Navigation
and Safety of Sea Transportation
Volume 13
Number 1
March 2019
DOI: 10.12716/1001.13.01.06
70
information generation is proposed, which can be a
planned route in a certain area of a ship of different
types, sizes and headings.
The paper is organized as follows: section one
mainly introduces the research on route planning, and
section two introduces the route generation method.
In the third section, the AIS data of the three gorges
region of China were used to generate air routes and
discuss. Finally, the fourth part is the conclusion of
this paper.
2 RELATED STUDIES
2.1 AIS data in maritime transportation
The ship AIS equipment transmits information at
different frequencies according to the equipment level
and the navigation status of the ship in accordance
with the regulations of the local maritime
administration. The format of the information is
strictly defined, including the actual position, speed,
heading, and the direction of the ship, as well as the
static or semi-static information such as ship's name,
call sign, draught, ship size (Tetreault 2005).
At present, all the ships above a certain tonnage
are forced to install the AIS system. And a lot of small
ships also installed the AIS system. A large amount of
AIS data has been used in many studies in maritime
transport, such as risk assessment analysis
(Montewka and Kujala 2014, Hänninen and Kujala
2014), ship collision avoidance (Wang et al. 2013,
Zhang et al. 2015), traffic flow analysis (Xiao et al.
2015), etc.
AIS data can be used to generate ship historical
routes. Zhang et al. deleted the abnormal AIS data by
determining the state of the ship and used the
regression model to smooth the route generated by
the AIS data (Zhang et al. 2018). Wang et al. proposed
a modified clustering algorithm to extract and analyze
ship routes (Wang Gao and Yang 2017). To solve the
problem of missing AIS data, Sang et al. proposed a
curve fitting method to restore the vessel trajectory
(Sang et al. 2015). Although there are some problems
in AIS data, such as noise and absence, it can still
generate the ship's historical route with a satisfactory
accuracy.
2.2 Ship route planning
There are many ship route design methods based on
heuristic algorithm, such as Dijkstra algorithm,
genetic algorithm, ant colony algorithm, A*
algorithm. In order to avoid problems such as local
optimal solution and long calculation time, many
researchers have proposed improved methods
combining other relevant algorithms (Lee et al. 2018,
Vettor and Guedes Soares 2016, Roh 2013).
A lot of the above methods are cell-based, which
means that obstacles need to be gridded. Therefore, it
is difficult to accurately and completely express
obstacles in the complex environment of obstacles
such as bridges, shallow waterways, ship anchorage,
reefs and navigation marks.
Xiao et al.'s study on the traffic flow pattern of
ships shows that different types and sizes of ships
have different route positions, which is especially
obvious in waterways with traffic separation
management (Xiao et al. 2015). Most existing methods
do not take into account the route differences of
different ships.
The method proposed in this paper avoids the grid
description of obstacles and can design different
routes according to ship type, size or other special
requirements, which overcomes the limitations of
existing methods.
3 METHOD FOR THE ROUTE PLANNING
In this section, the route design method proposed is
introduced. This method obtains the ship's historical
route through AIS data, and then clusters the
optimization to design the route.
3.1 Extraction of historical routes
This section describes how to get historical routes
from AIS data. The time interval at which the AIS
device sends the message is short, and the message
contains the location information of the ship.
Therefore, the historical route of the ship can be
obtained by processing the AIS data. The AIS data
includes the ship's MMSI and time. It can classify
historical routes and select different historical route
information according to actual needs.
3.1.1 AIS data selection and classification
This paper mainly uses dynamic and static AIS
data. Dynamic data provides the latitude and
longitude of the ship at each moment and can be used
to map the ship's historical routes. Static data is
mainly used to classify ships. It should be noted that
since the ship's static data only divides the ship into
passenger ships, cargo ships, oil tankers, tugs and
official ships, it is necessary to use the MMSI number
to inquire about the specific ship type.
The purpose of classifying AIS data is to obtain
different types of historical routes. The AIS data
includes the ship's MMSI. The ship can be classified
by the unique MMSI to obtain ship route information
of different types, sizes and draughts. At the same
time, according to the special requirements of the
local maritime administration, the route information
of special ships or dangerous goods ships can be
selected. Further, considering the hydrological
environment of the waters, the route information of
the dry season, the flat-water period and the flood
season can be obtained by selecting the AIS data of
different seasons.
3.1.2 Outliers removal
The outliers handled in this article is "particularly
egregious" error data. Since the turning points will be
clustered, the influence of random errors can be
ignored. The processing of the abnormal data is
71
divided into two steps, which are removal and
replacement.
For the abnormal data of the velocity, the
confidence range is set by determining its maximum
and minimum values. The scope of credibility is
determined based on actual conditions. The speed
range of ships in different navigation areas is
inconsistent, different types of ships have different
speed ranges as well. The choice of speed range
should be determined by reference to the ship's
performance of the same type and size and the
regulations of the local maritime administration.
Figure 1 shows the speed in the AIS data of a bulk
carrier in an inland area, and the confidence range is
set to be between 1 and 20 knots.
Figure 1. Analysis of abnormal velocity data
The position abnormal data, that is, the longitude
or latitude, is abnormal, and the position deviates
from the route. In this paper, a confidence range is
established, and it is considered that the abnormal
data is outside the current trusted range. The
confidence range is determined by the position of the
previous moment, the position of the latter moment,
and the speed.
Depending on the speed, the maximum distance
the ship can navigate in a fixed time span can be
calculated, during which time the ship will not move
to a further position. Since the starting and ending
points of the movement of the ship during this time
are certain (i.e. the position at the previous moment
and the position at the next moment), the credible
range should be the ellipse with the starting point and
the end point as the focus. The sum of the distance
from any point on the ellipse to the focus is the
maximum distance the ship can move.
Figure 2. Analysis of abnormal position data
The long axis a and the short axis b of this ellipse
are:

11
2
nn n
vt t
a


(1)

2
2
11
22
nn n
vt t
l
b









(2)
where: t
n-1 is the time at the previous moment; tn+1 is
the time at the later moment; l is the distance between
the position of the ship at the previous moment and
later moment; v is the speed of the ship.
3.1.3 Outliers restoration
After removing the outliers, the missing speed
data and location data should be restored.
AIS devices send dynamic data at intervals
ranging from 2 seconds to 3 minutes. The time
interval of sending dynamic data is very short when
the ship's navigation state are changing greatly, such
as large angle steering and acceleration or
deceleration. On the contrary, the time interval of
sending dynamic data is longer when the change of
navigation status is small. Therefore, the missing data
can be obtained simply by linear interpolation.
Suppose that the point to be restored is n, the point
at the previous moment is n-1, and the point at the
latter moment is n+1, then the longitude, latitude and
velocity of the point are:

11
11
11
nn
nn nn
nn
vv
vv tt
tt







(3)

11
11
11
nn
nn nn
nn
La La
La La t t
tt







(4)

11
11
11
nn
nn nn
nn
Lo Lo
Lo Lo t t
tt







(5)
where: V is the velocity at this point, La is the latitude,
and Lo is the longitude.
3.1.4 Generation and screening of historical routes
The processed AIS data contains more accurate
longitude and latitude information at each moment of
the ship and can be used to describe the ship's
trajectory. Ship routes on short-haul flights lack
reference value and should be deleted. In addition,
due to the failure of AIS equipment or poor signal, the
ship has too few AIS position points in the route. Such
routes may lack key information and should be
deleted as well. The route identification method with
less AIS information is to calculate the frequency of
the AIS message of the route. When the frequency is
lower than a threshold, it is considered that the route
lacks key information. The interval at which AIS
72
devices send packets varies from 2 seconds to 3
minutes. This is determined by the ship's heading
status and is highly variable. Therefore, this paper
selects a number of ships with different speeds and
counts the time interval at which AIS equipment
sends messages.
However, in the same waters with the same
navigation environment, the time interval for the AIS
device to send packets should converge in a range.
Assuming that Xt is a specimen for the time interval
of sending packets in a certain area, according to
Chebyshev's theorem, the time interval for sending
packets can be determined as:
0, 5
t
t
x
DX

(6)
where:
𝑥
̅
𝑡
is the average value;
𝐷𝑋
𝑡
is the standard
deviation. Then, the minimum frequency of messages
sent by the AIS device every hour is:
3600
5
t
t
n
x
DX

(7)
When the number of AIS packets in the route is
less than such threshold, it indicates that the route
may lack key location information and needs to be
deleted. It should be noted that the frequency of the
AIS message calculated above refers to the dynamic
data packet including the location information, and
does not include other AIS messages such as static
data and voyage data.
3.2 Determine the turning point of the ship
Turning point refers to the point at which the ship
turns. The main method of route design is to
determine the turning point of the ship, and the route
between the turning points constitutes the route. After
obtaining the historical route, the optimized route is
generated by selecting the turning point of the ship
therein. Due to the large amount of data, the route
generated by using all the steering points of the ship
is too complicated. The calculation amount is huge.
The local optimal solution is easily generated, which
is not in accordance with the actual situation. There-
fore, the steering points need to be clustered before
the route is generated.
In the database, the ship's historical route can be
selected according to special requirements, and then
their turning points are clustered. In this way, routes
with different characteristics can be designed. For
example, designing routes with different headings or
routes in different seasons.
3.2.1 Clustering method
Density Based Spatial Clustering of Applications
with Noise (DBSCAN) is a spatial clustering
algorithm that is widely used in many applications. It
is capable of finding arbitrarily shaped clusters in the
presence of noise data and performs well without the
prior knowledge about the number of clusters. The
characteristics of DBSCAN determine its suitability
for clustering spatial data. Therefore, the DBSCAN
algorithm is selected to cluster the turning points.
The clustering effect of the DBSCAN algorithm is
related to the Eps radius (Reps) and the neighborhood
density threshold (MinPts). Start at any point
(unvisited) and find all nearby points within the
distance of Reps. If the number of nearby points is not
less than MinPts, the current point forms a cluster
with its nearby points. The starting point is marked as
visited. If the number of nearby points is less than
MinPts, the point is marked as a noise point. When no
new points are added to any cluster, the clustering
terminates.
In Figure 3, when MinPts is 3, a, b, c, d are core
points, and e, f are noise points.
Figure 3. The schemes of DBSCAN cluster algorithm
3.2.2 Algorithm parameters
When clustering the turning points, three
parameters need to be considered: steering angle,
Reps and MinPts in the DBSCAN algorithm. The
steering angle and Reps are determined based on
historical route conditions. MinPts is determined
based on the generated turning point density. When
the angle of the historical route changes small, the
large ship steering angle and small Reps should be
selected. When the historical route angle changes
greatly, the small ship steering angle and the large
Reps should be selected.
It should be noted that the historical routes
generated in the previous section are connected by
intermittent points, so the steering angle of each point
cannot be obtained directly. The steering angle is
calculated by the heading difference between the
position point in the AIS data and the previous point
position.
After turning to point clustering, a cluster
containing multiple turning points is needed. The
cluster needs to be converted into points with certain
coordinates. The average longitude and average
latitude of the turning points in the cluster should be
calculated to determine the turning points
represented by the cluster:
1
m
i
i
La La
(8)
73
1
m
i
i
Lo Lo
(9)
3.2.3 Clustering effect adjustment
In the clustering process described above, the
clustering parameters selected for the entire route are
consistent. However, the angle changes in various
parts of the route may be inconsistent, or even vary
greatly. Using the same parameters may cause the
turning points in some areas to be too sparse, while
the turning points in some areas are too dense. So for
the area where the steering point density is small, the
steering angle should be reduced. In the area where
the turning point is dense, the steering angle should
increase.
3.3 Route generation
After obtaining the turning point of the ship, treating
it as a node, the Dijkstra algorithm and the ant colony
algorithm are used to generate the route.
When there are too many data points, the
traditional ant colony algorithm has the
disadvantages of weak global search ability, slow
convergence speed and low search efficiency. It is
easy to generate local optimal solutions. If only the
Dijkstra algorithm is used, the generated route will
always be close to the outside and turn frequently,
which is not in line with the actual situation.
Therefore, this paper first uses the Dijkstra algorithm
to plan the initial path and then uses the ant colony
algorithm to optimize the path.
3.3.1 Initial route generation
The Dijkstra algorithm is essentially a traversal
algorithm that calculates the shortest path by
calculating the length of many paths between two
points (Dijkstra 1959). The Dijkstra algorithm starts
from the starting point and selects the nearest node
each time, until it reaches the end point.
When using the Dijkstra algorithm, the problem in
the Figure-4 will appear. Due to no other restrictions,
the route generated by the Dijkstra algorithm may
pass through obstacles or beyond the deep water
channel. When calculating the distance between two
points, set the path length through the obstacle or
beyond the deep water channel to infinity. In this way
it is ensured that the designed route does not pass
through obstacles or outside the deep water channel.
Figure 4. Unsafe path
The steps of Dijkstra algorithm are as follows:
Step 1: Create a point set S= {v
0}, U = {v0, v1, …, vn}. <