HomePage
 




 


 

ISSN 2083-6473
ISSN 2083-6481 (electronic version)
 

 

 

Editor-in-Chief

Associate Editor
Tomasz Neumann
 

Published by
TransNav, Faculty of Navigation
Gdynia Maritime University
3, John Paul II Avenue
81-345 Gdynia, POLAND
www http://www.transnav.eu
e-mail transnav@am.gdynia.pl
Named Entity Disambiguation for Maritime-related Data Retrieved from Heterogenous Sources
1 Poznań University of Economics and Business, Poznań, Poland
ABSTRACT: The article concerns integration and disambiguation of data related to the maritime domain. A developed system is described, which collects and merges data about several maritime-related entities (vessels, vessel types, ports, companies etc.) retrieved from different internet sources and feeds the data into a single database. This process is however not trivial. There are few challenges, which need to be faced to successfully conduct it. Firstly, in different sources, entities may be referenced to in different ways, for example, by using different text strings. Additionally, some of these references may be ambiguous, i.e. potentially the reference may point to more than one entity. To enable efficient analysis of data coming from different sources, such ambiguities must be resolved automatically as a preprocessing step, before the data is uploaded to the database and utilized in further computations. The aim of the disambiguation process is to assign artificial, unique identifiers to each entity and then, if possible, automatically assign these identifiers to each data item related to a given entity. In the article, developed methods for resolving such ambiguities are discussed and their evaluation is presented.
REFERENCES
International Maritime Organisation: The International Aeronautical and Maritime Search and Rescue (IAMSAR) Manual. IMO/ICAO, London (2013)
el Pozo, F., Dymock, A., Feldt, L., Hebrard, P., di Monteforte, F.S.: Maritime surveillance in support of csdp. Technical report, European Defence Agency (2010)
Angerman, W.S.: Coming full circle with boyd’s ooda loop ideas: An analysis of innovation diffusion and evolution. Technical report, DTIC Document (2004)
Vassiliadis, P.: A survey of extract–transform–load technology. International Journal of Data Warehousing and Mining (IJDWM) 5(3) (2009) 1–27
Abramowicz, W., Eiden, G., Małyszko, J., Stróżyna, M., We˛cel, K.: SIMMO Project. Deliverable 1.2 Report on selected internet data sources, defined cooperation models and intelligence analysis scenarios. Research report, Poznan´ University of Economics, LuxSpace Sarl (2015)
Bilenko, M., Mooney, R.J.: Adaptive duplicate detection using learnable string similarity measures. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’03, New York, NY, USA, ACM (2003) 39–48
Rahm, E., Do, H.H.: Data cleaning: Problems and current approaches. IEEE Data Eng. Bull.
23(4) (2000) 3–13
Alberga, C.N.: String similarity and misspellings. Commun. ACM 10(5) (May 1967) 302– 313
Jaro, M.A.: Advances in record-linkage methodology as applied to matching the 1985 census of tampa, florida. Journal of the American Statistical Association 84(406) (1989) 414–420
Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. Knowledge and Data Engineering, IEEE Transactions on 19(1) (2007) 1–16
Wentland, W., Knopp, J., Silberer, C., Hartung, M.: Building a multilingual lexical resource for named entity disambiguation, translation and transliteration. In: LREC. (2008)
Vespe, M., Sciotti, M., Battistello, G.: Multi-sensor autonomous tracking for maritime surveillance. In: Radar, 2008 International Conference on, IEEE (2008) 525–530
Kazemi, S., Abghari, S., Lavesson, N., Johnson, H., Ryman, P.: Open data for anomaly detection in maritime surveillance. Expert Syst. Appl. 40(14) (2013) 5719–5729
Kaczmarek, T., Węckowski, D. 347. In: Harvesting Deep Web Data through Produser Involvement. IGI Global (2013) 200–221
Chang, K.C.C., He, B., Li, C., Patel, M., Zhang, Z.: Structured databases on the web: Observations and implications. ACM SIGMOD Record 33(3) (2004) 61–70
Rhodes, B.J., Bomberger, N.A., Seibert, M., Waxman, A.M.: Maritime situation monitoring and awareness using learning mechanisms. In: Military Communications Conference, 2005. MILCOM 2005. IEEE, IEEE (2005) 646–652
Helldin, T., Riveiro, M.: Explanation methods for bayesian networks: review and application to a maritime scenario. In: Proc. of the 3rd Annual Skövde Workshop on Information Fusion Topics (SWIFT 2009). (2009) 11–16
Mano, J.P., Georgé, J.P., Gleizes, M.P.: Adaptive multi-agent system for multi-sensor maritime surveillance. In: Advances in Practical Applications of Agents and Multiagent Systems. Springer (2010) 285–290
Ding, Z., Kannappan, G., Benameur, K., Kirubarajan, T., Farooq, M.: Wide area integrated maritime surveillance: An updated architecture with data fusion. In: Proceedings of the Sixth International Conference of Information Fusion, Australia. Volume 2. (2003) 1324–1333
Citation note:
Małyszko J., Abramowicz W., Stróżyna M.: Named Entity Disambiguation for Maritime-related Data Retrieved from Heterogenous Sources. TransNav, the International Journal on Marine Navigation and Safety of Sea Transportation, Vol. 10, No. 3, doi:10.12716/1001.10.03.12, pp. 465-477, 2016

File downloaded 341 times








Important: TransNav.eu cookie usage
The TransNav.eu website uses certain cookies. A cookie is a text-only string of information that the TransNav.EU website transfers to the cookie file of the browser on your computer. Cookies allow the TransNav.eu website to perform properly and remember your browsing history. Cookies also help a website to arrange content to match your preferred interests more quickly. Cookies alone cannot be used to identify you.
Akceptuję pliki cookies z tej strony