Options
Scalable Disambiguation System Capturing Individualities of Mentions
Date Issued
2017-05-27
Date Available
2019-04-10T11:33:54Z
Abstract
Entity disambiguation, or mapping a phrase to its canonical representation in a knowledge base, is a fundamental step in many natural language processing applications. Existing techniques based on global ranking models fail to capture the individual peculiarities of the words and hence, struggle to meet the accuracy-time requirements of many real-world applications. In this paper, we propose a new system that learns specialized features and models for disambiguating each ambiguous phrase in the English language. We train and validate the hundreds of thousands of learning models for this purpose using a Wikipedia hyperlink dataset with more than 170 million labelled annotations. The computationally intensive training required for this approach can be distributed over a cluster. In addition, our approach supports fast queries, efficient updates and its accuracy compares favorably with respect to other state-of-the-art disambiguation systems.
Type of Material
Conference Publication
Publisher
Springer
Start Page
365
End Page
379
Series
Lecture Notes in Computer Science (volume 10318)
Copyright (Published Version)
2017 Springer
Language
English
Status of Item
Not peer reviewed
Journal
Gracia, J., Bond, F., McCrae, F. et al. (eds.). Language, Data, and Knowledge: First International Conference, LDK 2017, Galway, Ireland, June 19-20, 2017, Proceedings
Conference Details
Language, Data, and Knowledge - First International Conference (LDK 2017), Galway, Ireland, 19-20 June 2017
ISBN
9783319598871
ISSN
0302-9743
This item is made available under a Creative Commons License
File(s)
No Thumbnail Available
Name
ajwani_ldk17.pdf
Size
368.24 KB
Format
Adobe PDF
Checksum (MD5)
0e1383c613ac6f30897751dbfd7415df
Owning collection