Options
Sentence-Level Event Classification in Unstructured Texts
Author(s)
Date Issued
2008-09
Date Available
2021-07-30T16:12:38Z
Abstract
The ability to correctly classify sentences that describe events is an important task for many natural language applications such as Question Answering (QA) and Text Summarisation. In this paper, we treat event detection as a sentence level text classification problem. We compare the performance of two approaches to this task: a Support Vector Machine (SVM) classifier and a Language Modeling (LM) approach. We also investigate a rule-based method that uses hand-crafted lists of ‘trigger’ terms derived from WordNet. We use two datasets in our experiments and test each approach using six different event types, i.e, Die, Attack, Injure, Meet, Transport and Charge-Indict. Our experimental results indicate that although the trained SVM classifier consistently outperforms the language modeling approach, our rule-based system marginally outperforms the trained SVM classifier on three of our six event types. We also observe that overall performance is greatly affected by the type of corpus used to train the algorithms. Specifically, we have found that a homogeneous training corpus that contains many instances of a specific event type (i.e., Die events in the recent Iraqi war) produces a poorer performing classifier than one trained on a heterogeneous dataset containing more diverse instances of the event (i.e.,Die events in many different settings, for example, traffic accidents, natural disasters etc.). Our heterogeneous dataset is provided by the ACE (Automatic Content Extraction) initiative, while our novel homogeneous dataset consists of news articles and annotated Die events from the Iraq Body Count (IBC) database. Overall, our results show that the techniques presented here are effective solutions to the event classification task described in this paper, where F1 scores of over 90% are achieved.
External Notes
Technical report numbers ucd-csi-2008-06 and ucd-csi-2008-07 are identical; only one copy has been retained.
Sponsorship
Irish Research Council for Science, Engineering and Technology
Type of Material
Technical Report
Publisher
University College Dublin. School of Computer Science and Informatics
Series
UCD CSI Technical Reports
ucd-csi-2008-6
UCD CSI Technical Reports
ucd-csi-2008-7
Copyright (Published Version)
2008 the Authors
Language
English
Status of Item
Not peer reviewed
This item is made available under a Creative Commons License
File(s)
Owning collection
Views
268
Acquisition Date
Apr 18, 2024
Apr 18, 2024
Downloads
258
Last Month
5
5
Acquisition Date
Apr 18, 2024
Apr 18, 2024