Time Series Classification by Sequence Learning in All-Subsequence Space

DC FieldValueLanguage
dc.contributor.authorNguyen, Thach Le
dc.contributor.authorGsponer, Severin
dc.contributor.authorIfrim, Georgiana
dc.date.accessioned2017-05-26T15:22:02Z
dc.date.available2017-05-26T15:22:02Z
dc.date.issued2017-04-22
dc.identifier.urihttp://hdl.handle.net/10197/8547
dc.description2017 IEEE International Conference on Data Engineering, San Diego, California, USA, 19-22 April 2017en
dc.description.abstractExisting approaches to time series classification can be grouped into shape-based (numeric) and structure-based (symbolic). Shape-based techniques use the raw numeric time series with Euclidean or Dynamic Time Warping distance and a 1-Nearest Neighbor classifier. They are accurate, but computationally intensive. Structure-based methods discretize the raw data into symbolic representations, then extract features for classifiers. Recent symbolic methods have outperformed numeric ones regarding both accuracy and efficiency. Most approaches employ a bag-of-symbolic-words representation, but typically the word-length is fixed across all time series, an issue identified as a major weakness in the literature. Also, there are no prior attempts to use efficient sequence learning techniques to go beyond single words, to features based on variable-length sequences of words or symbols. We study an efficient linear classification approach, SEQL, originally designed for classification of symbolic sequences. SEQL learns discriminative subsequences from training data by exploiting the all-subsequence space using greedy gradient descent. We explore different discretization approaches, from none at all to increasing smoothing of the original data, and study the effect of these transformations on the accuracy of SEQL classifiers. We propose two adaptations of SEQL for time series data, SAX-VSEQL, can deal with X-axis offsets by learning variable-length symbolic words, and SAX-VFSEQL, can deal with X-axis and Y-axis offsets, by learning fuzzy variable-length symbolic words. Our models are linear classifiers in rich feature spaces. Their predictions are based on the most discriminative subsequences learned during training, and can be investigated for interpreting the classification decision.en
dc.description.sponsorshipScience Foundation Irelanden
dc.language.isoenen
dc.relation.ispartof2017 IEEE 33rd International Conference on Data Engineering (ICDE)
dc.subjectMachine learningen
dc.subjectStatisiticsen
dc.titleTime Series Classification by Sequence Learning in All-Subsequence Spaceen
dc.typeConference Publicationen
dc.statusPeer revieweden
dc.identifier.doi10.1109/ICDE.2017.142-
dc.neeo.contributorNguyen|Thach Le|aut|-
dc.neeo.contributorGsponer|Severin|aut|-
dc.neeo.contributorIfrim|Georgiana|aut|-
dc.date.updated2017-01-27T14:09:14Z
dc.rights.licensehttps://creativecommons.org/licenses/by-nc-nd/3.0/ie/en
item.grantfulltextopen-
item.fulltextWith Fulltext-
Appears in Collections:Insight Research Collection
Files in This Item:
 File SizeFormat
Downloadinsight_publication.pdf467.37 kBAdobe PDF
Show simple item record

SCOPUSTM   
Citations 50

8
Last Week
0
Last month
0
checked on Sep 12, 2020

Page view(s)

1,054
Last Week
2
Last month
13
checked on May 19, 2022

Download(s) 20

799
checked on May 19, 2022

Google ScholarTM

Check

Altmetric


If you are a publisher or author and have copyright concerns for any item, please email research.repository@ucd.ie and the item will be withdrawn immediately. The author or person responsible for depositing the article will be contacted within one business day.