Time Series Classification by Sequence Learning in All-Subsequence Space
|Title:||Time Series Classification by Sequence Learning in All-Subsequence Space||Authors:||Nguyen, Thach Le
|Permanent link:||http://hdl.handle.net/10197/8547||Date:||22-Apr-2017||Online since:||2017-05-26T15:22:02Z||Abstract:||Existing approaches to time series classification can be grouped into shape-based (numeric) and structure-based (symbolic). Shape-based techniques use the raw numeric time series with Euclidean or Dynamic Time Warping distance and a 1-Nearest Neighbor classifier. They are accurate, but computationally intensive. Structure-based methods discretize the raw data into symbolic representations, then extract features for classifiers. Recent symbolic methods have outperformed numeric ones regarding both accuracy and efficiency. Most approaches employ a bag-of-symbolic-words representation, but typically the word-length is fixed across all time series, an issue identified as a major weakness in the literature. Also, there are no prior attempts to use efficient sequence learning techniques to go beyond single words, to features based on variable-length sequences of words or symbols. We study an efficient linear classification approach, SEQL, originally designed for classification of symbolic sequences. SEQL learns discriminative subsequences from training data by exploiting the all-subsequence space using greedy gradient descent. We explore different discretization approaches, from none at all to increasing smoothing of the original data, and study the effect of these transformations on the accuracy of SEQL classifiers. We propose two adaptations of SEQL for time series data, SAX-VSEQL, can deal with X-axis offsets by learning variable-length symbolic words, and SAX-VFSEQL, can deal with X-axis and Y-axis offsets, by learning fuzzy variable-length symbolic words. Our models are linear classifiers in rich feature spaces. Their predictions are based on the most discriminative subsequences learned during training, and can be investigated for interpreting the classification decision.||Funding Details:||Science Foundation Ireland||Type of material:||Conference Publication||Keywords:||Machine learning; Statisitics||DOI:||10.1109/ICDE.2017.142||Language:||en||Status of Item:||Peer reviewed||Is part of:||2017 IEEE 33rd International Conference on Data Engineering (ICDE)||Conference Details:||2017 IEEE International Conference on Data Engineering, San Diego, California, USA, 19-22 April 2017|
|Appears in Collections:||Insight Research Collection|
Show full item record
This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.