Options
Artificial Intelligence Methods and Evaluation Strategies for Detecting Future Customer Needs from User Generated Content
Author(s)
Date Issued
2024
Date Available
2025-11-28T14:46:45Z
Embargo end date
2025-04-10
Abstract
Research has shown that listening to customers when developing products leads to increased satisfaction, which drives profits. In recent years, approaches have been applied to automatically extract customer needs from User Generated Content, such as social media (e.g. Twitter) and blog websites (e.g. Quora). Specifically, these studies use techniques from Artificial Intelligence, and in particular Machine Learning, to gather these requirements. The majority of these studies focus on summarizing currently discussed customer needs from specific product models (e.g. current needs for the Samsung Galaxy mobile). However, there is a lack of studies that focus on the challenging task of predicting needs from product categories that will be of importance in the future and are therefore unaddressed (e.g. future needs for mobile phones). Doing so would allow businesses to discover the “Next Big Thing”. Therefore, this thesis investigates techniques and evaluation approaches that address the task of predicting future customer needs for product categories on Reddit. The main method used to solve the task is keyphrase ranking/classification, which orders/classifies candidate keyphrases by the extent to which they are likely to be future customer needs. The thesis first presents two ground truth datasets to train/evaluate algorithms run over Reddit. To curate such datasets, needs are extracted from Mintel GNPD - a large database of new-to-market product descriptions. These datasets use human annotators in the curation process to ensure their quality. After, two separate algorithms for future customer needs prediction are presented. The first algorithm is a rule-based approach that applies various text mining techniques to rank keyphrases that are likely to be future customer needs. This approach is evaluated on the task of predicting future customer needs with one product category: Toothpaste. The second algorithm is a machine learning approach that uses Multivariate Time Series Classification techniques to classify candidate keyphrase instances represented by 1263 univariate time series features. These 1263 features come from 10 families of features, all selected for the task of predicting future customer needs. During the evaluation, this approach predicts 15 product categories, all in the area of Consumer Packaged Goods. Finally, a user study is undertaken with participants from a multi-billion dollar (USD) company to evaluate whether they found the output of the machine learning algorithm useful in discovering new product opportunities. This user study also proposes an evaluation methodology that would allow companies to partake in evaluation studies with researchers without disclosing any proprietary information about the products they plan to make.
Type of Material
Doctoral Thesis
Qualification Name
Doctor of Philosophy (Ph.D.)
Publisher
University College Dublin. School of Computer Science
Copyright (Published Version)
2024 the Author
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
File(s)
Loading...
Name
Kilroy2024.pdf
Size
6.97 MB
Format
Adobe PDF
Checksum (MD5)
a7561b55663ec400e2e3ee64652fb895
Owning collection