How Early Rewards Influence Choice: Targeting model-free processing through reward timing

Files in This Item:
 File SizeFormat
Download7834041.pdf3.77 MBAdobe PDF
Title: How Early Rewards Influence Choice: Targeting model-free processing through reward timing
Authors: Garaialde, Diego
Permanent link: http://hdl.handle.net/10197/12806
Date: 2021
Online since: 2022-04-29T13:24:30Z
Abstract: While many people claim to have the intention to perform certain behaviours, it is commonly the case these intentions do not come to fruition. This issue is particularly pronounced in cases where there is a long delay between intention and the behaviour, or cases where there is a strong automatic impulse that acts against the intention. According to dual-process theories, this intention-behaviour gap is a result of a conflict between two types of systems: a habitual model-free system and a deliberate model-based system. Usually, interventions target the model-based system, providing important information necessary to convince individuals that the behaviour is desirable or beneficial. However, this approach mostly ignores the model-free system, leaving a large part of the decision-making process outside of the intervention. The early reward strategy is a method to target the model-free system directly and considers the known mechanisms behind how reward information is processed. In particular, it focuses on how reward timing affects decision making within a sequence of actions. Due to how temporal discounting and temporal difference learning lead to reductions in the value of the reward based on how far it is placed from the first action in the sequence, placing the reward as close to the start of the sequence as possible is likely to prevent this reduction from occurring as much as possible. This early reward strategy was tested across four experiments and was found to successfully alter behaviour in a way predicted by the theory. Two of the experiments focused on a computational approach, using reinforcement learning algorithms to predict behaviour and compare it against the participant responses. The other two experiments were conducted with a more applied approach that used tasks more representative of real-world action sequences to test the extent to which behaviour was affected by early rewards. Whether the reward was monetary or gamified, placing a reward earlier in a sequence improved the frequency of selection for that sequence significantly when compared to other reward placements. The results have important implications for anyone attempting to incentivise new behaviours by providing a theory-driven approach towards maximising the effectiveness of the reward, particularly to the model-free system. As a result, consideration for reward timing should be integral to any incentive system that involves sequences of actions, with a strong emphasis on providing rewards as early in the interaction as possible.
Type of material: Doctoral Thesis
Publisher: University College Dublin. School of Information and Communication Studies
Qualification Name: Ph.D.
Copyright (published version): 2021 the Author
Keywords: Dual process theoryRewardsTemporal discountingTemporal difference learning
Language: en
Status of Item: Peer reviewed
This item is made available under a Creative Commons License: https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
Appears in Collections:Information and Communication Studies Theses

Show full item record

Page view(s)

52
Last Week
12
Last month
checked on May 17, 2022

Download(s)

37
checked on May 17, 2022

Google ScholarTM

Check


If you are a publisher or author and have copyright concerns for any item, please email research.repository@ucd.ie and the item will be withdrawn immediately. The author or person responsible for depositing the article will be contacted within one business day.