Repository logo
  • Log In
    New user? Click here to register.Have you forgotten your password?
University College Dublin
    Colleges & Schools
    Statistics
    All of DSpace
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. UCD Office of the Registrar and Vice President Academic Affairs
  3. UCD Library
  4. UCD Library Staff Research Collection
  5. Web robot detection in scholarly Open Access institutional repositories
 
  • Details
Options

Web robot detection in scholarly Open Access institutional repositories

Author(s)
Greene, Joseph  
Uri
http://hdl.handle.net/10197/7682
Date Issued
2016-07
Date Available
2016-08-14T01:00:36Z
Abstract
Purpose -- This paper investigates the impact and techniques for mitigating the effects of web robots on usage statistics collected by Open Access institutional repositories (IRs). Design/methodology/approach -- A review of the literature provides a comprehensive list of web robot detection techniques. Reviews of system documentation and open source code are carried out along with personal interviews to provide a comparison of the robot detection techniques used in the major IR platforms. An empirical test based on a simple random sample of downloads with 96.20% certainty is undertaken to measure the accuracy of an IR's web robot detection at a large Irish University. Findings -- While web robot detection is not ignored in IRs, there are areas where the two main systems could be improved. The technique tested here is found to have successfully detected 94.18% of web robots visiting the site over a two-year period (recall), with a precision of 98.92%. Due to the high level of robot activity in repositories, correctly labelling more robots has an exponential effect on the accuracy of usage statistics. Limitations -- This study is performed on one repository using a single system. Future studies across multiple sites and platforms are needed to determine the accuracy of web robot detection in OA repositories generally. Originality/value -- This is the only study to date to have investigated web robot detection in IRs. It puts forward the first empirical benchmarking of accuracy in IR usage statistics.
Type of Material
Journal Article
Publisher
Emerald
Journal
Library Hi Tech
Volume
34
Issue
3
Start Page
500
End Page
520
Subjects

Open Access

Institutional reposit...

Usage statistics

Downloads

Web robots

DOI
10.1108/LHT-04-2016-0048
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
File(s)
Loading...
Thumbnail Image
Name

RobotsLibHiTechAcceptedPostPrintRepository2016-06-20.pdf

Size

621.94 KB

Format

Adobe PDF

Checksum (MD5)

03b06d04aacca1658d3c66868b18de6f

Owning collection
UCD Library Staff Research Collection

Item descriptive metadata is released under a CC-0 (public domain) license: https://creativecommons.org/public-domain/cc0/.
All other content is subject to copyright.

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement