Repository logo
  • Log In
    New user? Click here to register.Have you forgotten your password?
University College Dublin
    Colleges & Schools
    Statistics
    All of DSpace
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. College of Science
  3. School of Computer Science
  4. Computer Science Research Collection
  5. A Sentence-level Hierarchical BERT Model for Document Classification with Limited Labelled Data
 
  • Details
Options

A Sentence-level Hierarchical BERT Model for Document Classification with Limited Labelled Data

Author(s)
Lu, Jinghui  
Henchion, Maeve  
Bacher, Ivan  
MacNamee, Brian  
Uri
http://hdl.handle.net/10197/25404
Date Issued
2021-10-09
Date Available
2024-02-09T15:41:51Z
Abstract
Training deep learning models with limited labelled data is an attractive scenario for many NLP tasks, including document classification. While with the recent emergence of BERT, deep learning language models can achieve reasonably good performance in document classification with few labelled instances, there is a lack of evidence in the utility of applying BERT-like models on long document classification. This work introduces a long-text-specific model — the Hierarchical BERT Model (HBM) — that learns sentence-level features of the text and works well in scenarios with limited labelled data. Various evaluation experiments have demonstrated that HBM can achieve higher performance in document classification than the previous state-of-the-art methods with only 50 to 200 labelled instances, especially when documents are long. Also, as an extra benefit of HBM, the salient sentences identified by learned HBM are useful as explanations for labelling documents based on a user study.
Sponsorship
Science Foundation Ireland
Teagasc
Type of Material
Book Chapter
Publisher
Springer
Series
Lecture Notes in Computer Science
12986
Copyright (Published Version)
2021 Springer
Subjects

Document classificati...

Low-shot learning

BERT

DOI
10.1007/978-3-030-88942-5_18
Language
English
Status of Item
Peer reviewed
Journal
Soares, C. and Torgo, L. (eds.). Discovery Science: 24th International Conference, DS 2021, Halifax, NS, Canada, October 11–13, 2021: Proceedings
ISBN
978-3-030-88941-8
This item is made available under a Creative Commons License
https://creativecommons.org/licenses/by/3.0/ie/
File(s)
Loading...
Thumbnail Image
Name

2106.06738.pdf

Size

1.03 MB

Format

Adobe PDF

Checksum (MD5)

08d69348aca1c8116eddba1a0cd716fa

Owning collection
Computer Science Research Collection
Mapped collections
Insight Research Collection

Item descriptive metadata is released under a CC-0 (public domain) license: https://creativecommons.org/public-domain/cc0/.
All other content is subject to copyright.

For all queries please contact research.repository@ucd.ie.

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement