Repository logo
  • Log In
    New user? Click here to register.Have you forgotten your password?
University College Dublin
  • Colleges & Schools
  • Statistics
  • All of DSpace
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. College of Science
  3. School of Computer Science
  4. Computer Science and Informatics Technical Reports
  5. An Ensemble Approach to Identifying Informative Constraints for Semi-Supervised Clustering
 
  • Details
Options

An Ensemble Approach to Identifying Informative Constraints for Semi-Supervised Clustering

File(s)
FileDescriptionSizeFormat
Download UCD-CSI-2007-6.pdf191.2 KB
Author(s)
Greene, Derek 
Cunningham, Pádraig 
Uri
http://hdl.handle.net/10197/12355
Date Issued
04 May 2007
Date Available
28T15:55:52Z July 2021
Abstract
A number of clustering algorithms have been proposed for use in tasks where a limited degree of supervision is available. This prior knowledge is frequently provided in the form of pairwise must-link and cannot-link constraints. While the incorporation of pairwise supervision has the potential to improve clustering accuracy, the composition and cardinality of the constraint sets can significantly impact upon the level of improvement. We demonstrate that it is often possible to correctly “guess” a large number of constraints without supervision from the coassociations between pairs of objects in an ensemble of clusterings. Along the same lines, we establish that constraints based on pairs with uncertain co-associations are particularly informative, if known. An evaluation on text data shows that this provides an effective criterion for identifying constraints, leading to a reduction in the level of supervision required to direct a clustering algorithm to an accurate solution.
Type of Material
Technical Report
Publisher
University College Dublin. School of Computer Science and Informatics
Series
UCD CSI Technical Reports
UCD-CSI-2007-6
Copyright (Published Version)
2007 the Authors
Keywords
  • Clustering algorithms...

  • Machine learning

  • Semi-supervised clust...

  • Ensemble-based cluste...

  • Text corpora

Web versions
https://web.archive.org/web/20080226040105/http:/csiweb.ucd.ie/Research/TechnicalReports.html
Language
English
Status of Item
Not peer reviewed
This item is made available under a Creative Commons License
https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
Owning collection
Computer Science and Informatics Technical Reports
Views
273
Acquisition Date
Feb 5, 2023
View Details
Downloads
20
Last Week
1
Last Month
1
Acquisition Date
Feb 5, 2023
View Details
google-scholar
University College Dublin Research Repository UCD
The Library, University College Dublin, Belfield, Dublin 4
Phone: +353 (0)1 716 7583
Fax: +353 (0)1 283 7667
Email: mailto:research.repository@ucd.ie
Guide: http://libguides.ucd.ie/rru

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement