Now showing 1 - 6 of 6
  • Publication
    How Many Topics? Stability Analysis for Topic Models
    Topic modeling refers to the task of discovering the underlyingthematic structure in a text corpus, where the output is commonlypresented as a report of the top terms appearing in each topic. Despitethe diversity of topic modeling algorithms that have been proposed, acommon challenge in successfully applying these techniques is the selectionof an appropriate number of topics for a given corpus. Choosingtoo few topics will produce results that are overly broad, while choosingtoo many will result in theover-clustering of a corpus into many small,highly-similar topics. In this paper, we propose a term-centric stabilityanalysis strategy to address this issue, the idea being that a model withan appropriate number of topics will be more robust to perturbations inthe data. Using a topic modeling approach based on matrix factorization,evaluations performed on a range of corpora show that this strategy cansuccessfully guide the model selection process.
      601Scopus© Citations 135
  • Publication
    Down the (White) Rabbit Hole: The Extreme Right and Online Recommender Systems
    In addition to hosting user-generated video content, YouTube provides recommendation services,where sets of related and recommended videos are presented to users, based on factors such as covisitation count and prior viewing history. This article is specifically concerned with extreme right(ER) video content, portions of which contravene hate laws and are thus illegal in certain countries,which are recommended by YouTube to some users. We develop a categorization of this content based on various schema found in a selection of academic literature on the ER, which is then used to demonstrate the political articulations of YouTubes recommender system, particularly the narrowing of the range of content to which users are exposed and the potential impacts of this. For this purpose, we use two data sets of English and German language ER YouTube channels, along with channels suggested by YouTubes related video service. A process is observable whereby users accessing an ER YouTube video are likely to be recommended further ER content, leading to immersion in an ideological bubble in just a few short clicks. The evidence presented in this article supportsa shift of the almost exclusive focus on users as content creators and protagonists in extremist cyberspaces to also consider online platform providers as important actors in these same spaces.
      804Scopus© Citations 117
  • Publication
    An Analysis of the Coherence of Descriptors in Topic Modeling
    In recent years, topic modeling has become an established method in the analysis of text corpora, with probabilistic techniques such as latent Dirichlet allocation (LDA) commonly employed for this purpose. However, it might be argued that adequate attention is often not paid to the issue of topic coherence, the semantic interpretability of the top terms usually used to describe discovered topics. Nevertheless, a number of studies have proposed measures for analyzing such coherence, where these have been largely focused on topics found by LDA, with matrix decomposition techniques such as Non-negative Matrix Factorization (NMF) being somewhat overlooked in comparison. This motivates the current work, where we compare and analyze topics found by popular variants of both NMF and LDA in multiple corpora in terms of both their coherence and associated generality, using a combination of existing and new measures, including one based on distributional semantics. Two out of three coherence measures find NMF to regularly produce more coherent topics, with higher levels of generality and redundancy observed with the LDA topic descriptors. In all cases, we observe that the associated term weighting strategy plays a major role. The results observed with NMF suggest that this may be a more suitable topic modeling method when analyzing certain corpora, such as those associated with niche or non-mainstream domains.
      3094Scopus© Citations 227
  • Publication
    Online Social Media in the Syria Conflict: Encompassing the Extremes and the In-Betweens
    The Syria conflict has been described as the most socially mediated in history, with online social media playing a particularly important role. At the same time, the ever-changing landscape of the conflict leads to difficulties in applying analysis approaches taken by other studies of online political activism. In this paper, we propose an approach motivated by the Grounded Theory method, which is used within the social sciences to perform analysis in situations where key prior assumptions or the proposal of an advance hypothesis may not be possible. We apply this method to analyze Twitter and YouTube activity of a range of protagonists to the conflict in an attempt to reveal additional insights into the relationships between them. By means of a network representation that combines multiple data views, we uncover communities of accounts falling into four categories that broadly reflect the situation on the ground in Syria. A detailed analysis of selected communities within the anti-regime categories is provided, focusing on their central actors, preferred online platforms, and activity surrounding real world events. Our findings indicate that social media activity in Syria is considerably more convoluted than reported in many other studies of online political activism, suggesting that alternative analysis approaches can play an important role in this type of scenario.
      531Scopus© Citations 22
  • Publication
    Development and Validation of the Operations Procedures and Manual for a 2U CubeSat, EIRSAT-1, with Three Novel Payloads
    The CubeSat standard, relatively short launch timescale, and orders of magnitude difference in cost in comparison to large scale missions, has allowed universities and smaller institutions to develop space missions. The Educational Irish Research Satellite (EIRSAT-1) is a 2U CubeSat being developed in University College Dublin (UCD) as part of the second round of the European Space Agency (ESA) Education Office’s Fly Your Satellite! (FYS) Programme. EIRSAT-1 is a student-led project to build, test, launch and operate Ireland’s first satellite. CubeSats typically use commercial off-the-shelf (COTS) components to facilitate new teams in developing a satellite on a rapid timescale. While some of the EIRSAT-1 subsystems are COTS procured from AAC Clyde Space, EIRSAT-1 has three novel experiments on-board which have been developed in UCD. The spacecraft’s Antenna Deployment Module has also been designed and built in-house. The on-board computer (OBC), procured from AAC Clyde Space, has been adapted to interface with these novel hardware components, accompanied by in-house developed software and firmware. All of these innovative subsystems complicate the CubeSat functionality making it essential to document and rigorously test the operations procedures for EIRSAT-1. In preparation for launch with these novel spacecraft subsystems, the EIRSAT-1 Operations Manual is being developed and incrementally verified. The Operations Manual contains the procedures to command and control the satellite, account for nominal and non-nominal scenarios and guide the operator in determining the cause of any anomalies observed during the mission and facilitate recovery. A series of operations development tests (ODTs) have been designed and conducted for a robust verification process. Each procedure is written up by a member of the EIRSAT-1 Operations Team in the EIRSAT-1 Operations Manual format. During an ODT, an in-flight scenario is considered in which the procedure under test is required. The procedure is then followed by a team member who has not been involved in the procedure development process. The feedback from these tests and from the operators is used to improve the procedures and continually update the Operations Manual. This paper will present the approach to operations development used by the EIRSAT-1 team and discuss the lessons learned for CubeSat operations development, testing and pre-flight verification.
      498
  • Publication
    Development of the Ground Segment Communication System for the EIRSAT-1 CubeSat
    The Educational Irish Research Satellite (EIRSAT-1) is a student-led project to design, build and test Ireland’s first satellite. As part of the development, a ground segment (GS) has also been designed alongside the spacecraft. The ground segment will support two-way communications with the spacecraft throughout the mission. Communication with the satellite will occur in the very high frequency (VHF) and the ultra high frequency (UHF) bands for the uplink and downlink respectively. Different modulation schemes have been implemented for both uplink and downlink as part of the GS system. Uplink incorporates an Audio Frequency Shift-Keying (AFSK) scheme, while downlink incorporates a Gaussian Minimum Shift-Keying (GMSK) scheme. In order for the spacecraft to successfully receive a telecommand (TC) transmitted from the ground station, a framing protocol is required. AX.25 was selected as the data link layer protocol. A hardware terminal node controller (TNC) executes both the AX.25 framing and the AFSK modulation. Keep It Simple Stupid (KISS) framing software was developed to allow data to be accepted by the TNC. A software defined radio (SDR) approach has been chosen for the downlink. GNURadio is software that allows flowcharts to be built to undertake the required signal processing of the received signal, the demodulation of the signal and the decoding of data. This paper provides a detailed account of the software developed for the ground segment communication system. A review of the AX.25 and KISS framing protocols is presented. The GNURadio flowcharts that handle the signal processing and data decoding are broken down and each constituent is explained. To ensure the reliability and robustness of the system, a suite of tests was undertaken, the results of which are also presented.
      746