Current projects

Industry classification

Machine learning-based industry classification according to the North American Industry Classification System (NAICS) involves the use of diverse ML algorithms and NLP techniques to automatically categorise businesses into specific industry codes. NAICS is a widely adopted system for classifying establishments based on their primary business activities. Traditionally, industry classification has been a manual and time-consuming process, but with the integration of machine learning, the task becomes more efficient and precise.

News Polygraph

News Polygraph is a collaborative research project dedicated to advancing research in the field of media and news credibility. The research alliance engages in activities related to fact-checking and online disinformation detection and debunking. As a project lead, I design, develop and deploy an AI system for disinformation detection and monitoring by leveraging sources and information propagation.

Past projects

Explainable Depression Detection on Social Media

Depression is a serious health and social issue that afflicts many individuals in modern society and its prevalence is predicted to increase globally. People with depression are likely to express their feelings and mental states over their social media before seeing health professionals. Most existing black-box-like deep learning methods for depression detection largely focused on improving classification performance. However, explaining model decisions is imperative in health research because decision-making can often be high-stakes and life-and-death. Reliable automatic diagnosis of mental health problems including depression should be supported by credible explanations justifying models' predictions. This project aims to develop automatic, efficient solutions for depression detection. Potentially, outcomes of this project can help patients conduct a self-assessment of risks for their mental disorders and better understand their experiences and health professionals provide tailored, timely therapy to patients, thereby achieving remission and preventing relapse efficiently and effectively.

Human-Robot Collaborative Al for Advanced Manufacturing and Engineering, S$1,962,800 (Oct 2021-present)

This collaborative research project, involving A*STAR, Nanyang Technological University, National University of Singapore and Singapore University of Technology and Design, seeks to enable a fundamental shift in human-machine interaction to allow intelligent machines to work alongside humans as partners, interacting in a natural human-like manner. In particular, I focus on developing a multi-layer, multi-domain and multi-modal commonsense knowledge representation and a suite of reasoning strategies to leverage such knowledge to support higher level inference involved in perception and task understanding, collaborative dialogue, human mental state inference, task concept and script learning, and task planning and execution.

Governmental Requests for Information Decomposition, Undisclosed amount (Jan-Mar 2022)

Requests for Information (RFIs) seeking the clarification of questions asked are an imperative communication process in decision-making. I am interested in accelerating and optimising the handling of RFIs with a minimum of human supervision. This projects develops, implements and demonstrates computational solutions for optimally matching a textual question with limited context to the most relevant subset of information resources to answer the question. This project is funded by the Defence Science and Technology Laboratory (Dstl), an executive agency of the Ministry of Defence of the UK.

Bot/Cyborg detection, Undisclosed amount (Jan 2020-March 2022)

Bot accounts are often used to deliberately spread fake news and disinformation in social media. I explore correlations between social media posts generated by bots and those containing rumours and research methods for identifying bots, cyborgs and accounts controlled by multiple humans based on textual content, posting behaviour and user profiles. This project is funded by the Defence Science and Technology Laboratory (Dstl), an executive agency of the Ministry of Defence of the UK.

Context-aware message-level rumour detection with weak supervision

This research project focuses on message-level early rumour detection (ERD) on social media by exploiting weak supervision and contextual information. Weak supervision is a branch of Machine Learning (ML) where noisy and less precise sources (e.g. data patterns) are leveraged to learn limited high-quality labelled data. This is intended to reduce the cost and increase the efficiency of the hand-labelling of large-scale data. The aim is to study whether identifying rumours before they go viral is possible and develop an architecture for ERD at individual post level. To this end, the following three major bottlenecks of state-of-the-art ERD are addressed in this project: 1) labelled data scarcity and class imbalance; 2) enormous amounts of noisy data; and 3) limited context.

SETA: ubiquitous data and service ecosystem for better metropolitan mobility

The objective of this project is to provide effective solutions for intelligent and sustainable mobility - i.e. the smarter, greener and more efficient movement of people and goods. SETA will provide a radical change from transport as a series of separate modal journeys to an integrated, reactive, intelligent, mobility system. It will provide always-on, pervasive services to citizens and business, as well as decision-makers to support safe, sustainable, effective, efficient and resilient mobility. The project lasts three years, with €5.5m of funding from EU Horizon 2020 of which €1.2m is for Sheffield. Professor Fabio Ciravegna is the project director (2016-2019).

Football Whispers

This is an industrial project that mines information from millions of social media messages and predicts football transfers by analysing related rumours on social media. Football Whispers is the world’s first football transfer predictor, built by football fans for football fans. We sort the dead certs from long shots. We show you who’s really moving and who’s staying put; all with our unique Football Whispers Index.