-
A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Authors:
Md Tahmid Rahman Laskar,
Sawsan Alqahtani,
M Saiful Bari,
Mizanur Rahman,
Mohammad Abdullah Matin Khan,
Haidar Khan,
Israt Jahan,
Amran Bhuiyan,
Chee Wei Tan,
Md Rizwan Parvez,
Enamul Hoque,
Shafiq Joty,
Jimmy Huang
Abstract:
Large Language Models (LLMs) have recently gained significant attention due to their remarkable capabilities in performing diverse tasks across various domains. However, a thorough evaluation of these models is crucial before deploying them in real-world applications to ensure they produce reliable performance. Despite the well-established importance of evaluating LLMs in the community, the comple…
▽ More
Large Language Models (LLMs) have recently gained significant attention due to their remarkable capabilities in performing diverse tasks across various domains. However, a thorough evaluation of these models is crucial before deploying them in real-world applications to ensure they produce reliable performance. Despite the well-established importance of evaluating LLMs in the community, the complexity of the evaluation process has led to varied evaluation setups, causing inconsistencies in findings and interpretations. To address this, we systematically review the primary challenges and limitations causing these inconsistencies and unreliable evaluations in various steps of LLM evaluation. Based on our critical review, we present our perspectives and recommendations to ensure LLM evaluations are reproducible, reliable, and robust.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Unraveling Contagion Origins: Optimal Estimation through Maximum-Likelihood and Starlike Tree Approximation in Markovian Spreading Models
Authors:
Pei-Duo Yu,
Chee Wei Tan,
Liang Zheng,
Chao Zhao
Abstract:
Identifying the source of epidemic-like spread in networks is crucial for tasks like removing internet viruses or finding the rumor source in online social networks. The challenge lies in tracing the source from a snapshot observation of infected nodes. How do we accurately pinpoint the source? Utilizing snapshot data, we apply a probabilistic approach, focusing on the graph boundary and the obser…
▽ More
Identifying the source of epidemic-like spread in networks is crucial for tasks like removing internet viruses or finding the rumor source in online social networks. The challenge lies in tracing the source from a snapshot observation of infected nodes. How do we accurately pinpoint the source? Utilizing snapshot data, we apply a probabilistic approach, focusing on the graph boundary and the observed time, to detect sources via an effective maximum likelihood algorithm. A novel starlike tree approximation extends applicability to general graphs, demonstrating versatility. We highlight the utility of the Gamma function for analyzing the asymptotic behavior of the likelihood ratio between nodes. Comprehensive evaluations confirm algorithmic effectiveness in diverse network scenarios, advancing rumor source detection in large-scale network analysis and information dissemination strategies.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Towards Efficient and Certified Recovery from Poisoning Attacks in Federated Learning
Authors:
Yu Jiang,
Jiyuan Shen,
Ziyao Liu,
Chee Wei Tan,
Kwok-Yan Lam
Abstract:
Federated learning (FL) is vulnerable to poisoning attacks, where malicious clients manipulate their updates to affect the global model. Although various methods exist for detecting those clients in FL, identifying malicious clients requires sufficient model updates, and hence by the time malicious clients are detected, FL models have been already poisoned. Thus, a method is needed to recover an a…
▽ More
Federated learning (FL) is vulnerable to poisoning attacks, where malicious clients manipulate their updates to affect the global model. Although various methods exist for detecting those clients in FL, identifying malicious clients requires sufficient model updates, and hence by the time malicious clients are detected, FL models have been already poisoned. Thus, a method is needed to recover an accurate global model after malicious clients are identified. Current recovery methods rely on (i) all historical information from participating FL clients and (ii) the initial model unaffected by the malicious clients, leading to a high demand for storage and computational resources. In this paper, we show that highly effective recovery can still be achieved based on (i) selective historical information rather than all historical information and (ii) a historical model that has not been significantly affected by malicious clients rather than the initial model. In this scenario, while maintaining comparable recovery performance, we can accelerate the recovery speed and decrease memory consumption. Following this concept, we introduce Crab, an efficient and certified recovery method, which relies on selective information storage and adaptive model rollback. Theoretically, we demonstrate that the difference between the global model recovered by Crab and the one recovered by train-from-scratch can be bounded under certain assumptions. Our empirical evaluation, conducted across three datasets over multiple machine learning models, and a variety of untargeted and targeted poisoning attacks reveals that Crab is both accurate and efficient, and consistently outperforms previous approaches in terms of both recovery speed and memory consumption.
△ Less
Submitted 19 January, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
Series2Vec: Similarity-based Self-supervised Representation Learning for Time Series Classification
Authors:
Navid Mohammadi Foumani,
Chang Wei Tan,
Geoffrey I. Webb,
Hamid Rezatofighi,
Mahsa Salehi
Abstract:
We argue that time series analysis is fundamentally different in nature to either vision or natural language processing with respect to the forms of meaningful self-supervised learning tasks that can be defined. Motivated by this insight, we introduce a novel approach called \textit{Series2Vec} for self-supervised representation learning. Unlike other self-supervised methods in time series, which…
▽ More
We argue that time series analysis is fundamentally different in nature to either vision or natural language processing with respect to the forms of meaningful self-supervised learning tasks that can be defined. Motivated by this insight, we introduce a novel approach called \textit{Series2Vec} for self-supervised representation learning. Unlike other self-supervised methods in time series, which carry the risk of positive sample variants being less similar to the anchor sample than series in the negative set, Series2Vec is trained to predict the similarity between two series in both temporal and spectral domains through a self-supervised task. Series2Vec relies primarily on the consistency of the unsupervised similarity step, rather than the intrinsic quality of the similarity measurement, without the need for hand-crafted data augmentation. To further enforce the network to learn similar representations for similar time series, we propose a novel approach that applies order-invariant attention to each representation within the batch during training. Our evaluation of Series2Vec on nine large real-world datasets, along with the UCR/UEA archive, shows enhanced performance compared to current state-of-the-art self-supervised techniques for time series. Additionally, our extensive experiments show that Series2Vec performs comparably with fully supervised training and offers high efficiency in datasets with limited-labeled data. Finally, we show that the fusion of Series2Vec with other representation learning models leads to enhanced performance for time series classification. Code and models are open-source at \url{https://github.com/Navidfoumani/Series2Vec.}
△ Less
Submitted 12 December, 2023; v1 submitted 6 December, 2023;
originally announced December 2023.
-
Large Language Model-Driven Classroom Flipping: Empowering Student-Centric Peer Questioning with Flipped Interaction
Authors:
Chee Wei Tan
Abstract:
Reciprocal questioning is essential for effective teaching and learning, fostering active engagement and deeper understanding through collaborative interactions, especially in large classrooms. Can large language model (LLM), such as OpenAI's GPT (Generative Pre-trained Transformer) series, assist in this? This paper investigates a pedagogical approach of classroom flipping based on flipped intera…
▽ More
Reciprocal questioning is essential for effective teaching and learning, fostering active engagement and deeper understanding through collaborative interactions, especially in large classrooms. Can large language model (LLM), such as OpenAI's GPT (Generative Pre-trained Transformer) series, assist in this? This paper investigates a pedagogical approach of classroom flipping based on flipped interaction in LLMs. Flipped interaction involves using language models to prioritize generating questions instead of answers to prompts. We demonstrate how traditional classroom flipping techniques, including Peer Instruction and Just-in-Time Teaching (JiTT), can be enhanced through flipped interaction techniques, creating student-centric questions for hybrid teaching. In particular, we propose a workflow to integrate prompt engineering with clicker and JiTT quizzes by a poll-prompt-quiz routine and a quiz-prompt-discuss routine to empower students to self-regulate their learning capacity and enable teachers to swiftly personalize training pathways. We develop an LLM-driven chatbot software that digitizes various elements of classroom flipping and facilitates the assessment of students using these routines to deliver peer-generated questions. We have applied our LLM-driven chatbot software for teaching both undergraduate and graduate students from 2020 to 2022, effectively useful for bridging the gap between teachers and students in remote teaching during the COVID-19 pandemic years. In particular, LLM-driven classroom flipping can be particularly beneficial in large class settings to optimize teaching pace and enable engaging classroom experiences.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Copilot for Xcode: Exploring AI-Assisted Programming by Prompting Cloud-based Large Language Models
Authors:
Chee Wei Tan,
Shangxin Guo,
Man Fai Wong,
Ching Nam Hang
Abstract:
This paper presents an AI-assisted programming tool called Copilot for Xcode for program composition and design to support human software developers. By seamlessly integrating cloud-based Large Language Models (LLM) with Apple's local development environment, Xcode, this tool enhances productivity and unleashes creativity for software development in Apple software ecosystem (e.g., iOS apps, macOS)…
▽ More
This paper presents an AI-assisted programming tool called Copilot for Xcode for program composition and design to support human software developers. By seamlessly integrating cloud-based Large Language Models (LLM) with Apple's local development environment, Xcode, this tool enhances productivity and unleashes creativity for software development in Apple software ecosystem (e.g., iOS apps, macOS). Leveraging advanced natural language processing (NLP) techniques, Copilot for Xcode effectively processes source code tokens and patterns within code repositories, enabling features such as code generation, autocompletion, documentation, and error detection. Software developers can also query and make "small" decisions for program composition, some of which can be made simultaneously, and this is facilitated through prompt engineering in a chat interface of Copilot for Xcode. Finally, we present simple case studies as evidence of the effectiveness of utilizing NLP in Xcode to prompt popular LLM services like OpenAI ChatGPT for program composition and design.
△ Less
Submitted 8 July, 2023;
originally announced July 2023.
-
FedDRL: A Trustworthy Federated Learning Model Fusion Method Based on Staged Reinforcement Learning
Authors:
Leiming Chen,
Weishan Zhang,
Cihao Dong,
Sibo Qiao,
Ziling Huang,
Yuming Nie,
Zhaoxiang Hou,
Chee Wei Tan
Abstract:
Traditional federated learning uses the number of samples to calculate the weights of each client model and uses this fixed weight value to fusion the global model. However, in practical scenarios, each client's device and data heterogeneity leads to differences in the quality of each client's model. Thus the contribution to the global model is not wholly determined by the sample size. In addition…
▽ More
Traditional federated learning uses the number of samples to calculate the weights of each client model and uses this fixed weight value to fusion the global model. However, in practical scenarios, each client's device and data heterogeneity leads to differences in the quality of each client's model. Thus the contribution to the global model is not wholly determined by the sample size. In addition, if clients intentionally upload low-quality or malicious models, using these models for aggregation will lead to a severe decrease in global model accuracy. Traditional federated learning algorithms do not address these issues. To solve this probelm, we propose FedDRL, a model fusion approach using reinforcement learning based on a two staged approach. In the first stage, Our method could filter out malicious models and selects trusted client models to participate in the model fusion. In the second stage, the FedDRL algorithm adaptively adjusts the weights of the trusted client models and aggregates the optimal global model. We also define five model fusion scenarios and compare our method with two baseline algorithms in those scenarios. The experimental results show that our algorithm has higher reliability than other algorithms while maintaining accuracy.
△ Less
Submitted 19 March, 2024; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Contagion Source Detection in Epidemic and Infodemic Outbreaks: Mathematical Analysis and Network Algorithms
Authors:
Chee Wei Tan,
Pei-Duo Yu
Abstract:
This monograph provides an overview of the mathematical theories and computational algorithm design for contagion source detection in large networks. By leveraging network centrality as a tool for statistical inference, we can accurately identify the source of contagions, trace their spread, and predict future trajectories. This approach provides fundamental insights into surveillance capability a…
▽ More
This monograph provides an overview of the mathematical theories and computational algorithm design for contagion source detection in large networks. By leveraging network centrality as a tool for statistical inference, we can accurately identify the source of contagions, trace their spread, and predict future trajectories. This approach provides fundamental insights into surveillance capability and asymptotic behavior of contagion spreading in networks. Mathematical theory and computational algorithms are vital to understanding contagion dynamics, improving surveillance capabilities, and developing effective strategies to prevent the spread of infectious diseases and misinformation.
△ Less
Submitted 8 July, 2023;
originally announced July 2023.
-
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
Authors:
Man Fai Wong,
Shangxin Guo,
Ching Nam Hang,
Siu Wai Ho,
Chee Wei Tan
Abstract:
This paper provides a comprehensive review of the literature concerning the utilization of Natural Language Processing (NLP) techniques, with a particular focus on transformer-based large language models (LLMs) trained using Big Code, within the domain of AI-assisted programming tasks. LLMs, augmented with software naturalness, have played a crucial role in facilitating AI-assisted programming app…
▽ More
This paper provides a comprehensive review of the literature concerning the utilization of Natural Language Processing (NLP) techniques, with a particular focus on transformer-based large language models (LLMs) trained using Big Code, within the domain of AI-assisted programming tasks. LLMs, augmented with software naturalness, have played a crucial role in facilitating AI-assisted programming applications, including code generation, code completion, code translation, code refinement, code summarization, defect detection, and clone detection. Notable examples of such applications include the GitHub Copilot powered by OpenAI's Codex and DeepMind AlphaCode. This paper presents an overview of the major LLMs and their applications in downstream tasks related to AI-assisted programming. Furthermore, it explores the challenges and opportunities associated with incorporating NLP techniques with software naturalness in these applications, with a discussion on extending AI-assisted programming capabilities to Apple's Xcode for mobile software development. This paper also presents the challenges of and opportunities for incorporating NLP techniques with software naturalness, empowering developers with advanced coding assistance and streamlining the software development process.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
Improving Position Encoding of Transformers for Multivariate Time Series Classification
Authors:
Navid Mohammadi Foumani,
Chang Wei Tan,
Geoffrey I. Webb,
Mahsa Salehi
Abstract:
Transformers have demonstrated outstanding performance in many applications of deep learning. When applied to time series data, transformers require effective position encoding to capture the ordering of the time series data. The efficacy of position encoding in time series analysis is not well-studied and remains controversial, e.g., whether it is better to inject absolute position encoding or re…
▽ More
Transformers have demonstrated outstanding performance in many applications of deep learning. When applied to time series data, transformers require effective position encoding to capture the ordering of the time series data. The efficacy of position encoding in time series analysis is not well-studied and remains controversial, e.g., whether it is better to inject absolute position encoding or relative position encoding, or a combination of them. In order to clarify this, we first review existing absolute and relative position encoding methods when applied in time series classification. We then proposed a new absolute position encoding method dedicated to time series data called time Absolute Position Encoding (tAPE). Our new method incorporates the series length and input embedding dimension in absolute position encoding. Additionally, we propose computationally Efficient implementation of Relative Position Encoding (eRPE) to improve generalisability for time series. We then propose a novel multivariate time series classification (MTSC) model combining tAPE/eRPE and convolution-based input encoding named ConvTran to improve the position and data embedding of time series data. The proposed absolute and relative position encoding methods are simple and efficient. They can be easily integrated into transformer blocks and used for downstream tasks such as forecasting, extrinsic regression, and anomaly detection. Extensive experiments on 32 multivariate time-series datasets show that our model is significantly more accurate than state-of-the-art convolution and transformer-based models. Code and models are open-sourced at \url{https://github.com/Navidfoumani/ConvTran}.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
An Approach to Multiple Comparison Benchmark Evaluations that is Stable Under Manipulation of the Comparate Set
Authors:
Ali Ismail-Fawaz,
Angus Dempster,
Chang Wei Tan,
Matthieu Herrmann,
Lynn Miller,
Daniel F. Schmidt,
Stefano Berretti,
Jonathan Weber,
Maxime Devanne,
Germain Forestier,
Geoffrey I. Webb
Abstract:
The measurement of progress using benchmarks evaluations is ubiquitous in computer science and machine learning. However, common approaches to analyzing and presenting the results of benchmark comparisons of multiple algorithms over multiple datasets, such as the critical difference diagram introduced by Demšar (2006), have important shortcomings and, we show, are open to both inadvertent and inte…
▽ More
The measurement of progress using benchmarks evaluations is ubiquitous in computer science and machine learning. However, common approaches to analyzing and presenting the results of benchmark comparisons of multiple algorithms over multiple datasets, such as the critical difference diagram introduced by Demšar (2006), have important shortcomings and, we show, are open to both inadvertent and intentional manipulation. To address these issues, we propose a new approach to presenting the results of benchmark comparisons, the Multiple Comparison Matrix (MCM), that prioritizes pairwise comparisons and precludes the means of manipulating experimental results in existing approaches. MCM can be used to show the results of an all-pairs comparison, or to show the results of a comparison between one or more selected algorithms and the state of the art. MCM is implemented in Python and is publicly available.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Proximity Forest 2.0: A new effective and scalable similarity-based classifier for time series
Authors:
Matthieu Herrmann,
Chang Wei Tan,
Mahsa Salehi,
Geoffrey I. Webb
Abstract:
Time series classification (TSC) is a challenging task due to the diversity of types of feature that may be relevant for different classification tasks, including trends, variance, frequency, magnitude, and various patterns. To address this challenge, several alternative classes of approach have been developed, including similarity-based, features and intervals, shapelets, dictionary, kernel, neur…
▽ More
Time series classification (TSC) is a challenging task due to the diversity of types of feature that may be relevant for different classification tasks, including trends, variance, frequency, magnitude, and various patterns. To address this challenge, several alternative classes of approach have been developed, including similarity-based, features and intervals, shapelets, dictionary, kernel, neural network, and hybrid approaches. While kernel, neural network, and hybrid approaches perform well overall, some specialized approaches are better suited for specific tasks. In this paper, we propose a new similarity-based classifier, Proximity Forest version 2.0 (PF 2.0), which outperforms previous state-of-the-art similarity-based classifiers across the UCR benchmark and outperforms state-of-the-art kernel, neural network, and hybrid methods on specific datasets in the benchmark that are best addressed by similarity-base methods. PF 2.0 incorporates three recent advances in time series similarity measures -- (1) computationally efficient early abandoning and pruning to speedup elastic similarity computations; (2) a new elastic similarity measure, Amerced Dynamic Time Warping (ADTW); and (3) cost function tuning. It rationalizes the set of similarity measures employed, reducing the eight base measures of the original PF to three and using the first derivative transform with all similarity measures, rather than a limited subset. We have implemented both PF 1.0 and PF 2.0 in a single C++ framework, making the PF framework more efficient.
△ Less
Submitted 13 April, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
Deep Learning for Time Series Classification and Extrinsic Regression: A Current Survey
Authors:
Navid Mohammadi Foumani,
Lynn Miller,
Chang Wei Tan,
Geoffrey I. Webb,
Germain Forestier,
Mahsa Salehi
Abstract:
Time Series Classification and Extrinsic Regression are important and challenging machine learning tasks. Deep learning has revolutionized natural language processing and computer vision and holds great promise in other fields such as time series analysis where the relevant features must often be abstracted from the raw data but are not known a priori. This paper surveys the current state of the a…
▽ More
Time Series Classification and Extrinsic Regression are important and challenging machine learning tasks. Deep learning has revolutionized natural language processing and computer vision and holds great promise in other fields such as time series analysis where the relevant features must often be abstracted from the raw data but are not known a priori. This paper surveys the current state of the art in the fast-moving field of deep learning for time series classification and extrinsic regression. We review different network architectures and training methods used for these tasks and discuss the challenges and opportunities when applying deep learning to time series data. We also summarize two critical applications of time series classification and extrinsic regression, human activity recognition and satellite earth observation.
△ Less
Submitted 19 December, 2023; v1 submitted 5 February, 2023;
originally announced February 2023.
-
EuclidNet: Deep Visual Reasoning for Constructible Problems in Geometry
Authors:
Man Fai Wong,
Xintong Qi,
Chee Wei Tan
Abstract:
In this paper, we present a deep learning-based framework for solving geometric construction problems through visual reasoning, which is useful for automated geometry theorem proving. Constructible problems in geometry often ask for the sequence of straightedge-and-compass constructions to construct a given goal given some initial setup. Our EuclidNet framework leverages the neural network archite…
▽ More
In this paper, we present a deep learning-based framework for solving geometric construction problems through visual reasoning, which is useful for automated geometry theorem proving. Constructible problems in geometry often ask for the sequence of straightedge-and-compass constructions to construct a given goal given some initial setup. Our EuclidNet framework leverages the neural network architecture Mask R-CNN to extract the visual features from the initial setup and goal configuration with extra points of intersection, and then generate possible construction steps as intermediary data models that are used as feedback in the training process for further refinement of the construction step sequence. This process is repeated recursively until either a solution is found, in which case we backtrack the path for a step-by-step construction guide, or the problem is identified as unsolvable. Our EuclidNet framework is validated on complex Japanese Sangaku geometry problems, demonstrating its capacity to leverage backtracking for deep visual reasoning of challenging problems.
△ Less
Submitted 27 December, 2022;
originally announced January 2023.
-
Parameterizing the cost function of Dynamic Time Warping with application to time series classification
Authors:
Matthieu Herrmann,
Chang Wei Tan,
Geoffrey I. Webb
Abstract:
Dynamic Time Warping (DTW) is a popular time series distance measure that aligns the points in two series with one another. These alignments support warping of the time dimension to allow for processes that unfold at differing rates. The distance is the minimum sum of costs of the resulting alignments over any allowable warping of the time dimension. The cost of an alignment of two points is a fun…
▽ More
Dynamic Time Warping (DTW) is a popular time series distance measure that aligns the points in two series with one another. These alignments support warping of the time dimension to allow for processes that unfold at differing rates. The distance is the minimum sum of costs of the resulting alignments over any allowable warping of the time dimension. The cost of an alignment of two points is a function of the difference in the values of those points. The original cost function was the absolute value of this difference. Other cost functions have been proposed. A popular alternative is the square of the difference. However, to our knowledge, this is the first investigation of both the relative impacts of using different cost functions and the potential to tune cost functions to different tasks. We do so in this paper by using a tunable cost function λγ with parameter γ. We show that higher values of γ place greater weight on larger pairwise differences, while lower values place greater weight on smaller pairwise differences. We demonstrate that training γ significantly improves the accuracy of both the DTW nearest neighbor and Proximity Forest classifiers.
△ Less
Submitted 28 March, 2023; v1 submitted 24 January, 2023;
originally announced January 2023.
-
DeepTrace: Learning to Optimize Contact Tracing in Epidemic Networks with Graph Neural Networks
Authors:
Chee Wei Tan,
Pei-Duo Yu,
Siya Chen,
H. Vincent Poor
Abstract:
Digital contact tracing aims to curb epidemics by identifying and mitigating public health emergencies through technology. Backward contact tracing, which tracks the sources of infection, proved crucial in places like Japan for identifying COVID-19 infections from superspreading events. This paper presents a novel perspective of digital contact tracing as online graph exploration and addresses the…
▽ More
Digital contact tracing aims to curb epidemics by identifying and mitigating public health emergencies through technology. Backward contact tracing, which tracks the sources of infection, proved crucial in places like Japan for identifying COVID-19 infections from superspreading events. This paper presents a novel perspective of digital contact tracing as online graph exploration and addresses the forward and backward contact tracing problem as a maximum-likelihood (ML) estimation problem using iterative epidemic network data sampling. The challenge lies in the combinatorial complexity and rapid spread of infections. We introduce DeepTrace, an algorithm based on a Graph Neural Network (GNN) that iteratively updates its estimations as new contact tracing data is collected, learning to optimize the maximum likelihood estimation by utilizing topological features to accelerate learning and improve convergence. The contact tracing process combines either BFS or DFS to expand the network and trace the infection source, ensuring comprehensive and efficient exploration. Additionally, the GNN model is fine-tuned through a two-phase approach: pre-training with synthetic networks to approximate likelihood probabilities and fine-tuning with high-quality data to refine the model. Using COVID-19 variant data, we illustrate that DeepTrace surpasses current methods in identifying superspreaders, providing a robust basis for a scalable digital contact tracing strategy.
△ Less
Submitted 24 June, 2024; v1 submitted 2 November, 2022;
originally announced November 2022.
-
FRANS: Automatic Feature Extraction for Time Series Forecasting
Authors:
Alexey Chernikov,
Chang Wei Tan,
Pablo Montero-Manso,
Christoph Bergmeir
Abstract:
Feature extraction methods help in dimensionality reduction and capture relevant information. In time series forecasting (TSF), features can be used as auxiliary information to achieve better accuracy. Traditionally, features used in TSF are handcrafted, which requires domain knowledge and significant data-engineering work. In this research, we first introduce a notion of static and dynamic featur…
▽ More
Feature extraction methods help in dimensionality reduction and capture relevant information. In time series forecasting (TSF), features can be used as auxiliary information to achieve better accuracy. Traditionally, features used in TSF are handcrafted, which requires domain knowledge and significant data-engineering work. In this research, we first introduce a notion of static and dynamic features, which then enables us to develop our autonomous Feature Retrieving Autoregressive Network for Static features (FRANS) that does not require domain knowledge. The method is based on a CNN classifier that is trained to create for each series a collective and unique class representation either from parts of the series or, if class labels are available, from a set of series of the same class. It allows to discriminate series with similar behaviour but from different classes and makes the features extracted from the classifier to be maximally discriminatory. We explore the interpretability of our features, and evaluate the prediction capabilities of the method within the forecasting meta-learning environment FFORMA. Our results show that our features lead to improvement in accuracy in most situations. Once trained our approach creates features orders of magnitude faster than statistical methods.
△ Less
Submitted 14 September, 2022;
originally announced September 2022.
-
Epidemic Source Detection in Contact Tracing Networks: Epidemic Centrality in Graphs and Message-Passing Algorithms
Authors:
Pei-Duo Yu,
Chee Wei Tan,
Hung-Lin Fu
Abstract:
We study the epidemic source detection problem in contact tracing networks modeled as a graph-constrained maximum likelihood estimation problem using the susceptible-infected model in epidemiology. Based on a snapshot observation of the infection subgraph, we first study finite degree regular graphs and regular graphs with cycles separately, thereby establishing a mathematical equivalence in maxim…
▽ More
We study the epidemic source detection problem in contact tracing networks modeled as a graph-constrained maximum likelihood estimation problem using the susceptible-infected model in epidemiology. Based on a snapshot observation of the infection subgraph, we first study finite degree regular graphs and regular graphs with cycles separately, thereby establishing a mathematical equivalence in maximal likelihood ratio between the case of finite acyclic graphs and that of cyclic graphs. In particular, we show that the optimal solution of the maximum likelihood estimator can be refined to distances on graphs based on a novel statistical distance centrality that captures the optimality of the nonconvex problem. An efficient contact tracing algorithm is then proposed to solve the general case of finite degree-regular graphs with multiple cycles. Our performance evaluation on a variety of graphs shows that our algorithms outperform the existing state-of-the-art heuristics using contact tracing data from the SARS-CoV 2003 and COVID-19 pandemics by correctly identifying the superspreaders on some of the largest superspreading infection clusters in Singapore and Taiwan.
△ Less
Submitted 25 February, 2022; v1 submitted 18 January, 2022;
originally announced January 2022.
-
An Overview of Healthcare Data Analytics With Applications to the COVID-19 Pandemic
Authors:
Zhe Fei,
Yevgen Ryeznik,
Oleksandr Sverdlov,
Chee Wei Tan,
Weng Kee Wong
Abstract:
In the era of big data, standard analysis tools may be inadequate for making inference and there is a growing need for more efficient and innovative ways to collect, process, analyze and interpret the massive and complex data. We provide an overview of challenges in big data problems and describe how innovative analytical methods, machine learning tools and metaheuristics can tackle general health…
▽ More
In the era of big data, standard analysis tools may be inadequate for making inference and there is a growing need for more efficient and innovative ways to collect, process, analyze and interpret the massive and complex data. We provide an overview of challenges in big data problems and describe how innovative analytical methods, machine learning tools and metaheuristics can tackle general healthcare problems with a focus on the current pandemic. In particular, we give applications of modern digital technology, statistical methods, data platforms and data integration systems to improve diagnosis and treatment of diseases in clinical research and novel epidemiologic tools to tackle infection source problems, such as finding Patient Zero in the spread of epidemics. We make the case that analyzing and interpreting big data is a very challenging task that requires a multi-disciplinary effort to continuously create more effective methodologies and powerful tools to transfer data information into knowledge that enables informed decision making.
△ Less
Submitted 25 November, 2021;
originally announced November 2021.
-
Classification of multivariate weakly-labelled time-series with attention
Authors:
Surayez Rahman,
Chang Wei Tan
Abstract:
This research identifies a gap in weakly-labelled multivariate time-series classification (TSC), where state-of-the-art TSC models do not per-form well. Weakly labelled time-series are time-series containing noise and significant redundancies. In response to this gap, this paper proposes an approach of exploiting context relevance of subsequences from previous subsequences to improve classificatio…
▽ More
This research identifies a gap in weakly-labelled multivariate time-series classification (TSC), where state-of-the-art TSC models do not per-form well. Weakly labelled time-series are time-series containing noise and significant redundancies. In response to this gap, this paper proposes an approach of exploiting context relevance of subsequences from previous subsequences to improve classification accuracy. To achieve this, state-of-the-art Attention algorithms are experimented in combination with the top CNN models for TSC (FCN and ResNet), in an CNN-LSTM architecture. Attention is a popular strategy for context extraction with exceptional performance in modern sequence-to-sequence tasks. This paper shows how attention algorithms can be used for improved weakly labelledTSC by evaluating models on a multivariate EEG time-series dataset obtained using a commercial Emotiv headsets from participants performing various activities while driving. These time-series are segmented into sub-sequences and labelled to allow supervised TSC.
△ Less
Submitted 17 September, 2021; v1 submitted 16 February, 2021;
originally announced February 2021.
-
MultiRocket: Multiple pooling operators and transformations for fast and effective time series classification
Authors:
Chang Wei Tan,
Angus Dempster,
Christoph Bergmeir,
Geoffrey I. Webb
Abstract:
We propose MultiRocket, a fast time series classification (TSC) algorithm that achieves state-of-the-art performance with a tiny fraction of the time and without the complex ensembling structure of many state-of-the-art methods. MultiRocket improves on MiniRocket, one of the fastest TSC algorithms to date, by adding multiple pooling operators and transformations to improve the diversity of the fea…
▽ More
We propose MultiRocket, a fast time series classification (TSC) algorithm that achieves state-of-the-art performance with a tiny fraction of the time and without the complex ensembling structure of many state-of-the-art methods. MultiRocket improves on MiniRocket, one of the fastest TSC algorithms to date, by adding multiple pooling operators and transformations to improve the diversity of the features generated. In addition to processing the raw input series, MultiRocket also applies first order differences to transform the original series. Convolutions are applied to both representations, and four pooling operators are applied to the convolution outputs. When benchmarked using the University of California Riverside TSC benchmark datasets, MultiRocket is significantly more accurate than MiniRocket, and competitive with the best ranked current method in terms of accuracy, HIVE-COTE 2.0, while being orders of magnitude faster.
△ Less
Submitted 21 February, 2022; v1 submitted 31 January, 2021;
originally announced February 2021.
-
Time Series Extrinsic Regression
Authors:
Chang Wei Tan,
Christoph Bergmeir,
Francois Petitjean,
Geoffrey I. Webb
Abstract:
This paper studies Time Series Extrinsic Regression (TSER): a regression task of which the aim is to learn the relationship between a time series and a continuous scalar variable; a task closely related to time series classification (TSC), which aims to learn the relationship between a time series and a categorical class label. This task generalizes time series forecasting (TSF), relaxing the requ…
▽ More
This paper studies Time Series Extrinsic Regression (TSER): a regression task of which the aim is to learn the relationship between a time series and a continuous scalar variable; a task closely related to time series classification (TSC), which aims to learn the relationship between a time series and a categorical class label. This task generalizes time series forecasting (TSF), relaxing the requirement that the value predicted be a future value of the input series or primarily depend on more recent values.
In this paper, we motivate and study this task, and benchmark existing solutions and adaptations of TSC algorithms on a novel archive of 19 TSER datasets which we have assembled. Our results show that the state-of-the-art TSC algorithm Rocket, when adapted for regression, achieves the highest overall accuracy compared to adaptations of other TSC algorithms and state-of-the-art machine learning (ML) algorithms such as XGBoost, Random Forest and Support Vector Regression. More importantly, we show that much research is needed in this field to improve the accuracy of ML models. We also find evidence that further research has excellent prospects of improving upon these straightforward baselines.
△ Less
Submitted 3 February, 2021; v1 submitted 22 June, 2020;
originally announced June 2020.
-
Pain Intensity Estimation from Mobile Video Using 2D and 3D Facial Keypoints
Authors:
Matthew Lee,
Lyndon Kennedy,
Andreas Girgensohn,
Lynn Wilcox,
John Song En Lee,
Chin Wen Tan,
Ban Leong Sng
Abstract:
Managing post-surgical pain is critical for successful surgical outcomes. One of the challenges of pain management is accurately assessing the pain level of patients. Self-reported numeric pain ratings are limited because they are subjective, can be affected by mood, and can influence the patient's perception of pain when making comparisons. In this paper, we introduce an approach that analyzes 2D…
▽ More
Managing post-surgical pain is critical for successful surgical outcomes. One of the challenges of pain management is accurately assessing the pain level of patients. Self-reported numeric pain ratings are limited because they are subjective, can be affected by mood, and can influence the patient's perception of pain when making comparisons. In this paper, we introduce an approach that analyzes 2D and 3D facial keypoints of post-surgical patients to estimate their pain intensity level. Our approach leverages the previously unexplored capabilities of a smartphone to capture a dense 3D representation of a person's face as input for pain intensity level estimation. Our contributions are adata collection study with post-surgical patients to collect ground-truth labeled sequences of 2D and 3D facial keypoints for developing a pain estimation algorithm, a pain estimation model that uses multiple instance learning to overcome inherent limitations in facial keypoint sequences, and the preliminary results of the pain estimation model using 2D and 3D features with comparisons of alternate approaches.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
Monash University, UEA, UCR Time Series Extrinsic Regression Archive
Authors:
Chang Wei Tan,
Christoph Bergmeir,
Francois Petitjean,
Geoffrey I. Webb
Abstract:
Time series research has gathered lots of interests in the last decade, especially for Time Series Classification (TSC) and Time Series Forecasting (TSF). Research in TSC has greatly benefited from the University of California Riverside and University of East Anglia (UCR/UEA) Time Series Archives. On the other hand, the advancement in Time Series Forecasting relies on time series forecasting compe…
▽ More
Time series research has gathered lots of interests in the last decade, especially for Time Series Classification (TSC) and Time Series Forecasting (TSF). Research in TSC has greatly benefited from the University of California Riverside and University of East Anglia (UCR/UEA) Time Series Archives. On the other hand, the advancement in Time Series Forecasting relies on time series forecasting competitions such as the Makridakis competitions, NN3 and NN5 Neural Network competitions, and a few Kaggle competitions. Each year, thousands of papers proposing new algorithms for TSC and TSF have utilized these benchmarking archives. These algorithms are designed for these specific problems, but may not be useful for tasks such as predicting the heart rate of a person using photoplethysmogram (PPG) and accelerometer data. We refer to this problem as Time Series Extrinsic Regression (TSER), where we are interested in a more general methodology of predicting a single continuous value, from univariate or multivariate time series. This prediction can be from the same time series or not directly related to the predictor time series and does not necessarily need to be a future value or depend heavily on recent values. To the best of our knowledge, research into TSER has received much less attention in the time series research community and there are no models developed for general time series extrinsic regression problems. Most models are developed for a specific problem. Therefore, we aim to motivate and support the research into TSER by introducing the first TSER benchmarking archive. This archive contains 19 datasets from different domains, with varying number of dimensions, unequal length dimensions, and missing values. In this paper, we introduce the datasets in this archive and did an initial benchmark on existing models.
△ Less
Submitted 19 October, 2020; v1 submitted 19 June, 2020;
originally announced June 2020.
-
Detecting Driver's Distraction using Long-term Recurrent Convolutional Network
Authors:
Chang Wei Tan,
Mahsa Salehi,
Geoffrey Mackellar
Abstract:
In this study we demonstrate a novel Brain Computer Interface (BCI) approach to detect driver distraction events to improve road safety. We use a commercial wireless headset that generates EEG signals from the brain. We collected real EEG signals from participants who undertook a 40-minute driving simulation and were required to perform different tasks while driving. These signals are segmented in…
▽ More
In this study we demonstrate a novel Brain Computer Interface (BCI) approach to detect driver distraction events to improve road safety. We use a commercial wireless headset that generates EEG signals from the brain. We collected real EEG signals from participants who undertook a 40-minute driving simulation and were required to perform different tasks while driving. These signals are segmented into short windows and labelled using a time series classification (TSC) model. We studied different TSC approaches and designed a Long-term Recurrent Convolutional Network (LCRN) model for this task. Our results showed that our LRCN model performs better than the state of the art TSC models at detecting driver distraction events.
△ Less
Submitted 13 April, 2020;
originally announced April 2020.
-
Time series classification for varying length series
Authors:
Chang Wei Tan,
Francois Petitjean,
Eamonn Keogh,
Geoffrey I. Webb
Abstract:
Research into time series classification has tended to focus on the case of series of uniform length. However, it is common for real-world time series data to have unequal lengths. Differing time series lengths may arise from a number of fundamentally different mechanisms. In this work, we identify and evaluate two classes of such mechanisms -- variations in sampling rate relative to the relevant…
▽ More
Research into time series classification has tended to focus on the case of series of uniform length. However, it is common for real-world time series data to have unequal lengths. Differing time series lengths may arise from a number of fundamentally different mechanisms. In this work, we identify and evaluate two classes of such mechanisms -- variations in sampling rate relative to the relevant signal and variations between the start and end points of one time series relative to one another. We investigate how time series generated by each of these classes of mechanism are best addressed for time series classification. We perform extensive experiments and provide practical recommendations on how variations in length should be handled in time series classification.
△ Less
Submitted 9 October, 2019;
originally announced October 2019.
-
Elastic bands across the path: A new framework and methods to lower bound DTW
Authors:
Chang Wei Tan,
Francois Petitjean,
Geoffrey I. Webb
Abstract:
There has been renewed recent interest in developing effective lower bounds for Dynamic Time Warping (DTW) distance between time series. These have many applications in time series indexing, clustering, forecasting, regression and classification. One of the key time series classification algorithms, the nearest neighbor algorithm with DTW distance (NN-DTW) is very expensive to compute, due to the…
▽ More
There has been renewed recent interest in developing effective lower bounds for Dynamic Time Warping (DTW) distance between time series. These have many applications in time series indexing, clustering, forecasting, regression and classification. One of the key time series classification algorithms, the nearest neighbor algorithm with DTW distance (NN-DTW) is very expensive to compute, due to the quadratic complexity of DTW. Lower bound search can speed up NN-DTW substantially. An effective and tight lower bound quickly prunes off unpromising nearest neighbor candidates from the search space and minimises the number of the costly DTW computations. The speed up provided by lower bound search becomes increasingly critical as training set size increases. Different lower bounds provide different trade-offs between computation time and tightness. Most existing lower bounds interact with DTW warping window sizes. They are very tight and effective at smaller warping window sizes, but become looser as the warping window increases, thus reducing the pruning effectiveness for NN-DTW. In this work, we present a new class of lower bounds that are tighter than the popular Keogh lower bound, while requiring similar computation time. Our new lower bounds take advantage of the DTW boundary condition, monotonicity and continuity constraints to create a tighter lower bound. Of particular significance, they remain relatively tight even for large windows. A single parameter to these new lower bounds controls the speed-tightness trade-off. We demonstrate that these new lower bounds provide an exceptional balance between computation time and tightness for the NN-DTW time series classification task, resulting in greatly improved efficiency for NN-DTW lower bound search.
△ Less
Submitted 14 February, 2019; v1 submitted 28 August, 2018;
originally announced August 2018.
-
Joint Beamforming and Power Control in Coordinated Multicell: Max-Min Duality, Effective Network and Large System Transition
Authors:
Yichao Huang,
Chee Wei Tan,
Bhaskar D. Rao
Abstract:
This paper studies joint beamforming and power control in a coordinated multicell downlink system that serves multiple users per cell to maximize the minimum weighted signal-to-interference-plus-noise ratio. The optimal solution and distributed algorithm with geometrically fast convergence rate are derived by employing the nonlinear Perron-Frobenius theory and the multicell network duality. The it…
▽ More
This paper studies joint beamforming and power control in a coordinated multicell downlink system that serves multiple users per cell to maximize the minimum weighted signal-to-interference-plus-noise ratio. The optimal solution and distributed algorithm with geometrically fast convergence rate are derived by employing the nonlinear Perron-Frobenius theory and the multicell network duality. The iterative algorithm, though operating in a distributed manner, still requires instantaneous power update within the coordinated cluster through the backhaul. The backhaul information exchange and message passing may become prohibitive with increasing number of transmit antennas and increasing number of users. In order to derive asymptotically optimal solution, random matrix theory is leveraged to design a distributed algorithm that only requires statistical information. The advantage of our approach is that there is no instantaneous power update through backhaul. Moreover, by using nonlinear Perron-Frobenius theory and random matrix theory, an effective primal network and an effective dual network are proposed to characterize and interpret the asymptotic solution.
△ Less
Submitted 27 June, 2013; v1 submitted 11 March, 2013;
originally announced March 2013.
-
Rooting out the Rumor Culprit from Suspects
Authors:
Wenxiang Dong,
Wenyi Zhang,
Chee Wei Tan
Abstract:
Suppose that a rumor originating from a single source among a set of suspects spreads in a network, how to root out this rumor source? With the a priori knowledge of suspect nodes and an observation of infected nodes, we construct a maximum a posteriori (MAP) estimator to identify the rumor source using the susceptible-infected (SI) model. The a priori suspect set and its associated connectivity b…
▽ More
Suppose that a rumor originating from a single source among a set of suspects spreads in a network, how to root out this rumor source? With the a priori knowledge of suspect nodes and an observation of infected nodes, we construct a maximum a posteriori (MAP) estimator to identify the rumor source using the susceptible-infected (SI) model. The a priori suspect set and its associated connectivity bring about new ingredients to the problem, and thus we propose to use local rumor center, a generalized concept based on rumor centrality, to identify the source from suspects. For regular tree-type networks of node degree δ, we characterize Pc(n), the correct detection probability of the estimator upon observing n infected nodes, in both the finite and asymptotic regimes. First, when every infected node is a suspect, Pc(n) asymptotically grows from 0.25 to 0.307 with δ from 3 to infinity, a result first established in Shah and Zaman (2011, 2012) via a different approach; and it monotonically decreases with n and increases with δ. Second, when the suspects form a connected subgraph of the network, Pc(n) asymptotically significantly exceeds the a priori probability if δ>2, and reliable detection is achieved as δ becomes large; furthermore, it monotonically decreases with n and increases with δ. Third, when there are only two suspects, Pc(n) is asymptotically at least 0.75 if δ>2; and it increases with the distance between the two suspects. Fourth, when there are multiple suspects, among all possible connection patterns, that they form a connected subgraph of the network achieves the smallest detection probability. Our analysis leverages ideas from the Polya's urn model in probability theory and sheds insight into the behavior of the rumor spreading process not only in the asymptotic regime but also for the general finite-n regime.
△ Less
Submitted 9 May, 2013; v1 submitted 26 January, 2013;
originally announced January 2013.
-
Optimal Charging of Electric Vehicles in Smart Grid: Characterization and Valley-Filling Algorithms
Authors:
Niangjun Chen,
Chee Wei Tan,
Tony Q. S. Quek
Abstract:
Electric vehicles (EVs) offer an attractive long-term solution to reduce the dependence on fossil fuel and greenhouse gas emission. However, a fleet of EVs with different EV battery charging rate constraints, that is distributed across a smart power grid network requires a coordinated charging schedule to minimize the power generation and EV charging costs. In this paper, we study a joint optimal…
▽ More
Electric vehicles (EVs) offer an attractive long-term solution to reduce the dependence on fossil fuel and greenhouse gas emission. However, a fleet of EVs with different EV battery charging rate constraints, that is distributed across a smart power grid network requires a coordinated charging schedule to minimize the power generation and EV charging costs. In this paper, we study a joint optimal power flow (OPF) and EV charging problem that augments the OPF problem with charging EVs over time. While the OPF problem is generally nonconvex and nonsmooth, it is shown recently that the OPF problem can be solved optimally for most practical power networks using its convex dual problem. Building on this zero duality gap result, we study a nested optimization approach to decompose the joint OPF and EV charging problem. We characterize the optimal offline EV charging schedule to be a valley-filling profile, which allows us to develop an optimal offline algorithm with computational complexity that is significantly lower than centralized interior point solvers. Furthermore, we propose a decentralized online algorithm that dynamically tracks the valley-filling profile. Our algorithms are evaluated on the IEEE 14 bus system, and the simulations show that the online algorithm performs almost near optimality ($<1%$ relative difference from the offline optimal solution) under different settings.
△ Less
Submitted 7 April, 2013; v1 submitted 23 August, 2012;
originally announced August 2012.
-
On the Sum-Capacity with Successive Decoding in Interference Channels
Authors:
Yue Zhao,
Chee Wei Tan,
A. Salman Avestimehr,
Suhas N. Diggavi,
Gregory J. Pottie
Abstract:
In this paper, we investigate the sum-capacity of the two-user Gaussian interference channel with Gaussian superposition coding and successive decoding. We first examine an approximate deterministic formulation of the problem, and introduce the complementarity conditions that capture the use of Gaussian coding and successive decoding. In the deterministic channel problem, we find the constrained s…
▽ More
In this paper, we investigate the sum-capacity of the two-user Gaussian interference channel with Gaussian superposition coding and successive decoding. We first examine an approximate deterministic formulation of the problem, and introduce the complementarity conditions that capture the use of Gaussian coding and successive decoding. In the deterministic channel problem, we find the constrained sum-capacity and its achievable schemes with the minimum number of messages, first in symmetric channels, and then in general asymmetric channels. We show that the constrained sum-capacity oscillates as a function of the cross link gain parameters between the information theoretic sum-capacity and the sum-capacity with interference treated as noise. Furthermore, we show that if the number of messages of either of the two users is fewer than the minimum number required to achieve the constrained sum-capacity, the maximum achievable sum-rate drops to that with interference treated as noise. We provide two algorithms (a simple one and a finer one) to translate the optimal schemes in the deterministic channel model to the Gaussian channel model. We also derive two upper bounds on the sum-capacity of the Gaussian Han-Kobayashi schemes, which automatically upper bound the sum-capacity using successive decoding of Gaussian codewords. Numerical evaluations show that, similar to the deterministic channel results, the constrained sum-capacity in the Gaussian channels oscillates between the sum-capacity with Han-Kobayashi schemes and that with single message schemes.
△ Less
Submitted 28 March, 2011; v1 submitted 28 February, 2011;
originally announced March 2011.