Skip to main content

Showing 1–50 of 63 results for author: Wong, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.10631  [pdf, other

    cs.LG cs.AI cs.CL

    LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for Large Language Models

    Authors: Yupeng Su, Ziyi Guan, Xiaoqun Liu, Tianlai Jin, Dongkuan Wu, Graziano Chesi, Ngai Wong, Hao Yu

    Abstract: Large language models (LLMs) have grown significantly in scale, leading to a critical need for efficient model pruning techniques. Existing post-training pruning techniques primarily focus on measuring weight importance on converged dense models to determine salient weights to retain. However, they often overlook the changes in weight importance during the pruning process, which can lead to perfor… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  2. arXiv:2407.13623  [pdf, other

    cs.CL cs.AI

    Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

    Authors: Chaofan Tao, Qian Liu, Longxu Dou, Niklas Muennighoff, Zhongwei Wan, Ping Luo, Min Lin, Ngai Wong

    Abstract: Research on scaling large language models (LLMs) has primarily focused on model parameters and training data size, overlooking the role of vocabulary size. We investigate how vocabulary size impacts LLM scaling laws by training models ranging from 33M to 3B parameters on up to 500B characters with various vocabulary configurations. We propose three complementary approaches for predicting the compu… ▽ More

    Submitted 26 July, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

    Comments: 26 pages, 12 figures. Add more related work

  3. arXiv:2406.11909  [pdf, other

    cs.LG cs.AI

    Mixture-of-Subspaces in Low-Rank Adaptation

    Authors: Taiqiang Wu, Jiahao Wang, Zhe Zhao, Ngai Wong

    Abstract: In this paper, we introduce a subspace-inspired Low-Rank Adaptation (LoRA) method, which is computationally efficient, easy to implement, and readily applicable to large language, multimodal, and diffusion models. Initially, we equivalently decompose the weights of LoRA into two subspaces, and find that simply mixing them can enhance performance. To study such a phenomenon, we revisit it through a… ▽ More

    Submitted 5 July, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: working in progress

  4. arXiv:2405.12398  [pdf, other

    cs.LG

    ASMR: Activation-sharing Multi-resolution Coordinate Networks For Efficient Inference

    Authors: Jason Chun Lok Li, Steven Tin Sui Luo, Le Xu, Ngai Wong

    Abstract: Coordinate network or implicit neural representation (INR) is a fast-emerging method for encoding natural signals (such as images and videos) with the benefits of a compact neural representation. While numerous methods have been proposed to increase the encoding capabilities of an INR, an often overlooked aspect is the inference efficiency, usually measured in multiply-accumulate (MAC) count. This… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: ICLR 2024 (v3: 21 pages, 11 figures, Project Page: https://github.com/stevolopolis/asmr.git)

  5. arXiv:2405.10531  [pdf, other

    cs.LG cs.CV

    Nonparametric Teaching of Implicit Neural Representations

    Authors: Chen Zhang, Steven Tin Sui Luo, Jason Chun Lok Li, Yik-Chung Wu, Ngai Wong

    Abstract: We investigate the learning of implicit neural representation (INR) using an overparameterized multilayer perceptron (MLP) via a novel nonparametric teaching perspective. The latter offers an efficient example selection framework for teaching nonparametrically defined (viz. non-closed-form) target functions, such as image functions defined by 2D grids of pixels. To address the costly training of I… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: ICML 2024 (24 pages, 13 figures)

  6. arXiv:2405.05573  [pdf, other

    cs.CV cs.CR

    Poisoning-based Backdoor Attacks for Arbitrary Target Label with Positive Triggers

    Authors: Binxiao Huang, Jason Chun Lok, Chang Liu, Ngai Wong

    Abstract: Poisoning-based backdoor attacks expose vulnerabilities in the data preparation stage of deep neural network (DNN) training. The DNNs trained on the poisoned dataset will be embedded with a backdoor, making them behave well on clean data while outputting malicious predictions whenever a trigger is applied. To exploit the abundant information contained in the input data to output label mapping, our… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  7. arXiv:2405.02356  [pdf, other

    cs.LG cs.AI

    Stochastic Multivariate Universal-Radix Finite-State Machine: a Theoretically and Practically Elegant Nonlinear Function Approximator

    Authors: Xincheng Feng, Guodong Shen, Jianhao Hu, Meng Li, Ngai Wong

    Abstract: Nonlinearities are crucial for capturing complex input-output relationships especially in deep neural networks. However, nonlinear functions often incur various hardware and compute overheads. Meanwhile, stochastic computing (SC) has emerged as a promising approach to tackle this challenge by trading output precision for hardware simplicity. To this end, this paper proposes a first-of-its-kind sto… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  8. arXiv:2404.10179  [pdf, other

    cs.RO cs.AI cs.HC cs.LG

    Scaling Instructable Agents Across Many Simulated Worlds

    Authors: SIMA Team, Maria Abi Raad, Arun Ahuja, Catarina Barros, Frederic Besse, Andrew Bolt, Adrian Bolton, Bethanie Brownfield, Gavin Buttimore, Max Cant, Sarah Chakera, Stephanie C. Y. Chan, Jeff Clune, Adrian Collister, Vikki Copeman, Alex Cullum, Ishita Dasgupta, Dario de Cesare, Julia Di Trapani, Yani Donchev, Emma Dunleavy, Martin Engelcke, Ryan Faulkner, Frankie Garcia, Charles Gbadamosi , et al. (68 additional authors not shown)

    Abstract: Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground language in perception and embodied actions, in order to accomplish complex tasks. The Scalable, Instructable, Multiworld Agent (SIMA) project tackles this by training agents to follow free-form instructio… ▽ More

    Submitted 17 April, 2024; v1 submitted 13 March, 2024; originally announced April 2024.

  9. arXiv:2404.02657  [pdf, other

    cs.CL cs.AI

    Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models

    Authors: Taiqiang Wu, Chaofan Tao, Jiahao Wang, Zhe Zhao, Ngai Wong

    Abstract: Kullback-Leiber divergence has been widely used in Knowledge Distillation (KD) to compress Large Language Models (LLMs). Contrary to prior assertions that reverse Kullback-Leibler (RKL) divergence is mode-seeking and thus preferable over the mean-seeking forward Kullback-Leibler (FKL) divergence, this study empirically and theoretically demonstrates that neither mode-seeking nor mean-seeking prope… ▽ More

    Submitted 16 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: working on progress

  10. arXiv:2403.19238  [pdf, other

    cs.CV cs.AI eess.IV

    Taming Lookup Tables for Efficient Image Retouching

    Authors: Sidi Yang, Binxiao Huang, Mingdeng Cao, Yatai Ji, Hanzhong Guo, Ngai Wong, Yujiu Yang

    Abstract: The widespread use of high-definition screens in edge devices, such as end-user cameras, smartphones, and televisions, is spurring a significant demand for image enhancement. Existing enhancement models often optimize for high performance while falling short of reducing hardware inference time and power consumption, especially on edge devices with constrained computing and storage resources. To th… ▽ More

    Submitted 13 July, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted by ECCV2024

  11. arXiv:2402.14866  [pdf, other

    cs.LG cs.AI cs.CL

    APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models

    Authors: Ziyi Guan, Hantao Huang, Yupeng Su, Hong Huang, Ngai Wong, Hao Yu

    Abstract: Large Language Models (LLMs) have greatly advanced the natural language processing paradigm. However, the high computational load and huge model sizes pose a grand challenge for deployment on edge devices. To this end, we propose APTQ (Attention-aware Post-Training Mixed-Precision Quantization) for LLMs, which considers not only the second-order information of each layer's weights, but also, for t… ▽ More

    Submitted 15 April, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: 6 pages, 2 figures, published to DAC 2024: 61st IEEE/ACM Design Automation Conference. (DAC'24)

  12. arXiv:2402.11417  [pdf, other

    cs.CL cs.AI cs.LG

    LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models

    Authors: Yifan Yang, Jiajun Zhou, Ngai Wong, Zheng Zhang

    Abstract: Various parameter-efficient fine-tuning (PEFT) techniques have been proposed to enable computationally efficient fine-tuning while maintaining model performance. However, existing PEFT methods are still limited by the growing number of trainable parameters with the rapid deployment of Large Language Models (LLMs). To address this challenge, we present LoRETTA, an ultra-parameter-efficient framewor… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

  13. arXiv:2312.17018  [pdf, other

    cs.CV cs.LG

    Learning Spatially Collaged Fourier Bases for Implicit Neural Representation

    Authors: Jason Chun Lok Li, Chang Liu, Binxiao Huang, Ngai Wong

    Abstract: Existing approaches to Implicit Neural Representation (INR) can be interpreted as a global scene representation via a linear combination of Fourier bases of different frequencies. However, such universal basis functions can limit the representation capability in local regions where a specific component is unnecessary, resulting in unpleasant artifacts. To this end, we introduce a learnable spatial… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: 11 pages, 13 figures, Accepted at the 38th AAAI Conference on Artificial Intelligence (AAAI-24)

  14. arXiv:2312.09922  [pdf, other

    cs.CV cs.AI

    A Unifying Tensor View for Lightweight CNNs

    Authors: Jason Chun Lok Li, Rui Lin, Jiajun Zhou, Edmund Yin Mun Lam, Ngai Wong

    Abstract: Despite the decomposition of convolutional kernels for lightweight CNNs being well studied, existing works that rely on tensor network diagrams or hyperdimensional abstraction lack geometry intuition. This work devises a new perspective by linking a 3D-reshaped kernel tensor to its various slice-wise and rank-1 decompositions, permitting a straightforward connection between various tensor approxim… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: 4 pages, 3 figures, accepted in 2023 IEEE 15th International Conference on ASIC (ASICON 2023)

  15. arXiv:2312.06101  [pdf, other

    eess.IV cs.CV

    Hundred-Kilobyte Lookup Tables for Efficient Single-Image Super-Resolution

    Authors: Binxiao Huang, Jason Chun Lok Li, Jie Ran, Boyu Li, Jiajun Zhou, Dahai Yu, Ngai Wong

    Abstract: Conventional super-resolution (SR) schemes make heavy use of convolutional neural networks (CNNs), which involve intensive multiply-accumulate (MAC) operations, and require specialized hardware such as graphics processing units. This contradicts the regime of edge AI that often runs on devices strained by power, computing, and storage resources. Such a challenge has motivated a series of lookup ta… ▽ More

    Submitted 8 May, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

  16. arXiv:2311.08125  [pdf, other

    cs.LG

    Lite it fly: An All-Deformable-Butterfly Network

    Authors: Rui Lin, Jason Chun Lok Li, Jiajun Zhou, Binxiao Huang, Jie Ran, Ngai Wong

    Abstract: Most deep neural networks (DNNs) consist fundamentally of convolutional and/or fully connected layers, wherein the linear transform can be cast as the product between a filter matrix and a data matrix obtained by arranging feature tensors into columns. The lately proposed deformable butterfly (DeBut) decomposes the filter matrix into generalized, butterflylike factors, thus achieving network compr… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 7 pages, 3 figures, accepted as a brief paper in IEEE Transactions on Neural Networks and Learning Systems (TNNLS)

  17. arXiv:2306.14262  [pdf, other

    cs.CV

    A Spectral Perspective towards Understanding and Improving Adversarial Robustness

    Authors: Binxiao Huang, Rui Lin, Chaofan Tao, Ngai Wong

    Abstract: Deep neural networks (DNNs) are incredibly vulnerable to crafted, imperceptible adversarial perturbations. While adversarial training (AT) has proven to be an effective defense approach, the AT mechanism for robustness improvement is not fully understood. This work investigates AT from a spectral perspective, adding new insights to the design of effective defenses. In particular, we show that AT i… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

  18. arXiv:2306.11123  [pdf, other

    eess.SP cs.CV

    To Fold or Not to Fold: Graph Regularized Tensor Train for Visual Data Completion

    Authors: Le Xu, Lei Cheng, Ngai Wong, Yik-Chung Wu

    Abstract: Tensor train (TT) representation has achieved tremendous success in visual data completion tasks, especially when it is combined with tensor folding. However, folding an image or video tensor breaks the original data structure, leading to local information loss as nearby pixels may be assigned into different dimensions and become far away from each other. In this paper, to fully preserve the local… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  19. arXiv:2305.15365  [pdf, other

    cs.CV

    Boundary Attention Mapping (BAM): Fine-grained saliency maps for segmentation of Burn Injuries

    Authors: Mahla Abdolahnejad, Justin Lee, Hannah Chan, Alex Morzycki, Olivier Ethier, Anthea Mo, Peter X. Liu, Joshua N. Wong, Colin Hong, Rakesh Joshi

    Abstract: Burn injuries can result from mechanisms such as thermal, chemical, and electrical insults. A prompt and accurate assessment of burns is essential for deciding definitive clinical treatments. Currently, the primary approach for burn assessments, via visual and tactile observations, is approximately 60%-80% accurate. The gold standard is biopsy and a close second would be non-invasive methods like… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  20. arXiv:2305.09098  [pdf, other

    cs.CL cs.LG

    Weight-Inherited Distillation for Task-Agnostic BERT Compression

    Authors: Taiqiang Wu, Cheng Hou, Shanshan Lao, Jiayi Li, Ngai Wong, Zhe Zhao, Yujiu Yang

    Abstract: Knowledge Distillation (KD) is a predominant approach for BERT compression. Previous KD-based methods focus on designing extra alignment losses for the student model to mimic the behavior of the teacher model. These methods transfer the knowledge in an indirect way. In this paper, we propose a novel Weight-Inherited Distillation (WID), which directly transfers knowledge from the teacher. WID does… ▽ More

    Submitted 20 March, 2024; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: 9 pages, 4 figures, NAACL2024 findings

  21. arXiv:2303.14893  [pdf, other

    cs.CV

    Context-Aware Transformer for 3D Point Cloud Automatic Annotation

    Authors: Xiaoyan Qian, Chang Liu, Xiaojuan Qi, Siew-Chong Tan, Edmund Lam, Ngai Wong

    Abstract: 3D automatic annotation has received increased attention since manually annotating 3D point clouds is laborious. However, existing methods are usually complicated, e.g., pipelined training for 3D foreground/background segmentation, cylindrical object proposals, and point completion. Furthermore, they often overlook the inter-object feature relation that is particularly informative to hard samples… ▽ More

    Submitted 26 March, 2023; originally announced March 2023.

  22. arXiv:2303.13763  [pdf, other

    cs.LG cs.AI

    Edge-free but Structure-aware: Prototype-Guided Knowledge Distillation from GNNs to MLPs

    Authors: Taiqiang Wu, Zhe Zhao, Jiahao Wang, Xingyu Bai, Lei Wang, Ngai Wong, Yujiu Yang

    Abstract: Distilling high-accuracy Graph Neural Networks~(GNNs) to low-latency multilayer perceptrons~(MLPs) on graph tasks has become a hot research topic. However, MLPs rely exclusively on the node features and fail to capture the graph structural information. Previous methods address this issue by processing graph edges into extra inputs for MLPs, but such graph structures may be unavailable for various… ▽ More

    Submitted 27 March, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: 8 pages, 4 figures, 9 tables

  23. DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference

    Authors: Jiajun Zhou, Jiajun Wu, Yizhao Gao, Yuhao Ding, Chaofan Tao, Boyu Li, Fengbin Tu, Kwang-Ting Cheng, Hayden Kwok-Hay So, Ngai Wong

    Abstract: To accelerate the inference of deep neural networks (DNNs), quantization with low-bitwidth numbers is actively researched. A prominent challenge is to quantize the DNN models into low-bitwidth numbers without significant accuracy degradation, especially at very low bitwidths (< 8 bits). This work targets an adaptive data representation with variable-length encoding called DyBit. DyBit can dynamica… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

  24. arXiv:2212.12732  [pdf, other

    cs.CV

    Frequency Regularization for Improving Adversarial Robustness

    Authors: Binxiao Huang, Chaofan Tao, Rui Lin, Ngai Wong

    Abstract: Deep neural networks are incredibly vulnerable to crafted, human-imperceptible adversarial perturbations. Although adversarial training (AT) has proven to be an effective defense approach, we find that the AT-trained models heavily rely on the input low-frequency content for judgment, accounting for the low standard accuracy. To close the large gap between the standard and robust accuracies during… ▽ More

    Submitted 24 December, 2022; originally announced December 2022.

    Comments: accepted by AAAI 2023 workshop

  25. arXiv:2211.11602  [pdf, other

    cs.LG cs.HC cs.MA

    Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback

    Authors: Josh Abramson, Arun Ahuja, Federico Carnevale, Petko Georgiev, Alex Goldin, Alden Hung, Jessica Landon, Jirka Lhotka, Timothy Lillicrap, Alistair Muldal, George Powell, Adam Santoro, Guy Scully, Sanjana Srivastava, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan, Rui Zhu

    Abstract: An important goal in artificial intelligence is to create agents that can both interact naturally with humans and learn from their feedback. Here we demonstrate how to use reinforcement learning from human feedback (RLHF) to improve upon simulated, embodied agents trained to a base level of competency with imitation learning. First, we collected data of humans interacting with agents in a simulate… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

  26. arXiv:2210.08701  [pdf, other

    cs.LG cs.CV

    ODG-Q: Robust Quantization via Online Domain Generalization

    Authors: Chaofan Tao, Ngai Wong

    Abstract: Quantizing neural networks to low-bitwidth is important for model deployment on resource-limited edge hardware. Although a quantized network has a smaller model size and memory footprint, it is fragile to adversarial attacks. However, few methods study the robustness and training efficiency of quantized networks. To this end, we propose a new method by recasting robust quantization as an online do… ▽ More

    Submitted 16 October, 2022; originally announced October 2022.

  27. arXiv:2208.13571  [pdf, other

    cs.LG cs.AI

    PECAN: A Product-Quantized Content Addressable Memory Network

    Authors: Jie Ran, Rui Lin, Jason Chun Lok Li, Jiajun Zhou, Ngai Wong

    Abstract: A novel deep neural network (DNN) architecture is proposed wherein the filtering and linear transform are realized solely with product quantization (PQ). This results in a natural implementation via content addressable memory (CAM), which transcends regular DNN layer operations and requires only simple table lookup. Two schemes are developed for the end-to-end PQ prototype training, namely, throug… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

  28. arXiv:2207.09805  [pdf, other

    cs.CV

    Multimodal Transformer for Automatic 3D Annotation and Object Detection

    Authors: Chang Liu, Xiaoyan Qian, Binxiao Huang, Xiaojuan Qi, Edmund Lam, Siew-Chong Tan, Ngai Wong

    Abstract: Despite a growing number of datasets being collected for training 3D object detection models, significant human effort is still required to annotate 3D boxes on LiDAR scans. To automate the annotation and facilitate the production of various customized datasets, we propose an end-to-end multimodal transformer (MTrans) autolabeler, which leverages both LiDAR scans and images to generate precise 3D… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: 14 pages, 4 figures

  29. arXiv:2206.07807  [pdf, other

    cs.CL

    How Adults Understand What Young Children Say

    Authors: Stephan C. Meylan, Ruthe Foushee, Nicole H. Wong, Elika Bergelson, Roger P. Levy

    Abstract: Children's early speech often bears little resemblance to that of adults, and yet parents and other caregivers are able to interpret that speech and react accordingly. Here we investigate how these adult inferences as listeners reflect sophisticated beliefs about what children are trying to communicate, as well as how children are likely to pronounce words. Using a Bayesian framework for modeling… ▽ More

    Submitted 16 March, 2023; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: 24 pages, 8 figures, 3 tables

  30. arXiv:2205.13274  [pdf, other

    cs.LG cs.AI

    Evaluating Multimodal Interactive Agents

    Authors: Josh Abramson, Arun Ahuja, Federico Carnevale, Petko Georgiev, Alex Goldin, Alden Hung, Jessica Landon, Timothy Lillicrap, Alistair Muldal, Blake Richards, Adam Santoro, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan

    Abstract: Creating agents that can interact naturally with humans is a common goal in artificial intelligence (AI) research. However, evaluating these interactions is challenging: collecting online human-agent interactions is slow and expensive, yet faster proxy metrics often do not correlate well with interactive evaluation. In this paper, we assess the merits of these existing evaluation metrics and prese… ▽ More

    Submitted 14 July, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

  31. arXiv:2205.09065  [pdf, other

    cs.LG eess.SP

    Multilayer Perceptron Based Stress Evolution Analysis under DC Current Stressing for Multi-segment Wires

    Authors: Tianshu Hou, Peining Zhen, Ngai Wong, Quan Chen, Guoyong Shi, Shuqi Wang, Hai-Bao Chen

    Abstract: Electromigration (EM) is one of the major concerns in the reliability analysis of very large scale integration (VLSI) systems due to the continuous technology scaling. Accurately predicting the time-to-failure of integrated circuits (IC) becomes increasingly important for modern IC design. However, traditional methods are often not sufficiently accurate, leading to undesirable over-design especial… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

    Comments: The paper will be published in IEEE Transactions on COMPUTER-AIDED DESIGN of Integrated Circuits and Systems

  32. arXiv:2204.06907  [pdf, other

    eess.AS cs.SD

    Lombard Effect for Bilingual Speakers in Cantonese and English: importance of spectro-temporal features

    Authors: Maximilian Karl Scharf, Sabine Hochmuth, Lena L. N. Wong, Birger Kollmeier, Anna Warzybok

    Abstract: For a better understanding of the mechanisms underlying speech perception and the contribution of different signal features, computational models of speech recognition have a long tradition in hearing research. Due to the diverse range of situations in which speech needs to be recognized, these models need to be generalizable across many acoustic conditions, speakers, and languages. This contribut… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: Submitted to INTERSPEECH2022

  33. arXiv:2203.15700  [pdf, other

    cs.CV

    MAP-Gen: An Automated 3D-Box Annotation Flow with Multimodal Attention Point Generator

    Authors: Chang Liu, Xiaoyan Qian, Xiaojuan Qi, Edmund Y. Lam, Siew-Chong Tan, Ngai Wong

    Abstract: Manually annotating 3D point clouds is laborious and costly, limiting the training data preparation for deep learning in real-world object detection. While a few previous studies tried to automatically generate 3D bounding boxes from weak labels such as 2D boxes, the quality is sub-optimal compared to human annotators. This work proposes a novel autolabeler, called multimodal attention point gener… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: 6 pages, 4 figures, accepted by ICPR 2022

  34. arXiv:2203.15189  [pdf, other

    cs.CV

    Coarse to Fine: Image Restoration Boosted by Multi-Scale Low-Rank Tensor Completion

    Authors: Rui Lin, Cong Chen, Ngai Wong

    Abstract: Existing low-rank tensor completion (LRTC) approaches aim at restoring a partially observed tensor by imposing a global low-rank constraint on the underlying completed tensor. However, such a global rank assumption suffers the trade-off between restoring the originally details-lacking parts and neglecting the potentially complex objects, making the completion performance unsatisfactory on both sid… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

  35. arXiv:2203.13556  [pdf, other

    cs.CV cs.LG

    Deformable Butterfly: A Highly Structured and Sparse Linear Transform

    Authors: Rui Lin, Jie Ran, King Hung Chiu, Graziano Chesi, Ngai Wong

    Abstract: We introduce a new kind of linear transform named Deformable Butterfly (DeBut) that generalizes the conventional butterfly matrices and can be adapted to various input-output dimensions. It inherits the fine-to-coarse-grained learnable hierarchy of traditional butterflies and when deployed to neural networks, the prominent structures and sparsity in a DeBut layer constitutes a new way for network… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

  36. arXiv:2203.10705  [pdf, other

    cs.CL cs.CV

    Compression of Generative Pre-trained Language Models via Quantization

    Authors: Chaofan Tao, Lu Hou, Wei Zhang, Lifeng Shang, Xin Jiang, Qun Liu, Ping Luo, Ngai Wong

    Abstract: The increasing size of generative Pre-trained Language Models (PLMs) has greatly increased the demand for model compression. Despite various methods to compress BERT or its variants, there are few attempts to compress generative PLMs, and the underlying difficulty remains unclear. In this paper, we compress generative PLMs by quantization. We find that previous quantization methods fail on generat… ▽ More

    Submitted 16 July, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: ACL 2022

  37. arXiv:2203.08739  [pdf, other

    cs.CV cs.LG

    What Do Adversarially trained Neural Networks Focus: A Fourier Domain-based Study

    Authors: Binxiao Huang, Chaofan Tao, Rui Lin, Ngai Wong

    Abstract: Although many fields have witnessed the superior performance brought about by deep learning, the robustness of neural networks remains an open issue. Specifically, a small adversarial perturbation on the input may cause the model to produce a completely different output. Such poor robustness implies many potential hazards, especially in security-critical applications, e.g., autonomous driving and… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

  38. arXiv:2112.03763  [pdf, other

    cs.LG

    Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning

    Authors: DeepMind Interactive Agents Team, Josh Abramson, Arun Ahuja, Arthur Brussee, Federico Carnevale, Mary Cassin, Felix Fischer, Petko Georgiev, Alex Goldin, Mansi Gupta, Tim Harley, Felix Hill, Peter C Humphreys, Alden Hung, Jessica Landon, Timothy Lillicrap, Hamza Merzic, Alistair Muldal, Adam Santoro, Guy Scully, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan, Rui Zhu

    Abstract: A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. We show that imitation learning of human-human interactions in a… ▽ More

    Submitted 2 February, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

  39. arXiv:2110.07185  [pdf, other

    astro-ph.HE cs.LG

    VLBInet: Radio Interferometry Data Classification for EHT with Neural Networks

    Authors: Joshua Yao-Yu Lin, Dominic W. Pesce, George N. Wong, Ajay Uppili Arasanipalai, Ben S. Prather, Charles F. Gammie

    Abstract: The Event Horizon Telescope (EHT) recently released the first horizon-scale images of the black hole in M87. Combined with other astronomical data, these images constrain the mass and spin of the hole as well as the accretion rate and magnetic flux trapped on the hole. An important question for the EHT is how well key parameters, such as trapped magnetic flux and the associated disk models, can be… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: 10 pages, 7 figures

  40. arXiv:2110.02059  [pdf, other

    cs.CL cs.AI cs.LG

    Multi-Relational Graph based Heterogeneous Multi-Task Learning in Community Question Answering

    Authors: Zizheng Lin, Haowen Ke, Ngo-Yin Wong, Jiaxin Bai, Yangqiu Song, Huan Zhao, Junpeng Ye

    Abstract: Various data mining tasks have been proposed to study Community Question Answering (CQA) platforms like Stack Overflow. The relatedness between some of these tasks provides useful learning signals to each other via Multi-Task Learning (MTL). However, due to the high heterogeneity of these tasks, few existing works manage to jointly solve them in a unified framework. To tackle this challenge, we de… ▽ More

    Submitted 3 September, 2021; originally announced October 2021.

    Comments: Full paper of CIKM 2021

  41. arXiv:2107.12808  [pdf, other

    cs.LG cs.AI cs.MA

    Open-Ended Learning Leads to Generally Capable Agents

    Authors: Open Ended Learning Team, Adam Stooke, Anuj Mahajan, Catarina Barros, Charlie Deck, Jakob Bauer, Jakub Sygnowski, Maja Trebacz, Max Jaderberg, Michael Mathieu, Nat McAleese, Nathalie Bradley-Schmieg, Nathaniel Wong, Nicolas Porcel, Roberta Raileanu, Steph Hughes-Fitt, Valentin Dalibard, Wojciech Marian Czarnecki

    Abstract: In this work we create agents that can perform well beyond a single, individual task, that exhibit much wider generalisation of behaviour to a massive, rich space of challenges. We define a universe of tasks within an environment domain and demonstrate the ability to train agents that are generally capable across this vast space and beyond. The environment is natively multi-agent, spanning the con… ▽ More

    Submitted 31 July, 2021; v1 submitted 27 July, 2021; originally announced July 2021.

  42. arXiv:2105.04218  [pdf, other

    cs.LG

    Exploiting Elasticity in Tensor Ranks for Compressing Neural Networks

    Authors: Jie Ran, Rui Lin, Hayden K. H. So, Graziano Chesi, Ngai Wong

    Abstract: Elasticities in depth, width, kernel size and resolution have been explored in compressing deep neural networks (DNNs). Recognizing that the kernels in a convolutional neural network (CNN) are 4-way tensors, we further exploit a new elasticity dimension along the input-output channels. Specifically, a novel nuclear-norm rank minimization factorization (NRMF) approach is proposed to dynamically and… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

    Comments: 8 pages, 5 figures

  43. arXiv:2105.03679  [pdf, other

    eess.IV cs.LG

    EZCrop: Energy-Zoned Channels for Robust Output Pruning

    Authors: Rui Lin, Jie Ran, Dongpeng Wang, King Hung Chiu, Ngai Wong

    Abstract: Recent results have revealed an interesting observation in a trained convolutional neural network (CNN), namely, the rank of a feature map channel matrix remains surprisingly constant despite the input images. This has led to an effective rank-based channel pruning algorithm, yet the constant rank phenomenon remains mysterious and unexplained. This work aims at demystifying and interpreting such r… ▽ More

    Submitted 11 May, 2021; v1 submitted 8 May, 2021; originally announced May 2021.

  44. arXiv:2103.11645  [pdf, other

    cs.CV

    AET-EFN: A Versatile Design for Static and Dynamic Event-Based Vision

    Authors: Chang Liu, Xiaojuan Qi, Edmund Lam, Ngai Wong

    Abstract: The neuromorphic event cameras, which capture the optical changes of a scene, have drawn increasing attention due to their high speed and low power consumption. However, the event data are noisy, sparse, and nonuniform in the spatial-temporal domain with an extremely high temporal resolution, making it challenging to design backend algorithms for event-based vision. Existing methods encode events… ▽ More

    Submitted 22 March, 2021; originally announced March 2021.

    Comments: 10 pages, 6 figures

    ACM Class: I.2.10

  45. arXiv:2102.07444  [pdf, other

    cs.CV

    FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware Transformation

    Authors: Chaofan Tao, Rui Lin, Quan Chen, Zhaoyang Zhang, Ping Luo, Ngai Wong

    Abstract: Learning convolutional neural networks (CNNs) with low bitwidth is challenging because performance may drop significantly after quantization. Prior arts often discretize the network weights by carefully tuning hyper-parameters of quantization (e.g. non-uniform stepsize and layer-wise bitwidths), which are complicated and sub-optimal because the full-precision and low-precision models have a large… ▽ More

    Submitted 20 February, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

  46. arXiv:2012.05672  [pdf, other

    cs.LG cs.AI cs.MA

    Imitating Interactive Intelligence

    Authors: Josh Abramson, Arun Ahuja, Iain Barr, Arthur Brussee, Federico Carnevale, Mary Cassin, Rachita Chhaparia, Stephen Clark, Bogdan Damoc, Andrew Dudzik, Petko Georgiev, Aurelia Guy, Tim Harley, Felix Hill, Alden Hung, Zachary Kenton, Jessica Landon, Timothy Lillicrap, Kory Mathewson, Soňa Mokrá, Alistair Muldal, Adam Santoro, Nikolay Savinov, Vikrant Varma, Greg Wayne , et al. (4 additional authors not shown)

    Abstract: A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. This setting nevertheless integrates a number of the central cha… ▽ More

    Submitted 20 January, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

  47. arXiv:2011.02265  [pdf, other

    cs.CV

    S3-Net: A Fast and Lightweight Video Scene Understanding Network by Single-shot Segmentation

    Authors: Yuan Cheng, Yuchao Yang, Hai-Bao Chen, Ngai Wong, Hao Yu

    Abstract: Real-time understanding in video is crucial in various AI applications such as autonomous driving. This work presents a fast single-shot segmentation strategy for video scene understanding. The proposed net, called S3-Net, quickly locates and segments target sub-scenes, meanwhile extracts structured time-series semantic features as inputs to an LSTM-based spatio-temporal model. Utilizing tensoriza… ▽ More

    Submitted 4 November, 2020; originally announced November 2020.

    Comments: WACV2021

  48. arXiv:2009.01719  [pdf, other

    cs.CL cs.AI

    Grounded Language Learning Fast and Slow

    Authors: Felix Hill, Olivier Tieleman, Tamara von Glehn, Nathaniel Wong, Hamza Merzic, Stephen Clark

    Abstract: Recent work has shown that large text-based neural language models, trained with conventional supervised learning objectives, acquire a surprising propensity for few- and one-shot learning. Here, we show that an embodied agent situated in a simulated 3D world, and endowed with a novel dual-coding external memory, can exhibit similar one-shot word learning when trained with conventional reinforceme… ▽ More

    Submitted 14 October, 2020; v1 submitted 3 September, 2020; originally announced September 2020.

  49. arXiv:2005.09382  [pdf, other

    cs.CL

    Human Instruction-Following with Deep Reinforcement Learning via Transfer-Learning from Text

    Authors: Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley

    Abstract: Recent work has described neural-network-based agents that are trained with reinforcement learning (RL) to execute language-like commands in simulated worlds, as a step towards an intelligent agent or robot that can be instructed by human users. However, the optimisation of multi-goal motor policies via deep RL from scratch requires many episodes of experience. Consequently, instruction-following… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

  50. Variational Inference with Parameter Learning Applied to Vehicle Trajectory Estimation

    Authors: Jeremy N. Wong, David J. Yoon, Angela P. Schoellig, Timothy D. Barfoot

    Abstract: We present parameter learning in a Gaussian variational inference setting using only noisy measurements (i.e., no groundtruth). This is demonstrated in the context of vehicle trajectory estimation, although the method we propose is general. The paper extends the Exactly Sparse Gaussian Variational Inference (ESGVI) framework, which has previously been used for large-scale nonlinear batch state est… ▽ More

    Submitted 9 July, 2020; v1 submitted 21 March, 2020; originally announced March 2020.

    Comments: IEEE Robotics and Automation Letters (RA-L). 8 pages, 4 figures