Skip to main content

Showing 1–50 of 78 results for author: Chan, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03311  [pdf, other

    cs.RO cs.AI cs.LG

    Value-Penalized Auxiliary Control from Examples for Learning without Rewards or Demonstrations

    Authors: Trevor Ablett, Bryan Chan, Jayce Haoran Wang, Jonathan Kelly

    Abstract: Learning from examples of success is an appealing approach to reinforcement learning that eliminates many of the disadvantages of using hand-crafted reward functions or full expert-demonstration trajectories, both of which can be difficult to acquire, biased, or suboptimal. However, learning from examples alone dramatically increases the exploration challenge, especially for complex tasks. This wo… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Submitted to the Conference on Robot Learning (CoRL'24), Munich, Germany, Nov. 6-9, 2024

  2. arXiv:2405.19943  [pdf, other

    cs.CV

    Multi-View People Detection in Large Scenes via Supervised View-Wise Contribution Weighting

    Authors: Qi Zhang, Yunfei Gong, Daijie Chen, Antoni B. Chan, Hui Huang

    Abstract: Recent deep learning-based multi-view people detection (MVD) methods have shown promising results on existing datasets. However, current methods are mainly trained and evaluated on small, single scenes with a limited number of multi-view frames and fixed camera views. As a result, these methods may not be practical for detecting people in larger, more complex scenes with severe occlusions and came… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: AAAI 2024

  3. arXiv:2405.08886  [pdf, other

    cs.LG stat.ML

    The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks

    Authors: Ziquan Liu, Yufei Cui, Yan Yan, Yi Xu, Xiangyang Ji, Xue Liu, Antoni B. Chan

    Abstract: In safety-critical applications such as medical imaging and autonomous driving, where decisions have profound implications for patient health and road safety, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks and reliable uncertainty quantification in decision-making. With extensive research focused on enhancing adversarial robustness th… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: ICML2024

  4. arXiv:2405.02560  [pdf, other

    cs.RO

    A Pilot Study on the Comparison of Prefrontal Cortex Activities of Robotic Therapies on Elderly with Mild Cognitive Impairment

    Authors: King Tai Henry Au-Yeung, William Wai Lam Chan, Kwan Yin Brian Chan, Hongjie Jiang, Junpei Zhong

    Abstract: Demographic shifts have led to an increase in mild cognitive impairment (MCI), and this study investigates the effects of cognitive training (CT) and reminiscence therapy (RT) conducted by humans or socially assistive robots (SARs) on prefrontal cortex activation in elderly individuals with MCI, aiming to determine the most effective therapy-modality combination for promoting cognitive function. T… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: submitted to IEEE on affective computing

  5. arXiv:2404.11895  [pdf, other

    cs.CV

    FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models

    Authors: Wei Wu, Qingnan Fan, Shuai Qin, Hong Gu, Ruoyu Zhao, Antoni B. Chan

    Abstract: Precise image editing with text-to-image models has attracted increasing interest due to their remarkable generative capabilities and user-friendly nature. However, such attempts face the pivotal challenge of misalignment between the intended precise editing target regions and the broader area impacted by the guidance in practice. Despite excellent methods leveraging attention mechanisms that have… ▽ More

    Submitted 13 August, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: Accepted by ECCV-2024

  6. arXiv:2404.09504  [pdf, other

    cs.CV

    Learning Tracking Representations from Single Point Annotations

    Authors: Qiangqiang Wu, Antoni B. Chan

    Abstract: Existing deep trackers are typically trained with largescale video frames with annotated bounding boxes. However, these bounding boxes are expensive and time-consuming to annotate, in particular for large scale datasets. In this paper, we propose to learn tracking representations from single point annotations (i.e., 4.5x faster to annotate than the traditional bounding box) in a weakly supervised… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accept to CVPR2024-L3DIVU

  7. arXiv:2403.10955  [pdf, other

    cs.RO

    Agonist-Antagonist Pouch Motors: Bidirectional Soft Actuators Enhanced by Thermally Responsive Peltier Elements

    Authors: Trevor Exley, Rashmi Wijesundara, Nathan Tan, Akshay Sunkara, Xinyu He, Shuopu Wang, Bonnie Chan, Aditya Jain, Luis Espinosa, Amir Jafari

    Abstract: In this study, we introduce a novel Mylar-based pouch motor design that leverages the reversible actuation capabilities of Peltier junctions to enable agonist-antagonist muscle mimicry in soft robotics. Addressing the limitations of traditional silicone-based materials, such as leakage and phase-change fluid degradation, our pouch motors filled with Novec 7000 provide a durable and leak-proof solu… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: submitted to IROS 2024, 7 pages, 9 figures

  8. arXiv:2403.10236  [pdf, other

    cs.CV

    A Fixed-Point Approach to Unified Prompt-Based Counting

    Authors: Wei Lin, Antoni B. Chan

    Abstract: Existing class-agnostic counting models typically rely on a single type of prompt, e.g., box annotations. This paper aims to establish a comprehensive prompt-based counting framework capable of generating density maps for concerned objects indicated by various prompt types, such as box, point, and text. To achieve this goal, we begin by converting prompts from different modalities into prompt mask… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted by AAAI 2024

  9. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  10. arXiv:2402.17514  [pdf, other

    cs.CV

    Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM

    Authors: Jia Wan, Qiangqiang Wu, Wei Lin, Antoni B. Chan

    Abstract: The existing crowd counting models require extensive training data, which is time-consuming to annotate. To tackle this issue, we propose a simple yet effective crowd counting method by utilizing the Segment-Everything-Everywhere Model (SEEM), an adaptation of the Segmentation Anything Model (SAM), to generate pseudo-labels for training crowd counting models. However, our initial investigation rev… ▽ More

    Submitted 15 August, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted to ECCV 2024

  11. arXiv:2402.10236  [pdf, other

    cs.MA cs.AI cs.LG

    Discovering Sensorimotor Agency in Cellular Automata using Diversity Search

    Authors: Gautier Hamon, Mayalen Etcheverry, Bert Wang-Chak Chan, Clément Moulin-Frier, Pierre-Yves Oudeyer

    Abstract: The research field of Artificial Life studies how life-like phenomena such as autopoiesis, agency, or self-regulation can self-organize in computer simulations. In cellular automata (CA), a key open-question has been whether it it is possible to find environment rules that self-organize robust "individuals" from an initial state with no prior existence of things like "bodies", "brain", "perception… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  12. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  13. arXiv:2312.09982  [pdf, other

    cs.PL cs.AI cs.LG cs.PF

    ACPO: AI-Enabled Compiler-Driven Program Optimization

    Authors: Amir H. Ashouri, Muhammad Asif Manzoor, Duc Minh Vu, Raymond Zhang, Ziwen Wang, Angel Zhang, Bryan Chan, Tomasz S. Czajkowski, Yaoqing Gao

    Abstract: The key to performance optimization of a program is to decide correctly when a certain transformation should be applied by a compiler. This is an ideal opportunity to apply machine-learning models to speed up the tuning process; while this realization has been around since the late 90s, only recent advancements in ML enabled a practical application of ML to compilers as an end-to-end framework.… ▽ More

    Submitted 11 March, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Preprint version of ACPO (12 pages)

    ACM Class: I.2.5; D.3.0; I.2.6

  14. arXiv:2312.00455  [pdf

    cs.AI cs.LG nlin.CG

    Meta-Diversity Search in Complex Systems, A Recipe for Artificial Open-Endedness ?

    Authors: Mayalen Etcheverry, Bert Wang-Chak Chan, Clément Moulin-Frier, Pierre-Yves Oudeyer

    Abstract: Can we build an artificial system that would be able to generate endless surprises if ran "forever" in Minecraft? While there is not a single path toward solving that grand challenge, this article presents what we believe to be some working ingredients for the endless generation of novel increasingly complex artifacts in Minecraft. Our framework for an open-ended system includes two components: a… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  15. arXiv:2311.01589  [pdf, other

    cs.LG

    A Statistical Guarantee for Representation Transfer in Multitask Imitation Learning

    Authors: Bryan Chan, Karime Pereida, James Bergstra

    Abstract: Transferring representation for multitask imitation learning has the potential to provide improved sample efficiency on learning new tasks, when compared to learning from scratch. In this work, we provide a statistical guarantee indicating that we can indeed achieve improved sample efficiency on the target task when a representation is trained using sufficiently diverse source tasks. Our theoretic… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: Accepted by NeurIPS 2023 Workshop on Robot Learning

  16. arXiv:2309.13648  [pdf, other

    cs.CR q-fin.TR

    Don't Let MEV Slip: The Costs of Swapping on the Uniswap Protocol

    Authors: Austin Adams, Benjamin Y Chan, Sarit Markovich, Xin Wan

    Abstract: We present the first in-depth empirical characterization of the costs of trading on a decentralized exchange (DEX). Using quoted prices from the Uniswap Labs interface for two pools -- USDC-ETH (5bps) and PEPE-ETH (30bps) -- we evaluate the efficiency of trading on DEXs. Our main tool is slippage -- the difference between the realized execution price of a trade, and its quoted price -- which we br… ▽ More

    Submitted 17 April, 2024; v1 submitted 24 September, 2023; originally announced September 2023.

    Comments: 31 pages, 7 tables, 2 figures

  17. arXiv:2308.09091  [pdf, other

    cs.CV

    Edit Temporal-Consistent Videos with Image Diffusion Model

    Authors: Yuanzhi Wang, Yong Li, Xiaoya Zhang, Xin Liu, Anbo Dai, Antoni B. Chan, Zhen Cui

    Abstract: Large-scale text-to-image (T2I) diffusion models have been extended for text-guided video editing, yielding impressive zero-shot video editing performance. Nonetheless, the generated videos usually show spatial irregularities and temporal inconsistencies as the temporal characteristics of videos have not been faithfully modeled. In this paper, we propose an elegant yet effective Temporal-Consisten… ▽ More

    Submitted 29 December, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: 10 pages, 7 figures

    Journal ref: ACM TOMM 2024, Codes: https://github.com/mdswyz/TCVE

  18. arXiv:2305.03601  [pdf, other

    cs.CV cs.AI

    Human Attention-Guided Explainable Artificial Intelligence for Computer Vision Models

    Authors: Guoyang Liu, Jindi Zhang, Antoni B. Chan, Janet H. Hsiao

    Abstract: We examined whether embedding human attention knowledge into saliency-based explainable AI (XAI) methods for computer vision models could enhance their plausibility and faithfulness. We first developed new gradient-based XAI methods for object detection models to generate object-specific explanations by extending the current methods for image classification models. Interestingly, while these gradi… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: 14 pages, 18 figures

    MSC Class: 68T45 ACM Class: I.2.0; I.4.0

  19. arXiv:2304.06354  [pdf, other

    cs.CV

    ODAM: Gradient-based instance-specific visual explanations for object detection

    Authors: Chenyang Zhao, Antoni B. Chan

    Abstract: We propose the gradient-weighted Object Detector Activation Maps (ODAM), a visualized explanation technique for interpreting the predictions of object detectors. Utilizing the gradients of detector targets flowing into the intermediate feature maps, ODAM produces heat maps that show the influence of regions on the detector's decision for each predicted attribute. Compared to previous works classif… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: 2023 International Conference on Learning Representations

  20. arXiv:2304.05639  [pdf, other

    cs.NE nlin.CG

    Towards Large-Scale Simulations of Open-Ended Evolution in Continuous Cellular Automata

    Authors: Bert Wang-Chak Chan

    Abstract: Inspired by biological and cultural evolution, there have been many attempts to explore and elucidate the necessary conditions for open-endedness in artificial intelligence and artificial life. Using a continuous cellular automata called Lenia as the base system, we built large-scale evolutionary simulations using parallel computing framework JAX, in order to achieve the goal of never-ending evolu… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

    Comments: Accepted to GECCO 2023

  21. arXiv:2304.00571  [pdf, other

    cs.CV

    DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks

    Authors: Qiangqiang Wu, Tianyu Yang, Ziquan Liu, Baoyuan Wu, Ying Shan, Antoni B. Chan

    Abstract: In this paper, we study masked autoencoder (MAE) pretraining on videos for matching-based downstream tasks, including visual object tracking (VOT) and video object segmentation (VOS). A simple extension of MAE is to randomly mask out frame patches in videos and reconstruct the frame pixels. However, we find that this simple baseline heavily relies on spatial cues while ignoring temporal relations… ▽ More

    Submitted 6 April, 2023; v1 submitted 2 April, 2023; originally announced April 2023.

    Comments: CVPR 2023; V2: fixed typos in Table-2

  22. arXiv:2303.11135  [pdf, other

    cs.LG cs.CV

    TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization

    Authors: Ziquan Liu, Yi Xu, Xiangyang Ji, Antoni B. Chan

    Abstract: Recent years have seen the ever-increasing importance of pre-trained models and their downstream training in deep learning research and applications. At the same time, the defense for adversarial examples has been mainly investigated in the context of training from random initialization on simple classification tasks. To better exploit the potential of pre-trained models in adversarial robustness,… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: CVPR2023

  23. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  24. arXiv:2301.00051  [pdf, other

    cs.LG cs.AI cs.RO

    Learning from Guided Play: Improving Exploration for Adversarial Imitation Learning with Simple Auxiliary Tasks

    Authors: Trevor Ablett, Bryan Chan, Jonathan Kelly

    Abstract: Adversarial imitation learning (AIL) has become a popular alternative to supervised imitation learning that reduces the distribution shift suffered by the latter. However, AIL requires effective exploration during an online reinforcement learning phase. In this work, we show that the standard, naive approach to exploration can manifest as a suboptimal local maximum if a policy learned with AIL suf… ▽ More

    Submitted 12 October, 2023; v1 submitted 30 December, 2022; originally announced January 2023.

    Comments: In IEEE Robotics and Automation Letters (RA-L) and presented at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'23), Detroit, MI, USA, Oct. 1-5, 2023. arXiv admin note: substantial text overlap with arXiv:2112.08932

    Journal ref: IEEE Robotics and Automation Letters (RA-L), Vol. 8, No. 3, pp. 1263-1270, Jan. 2023

  25. arXiv:2212.07906  [pdf, other

    cs.NE cs.AI nlin.CG

    Flow-Lenia: Towards open-ended evolution in cellular automata through mass conservation and parameter localization

    Authors: Erwan Plantec, Gautier Hamon, Mayalen Etcheverry, Pierre-Yves Oudeyer, Clément Moulin-Frier, Bert Wang-Chak Chan

    Abstract: The design of complex self-organising systems producing life-like phenomena, such as the open-ended evolution of virtual creatures, is one of the main goals of artificial life. Lenia, a family of cellular automata (CA) generalizing Conway's Game of Life to continuous space, time and states, has attracted a lot of attention because of the wide diversity of self-organizing patterns it can generate.… ▽ More

    Submitted 24 March, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

  26. arXiv:2210.05118  [pdf, other

    cs.LG cs.CV stat.ML

    Boosting Adversarial Robustness From The Perspective of Effective Margin Regularization

    Authors: Ziquan Liu, Antoni B. Chan

    Abstract: The adversarial vulnerability of deep neural networks (DNNs) has been actively investigated in the past several years. This paper investigates the scale-variant property of cross-entropy loss, which is the most commonly used loss function in classification tasks, and its impact on the effective margin and adversarial robustness of deep neural networks. Since the loss function is not invariant to l… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: BMVC 2022

  27. arXiv:2207.08389  [pdf, other

    cs.PL cs.AI cs.LG cs.NE cs.PF

    MLGOPerf: An ML Guided Inliner to Optimize Performance

    Authors: Amir H. Ashouri, Mostafa Elhoushi, Yuzhe Hua, Xiang Wang, Muhammad Asif Manzoor, Bryan Chan, Yaoqing Gao

    Abstract: For the past 25 years, we have witnessed an extensive application of Machine Learning to the Compiler space; the selection and the phase-ordering problem. However, limited works have been upstreamed into the state-of-the-art compilers, i.e., LLVM, to seamlessly integrate the former into the optimization pipeline of a compiler to be readily deployed by the user. MLGO was among the first of such pro… ▽ More

    Submitted 19 July, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: Version 2: Added the missing Table 6. The short version of this work is accepted at ACM/IEEE CASES 2022

    ACM Class: I.2.5; D.3.0; I.2.6

  28. arXiv:2207.01190  [pdf, other

    cs.LG

    Pareto Optimization for Active Learning under Out-of-Distribution Data Scenarios

    Authors: Xueying Zhan, Zeyu Dai, Qingzhong Wang, Qing Li, Haoyi Xiong, Dejing Dou, Antoni B. Chan

    Abstract: Pool-based Active Learning (AL) has achieved great success in minimizing labeling cost by sequentially selecting informative unlabeled samples from a large unlabeled data pool and querying their labels from oracle/annotators. However, existing AL sampling strategies might not work well in out-of-distribution (OOD) data scenarios, where the unlabeled data pool contains some data samples that do not… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

  29. arXiv:2205.12753  [pdf, other

    cs.CV cs.LG

    An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation

    Authors: Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Rong Jin, Xiangyang Ji, Antoni B. Chan

    Abstract: The performance of machine learning models under distribution shift has been the focus of the community in recent years. Most of current methods have been proposed to improve the robustness to distribution shift from the algorithmic perspective, i.e., designing better training algorithms to help the generalization in shifted test distributions. This paper studies the distribution shift problem fro… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

  30. arXiv:2205.01551  [pdf, other

    cs.CV

    Cross-View Cross-Scene Multi-View Crowd Counting

    Authors: Qi Zhang, Wei Lin, Antoni B. Chan

    Abstract: Multi-view crowd counting has been previously proposed to utilize multi-cameras to extend the field-of-view of a single camera, capturing more people in the scene, and improve counting performance for occluded people or those in low resolution. However, the current multi-view paradigm trains and tests on the same single scene and camera-views, which limits its practical application. In this paper,… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: CVPR 2021

  31. On Distinctive Image Captioning via Comparing and Reweighting

    Authors: Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan

    Abstract: Recent image captioning models are achieving impressive results based on popular metrics, i.e., BLEU, CIDEr, and SPICE. However, focusing on the most popular metrics that only consider the overlap between the generated captions and human annotation could result in using common words and phrases, which lacks distinctiveness, i.e., many similar images have the same caption. In this paper, we aim to… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Comments: 20 pages. arXiv admin note: substantial text overlap with arXiv:2007.06877

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI 2022)

  32. arXiv:2203.13450  [pdf, other

    cs.LG

    A Comparative Survey of Deep Active Learning

    Authors: Xueying Zhan, Qingzhong Wang, Kuan-hao Huang, Haoyi Xiong, Dejing Dou, Antoni B. Chan

    Abstract: While deep learning (DL) is data-hungry and usually relies on extensive labeled data to deliver good performance, Active Learning (AL) reduces labeling costs by selecting a small proportion of samples from unlabeled data for labeling and training. Therefore, Deep Active Learning (DAL) has risen as a feasible solution for maximizing model performance under a limited labeling cost/budget in recent y… ▽ More

    Submitted 19 July, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

    Comments: 24 pages

  33. arXiv:2203.04232  [pdf, other

    cs.CV

    A Lightweight and Detector-free 3D Single Object Tracker on Point Clouds

    Authors: Yan Xia, Qiangqiang Wu, Wei Li, Antoni B. Chan, Uwe Stilla

    Abstract: Recent works on 3D single object tracking treat the task as a target-specific 3D detection task, where an off-the-shelf 3D detector is commonly employed for the tracking. However, it is non-trivial to perform accurate target-specific detection since the point cloud of objects in raw LiDAR scans is usually sparse and incomplete. In this paper, we address this issue by explicitly leveraging temporal… ▽ More

    Submitted 11 February, 2023; v1 submitted 8 March, 2022; originally announced March 2022.

    Comments: Accepted by IEEE Transactions on Intelligent Transportation Systems 2023

  34. arXiv:2112.08932  [pdf, other

    cs.LG cs.AI cs.RO

    Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning

    Authors: Trevor Ablett, Bryan Chan, Jonathan Kelly

    Abstract: Effective exploration continues to be a significant challenge that prevents the deployment of reinforcement learning for many physical systems. This is particularly true for systems with continuous and high-dimensional state and action spaces, such as robotic manipulators. The challenge is accentuated in the sparse rewards setting, where the low-level state information required for the design of d… ▽ More

    Submitted 19 July, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: In Proceedings of the Neural Information Processing Systems (NeurIPS'21) Deep Reinforcement Learning Workshop, Sydney, Australia, Dec. 13, 2021

  35. arXiv:2110.04931  [pdf, other

    cs.CV

    BEV-Net: Assessing Social Distancing Compliance by Joint People Localization and Geometric Reasoning

    Authors: Zhirui Dai, Yuepeng Jiang, Yi Li, Bo Liu, Antoni B. Chan, Nuno Vasconcelos

    Abstract: Social distancing, an essential public health measure to limit the spread of contagious diseases, has gained significant attention since the outbreak of the COVID-19 pandemic. In this work, the problem of visual social distancing compliance assessment in busy public areas, with wide field-of-view cameras, is considered. A dataset of crowd scenes with people annotations under a bird's eye view (BEV… ▽ More

    Submitted 12 October, 2021; v1 submitted 10 October, 2021; originally announced October 2021.

    Comments: Published as a conference paper at International Conference on Computer Vision, 2021

  36. arXiv:2108.09151  [pdf, other

    cs.CV cs.CL cs.LG

    Group-based Distinctive Image Captioning with Memory Attention

    Authors: Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan

    Abstract: Describing images using natural language is widely known as image captioning, which has made consistent progress due to the development of computer vision and natural language generation techniques. Though conventional captioning models achieve high accuracy based on popular metrics, i.e., BLEU, CIDEr, and SPICE, the ability of captions to distinguish the target image from other similar images is… ▽ More

    Submitted 7 April, 2022; v1 submitted 20 August, 2021; originally announced August 2021.

    Comments: Accepted at ACM MM 2021 (oral)

  37. arXiv:2107.03374  [pdf, other

    cs.LG

    Evaluating Large Language Models Trained on Code

    Authors: Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter , et al. (33 additional authors not shown)

    Abstract: We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from docstrings, our model solves 28.8% of the problems, while GPT-3 solves 0% and GPT-J sol… ▽ More

    Submitted 14 July, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

    Comments: corrected typos, added references, added authors, added acknowledgements

  38. arXiv:2107.01622  [pdf, other

    cs.LG

    Multiple-criteria Based Active Learning with Fixed-size Determinantal Point Processes

    Authors: Xueying Zhan, Qing Li, Antoni B. Chan

    Abstract: Active learning aims to achieve greater accuracy with less training data by selecting the most useful data samples from which it learns. Single-criterion based methods (i.e., informativeness and representativeness based methods) are simple and efficient; however, they lack adaptability to different real-world scenarios. In this paper, we introduce a multiple-criteria based active learning algorith… ▽ More

    Submitted 4 July, 2021; originally announced July 2021.

  39. arXiv:2102.03497  [pdf, other

    cs.LG stat.ML

    Weight Rescaling: Effective and Robust Regularization for Deep Neural Networks with Batch Normalization

    Authors: Ziquan Liu, Yufei Cui, Jia Wan, Yu Mao, Antoni B. Chan

    Abstract: Weight decay is often used to ensure good generalization in the training practice of deep neural networks with batch normalization (BN-DNNs), where some convolution layers are invariant to weight rescaling due to the normalization. In this paper, we demonstrate that the practical usage of weight decay still has some unsolved problems in spite of existing theoretical work on explaining the effect o… ▽ More

    Submitted 17 June, 2022; v1 submitted 5 February, 2021; originally announced February 2021.

    Comments: Preprint

  40. arXiv:2101.11353  [pdf, other

    cs.LG cs.CV

    Variational Nested Dropout

    Authors: Yufei Cui, Yu Mao, Ziquan Liu, Qiao Li, Antoni B. Chan, Xue Liu, Tei-Wei Kuo, Chun Jason Xue

    Abstract: Nested dropout is a variant of dropout operation that is able to order network parameters or features based on the pre-defined importance during training. It has been explored for: I. Constructing nested nets: the nested nets are neural networks whose architectures can be adjusted instantly during testing time, e.g., based on computational constraints. The nested dropout implicitly ranks the netwo… ▽ More

    Submitted 17 June, 2022; v1 submitted 27 January, 2021; originally announced January 2021.

    Comments: 20 pages, 17 figures

  41. arXiv:2012.00946  [pdf, other

    cs.CV

    Wide-Area Crowd Counting: Multi-View Fusion Networks for Counting in Large Scenes

    Authors: Qi Zhang, Antoni B. Chan

    Abstract: Crowd counting in single-view images has achieved outstanding performance on existing counting datasets. However, single-view counting is not applicable to large and wide scenes (e.g., public parks, long subway platforms, or event spaces) because a single camera cannot capture the whole scene in adequate detail for counting, e.g., when the scene is too large to fit into the field-of-view of the ca… ▽ More

    Submitted 2 May, 2022; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: Accepted to IJCV

  42. arXiv:2010.10906  [pdf, other

    cs.CL cs.LG

    German's Next Language Model

    Authors: Branden Chan, Stefan Schweter, Timo Möller

    Abstract: In this work we present the experiments which lead to the creation of our BERT and ELECTRA based German language models, GBERT and GELECTRA. By varying the input training data, model size, and the presence of Whole Word Masking (WWM) we were able to attain SoTA performance across a set of document classification and named entity recognition (NER) tasks for both models of base and large size. We ad… ▽ More

    Submitted 3 December, 2020; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: Accepted by COLING2020

  43. arXiv:2010.08161  [pdf, other

    cs.LG

    ALdataset: a benchmark for pool-based active learning

    Authors: Xueying Zhan, Antoni Bert Chan

    Abstract: Active learning (AL) is a subfield of machine learning (ML) in which a learning algorithm could achieve good accuracy with less training samples by interactively querying a user/oracle to label new data points. Pool-based AL is well-motivated in many ML tasks, where unlabeled data is abundant, but their labels are hard to obtain. Although many pool-based AL methods have been developed, the lack of… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

  44. arXiv:2008.08157  [pdf, other

    cs.RO cs.LG eess.SY

    Heteroscedastic Uncertainty for Robust Generative Latent Dynamics

    Authors: Oliver Limoyo, Bryan Chan, Filip Marić, Brandon Wagstaff, Rupam Mahmood, Jonathan Kelly

    Abstract: Learning or identifying dynamics from a sequence of high-dimensional observations is a difficult challenge in many domains, including reinforcement learning and control. The problem has recently been studied from a generative perspective through latent dynamics: high-dimensional observations are embedded into a lower-dimensional space in which the dynamics can be learned. Despite some successes, l… ▽ More

    Submitted 11 July, 2022; v1 submitted 18 August, 2020; originally announced August 2020.

    Comments: In IEEE Robotics and Automation Letters (RA-L) and presented at the IEEE International Conference on Intelligent Robots and Systems (IROS'20), Las Vegas, USA, October 25-29, 2020

    Journal ref: IEEE Robotics and Automation Letters (RA-L), Vol. 5, No. 4, pp. 6654-6661, Oct. 2020

  45. arXiv:2008.02965  [pdf, other

    cs.LG stat.ML

    Improve Generalization and Robustness of Neural Networks via Weight Scale Shifting Invariant Regularizations

    Authors: Ziquan Liu, Yufei Cui, Antoni B. Chan

    Abstract: Using weight decay to penalize the L2 norms of weights in neural networks has been a standard training practice to regularize the complexity of networks. In this paper, we show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with positively homogeneous activation functions, such as linear, ReLU and max-pooling function… ▽ More

    Submitted 8 June, 2022; v1 submitted 6 August, 2020; originally announced August 2020.

    Comments: 14 pages, 5 figures, Accepted by ICML 2021 Workshop on Adversarial Machine Learning

  46. Tracking-by-Counting: Using Network Flows on Crowd Density Maps for Tracking Multiple Targets

    Authors: Weihong Ren, Xinchao Wang, Jiandong Tian, Yandong Tang, Antoni B. Chan

    Abstract: State-of-the-art multi-object tracking~(MOT) methods follow the tracking-by-detection paradigm, where object trajectories are obtained by associating per-frame outputs of object detectors. In crowded scenes, however, detectors often fail to obtain accurate detections due to heavy occlusions and high crowd density. In this paper, we propose a new MOT paradigm, tracking-by-counting, tailored for cro… ▽ More

    Submitted 18 July, 2020; originally announced July 2020.

    Comments: 14 pages

  47. arXiv:2007.06877  [pdf, other

    cs.CV cs.CL cs.LG

    Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets

    Authors: Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan

    Abstract: A wide range of image captioning models has been developed, achieving significant improvement based on popular metrics, such as BLEU, CIDEr, and SPICE. However, although the generated captions can accurately describe the image, they are generic for similar images and lack distinctiveness, i.e., cannot properly describe the uniqueness of each image. In this paper, we aim to improve the distinctiven… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

    Report number: Accepted at ECCV 2020 (oral)

  48. Fine-Grained Crowd Counting

    Authors: Jia Wan, Nikil Senthil Kumar, Antoni B. Chan

    Abstract: Current crowd counting algorithms are only concerned about the number of people in an image, which lacks low-level fine-grained information of the crowd. For many practical applications, the total number of people in an image is not as useful as the number of people in each sub-category. E.g., knowing the number of people waiting inline or browsing can help retail stores; knowing the number of peo… ▽ More

    Submitted 12 July, 2020; originally announced July 2020.

  49. arXiv:2007.03891  [pdf, other

    cs.CV

    Single-Frame based Deep View Synchronization for Unsynchronized Multi-Camera Surveillance

    Authors: Qi Zhang, Antoni B. Chan

    Abstract: Multi-camera surveillance has been an active research topic for understanding and modeling scenes. Compared to a single camera, multi-cameras provide larger field-of-view and more object cues, and the related applications are multi-view counting, multi-view tracking, 3D pose estimation or 3D reconstruction, etc. It is usually assumed that the cameras are all temporally synchronized when designing… ▽ More

    Submitted 2 May, 2022; v1 submitted 8 July, 2020; originally announced July 2020.

    Comments: Accepted to IEEE TNNLS

  50. arXiv:2006.05127  [pdf, other

    cs.CV

    Over-crowdedness Alert! Forecasting the Future Crowd Distribution

    Authors: Yuzhen Niu, Weifeng Shi, Wenxi Liu, Shengfeng He, Jia Pan, Antoni B. Chan

    Abstract: In recent years, vision-based crowd analysis has been studied extensively due to its practical applications in real world. In this paper, we formulate a novel crowd analysis problem, in which we aim to predict the crowd distribution in the near future given sequential frames of a crowd video without any identity annotations. Studying this research problem will benefit applications concerned with f… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.