Skip to main content

Showing 51–100 of 179 results for author: Hua, X

.
  1. arXiv:2207.09332  [pdf, other

    cs.CV

    Rethinking IoU-based Optimization for Single-stage 3D Object Detection

    Authors: Hualian Sheng, Sijia Cai, Na Zhao, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Min-Jian Zhao, Gim Hee Lee

    Abstract: Since Intersection-over-Union (IoU) based optimization maintains the consistency of the final IoU prediction metric and losses, it has been widely used in both regression and classification branches of single-stage 2D object detectors. Recently, several 3D object detection methods adopt IoU-based optimization and directly replace the 2D IoU with 3D IoU. However, such a direct computation in 3D is… ▽ More

    Submitted 20 July, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

    Comments: Accepted by ECCV2022. The code is available at https://github.com/hlsheng1/RDIoU

  2. arXiv:2207.09055  [pdf, other

    cs.CV

    Box-supervised Instance Segmentation with Level Set Evolution

    Authors: Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Xiansheng Hua, Lei Zhang

    Abstract: In contrast to the fully supervised methods using pixel-wise mask labels, box-supervised instance segmentation takes advantage of the simple box annotations, which has recently attracted a lot of research attentions. In this paper, we propose a novel single-shot box-supervised instance segmentation approach, which integrates the classical level set model with deep neural network delicately. Specif… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

    Comments: 17 page, 4figures, ECCV2022

  3. arXiv:2207.04829  [pdf, ps, other

    cs.IT eess.SP

    Low-complexity Joint Phase Adjustment and Receive Beamforming for Directional Modulation Networks via IRS

    Authors: Rongen Dong, Shaohua Jiang, Xinhai Hua, Yin Teng, Feng Shu, Jiangzhou Wang

    Abstract: Intelligent reflecting surface (IRS) is a revolutionary and low-cost technology for boosting the spectrum and energy efficiencies in future wireless communication network. In order to create controllable multipath transmission in the conventional line-of-sight (LOS) wireless communication environment, an IRS-aided directional modulation (DM) network is considered. In this paper, to improve the tra… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

  4. arXiv:2207.02812  [pdf, other

    cs.CV

    Towards Counterfactual Image Manipulation via CLIP

    Authors: Yingchen Yu, Fangneng Zhan, Rongliang Wu, Jiahui Zhang, Shijian Lu, Miaomiao Cui, Xuansong Xie, Xian-Sheng Hua, Chunyan Miao

    Abstract: Leveraging StyleGAN's expressivity and its disentangled latent codes, existing methods can achieve realistic editing of different visual attributes such as age and gender of facial images. An intriguing yet challenging problem arises: Can generative models achieve counterfactual editing against their learnt priors? Due to the lack of counterfactual samples in natural datasets, we investigate this… ▽ More

    Submitted 12 July, 2022; v1 submitted 6 July, 2022; originally announced July 2022.

    Comments: This paper has been accepted to ACM MM 2022, code may be found here: https://github.com/yingchen001/CF-CLIP

  5. arXiv:2206.14923  [pdf, other

    cs.CV cs.LG

    On Non-Random Missing Labels in Semi-Supervised Learning

    Authors: Xinting Hu, Yulei Niu, Chunyan Miao, Xian-Sheng Hua, Hanwang Zhang

    Abstract: Semi-Supervised Learning (SSL) is fundamentally a missing label problem, in which the label Missing Not At Random (MNAR) problem is more realistic and challenging, compared to the widely-adopted yet naive Missing Completely At Random assumption where both labeled and unlabeled data share the same class distribution. Different from existing SSL solutions that overlook the role of "class" in causing… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Journal ref: ICLR 2022

  6. arXiv:2206.11476  [pdf, other

    cs.CV

    Dynamic Scene Deblurring Based on Continuous Cross-Layer Attention Transmission

    Authors: Xia Hua, Mingxin Li, Junxiong Fei, Yu Shi, JianGuo Liu, Hanyu Hong

    Abstract: The deep convolutional neural networks (CNNs) using attention mechanism have achieved great success for dynamic scene deblurring. In most of these networks, only the features refined by the attention maps can be passed to the next layer and the attention maps of different layers are separated from each other, which does not make full use of the attention information from different layers in the CN… ▽ More

    Submitted 28 January, 2023; v1 submitted 23 June, 2022; originally announced June 2022.

  7. arXiv:2206.07662  [pdf, other

    cs.CV

    SP-ViT: Learning 2D Spatial Priors for Vision Transformers

    Authors: Yuxuan Zhou, Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Lei Zhang, Margret Keuper, Xiansheng Hua

    Abstract: Recently, transformers have shown great potential in image classification and established state-of-the-art results on the ImageNet benchmark. However, compared to CNNs, transformers converge slowly and are prone to overfitting in low-data regimes due to the lack of spatial inductive biases. Such spatial inductive biases can be especially beneficial since the 2D structure of an input image is not w… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    ACM Class: I.4

  8. arXiv:2205.05675  [pdf, other

    cs.CV eess.IV

    NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

    Authors: Yawei Li, Kai Zhang, Radu Timofte, Luc Van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang , et al. (86 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of e… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Validation code of the baseline model is available at https://github.com/ofsoundof/IMDN. Validation of all submitted models is available at https://github.com/ofsoundof/NTIRE2022_ESR

  9. arXiv:2204.11278  [pdf, ps, other

    eess.SP cs.IT stat.ML

    Unsupervised Learning Discriminative MIG Detectors in Nonhomogeneous Clutter

    Authors: Xiaoqiang Hua, Yusuke Ono, Linyu Peng, Yuting Xu

    Abstract: Principal component analysis (PCA) is a commonly used pattern analysis method that maps high-dimensional data into a lower-dimensional space maximizing the data variance, that results in the promotion of separability of data. Inspired by the principle of PCA, a novel type of learning discriminative matrix information geometry (MIG) detectors in the unsupervised scenario are developed, and applied… ▽ More

    Submitted 8 May, 2022; v1 submitted 24 April, 2022; originally announced April 2022.

    Comments: 14 pages, 6 figures

    Journal ref: IEEE Transactions on Communications 70, 4107-4120, 2022

  10. arXiv:2204.07300  [pdf, other

    cs.CV

    Dense Learning based Semi-Supervised Object Detection

    Authors: Binghui Chen, Pengyu Li, Xiang Chen, Biao Wang, Lei Zhang, Xian-Sheng Hua

    Abstract: Semi-supervised object detection (SSOD) aims to facilitate the training and deployment of object detectors with the help of a large amount of unlabeled data. Though various self-training based and consistency-regularization based SSOD methods have been proposed, most of them are anchor-based detectors, ignoring the fact that in many real-world applications anchor-free detectors are more demanded.… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: CVPR 2022

  11. Spatial Likelihood Voting with Self-Knowledge Distillation for Weakly Supervised Object Detection

    Authors: Ze Chen, Zhihang Fu, Jianqiang Huang, Mingyuan Tao, Rongxin Jiang, Xiang Tian, Yaowu Chen, Xian-sheng Hua

    Abstract: Weakly supervised object detection (WSOD), which is an effective way to train an object detection model using only image-level annotations, has attracted considerable attention from researchers. However, most of the existing methods, which are based on multiple instance learning (MIL), tend to localize instances to the discriminative parts of salient objects instead of the entire content of all ob… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: arXiv admin note: text overlap with arXiv:2006.12884

    Journal ref: Image and Vision Computing, Volume 116, 2021, 104314, ISSN 0262-8856

  12. arXiv:2204.00826  [pdf, other

    cs.CV

    Online Convolutional Re-parameterization

    Authors: Mu Hu, Junyi Feng, Jiashen Hua, Baisheng Lai, Jianqiang Huang, Xiaojin Gong, Xiansheng Hua

    Abstract: Structural re-parameterization has drawn increasing attention in various computer vision tasks. It aims at improving the performance of deep models without introducing any inference-time cost. Though efficient during inference, such models rely heavily on the complicated training-time blocks to achieve high accuracy, leading to large extra training cost. In this paper, we present online convolutio… ▽ More

    Submitted 2 April, 2022; originally announced April 2022.

    Comments: Accepted by CVPR 2022

  13. arXiv:2204.00754  [pdf, other

    cs.CV

    Homography Loss for Monocular 3D Object Detection

    Authors: Jiaqi Gu, Bojian Wu, Lubin Fan, Jianqiang Huang, Shen Cao, Zhiyu Xiang, Xian-Sheng Hua

    Abstract: Monocular 3D object detection is an essential task in autonomous driving. However, most current methods consider each 3D object in the scene as an independent training sample, while ignoring their inherent geometric relations, thus inevitably resulting in a lack of leveraging spatial constraints. In this paper, we propose a novel method that takes all the objects into consideration and explores th… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

    Comments: 8 pages, 5 figures. Accepted to CVPR 2022

  14. arXiv:2204.00707  [pdf, other

    cs.CL

    Efficient Argument Structure Extraction with Transfer Learning and Active Learning

    Authors: Xinyu Hua, Lu Wang

    Abstract: The automation of extracting argument structures faces a pair of challenges on (1) encoding long-term contexts to facilitate comprehensive understanding, and (2) improving data efficiency since constructing high-quality argument structures is time-consuming. In this work, we propose a novel context-aware Transformer-based argument structure prediction model which, on five different domains, signif… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

    Comments: Findings of ACL 2022, long paper

  15. Dynamic Supervisor for Cross-dataset Object Detection

    Authors: Ze Chen, Zhihang Fu, Jianqiang Huang, Mingyuan Tao, Shengyu Li, Rongxin Jiang, Xiang Tian, Yaowu Chen, Xian-sheng Hua

    Abstract: The application of cross-dataset training in object detection tasks is complicated because the inconsistency in the category range across datasets transforms fully supervised learning into semi-supervised learning. To address this problem, recent studies focus on the generation of high-quality missing annotations. In this study, we first point out that it is not enough to generate high-quality ann… ▽ More

    Submitted 31 March, 2022; originally announced April 2022.

    Journal ref: Neurocomputing, Volume 469, 2022, Pages 310-320, ISSN 0925-2312

  16. arXiv:2203.09771  [pdf, other

    cs.CV

    Beyond a Video Frame Interpolator: A Space Decoupled Learning Approach to Continuous Image Transition

    Authors: Tao Yang, Peiran Ren, Xuansong Xie, Xiansheng Hua, Lei Zhang

    Abstract: Video frame interpolation (VFI) aims to improve the temporal resolution of a video sequence. Most of the existing deep learning based VFI methods adopt off-the-shelf optical flow algorithms to estimate the bidirectional flows and interpolate the missing frames accordingly. Though having achieved a great success, these methods require much human experience to tune the bidirectional flows and often… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

  17. arXiv:2203.07111  [pdf, other

    cs.CV

    Disentangled Representation Learning for Text-Video Retrieval

    Authors: Qiang Wang, Yanhao Zhang, Yun Zheng, Pan Pan, Xian-Sheng Hua

    Abstract: Cross-modality interaction is a critical component in Text-Video Retrieval (TVR), yet there has been little examination of how different influencing factors for computing interaction affect performance. This paper first studies the interaction paradigm in depth, where we find that its computation can be split into two terms, the interaction contents at different granularity and the matching functi… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: 22 pages, 11 figures, Tech report

  18. arXiv:2203.00962  [pdf, other

    cs.CV

    Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation

    Authors: Zhaozheng Chen, Tan Wang, Xiongwei Wu, Xian-Sheng Hua, Hanwang Zhang, Qianru Sun

    Abstract: Extracting class activation maps (CAM) is arguably the most standard step of generating pseudo masks for weakly-supervised semantic segmentation (WSSS). Yet, we find that the crux of the unsatisfactory pseudo masks is the binary cross-entropy loss (BCE) widely used in CAM. Specifically, due to the sum-over-class pooling nature of BCE, each pixel in CAM may be responsive to multiple classes co-occu… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

    Comments: Accepted to CVPR 2022

  19. arXiv:2202.04250  [pdf, other

    cs.NI eess.SP

    GenAD: General Representations of Multivariate Time Seriesfor Anomaly Detection

    Authors: Xiaolei Hua, Lin Zhu, Shenglin Zhang, Zeyan Li, Su Wang, Dong Zhou, Shuo Wang, Chao Deng

    Abstract: The reliability of wireless base stations in China Mobile is of vital importance, because the cell phone users are connected to the stations and the behaviors of the stations are directly related to user experience. Although the monitoring of the station behaviors can be realized by anomaly detection on multivariate time series, due to complex correlations and various temporal patterns of multivar… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

  20. Probabilistic chip-collecting games with modulo winning conditions

    Authors: Joshua Harrington, Xuwen Hua, Xufei Liu, Alex Nash, Rodrigo Rios, Tony W. H. Wong

    Abstract: Let $a$, $b$, and $n$ be integers with $0<a<b<n$. In a certain two-player probabilistic chip-collecting game, Alice tosses a coin to determine whether she collects $a$ chips or $b$ chips. If Alice collects $a$ chips, then Bob collects $b$ chips, and vice versa. A player is announced the winner when they have accumulated a number of chips that is a multiple of $n$. In this paper, we settle two conj… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    MSC Class: 05C81

    Journal ref: Discrete Applied Mathematics 324 (2013), 93-98

  21. Offline-Online Associated Camera-Aware Proxies for Unsupervised Person Re-identification

    Authors: Menglin Wang, Jiachen Li, Baisheng Lai, Xiaojin Gong, Xian-Sheng Hua

    Abstract: Recently, unsupervised person re-identification (Re-ID) has received increasing research attention due to its potential for label-free applications. A promising way to address unsupervised Re-ID is clustering-based, which generates pseudo labels by clustering and uses the pseudo labels to train a Re-ID model iteratively. However, most clustering-based methods take each cluster as a pseudo identity… ▽ More

    Submitted 1 October, 2022; v1 submitted 15 January, 2022; originally announced January 2022.

    Comments: Accepted to TIP

  22. arXiv:2112.14380  [pdf, other

    cs.CV cs.LG

    Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification

    Authors: Beier Zhu, Yulei Niu, Xian-Sheng Hua, Hanwang Zhang

    Abstract: We address the overlooked unbiasedness in existing long-tailed classification methods: we find that their overall improvement is mostly attributed to the biased preference of tail over head, as the test distribution is assumed to be balanced; however, when the test is as imbalanced as the long-tailed training data -- let the test respect Zipf's law of nature -- the tail bias is no longer beneficia… ▽ More

    Submitted 28 December, 2021; originally announced December 2021.

  23. arXiv:2112.09999  [pdf, ps, other

    math.CO

    Zero forcing number versus general position number in tree-like graphs

    Authors: Hongbo Hua, Xinying Hua, Sandi Klavžar

    Abstract: Let ${\rm Z}(G)$ and ${\rm gp}(G)$ be the zero forcing number and the general position number of a graph $G$, respectively. Known results imply that ${\rm gp}(T)\ge {\rm Z}(T) + 1$ holds for every nontrivial tree $T$. It is proved that the result extends to block graphs. For connected, unicyclic graphs $G$ it is proved that ${\rm gp}(G) \ge {\rm Z}(G)$. The result extends neither to bicyclic graph… ▽ More

    Submitted 18 December, 2021; originally announced December 2021.

  24. arXiv:2111.15603  [pdf, other

    cs.CV

    Human Imperceptible Attacks and Applications to Improve Fairness

    Authors: Xinru Hua, Huanzhong Xu, Jose Blanchet, Viet Nguyen

    Abstract: Modern neural networks are able to perform at least as well as humans in numerous tasks involving object classification and image generation. However, small perturbations which are imperceptible to humans may significantly degrade the performance of well-trained deep neural networks. We provide a Distributionally Robust Optimization (DRO) framework which integrates human-based image quality assess… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

  25. arXiv:2111.10032  [pdf, other

    cs.CV

    Meta Clustering Learning for Large-scale Unsupervised Person Re-identification

    Authors: Xin Jin, Tianyu He, Xu Shen, Tongliang Liu, Xinchao Wang, Jianqiang Huang, Zhibo Chen, Xian-Sheng Hua

    Abstract: Unsupervised Person Re-identification (U-ReID) with pseudo labeling recently reaches a competitive performance compared to fully-supervised ReID methods based on modern clustering algorithms. However, such clustering-based scheme becomes computationally prohibitive for large-scale datasets. How to efficiently leverage endless unlabeled data with limited computing resources for better U-ReID is und… ▽ More

    Submitted 6 August, 2022; v1 submitted 18 November, 2021; originally announced November 2021.

    Comments: Accepted by ACMMM2022

  26. arXiv:2111.03298  [pdf, ps, other

    math.CO

    Relating the total domination number and the annihilation number for quasi-trees and some composite graphs

    Authors: Hongbo Hua, Xinying Hua, Sandi Klavžar, Kexiang Xu

    Abstract: The total domination number $γ_{t}(G)$ of a graph $G$ is the cardinality of a smallest set $D\subseteq V(G)$ such that each vertex of $G$ has a neighbor in $D$. The annihilation number $a(G)$ of $G$ is the largest integer $k$ such that there exist $k$ different vertices in $G$ with the degree sum at most $m(G)$. It is conjectured that $γ_{t}(G)\leq a(G)+1$ holds for every nontrivial connected grap… ▽ More

    Submitted 23 April, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

  27. arXiv:2111.00406  [pdf, other

    cs.CV

    PANet: Perspective-Aware Network with Dynamic Receptive Fields and Self-Distilling Supervision for Crowd Counting

    Authors: Xiaoshuang Chen, Yiru Zhao, Yu Qin, Fei Jiang, Mingyuan Tao, Xiansheng Hua, Hongtao Lu

    Abstract: Crowd counting aims to learn the crowd density distributions and estimate the number of objects (e.g. persons) in images. The perspective effect, which significantly influences the distribution of data points, plays an important role in crowd counting. In this paper, we propose a novel perspective-aware approach called PANet to address the perspective problem. Based on the observation that the siz… ▽ More

    Submitted 18 August, 2022; v1 submitted 31 October, 2021; originally announced November 2021.

    Comments: The paper is under consideration at Computer Vision and Image Understanding

  28. arXiv:2110.15082  [pdf, other

    cs.CV

    SpineOne: A One-Stage Detection Framework for Degenerative Discs and Vertebrae

    Authors: Jiabo He, Wei Liu, Yu Wang, Xingjun Ma, Xian-Sheng Hua

    Abstract: Spinal degeneration plagues many elders, office workers, and even the younger generations. Effective pharmic or surgical interventions can help relieve degenerative spine conditions. However, the traditional diagnosis procedure is often too laborious. Clinical experts need to detect discs and vertebrae from spinal magnetic resonance imaging (MRI) or computed tomography (CT) images as a preliminary… ▽ More

    Submitted 10 November, 2021; v1 submitted 28 October, 2021; originally announced October 2021.

  29. arXiv:2110.13675  [pdf, other

    cs.CV

    Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

    Authors: Jiabo He, Sarah Erfani, Xingjun Ma, James Bailey, Ying Chi, Xian-Sheng Hua

    Abstract: Bounding box (bbox) regression is a fundamental task in computer vision. So far, the most commonly used loss functions for bbox regression are the Intersection over Union (IoU) loss and its variants. In this paper, we generalize existing IoU-based losses to a new family of power IoU losses that have a power IoU term and an additional power regularization term with a single power parameter $α$. We… ▽ More

    Submitted 22 January, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

  30. arXiv:2110.05096  [pdf, other

    cs.LG

    Density-Based Clustering with Kernel Diffusion

    Authors: Chao Zheng, Yingjie Chen, Chong Chen, Jianqiang Huang, Xian-Sheng Hua

    Abstract: Finding a suitable density function is essential for density-based clustering algorithms such as DBSCAN and DPC. A naive density corresponding to the indicator function of a unit $d$-dimensional Euclidean ball is commonly used in these algorithms. Such density suffers from capturing local features in complex datasets. To tackle this issue, we propose a new kernel diffusion density function, which… ▽ More

    Submitted 14 October, 2021; v1 submitted 11 October, 2021; originally announced October 2021.

  31. arXiv:2108.10723  [pdf, other

    cs.CV

    Improving 3D Object Detection with Channel-wise Transformer

    Authors: Hualian Sheng, Sijia Cai, Yuan Liu, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Min-Jian Zhao

    Abstract: Though 3D object detection from point clouds has achieved rapid progress in recent years, the lack of flexible and high-performance proposal refinement remains a great hurdle for existing state-of-the-art two-stage detectors. Previous works on refining 3D proposals have relied on human-designed components such as keypoints sampling, set abstraction and multi-scale feature fusion to produce powerfu… ▽ More

    Submitted 14 September, 2021; v1 submitted 22 August, 2021; originally announced August 2021.

    Comments: Accepted by ICCV2021

  32. arXiv:2107.13269  [pdf, other

    cs.CV

    Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth

    Authors: Chenhang He, Jianqiang Huang, Xian-Sheng Hua, Lei Zhang

    Abstract: Current geometry-based monocular 3D object detection models can efficiently detect objects by leveraging perspective geometry, but their performance is limited due to the absence of accurate depth information. Though this issue can be alleviated in a depth-based model where a depth estimation module is plugged to predict depth information before 3D box reasoning, the introduction of such module dr… ▽ More

    Submitted 28 July, 2021; originally announced July 2021.

    Comments: 10 pages, 8 figures

  33. arXiv:2107.11055  [pdf, other

    cs.CV

    Transporting Causal Mechanisms for Unsupervised Domain Adaptation

    Authors: Zhongqi Yue, Qianru Sun, Xian-Sheng Hua, Hanwang Zhang

    Abstract: Existing Unsupervised Domain Adaptation (UDA) literature adopts the covariate shift and conditional shift assumptions, which essentially encourage models to learn common features across domains. However, due to the lack of supervision in the target domain, they suffer from the semantic loss: the feature will inevitably lose non-discriminative semantics in source domain, which is however discrimina… ▽ More

    Submitted 28 July, 2021; v1 submitted 23 July, 2021; originally announced July 2021.

    Comments: ICCV 2021 Oral

  34. arXiv:2107.00181  [pdf, other

    cs.LG cs.AI cs.CV

    Revisiting Knowledge Distillation: An Inheritance and Exploration Framework

    Authors: Zhen Huang, Xu Shen, Jun Xing, Tongliang Liu, Xinmei Tian, Houqiang Li, Bing Deng, Jianqiang Huang, Xian-Sheng Hua

    Abstract: Knowledge Distillation (KD) is a popular technique to transfer knowledge from a teacher model or ensemble to a student model. Its success is generally attributed to the privileged information on similarities/consistency between the class distributions or intermediate feature representations of the teacher model and the student model. However, directly pushing the student model to mimic the probabi… ▽ More

    Submitted 30 June, 2021; originally announced July 2021.

    Comments: Accepted by CVPR 2021

  35. arXiv:2106.00791  [pdf, other

    cs.CL

    DYPLOC: Dynamic Planning of Content Using Mixed Language Models for Text Generation

    Authors: Xinyu Hua, Ashwin Sreevatsa, Lu Wang

    Abstract: We study the task of long-form opinion text generation, which faces at least two distinct challenges. First, existing neural generation models fall short of coherence, thus requiring efficient content planning. Second, diverse types of information are needed to guide the generator to cover both subjective and objective content. To this end, we propose DYPLOC, a generation framework that conducts d… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: Accepted at ACL 2021. Project page: https://xinyuhua.github.io/Resources/acl21/

  36. MIG Median Detectors with Manifold Filter

    Authors: Xiaoqiang Hua, Linyu Peng

    Abstract: In this paper, we propose a class of median-based matrix information geometry (MIG) detectors with a manifold filter and apply them to signal detection in nonhomogeneous environments. As customary, the sample data is assumed to be modeled as Hermitian positive-definite (HPD) matrices, and the geometric median of a set of HPD matrices is interpreted as an estimate of the clutter covariance matrix (… ▽ More

    Submitted 26 May, 2021; originally announced May 2021.

    Comments: 22 pages, 12 figures

    Journal ref: Signal Processing 188, 108176, 2021

  37. arXiv:2105.11876  [pdf, other

    cs.IR

    Criterion-based Heterogeneous Collaborative Filtering for Multi-behavior Implicit Recommendation

    Authors: Xiao Luo, Daqing Wu, Yiyang Gu, Chong Chen, Luchen Liu, Jinwen Ma, Ming Zhang, Minghua Deng, Jianqiang Huang, Xian-Sheng Hua

    Abstract: Recent years have witnessed the explosive growth of interaction behaviors in multimedia information systems, where multi-behavior recommender systems have received increasing attention by leveraging data from various auxiliary behaviors such as tip and collect. Among various multi-behavior recommendation methods, non-sampling methods have shown superiority over negative sampling methods. However,… ▽ More

    Submitted 25 July, 2023; v1 submitted 25 May, 2021; originally announced May 2021.

    Comments: Accepted by ACM Transactions on Knowledge Discovery from Data (TKDD)

  38. Attention-guided Temporally Coherent Video Object Matting

    Authors: Yunke Zhang, Chi Wang, Miaomiao Cui, Peiran Ren, Xuansong Xie, Xian-sheng Hua, Hujun Bao, Qixing Huang, Weiwei Xu

    Abstract: This paper proposes a novel deep learning-based video object matting method that can achieve temporally coherent matting results. Its key component is an attention-based temporal aggregation module that maximizes image matting networks' strength for video matting networks. This module computes temporal correlations for pixels adjacent to each other along the time axis in feature space, which is ro… ▽ More

    Submitted 29 July, 2021; v1 submitted 24 May, 2021; originally announced May 2021.

    Comments: 10 pages, 6 figures, MM '21 camera-ready

  39. arXiv:2104.14430  [pdf, other

    cs.CV

    Discriminative-Generative Dual Memory Video Anomaly Detection

    Authors: Xin Guo, Zhongming Jin, Chong Chen, Helei Nie, Jianqiang Huang, Deng Cai, Xiaofei He, Xiansheng Hua

    Abstract: Recently, people tried to use a few anomalies for video anomaly detection (VAD) instead of only normal data during the training process. A side effect of data imbalance occurs when a few abnormal data face a vast number of normal data. The latest VAD works use triplet loss or data re-sampling strategy to lessen this problem. However, there is still no elaborately designed structure for discriminat… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

  40. arXiv:2104.01429  [pdf, other

    cs.CV

    Graph Contrastive Clustering

    Authors: Huasong Zhong, Jianlong Wu, Chong Chen, Jianqiang Huang, Minghua Deng, Liqiang Nie, Zhouchen Lin, Xian-Sheng Hua

    Abstract: Recently, some contrastive learning methods have been proposed to simultaneously learn representations and clustering assignments, achieving significant improvements. However, these methods do not take the category information and clustering objective into consideration, thus the learned representations are not optimal for clustering and the performance might be limited. Towards this issue, we fir… ▽ More

    Submitted 3 April, 2021; originally announced April 2021.

    Comments: 10

  41. arXiv:2104.00875  [pdf, other

    cs.CV

    Half-Real Half-Fake Distillation for Class-Incremental Semantic Segmentation

    Authors: Zilong Huang, Wentian Hao, Xinggang Wang, Mingyuan Tao, Jianqiang Huang, Wenyu Liu, Xian-Sheng Hua

    Abstract: Despite their success for semantic segmentation, convolutional neural networks are ill-equipped for incremental learning, \ie, adapting the original segmentation model as new classes are available but the initial training data is not retained. Actually, they are vulnerable to catastrophic forgetting problem. We try to address this issue by "inverting" the trained segmentation network to synthesize… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

  42. arXiv:2103.15537  [pdf, other

    cs.CV

    Cloth-Changing Person Re-identification from A Single Image with Gait Prediction and Regularization

    Authors: Xin Jin, Tianyu He, Kecheng Zheng, Zhiheng Yin, Xu Shen, Zhen Huang, Ruoyu Feng, Jianqiang Huang, Xian-Sheng Hua, Zhibo Chen

    Abstract: Cloth-Changing person re-identification (CC-ReID) aims at matching the same person across different locations over a long-duration, e.g., over days, and therefore inevitably meets challenge of changing clothing. In this paper, we focus on handling well the CC-ReID problem under a more challenging setting, i.e., just from a single image, which enables high-efficiency and latency-free pedestrian ide… ▽ More

    Submitted 31 March, 2022; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: Accepted by CVPR 2022. arXiv admin note: text overlap with arXiv:2002.02295 by other authors

  43. arXiv:2103.09013  [pdf, other

    cs.CV

    Dense Interaction Learning for Video-based Person Re-identification

    Authors: Tianyu He, Xin Jin, Xu Shen, Jianqiang Huang, Zhibo Chen, Xian-Sheng Hua

    Abstract: Video-based person re-identification (re-ID) aims at matching the same person across video clips. Efficiently exploiting multi-scale fine-grained features while building the structural interaction among them is pivotal for its success. In this paper, we propose a hybrid framework, Dense Interaction Learning (DenseIL), that takes the principal advantages of both CNN-based and Attention-based archit… ▽ More

    Submitted 16 August, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

    Comments: ICCV 2021, Oral

  44. arXiv:2103.01737  [pdf, other

    cs.AI

    Distilling Causal Effect of Data in Class-Incremental Learning

    Authors: Xinting Hu, Kaihua Tang, Chunyan Miao, Xian-Sheng Hua, Hanwang Zhang

    Abstract: We propose a causal framework to explain the catastrophic forgetting in Class-Incremental Learning (CIL) and then derive a novel distillation method that is orthogonal to the existing anti-forgetting techniques, such as data replay and feature/label distillation. We first 1) place CIL into the framework, 2) answer why the forgetting happens: the causal effect of the old data is lost in new trainin… ▽ More

    Submitted 7 March, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

  45. arXiv:2103.00887  [pdf, other

    cs.CV cs.AI

    Counterfactual Zero-Shot and Open-Set Visual Recognition

    Authors: Zhongqi Yue, Tan Wang, Hanwang Zhang, Qianru Sun, Xian-Sheng Hua

    Abstract: We present a novel counterfactual framework for both Zero-Shot Learning (ZSL) and Open-Set Recognition (OSR), whose common challenge is generalizing to the unseen-classes by only training on the seen-classes. Our idea stems from the observation that the generated samples for unseen-classes are often out of the true distribution, which causes severe recognition rate imbalance between the seen-class… ▽ More

    Submitted 1 March, 2021; originally announced March 2021.

    Comments: Accepted by CVPR 2021

  46. arXiv:2102.02696  [pdf, other

    cs.CV

    Active Boundary Loss for Semantic Segmentation

    Authors: Chi Wang, Yunke Zhang, Miaomiao Cui, Peiran Ren, Yin Yang, Xuansong Xie, XianSheng Hua, Hujun Bao, Weiwei Xu

    Abstract: This paper proposes a novel active boundary loss for semantic segmentation. It can progressively encourage the alignment between predicted boundaries and ground-truth boundaries during end-to-end training, which is not explicitly enforced in commonly used cross-entropy loss. Based on the predicted boundaries detected from the segmentation results using current network parameters, we formulate the… ▽ More

    Submitted 3 February, 2022; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: 7 pages, 6 figures; Accepted by AAAI 2022

  47. Target Detection within Nonhomogeneous Clutter via Total Bregman Divergence-Based Matrix Information Geometry Detectors

    Authors: Xiaoqiang Hua, Yusuke Ono, Linyu Peng, Yongqiang Cheng, Hongqiang Wang

    Abstract: Information divergences are commonly used to measure the dissimilarity of two elements on a statistical manifold. Differentiable manifolds endowed with different divergences may possess different geometric properties, which can result in totally different performances in many practical applications. In this paper, we propose a total Bregman divergence-based matrix information geometry (TBD-MIG) de… ▽ More

    Submitted 7 August, 2021; v1 submitted 26 December, 2020; originally announced December 2020.

    Comments: 15 pages, 8 figures

    Journal ref: IEEE Transactions on Signal Processing, 69, 4326-4340, 2021

  48. arXiv:2012.10674  [pdf, other

    cs.CV

    Camera-aware Proxies for Unsupervised Person Re-Identification

    Authors: Menglin Wang, Baisheng Lai, Jianqiang Huang, Xiaojin Gong, Xian-Sheng Hua

    Abstract: This paper tackles the purely unsupervised person re-identification (Re-ID) problem that requires no annotations. Some previous methods adopt clustering techniques to generate pseudo labels and use the produced labels to train Re-ID models progressively. These methods are relatively simple but effective. However, most clustering-based methods take each cluster as a pseudo identity class, neglectin… ▽ More

    Submitted 5 February, 2021; v1 submitted 19 December, 2020; originally announced December 2020.

    Comments: Accepted to AAAI 2021. Code is available at: https://github.com/Terminator8758/CAP-master

  49. arXiv:2012.04265  [pdf, other

    cs.CV

    Learning to Generate Content-Aware Dynamic Detectors

    Authors: Junyi Feng, Jiashen Hua, Baisheng Lai, Jianqiang Huang, Xi Li, Xian-sheng Hua

    Abstract: Model efficiency is crucial for object detection. Mostprevious works rely on either hand-crafted design or auto-search methods to obtain a static architecture, regardless ofthe difference of inputs. In this paper, we introduce a newperspective of designing efficient detectors, which is automatically generating sample-adaptive model architectureon the fly. The proposed method is named content-aware… ▽ More

    Submitted 8 December, 2020; originally announced December 2020.

    Comments: 10 pages, 7 figures

  50. arXiv:2011.13322  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition

    Authors: Zhen Huang, Xu Shen, Xinmei Tian, Houqiang Li, Jianqiang Huang, Xian-Sheng Hua

    Abstract: Skeleton-based human action recognition has attracted much attention with the prevalence of accessible depth sensors. Recently, graph convolutional networks (GCNs) have been widely used for this task due to their powerful capability to model graph data. The topology of the adjacency graph is a key factor for modeling the correlations of the input skeletons. Thus, previous methods mainly focus on t… ▽ More

    Submitted 19 August, 2021; v1 submitted 26 November, 2020; originally announced November 2020.

    Comments: ACM MM 2020