Skip to main content

Showing 1–50 of 137 results for author: Hua, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15516  [pdf, other

    cs.NI

    Predicting Parameter Change's Effect on Cellular Network Time Series

    Authors: Mingjie Li, Yongqian Sun, Xiaolei Hua, Renkai Yu, Xinwen Fan, Lin Zhu, Junlan Feng, Dan Pei

    Abstract: The cellular network provides convenient network access for ever-growing mobile phones. During the continuous optimization, operators can adjust cell parameters to enhance the Quality of Service (QoS) flexibly. A precise prediction of the parameter change's effect can help operators make proper parameter adjustments. This work focuses on predicting cell status (like the workload and QoS) after adj… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  2. arXiv:2408.14520  [pdf, other

    cs.LG cs.AI cs.SI

    Towards Graph Prompt Learning: A Survey and Beyond

    Authors: Qingqing Long, Yuchen Yan, Peiyan Zhang, Chen Fang, Wentao Cui, Zhiyuan Ning, Meng Xiao, Ning Cao, Xiao Luo, Lingjun Xu, Shiyue Jiang, Zheng Fang, Chong Chen, Xian-Sheng Hua, Yuanchun Zhou

    Abstract: Large-scale "pre-train and prompt learning" paradigms have demonstrated remarkable adaptability, enabling broad applications across diverse domains such as question answering, image recognition, and multimodal retrieval. This approach fully leverages the potential of large-scale pre-trained models, reducing downstream data requirements and computational costs while enhancing model applicability ac… ▽ More

    Submitted 29 August, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

    Comments: 19 pages, 2 figures

  3. arXiv:2408.12247  [pdf, other

    cs.AI

    Enhanced Fine-Tuning of Lightweight Domain-Specific Q&A Model Based on Large Language Models

    Authors: Shenglin Zhang, Pengtian Zhu, Minghua Ma, Jiagang Wang, Yongqian Sun, Dongwen Li, Jingyu Wang, Qianying Guo, Xiaolei Hua, Lin Zhu, Dan Pei

    Abstract: Large language models (LLMs) excel at general question-answering (Q&A) but often fall short in specialized domains due to a lack of domain-specific knowledge. Commercial companies face the dual challenges of privacy protection and resource constraints when involving LLMs for fine-tuning. This paper propose a novel framework, Self-Evolution, designed to address these issues by leveraging lightweigh… ▽ More

    Submitted 22 August, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

  4. arXiv:2407.14081  [pdf, other

    cs.LG cs.AI cs.IR cs.SI

    DisenSemi: Semi-supervised Graph Classification via Disentangled Representation Learning

    Authors: Yifan Wang, Xiao Luo, Chong Chen, Xian-Sheng Hua, Ming Zhang, Wei Ju

    Abstract: Graph classification is a critical task in numerous multimedia applications, where graphs are employed to represent diverse types of multimedia data, including images, videos, and social networks. Nevertheless, in real-world scenarios, labeled graph data can be limited or scarce. To address this issue, we focus on the problem of semi-supervised graph classification, which involves both supervised… ▽ More

    Submitted 9 August, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

    Comments: Accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS 2024)

  5. arXiv:2407.09057  [pdf, other

    cs.CV

    PersonificationNet: Making customized subject act like a person

    Authors: Tianchu Guo, Pengyu Li, Biao Wang, Xiansheng Hua

    Abstract: Recently customized generation has significant potential, which uses as few as 3-5 user-provided images to train a model to synthesize new images of a specified subject. Though subsequent applications enhance the flexibility and diversity of customized generation, fine-grained control over the given subject acting like the person's pose is still lack of study. In this paper, we propose a Personifi… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  6. arXiv:2407.03106  [pdf, other

    cs.CV

    Anti-Collapse Loss for Deep Metric Learning Based on Coding Rate Metric

    Authors: Xiruo Jiang, Yazhou Yao, Xili Dai, Fumin Shen, Xian-Sheng Hua, Heng-Tao Shen

    Abstract: Deep metric learning (DML) aims to learn a discriminative high-dimensional embedding space for downstream tasks like classification, clustering, and retrieval. Prior literature predominantly focuses on pair-based and proxy-based methods to maximize inter-class discrepancy and minimize intra-class diversity. However, these methods tend to suffer from the collapse of the embedding space due to their… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: accepted by IEEE Transactions on Multimedia

  7. arXiv:2405.17477  [pdf, other

    cs.LG cs.AI

    OLLIE: Imitation Learning from Offline Pretraining to Online Finetuning

    Authors: Sheng Yue, Xingyuan Hua, Ju Ren, Sen Lin, Junshan Zhang, Yaoxue Zhang

    Abstract: In this paper, we study offline-to-online Imitation Learning (IL) that pretrains an imitation policy from static demonstration data, followed by fast finetuning with minimal environmental interaction. We find the naïve combination of existing offline IL and online IL methods tends to behave poorly in this context, because the initial discriminator (often used in online IL) operates randomly and di… ▽ More

    Submitted 30 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: International Conference on Machine Learning (ICML)

  8. arXiv:2405.17476  [pdf, other

    cs.LG cs.AI

    How to Leverage Diverse Demonstrations in Offline Imitation Learning

    Authors: Sheng Yue, Jiani Liu, Xingyuan Hua, Ju Ren, Sen Lin, Junshan Zhang, Yaoxue Zhang

    Abstract: Offline Imitation Learning (IL) with imperfect demonstrations has garnered increasing attention owing to the scarcity of expert data in many real-world domains. A fundamental problem in this scenario is how to extract positive behaviors from noisy data. In general, current approaches to the problem select data building on state-action similarity to given expert demonstrations, neglecting precious… ▽ More

    Submitted 30 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: International Conference on Machine Learning (ICML)

  9. arXiv:2405.17474  [pdf, other

    cs.LG cs.AI

    Federated Offline Policy Optimization with Dual Regularization

    Authors: Sheng Yue, Zerui Qin, Xingyuan Hua, Yongheng Deng, Ju Ren

    Abstract: Federated Reinforcement Learning (FRL) has been deemed as a promising solution for intelligent decision-making in the era of Artificial Internet of Things. However, existing FRL approaches often entail repeated interactions with the environment during local updating, which can be prohibitively expensive or even infeasible in many real-world domains. To overcome this challenge, this paper proposes… ▽ More

    Submitted 28 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: IEEE International Conference on Computer Communications (INFOCOM)

  10. arXiv:2405.17471  [pdf, other

    cs.LG cs.AI

    Momentum-Based Federated Reinforcement Learning with Interaction and Communication Efficiency

    Authors: Sheng Yue, Xingyuan Hua, Lili Chen, Ju Ren

    Abstract: Federated Reinforcement Learning (FRL) has garnered increasing attention recently. However, due to the intrinsic spatio-temporal non-stationarity of data distributions, the current approaches typically suffer from high interaction and communication costs. In this paper, we introduce a new FRL algorithm, named $\texttt{MFPO}$, that utilizes momentum, importance sampling, and additional server-side… ▽ More

    Submitted 28 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: IEEE International Conference on Computer Communications (INFOCOM)

  11. arXiv:2405.11496  [pdf, other

    cs.CV cs.IR

    DEMO: A Statistical Perspective for Efficient Image-Text Matching

    Authors: Fan Zhang, Xian-Sheng Hua, Chong Chen, Xiao Luo

    Abstract: Image-text matching has been a long-standing problem, which seeks to connect vision and language through semantic understanding. Due to the capability to manage large-scale raw data, unsupervised hashing-based approaches have gained prominence recently. They typically construct a semantic similarity structure using the natural distance, which subsequently provides guidance to the model optimizatio… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  12. arXiv:2404.19282  [pdf, other

    cs.MM

    Dual Dynamic Threshold Adjustment Strategy for Deep Metric Learning

    Authors: Xiruo Jiang, Yazhou Yao, Sheng Liu, Fumin Shen, Liqiang Nie, Xiansheng Hua

    Abstract: Loss functions and sample mining strategies are essential components in deep metric learning algorithms. However, the existing loss function or mining strategy often necessitate the incorporation of additional hyperparameters, notably the threshold, which defines whether the sample pair is informative. The threshold provides a stable numerical standard for determining whether to retain the pairs.… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: accepted by ACM Transactions on Multimedia Computing, Communications, and Applications

  13. arXiv:2402.13714  [pdf, other

    q-bio.QM cs.AI cs.LG

    An Evaluation of Large Language Models in Bioinformatics Research

    Authors: Hengchuang Yin, Zhonghui Gu, Fanhao Wang, Yiparemu Abuduhaibaier, Yanqiao Zhu, Xinming Tu, Xian-Sheng Hua, Xiao Luo, Yizhou Sun

    Abstract: Large language models (LLMs) such as ChatGPT have gained considerable interest across diverse research communities. Their notable ability for text completion and generation has inaugurated a novel paradigm for language-interfaced problem solving. However, the potential and efficacy of these models in bioinformatics remain incompletely explored. In this work, we study the performance LLMs on a wide… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Under review

  14. arXiv:2402.11242  [pdf, other

    cs.LG cs.AI

    Learning with Imbalanced Noisy Data by Preventing Bias in Sample Selection

    Authors: Huafeng Liu, Mengmeng Sheng, Zeren Sun, Yazhou Yao, Xian-Sheng Hua, Heng-Tao Shen

    Abstract: Learning with noisy labels has gained increasing attention because the inevitable imperfect labels in real-world scenarios can substantially hurt the deep model performance. Recent studies tend to regard low-loss samples as clean ones and discard high-loss ones to alleviate the negative impact of noisy labels. However, real-world datasets contain not only noisy labels but also class imbalance. The… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: accepted by IEEE Transactions on Multimedia

  15. arXiv:2401.06936  [pdf, other

    cs.LG physics.comp-ph

    Accelerated Sampling of Rare Events using a Neural Network Bias Potential

    Authors: Xinru Hua, Rasool Ahmad, Jose Blanchet, Wei Cai

    Abstract: In the field of computational physics and material science, the efficient sampling of rare events occurring at atomic scale is crucial. It aids in understanding mechanisms behind a wide range of important phenomena, including protein folding, conformal changes, chemical reactions and materials diffusion and deformation. Traditional simulation methods, such as Molecular Dynamics and Monte Carlo, of… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

  16. arXiv:2312.09525  [pdf, other

    cs.CV

    Hierarchical Graph Pattern Understanding for Zero-Shot VOS

    Authors: Gensheng Pei, Fumin Shen, Yazhou Yao, Tao Chen, Xian-Sheng Hua, Heng-Tao Shen

    Abstract: The optical flow guidance strategy is ideal for obtaining motion information of objects in the video. It is widely utilized in video segmentation tasks. However, existing optical flow-based methods have a significant dependency on optical flow, which results in poor performance when the optical flow estimation fails for a particular scene. The temporal consistency provided by the optical flow coul… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: accepted by IEEE Transactions on Image Processing

    Journal ref: IEEE Transactions on Image Processing 2023

  17. arXiv:2311.14395  [pdf, other

    cs.LG cs.CV

    Multi-scale Semantic Correlation Mining for Visible-Infrared Person Re-Identification

    Authors: Ke Cheng, Xuecheng Hua, Hu Lu, Juanjuan Tu, Yuanquan Wang, Shitong Wang

    Abstract: The main challenge in the Visible-Infrared Person Re-Identification (VI-ReID) task lies in how to extract discriminative features from different modalities for matching purposes. While the existing well works primarily focus on minimizing the modal discrepancies, the modality information can not thoroughly be leveraged. To solve this problem, a Multi-scale Semantic Correlation Mining network (MSCM… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

  18. arXiv:2311.02342  [pdf, other

    cs.CV

    Proposal-Level Unsupervised Domain Adaptation for Open World Unbiased Detector

    Authors: Xuanyi Liu, Zhongqi Yue, Xian-Sheng Hua

    Abstract: Open World Object Detection (OWOD) combines open-set object detection with incremental learning capabilities to handle the challenge of the open and dynamic visual world. Existing works assume that a foreground predictor trained on the seen categories can be directly transferred to identify the unseen categories' locations by selecting the top-k most confident foreground predictions. However, the… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

  19. arXiv:2309.12028  [pdf, other

    cs.LG cs.AI cs.SI

    Dynamic Hypergraph Structure Learning for Traffic Flow Forecasting

    Authors: Yusheng Zhao, Xiao Luo, Wei Ju, Chong Chen, Xian-Sheng Hua, Ming Zhang

    Abstract: This paper studies the problem of traffic flow forecasting, which aims to predict future traffic conditions on the basis of road networks and traffic conditions in the past. The problem is typically solved by modeling complex spatio-temporal correlations in traffic data using spatio-temporal graph neural networks (GNNs). However, the performance of these methods is still far from satisfactory sinc… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: Accepted by 2023 IEEE 39th International Conference on Data Engineering (ICDE 2023)

  20. arXiv:2308.09694  [pdf, other

    cs.CV

    Invariant Training 2D-3D Joint Hard Samples for Few-Shot Point Cloud Recognition

    Authors: Xuanyu Yi, Jiajun Deng, Qianru Sun, Xian-Sheng Hua, Joo-Hwee Lim, Hanwang Zhang

    Abstract: We tackle the data scarcity challenge in few-shot point cloud recognition of 3D objects by using a joint prediction from a conventional 3D model and a well-trained 2D model. Surprisingly, such an ensemble, though seems trivial, has hardly been shown effective in recent 2D-3D models. We find out the crux is the less effective training for the ''joint hard samples'', which have high confidence predi… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

  21. arXiv:2307.15271  [pdf, other

    cs.CV

    Anatomy-Aware Lymph Node Detection in Chest CT using Implicit Station Stratification

    Authors: Ke Yan, Dakai Jin, Dazhou Guo, Minfeng Xu, Na Shen, Xian-Sheng Hua, Xianghua Ye, Le Lu

    Abstract: Finding abnormal lymph nodes in radiological images is highly important for various medical tasks such as cancer metastasis staging and radiotherapy planning. Lymph nodes (LNs) are small glands scattered throughout the body. They are grouped or defined to various LN stations according to their anatomical locations. The CT imaging appearance and context of LNs in different stations vary significant… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

  22. arXiv:2307.08249  [pdf, other

    cs.CV

    Random Boxes Are Open-world Object Detectors

    Authors: Yanghao Wang, Zhongqi Yue, Xian-Sheng Hua, Hanwang Zhang

    Abstract: We show that classifiers trained with random region proposals achieve state-of-the-art Open-world Object Detection (OWOD): they can not only maintain the accuracy of the known objects (w/ training labels), but also considerably improve the recall of unknown ones (w/o training labels). Specifically, we propose RandBox, a Fast R-CNN based architecture trained on random proposals at each training ite… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: ICCV 2023

  23. arXiv:2307.01098  [pdf

    physics.med-ph cs.AI

    Automated identification and quantification of myocardial inflammatory infiltration in digital histological images to diagnose myocarditis

    Authors: Yanyun Liu, Xiumeng Hua, Shouping Zhu, Congrui Wang, Xiao Chen, Yu Shi, Jiangping Song, Weihua Zhou

    Abstract: This study aims to develop a new computational pathology approach that automates the identification and quantification of myocardial inflammatory infiltration in digital HE-stained images to provide a quantitative histological diagnosis of myocarditis.898 HE-stained whole slide images (WSIs) of myocardium from 154 heart transplant patients diagnosed with myocarditis or dilated cardiomyopathy (DCM)… ▽ More

    Submitted 22 May, 2024; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: 21 pages,5 figures,6 Tables, 25 references

  24. arXiv:2306.04979  [pdf, other

    cs.LG cs.AI

    CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification

    Authors: Nan Yin, Li Shen, Mengzhu Wang, Long Lan, Zeyu Ma, Chong Chen, Xian-Sheng Hua, Xiao Luo

    Abstract: Although graph neural networks (GNNs) have achieved impressive achievements in graph classification, they often need abundant task-specific labels, which could be extensively costly to acquire. A credible solution is to explore additional labeled graphs to enhance unsupervised learning on the target domain. However, how to apply GNNs to domain adaptation remains unsolved owing to the insufficient… ▽ More

    Submitted 29 July, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

  25. arXiv:2305.17898  [pdf

    cs.CV cs.AI

    Convolutional neural network based on sparse graph attention mechanism for MRI super-resolution

    Authors: Xin Hua, Zhijiang Du, Hongjian Yu, Jixin Maa

    Abstract: Magnetic resonance imaging (MRI) is a valuable clinical tool for displaying anatomical structures and aiding in accurate diagnosis. Medical image super-resolution (SR) reconstruction using deep learning techniques can enhance lesion analysis and assist doctors in improving diagnostic efficiency and accuracy. However, existing deep learning-based SR methods predominantly rely on convolutional neura… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: 12 pages, 6 figures

    ACM Class: I.4.5

  26. arXiv:2305.11421  [pdf, other

    cs.CV cs.AI

    PastNet: Introducing Physical Inductive Biases for Spatio-temporal Video Prediction

    Authors: Hao Wu, Wei Xiong, Fan Xu, Xiao Luo, Chong Chen, Xian-Sheng Hua, Haixin Wang

    Abstract: In this paper, we investigate the challenge of spatio-temporal video prediction, which involves generating future videos based on historical data streams. Existing approaches typically utilize external information such as semantic maps to enhance video prediction, which often neglect the inherent physical knowledge embedded within videos. Furthermore, their high computational demands could impede… ▽ More

    Submitted 24 May, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: 11

    MSC Class: 14J60 (Primary) 14F05; 14J26 (Secondary) ACM Class: I.2.6; I.5

  27. arXiv:2305.03944  [pdf, other

    cs.CV

    Structural and Statistical Texture Knowledge Distillation for Semantic Segmentation

    Authors: Deyi Ji, Haoran Wang, Mingyuan Tao, Jianqiang Huang, Xian-Sheng Hua, Hongtao Lu

    Abstract: Existing knowledge distillation works for semantic segmentation mainly focus on transferring high-level contextual knowledge from teacher to student. However, low-level texture knowledge is also of vital importance for characterizing the local structural pattern and global statistical property, such as boundary, smoothness, regularity and color contrast, which may not be well addressed by high-lev… ▽ More

    Submitted 5 July, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

    Comments: Accepted to CVPR 2022

  28. arXiv:2304.11688  [pdf, other

    cs.LG cs.AI cs.IR

    TGNN: A Joint Semi-supervised Framework for Graph-level Classification

    Authors: Wei Ju, Xiao Luo, Meng Qu, Yifan Wang, Chong Chen, Minghua Deng, Xian-Sheng Hua, Ming Zhang

    Abstract: This paper studies semi-supervised graph classification, a crucial task with a wide range of applications in social network analysis and bioinformatics. Recent works typically adopt graph neural networks to learn graph-level representations for classification, failing to explicitly leverage features derived from graph topology (e.g., paths). Moreover, when labeled data is scarce, these methods are… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: Accepted by Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI 2022)

  29. Dynamic Flows on Curved Space Generated by Labeled Data

    Authors: Xinru Hua, Truyen Nguyen, Tam Le, Jose Blanchet, Viet Anh Nguyen

    Abstract: The scarcity of labeled data is a long-standing challenge for many machine learning tasks. We propose our gradient flow method to leverage the existing dataset (i.e., source) to generate new samples that are close to the dataset of interest (i.e., target). We lift both datasets to the space of probability distributions on the feature-Gaussian manifold, and then develop a gradient flow method that… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

    Report number: 5213182

    Journal ref: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, 2023, 3803--3811

  30. FECANet: Boosting Few-Shot Semantic Segmentation with Feature-Enhanced Context-Aware Network

    Authors: Huafeng Liu, Pai Peng, Tao Chen, Qiong Wang, Yazhou Yao, Xian-Sheng Hua

    Abstract: Few-shot semantic segmentation is the task of learning to locate each pixel of the novel class in the query image with only a few annotated support images. The current correlation-based methods construct pair-wise feature correlations to establish the many-to-many matching because the typical prototype-based approaches cannot learn fine-grained correspondence relations. However, the existing metho… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

    Comments: accepted by IEEE Transactions on Multimedia

  31. arXiv:2301.02299  [pdf, other

    cs.CL cs.AI cs.LG

    Sequentially Controlled Text Generation

    Authors: Alexander Spangher, Xinyu Hua, Yao Ming, Nanyun Peng

    Abstract: While GPT-2 generates sentences that are remarkably human-like, longer documents can ramble and do not follow human-like writing structure. We study the problem of imposing structure on long-range text. We propose a novel controlled text generation task, sequentially controlled text generation, and identify a dataset, NewsDiscourse as a starting point for this task. We develop a sequential control… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

    Comments: 19 pages. 10 pages main body, 3 pages references, 6 pages appendix

    Journal ref: Findings of the 2022 Conference on Empirical Methods in Natural Language Processing

  32. arXiv:2212.01579  [pdf, other

    cs.CV

    Box2Mask: Box-supervised Instance Segmentation via Level-set Evolution

    Authors: Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Risheng Yu, Xiansheng Hua, Lei Zhang

    Abstract: In contrast to fully supervised methods using pixel-wise mask labels, box-supervised instance segmentation takes advantage of simple box annotations, which has recently attracted increasing research attention. This paper presents a novel single-shot instance segmentation approach, namely Box2Mask, which integrates the classical level-set evolution model into deep neural network learning to achieve… ▽ More

    Submitted 3 December, 2022; originally announced December 2022.

    Comments: 29 pages, 7 figures, 14 tables. arXiv admin note: text overlap with arXiv:2207.09055

  33. arXiv:2211.07143  [pdf

    eess.IV cs.CV

    WSC-Trans: A 3D network model for automatic multi-structural segmentation of temporal bone CT

    Authors: Xin Hua, Zhijiang Du, Hongjian Yu, Jixin Ma, Fanjun Zheng, Cheng Zhang, Qiaohui Lu, Hui Zhao

    Abstract: Cochlear implantation is currently the most effective treatment for patients with severe deafness, but mastering cochlear implantation is extremely challenging because the temporal bone has extremely complex and small three-dimensional anatomical structures, and it is important to avoid damaging the corresponding structures when performing surgery. The spatial location of the relevant anatomical t… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: 10 pages,7 figures

  34. arXiv:2208.03462  [pdf, other

    cs.CV

    Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization

    Authors: Jiaxin Qi, Kaihua Tang, Qianru Sun, Xian-Sheng Hua, Hanwang Zhang

    Abstract: Out-Of-Distribution generalization (OOD) is all about learning invariance against environmental changes. If the context in every class is evenly distributed, OOD would be trivial because the context can be easily removed due to an underlying principle: class is invariant to context. However, collecting such a balanced dataset is impractical. Learning on imbalanced data makes the model bias to cont… ▽ More

    Submitted 31 March, 2023; v1 submitted 6 August, 2022; originally announced August 2022.

    Comments: Accepted by ECCV 2022

  35. arXiv:2207.13378  [pdf, other

    cs.CV

    Identifying Hard Noise in Long-Tailed Sample Distribution

    Authors: Xuanyu Yi, Kaihua Tang, Xian-Sheng Hua, Joo-Hwee Lim, Hanwang Zhang

    Abstract: Conventional de-noising methods rely on the assumption that all samples are independent and identically distributed, so the resultant classifier, though disturbed by noise, can still easily identify the noises as the outliers of training distribution. However, the assumption is unrealistic in large-scale data that is inevitably long-tailed. Such imbalanced training data makes a classifier less dis… ▽ More

    Submitted 31 March, 2023; v1 submitted 27 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV2022(Oral) ; Datasets and codes are available at https://github.com/yxymessi/H2E-Framework

  36. arXiv:2207.13259  [pdf, other

    cs.CV cs.AI cs.LG

    Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action Recognition

    Authors: Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Xian-Sheng Hua, Lei Zhang

    Abstract: Transformer-based methods have recently achieved great advancement on 2D image-based vision tasks. For 3D video-based tasks such as action recognition, however, directly applying spatiotemporal transformers on video data will bring heavy computation and memory burdens due to the largely increased number of patches and the quadratic complexity of self-attention computation. How to efficiently and e… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

    Comments: Accepted by ECCV22

  37. arXiv:2207.09332  [pdf, other

    cs.CV

    Rethinking IoU-based Optimization for Single-stage 3D Object Detection

    Authors: Hualian Sheng, Sijia Cai, Na Zhao, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Min-Jian Zhao, Gim Hee Lee

    Abstract: Since Intersection-over-Union (IoU) based optimization maintains the consistency of the final IoU prediction metric and losses, it has been widely used in both regression and classification branches of single-stage 2D object detectors. Recently, several 3D object detection methods adopt IoU-based optimization and directly replace the 2D IoU with 3D IoU. However, such a direct computation in 3D is… ▽ More

    Submitted 20 July, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

    Comments: Accepted by ECCV2022. The code is available at https://github.com/hlsheng1/RDIoU

  38. arXiv:2207.09055  [pdf, other

    cs.CV

    Box-supervised Instance Segmentation with Level Set Evolution

    Authors: Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Xiansheng Hua, Lei Zhang

    Abstract: In contrast to the fully supervised methods using pixel-wise mask labels, box-supervised instance segmentation takes advantage of the simple box annotations, which has recently attracted a lot of research attentions. In this paper, we propose a novel single-shot box-supervised instance segmentation approach, which integrates the classical level set model with deep neural network delicately. Specif… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

    Comments: 17 page, 4figures, ECCV2022

  39. arXiv:2207.04829  [pdf, ps, other

    cs.IT eess.SP

    Low-complexity Joint Phase Adjustment and Receive Beamforming for Directional Modulation Networks via IRS

    Authors: Rongen Dong, Shaohua Jiang, Xinhai Hua, Yin Teng, Feng Shu, Jiangzhou Wang

    Abstract: Intelligent reflecting surface (IRS) is a revolutionary and low-cost technology for boosting the spectrum and energy efficiencies in future wireless communication network. In order to create controllable multipath transmission in the conventional line-of-sight (LOS) wireless communication environment, an IRS-aided directional modulation (DM) network is considered. In this paper, to improve the tra… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

  40. arXiv:2207.02812  [pdf, other

    cs.CV

    Towards Counterfactual Image Manipulation via CLIP

    Authors: Yingchen Yu, Fangneng Zhan, Rongliang Wu, Jiahui Zhang, Shijian Lu, Miaomiao Cui, Xuansong Xie, Xian-Sheng Hua, Chunyan Miao

    Abstract: Leveraging StyleGAN's expressivity and its disentangled latent codes, existing methods can achieve realistic editing of different visual attributes such as age and gender of facial images. An intriguing yet challenging problem arises: Can generative models achieve counterfactual editing against their learnt priors? Due to the lack of counterfactual samples in natural datasets, we investigate this… ▽ More

    Submitted 12 July, 2022; v1 submitted 6 July, 2022; originally announced July 2022.

    Comments: This paper has been accepted to ACM MM 2022, code may be found here: https://github.com/yingchen001/CF-CLIP

  41. arXiv:2206.14923  [pdf, other

    cs.CV cs.LG

    On Non-Random Missing Labels in Semi-Supervised Learning

    Authors: Xinting Hu, Yulei Niu, Chunyan Miao, Xian-Sheng Hua, Hanwang Zhang

    Abstract: Semi-Supervised Learning (SSL) is fundamentally a missing label problem, in which the label Missing Not At Random (MNAR) problem is more realistic and challenging, compared to the widely-adopted yet naive Missing Completely At Random assumption where both labeled and unlabeled data share the same class distribution. Different from existing SSL solutions that overlook the role of "class" in causing… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Journal ref: ICLR 2022

  42. arXiv:2206.11476  [pdf, other

    cs.CV

    Dynamic Scene Deblurring Based on Continuous Cross-Layer Attention Transmission

    Authors: Xia Hua, Mingxin Li, Junxiong Fei, Yu Shi, JianGuo Liu, Hanyu Hong

    Abstract: The deep convolutional neural networks (CNNs) using attention mechanism have achieved great success for dynamic scene deblurring. In most of these networks, only the features refined by the attention maps can be passed to the next layer and the attention maps of different layers are separated from each other, which does not make full use of the attention information from different layers in the CN… ▽ More

    Submitted 28 January, 2023; v1 submitted 23 June, 2022; originally announced June 2022.

  43. arXiv:2206.07662  [pdf, other

    cs.CV

    SP-ViT: Learning 2D Spatial Priors for Vision Transformers

    Authors: Yuxuan Zhou, Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Lei Zhang, Margret Keuper, Xiansheng Hua

    Abstract: Recently, transformers have shown great potential in image classification and established state-of-the-art results on the ImageNet benchmark. However, compared to CNNs, transformers converge slowly and are prone to overfitting in low-data regimes due to the lack of spatial inductive biases. Such spatial inductive biases can be especially beneficial since the 2D structure of an input image is not w… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    ACM Class: I.4

  44. arXiv:2205.05675  [pdf, other

    cs.CV eess.IV

    NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

    Authors: Yawei Li, Kai Zhang, Radu Timofte, Luc Van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang , et al. (86 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of e… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Validation code of the baseline model is available at https://github.com/ofsoundof/IMDN. Validation of all submitted models is available at https://github.com/ofsoundof/NTIRE2022_ESR

  45. arXiv:2204.11278  [pdf, ps, other

    eess.SP cs.IT stat.ML

    Unsupervised Learning Discriminative MIG Detectors in Nonhomogeneous Clutter

    Authors: Xiaoqiang Hua, Yusuke Ono, Linyu Peng, Yuting Xu

    Abstract: Principal component analysis (PCA) is a commonly used pattern analysis method that maps high-dimensional data into a lower-dimensional space maximizing the data variance, that results in the promotion of separability of data. Inspired by the principle of PCA, a novel type of learning discriminative matrix information geometry (MIG) detectors in the unsupervised scenario are developed, and applied… ▽ More

    Submitted 8 May, 2022; v1 submitted 24 April, 2022; originally announced April 2022.

    Comments: 14 pages, 6 figures

    Journal ref: IEEE Transactions on Communications 70, 4107-4120, 2022

  46. arXiv:2204.07300  [pdf, other

    cs.CV

    Dense Learning based Semi-Supervised Object Detection

    Authors: Binghui Chen, Pengyu Li, Xiang Chen, Biao Wang, Lei Zhang, Xian-Sheng Hua

    Abstract: Semi-supervised object detection (SSOD) aims to facilitate the training and deployment of object detectors with the help of a large amount of unlabeled data. Though various self-training based and consistency-regularization based SSOD methods have been proposed, most of them are anchor-based detectors, ignoring the fact that in many real-world applications anchor-free detectors are more demanded.… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: CVPR 2022

  47. Spatial Likelihood Voting with Self-Knowledge Distillation for Weakly Supervised Object Detection

    Authors: Ze Chen, Zhihang Fu, Jianqiang Huang, Mingyuan Tao, Rongxin Jiang, Xiang Tian, Yaowu Chen, Xian-sheng Hua

    Abstract: Weakly supervised object detection (WSOD), which is an effective way to train an object detection model using only image-level annotations, has attracted considerable attention from researchers. However, most of the existing methods, which are based on multiple instance learning (MIL), tend to localize instances to the discriminative parts of salient objects instead of the entire content of all ob… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: arXiv admin note: text overlap with arXiv:2006.12884

    Journal ref: Image and Vision Computing, Volume 116, 2021, 104314, ISSN 0262-8856

  48. arXiv:2204.00826  [pdf, other

    cs.CV

    Online Convolutional Re-parameterization

    Authors: Mu Hu, Junyi Feng, Jiashen Hua, Baisheng Lai, Jianqiang Huang, Xiaojin Gong, Xiansheng Hua

    Abstract: Structural re-parameterization has drawn increasing attention in various computer vision tasks. It aims at improving the performance of deep models without introducing any inference-time cost. Though efficient during inference, such models rely heavily on the complicated training-time blocks to achieve high accuracy, leading to large extra training cost. In this paper, we present online convolutio… ▽ More

    Submitted 2 April, 2022; originally announced April 2022.

    Comments: Accepted by CVPR 2022

  49. arXiv:2204.00754  [pdf, other

    cs.CV

    Homography Loss for Monocular 3D Object Detection

    Authors: Jiaqi Gu, Bojian Wu, Lubin Fan, Jianqiang Huang, Shen Cao, Zhiyu Xiang, Xian-Sheng Hua

    Abstract: Monocular 3D object detection is an essential task in autonomous driving. However, most current methods consider each 3D object in the scene as an independent training sample, while ignoring their inherent geometric relations, thus inevitably resulting in a lack of leveraging spatial constraints. In this paper, we propose a novel method that takes all the objects into consideration and explores th… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

    Comments: 8 pages, 5 figures. Accepted to CVPR 2022

  50. arXiv:2204.00707  [pdf, other

    cs.CL

    Efficient Argument Structure Extraction with Transfer Learning and Active Learning

    Authors: Xinyu Hua, Lu Wang

    Abstract: The automation of extracting argument structures faces a pair of challenges on (1) encoding long-term contexts to facilitate comprehensive understanding, and (2) improving data efficiency since constructing high-quality argument structures is time-consuming. In this work, we propose a novel context-aware Transformer-based argument structure prediction model which, on five different domains, signif… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

    Comments: Findings of ACL 2022, long paper