Skip to main content

Showing 1–50 of 1,318 results for author: Zhou, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.03024  [pdf, other

    cs.LG

    NUMOSIM: A Synthetic Mobility Dataset with Anomaly Detection Benchmarks

    Authors: Chris Stanford, Suman Adari, Xishun Liao, Yueshuai He, Qinhua Jiang, Chenchen Kuai, Jiaqi Ma, Emmanuel Tung, Yinlong Qian, Lingyi Zhao, Zihao Zhou, Zeeshan Rasheed, Khurram Shafique

    Abstract: Collecting real-world mobility data is challenging. It is often fraught with privacy concerns, logistical difficulties, and inherent biases. Moreover, accurately annotating anomalies in large-scale data is nearly impossible, as it demands meticulous effort to distinguish subtle and complex patterns. These challenges significantly impede progress in geospatial anomaly detection research by restrict… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

  2. arXiv:2409.02567  [pdf, other

    cs.CV

    Evaluation Study on SAM 2 for Class-agnostic Instance-level Segmentation

    Authors: Tiantian Zhang, Zhangjun Zhou, Jialun Pei

    Abstract: Segment Anything Model (SAM) has demonstrated powerful zero-shot segmentation performance in natural scenes. The recently released Segment Anything Model 2 (SAM2) has further heightened researchers' expectations towards image segmentation capabilities. To evaluate the performance of SAM2 on class-agnostic instance-level segmentation tasks, we adopt different prompt strategies for SAM2 to cope with… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

  3. arXiv:2409.01977  [pdf, other

    cs.LG

    Counterfactual Fairness by Combining Factual and Counterfactual Predictions

    Authors: Zeyu Zhou, Tianci Liu, Ruqi Bai, Jing Gao, Murat Kocaoglu, David I. Inouye

    Abstract: In high-stake domains such as healthcare and hiring, the role of machine learning (ML) in decision-making raises significant fairness concerns. This work focuses on Counterfactual Fairness (CF), which posits that an ML model's outcome on any individual should remain unchanged if they had belonged to a different demographic group. Previous works have proposed methods that guarantee CF. Notwithstand… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  4. arXiv:2409.01807  [pdf, other

    cs.CV

    EPRecon: An Efficient Framework for Real-Time Panoptic 3D Reconstruction from Monocular Video

    Authors: Zhen Zhou, Yunkai Ma, Junfeng Fan, Shaolin Zhang, Fengshui Jing, Min Tan

    Abstract: Panoptic 3D reconstruction from a monocular video is a fundamental perceptual task in robotic scene understanding. However, existing efforts suffer from inefficiency in terms of inference speed and accuracy, limiting their practical applicability. We present EPRecon, an efficient real-time panoptic 3D reconstruction framework. Current volumetric-based reconstruction methods usually utilize multi-v… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  5. arXiv:2409.00997  [pdf, other

    cs.CL

    DataSculpt: Crafting Data Landscapes for LLM Post-Training through Multi-objective Partitioning

    Authors: Keer Lu, Zheng Liang, Xiaonan Nie, Da Pan, Shusen Zhang, Keshi Zhao, Weipeng Chen, Zenan Zhou, Guosheng Dong, Wentao Zhang, Bin Cui

    Abstract: The effectiveness of long-context modeling is important for Large Language Models (LLMs) in various applications. Despite their potential, LLMs' efficacy in processing long context does not consistently meet expectations, posing significant challenges for efficient management of prolonged sequences in training. This difficulty is compounded by the scarcity of comprehensive and diverse training dat… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  6. arXiv:2409.00843  [pdf, other

    econ.GN cs.CE cs.CY q-fin.CP stat.ML

    Global Public Sentiment on Decentralized Finance: A Spatiotemporal Analysis of Geo-tagged Tweets from 150 Countries

    Authors: Yuqi Chen, Yifan Li, Kyrie Zhixuan Zhou, Xiaokang Fu, Lingbo Liu, Shuming Bao, Daniel Sui, Luyao Zhang

    Abstract: In the digital era, blockchain technology, cryptocurrencies, and non-fungible tokens (NFTs) have transformed financial and decentralized systems. However, existing research often neglects the spatiotemporal variations in public sentiment toward these technologies, limiting macro-level insights into their global impact. This study leverages Twitter data to explore public attention and sentiment acr… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  7. arXiv:2409.00730  [pdf, other

    cs.LG stat.ML

    Generating Physical Dynamics under Priors

    Authors: Zihan Zhou, Xiaoxue Wang, Tianshu Yu

    Abstract: Generating physically feasible dynamics in a data-driven context is challenging, especially when adhering to physical priors expressed in specific equations or formulas. Existing methodologies often overlook the integration of physical priors, resulting in violation of basic physical laws and suboptimal performance. In this paper, we introduce a novel framework that seamlessly incorporates physica… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  8. arXiv:2409.00497  [pdf, other

    quant-ph cs.CR

    Security Loophole Induced by Photorefractive Effect in Continous-variable Quantum Key Distribution System

    Authors: Zehao Zhou, Peng Huang, Tao Wang, Guihua Zeng

    Abstract: Modulators based on the Mach-Zehnder interferometer (MZI) structure are widely used in continuous-variable quantum key distribution (CVQKD) systems. MZI-based variable optical attenuator (VOA) and amplitude modulator can reshape the waveform and control the intensity of coherent state signal to realize secret key information modulation in CVQKD system. However, these devices are not ideal, interna… ▽ More

    Submitted 31 August, 2024; originally announced September 2024.

  9. arXiv:2408.17396  [pdf, other

    cs.LG stat.ML

    Fairness-Aware Estimation of Graphical Models

    Authors: Zhuoping Zhou, Davoud Ataee Tarzanagh, Bojian Hou, Qi Long, Li Shen

    Abstract: This paper examines the issue of fairness in the estimation of graphical models (GMs), particularly Gaussian, Covariance, and Ising models. These models play a vital role in understanding complex relationships in high-dimensional data. However, standard GMs can result in biased outcomes, especially when the underlying data involves sensitive characteristics or protected groups. To address this, we… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: 32 Pages, 9 Figures

  10. arXiv:2408.15663  [pdf, other

    cs.RO

    NeuroVE: Brain-inspired Linear-Angular Velocity Estimation with Spiking Neural Networks

    Authors: Xiao Li, Xieyuanli Chen, Ruibin Guo, Yujie Wu, Zongtan Zhou, Fangwen Yu, Huimin Lu

    Abstract: Vision-based ego-velocity estimation is a fundamental problem in robot state estimation. However, the constraints of frame-based cameras, including motion blur and insufficient frame rates in dynamic settings, readily lead to the failure of conventional velocity estimation techniques. Mammals exhibit a remarkable ability to accurately estimate their ego-velocity during aggressive movement. Hence,… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  11. arXiv:2408.15251  [pdf, other

    cs.CV cs.LG

    TrajFM: A Vehicle Trajectory Foundation Model for Region and Task Transferability

    Authors: Yan Lin, Tonglong Wei, Zeyu Zhou, Haomin Wen, Jilin Hu, Shengnan Guo, Youfang Lin, Huaiyu Wan

    Abstract: Vehicle trajectories provide valuable movement information that supports various downstream tasks and powers real-world applications. A desirable trajectory learning model should transfer between different regions and tasks without retraining, thus improving computational efficiency and effectiveness with limited training data. However, a model's ability to transfer across regions is limited by th… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  12. arXiv:2408.15079  [pdf, other

    cs.CL cs.AI

    BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline

    Authors: Guosheng Dong, Da Pan, Yiding Sun, Shusen Zhang, Zheng Liang, Xin Wu, Yanjun Shen, Fan Yang, Haoze Sun, Tianpeng Li, Mingan Lin, Jianhua Xu, Yufan Zhang, Xiaonan Nie, Lei Su, Bingning Wang, Wentao Zhang, Jiaxin Mao, Zenan Zhou, Weipeng Chen

    Abstract: The general capabilities of Large Language Models (LLM) highly rely on the composition and selection on extensive pretraining datasets, treated as commercial secrets by several institutions. To mitigate this issue, we open-source the details of a universally applicable data processing pipeline and validate its effectiveness and potential by introducing a competitive LLM baseline. Specifically, the… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: 19 pages, 6 figures

  13. arXiv:2408.14254  [pdf, other

    q-bio.NC cs.LG

    Integrated Brain Connectivity Analysis with fMRI, DTI, and sMRI Powered by Interpretable Graph Neural Networks

    Authors: Gang Qu, Ziyu Zhou, Vince D. Calhoun, Aiying Zhang, Yu-Ping Wang

    Abstract: Multimodal neuroimaging modeling has becomes a widely used approach but confronts considerable challenges due to heterogeneity, which encompasses variability in data types, scales, and formats across modalities. This variability necessitates the deployment of advanced computational methods to integrate and interpret these diverse datasets within a cohesive analytical framework. In our research, we… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  14. arXiv:2408.14185  [pdf, other

    cs.AI cs.RO

    DynamicRouteGPT: A Real-Time Multi-Vehicle Dynamic Navigation Framework Based on Large Language Models

    Authors: Ziai Zhou, Bin Zhou, Hao Liu

    Abstract: Real-time dynamic path planning in complex traffic environments presents challenges, such as varying traffic volumes and signal wait times. Traditional static routing algorithms like Dijkstra and A* compute shortest paths but often fail under dynamic conditions. Recent Reinforcement Learning (RL) approaches offer improvements but tend to focus on local optima, risking dead-ends or boundary issues.… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: This paper is 12 pages long and represents the initial draft, version 1

  15. arXiv:2408.12121  [pdf, other

    cs.HC cs.AI

    Emotion-Agent: Unsupervised Deep Reinforcement Learning with Distribution-Prototype Reward for Continuous Emotional EEG Analysis

    Authors: Zhihao Zhou, Qile Liu, Jiyuan Wang, Zhen Liang

    Abstract: Continuous electroencephalography (EEG) signals are widely used in affective brain-computer interface (aBCI) applications. However, not all continuously collected EEG signals are relevant or meaningful to the task at hand (e.g., wondering thoughts). On the other hand, manually labeling the relevant parts is nearly impossible due to varying engagement patterns across different tasks and individuals… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: 11 pages, 4 figures, 4 tables, submitted to AAAI 2025

  16. arXiv:2408.11449  [pdf, other

    cs.AI

    Enabling Small Models for Zero-Shot Classification through Model Label Learning

    Authors: Jia Zhang, Zhi Zhou, Lan-Zhe Guo, Yu-Feng Li

    Abstract: Vision-language models (VLMs) like CLIP have demonstrated impressive zero-shot ability in image classification tasks by aligning text and images but suffer inferior performance compared with task-specific expert models. On the contrary, expert models excel in their specialized domains but lack zero-shot ability for new tasks. How to obtain both the high performance of expert models and zero-shot a… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  17. arXiv:2408.10943  [pdf, other

    cs.CL

    SysBench: Can Large Language Models Follow System Messages?

    Authors: Yanzhao Qin, Tao Zhang, Tao Zhang, Yanjun Shen, Wenjing Luo, Haoze Sun, Yan Zhang, Yujing Qiao, Weipeng Chen, Zenan Zhou, Wentao Zhang, Bin Cui

    Abstract: Large Language Models (LLMs) have become instrumental across various applications, with the customization of these models to specific scenarios becoming increasingly critical. System message, a fundamental component of LLMs, is consist of carefully crafted instructions that guide the behavior of model to meet intended goals. Despite the recognized potential of system messages to optimize AI-driven… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  18. arXiv:2408.10939  [pdf, other

    cs.LG stat.ML

    Conformalized Interval Arithmetic with Symmetric Calibration

    Authors: Rui Luo, Zhixin Zhou

    Abstract: Uncertainty quantification is essential in decision-making, especially when joint distributions of random variables are involved. While conformal prediction provides distribution-free prediction sets with valid coverage guarantees, it traditionally focuses on single predictions. This paper introduces novel conformal prediction methods for estimating the sum or average of unknown labels over specif… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  19. arXiv:2408.10861  [pdf, other

    cs.RO cs.HC

    DVRP-MHSI: Dynamic Visualization Research Platform for Multimodal Human-Swarm Interaction

    Authors: Pengming Zhu, Zhiwen Zeng, Weijia Yao, Wei Dai, Huimin Lu, Zongtan Zhou

    Abstract: In recent years, there has been a significant amount of research on algorithms and control methods for distributed collaborative robots. However, the emergence of collective behavior in a swarm is still difficult to predict and control. Nevertheless, human interaction with the swarm helps render the swarm more predictable and controllable, as human operators can utilize intuition or knowledge that… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  20. arXiv:2408.10653  [pdf, other

    cs.CV

    UIE-UnFold: Deep Unfolding Network with Color Priors and Vision Transformer for Underwater Image Enhancement

    Authors: Yingtie Lei, Jia Yu, Yihang Dong, Changwei Gong, Ziyang Zhou, Chi-Man Pun

    Abstract: Underwater image enhancement (UIE) plays a crucial role in various marine applications, but it remains challenging due to the complex underwater environment. Current learning-based approaches frequently lack explicit incorporation of prior knowledge about the physical processes involved in underwater image formation, resulting in limited optimization despite their impressive enhancement results. T… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: Accepted by DSAA CIVIL 2024

  21. arXiv:2408.09143  [pdf, ps, other

    math.NA cs.LG

    Point Source Identification Using Singularity Enriched Neural Networks

    Authors: Tianhao Hu, Bangti Jin, Zhi Zhou

    Abstract: The inverse problem of recovering point sources represents an important class of applied inverse problems. However, there is still a lack of neural network-based methods for point source identification, mainly due to the inherent solution singularity. In this work, we develop a novel algorithm to identify point sources, utilizing a neural network combined with a singularity enrichment technique. W… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: 22 pages

  22. arXiv:2408.09074  [pdf, ps, other

    cs.LG math.OC

    Gradient-Variation Online Learning under Generalized Smoothness

    Authors: Yan-Feng Xie, Peng Zhao, Zhi-Hua Zhou

    Abstract: Gradient-variation online learning aims to achieve regret guarantees that scale with the variations in the gradients of online functions, which has been shown to be crucial for attaining fast convergence in games and robustness in stochastic optimization, hence receiving increased attention. Existing results often require the smoothness condition by imposing a fixed bound on the gradient Lipschitz… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  23. arXiv:2408.08588  [pdf, other

    cs.IT eess.SP

    Movable Antenna for Wireless Communications:Prototyping and Experimental Results

    Authors: Zhenjun Dong, Zhiwen Zhou, Zhiqiang Xiao, Chaoyue Zhang, Xinrui Li, Hongqi Min, Yong Zeng, Shi Jin, Rui Zhang

    Abstract: Movable antenna (MA), which can flexibly change the position of antenna in three-dimensional (3D) continuous space, is an emerging technology for achieving full spatial performance gains. In this paper, a prototype of MA communication system with ultra-accurate movement control is presented to verify the performance gain of MA in practical environments. The prototype utilizes the feedback control… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  24. arXiv:2408.07663  [pdf, other

    cs.CL cs.AI

    Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions

    Authors: Quan Liu, Zhenhong Zhou, Longzhu He, Yi Liu, Wei Zhang, Sen Su

    Abstract: Large language models are susceptible to jailbreak attacks, which can result in the generation of harmful content. While prior defenses mitigate these risks by perturbing or inspecting inputs, they ignore competing objectives, the underlying cause of alignment failures. In this paper, we propose Alignment-Enhanced Decoding (AED), a novel defense that employs adaptive decoding to address the root c… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 15 pages, 5 figures

  25. arXiv:2408.07543  [pdf, other

    cs.CV cs.CL

    MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark

    Authors: Minxuan Zhou, Hao Liang, Tianpeng Li, Zhiyu Wu, Mingan Lin, Linzhuang Sun, Yaqi Zhou, Yan Zhang, Xiaoqin Huang, Yicong Chen, Yujing Qiao, Weipeng Chen, Bin Cui, Wentao Zhang, Zenan Zhou

    Abstract: With the development of Multimodal Large Language Models (MLLMs), the evaluation of multimodal models in the context of mathematical problems has become a valuable research field. Multimodal visual-textual mathematical reasoning serves as a critical indicator for evaluating the comprehension and complex multi-step quantitative reasoning abilities of MLLMs. However, previous multimodal math benchma… ▽ More

    Submitted 23 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

  26. arXiv:2408.06294  [pdf, other

    cs.HC

    AniBalloons: Animated Chat Balloons as Affective Augmentation for Social Messaging and Chatbot Interaction

    Authors: Pengcheng An, Chaoyu Zhang, Haichen Gao, Ziqi Zhou, Yage Xiao, Jian Zhao

    Abstract: Despite being prominent and ubiquitous, message-based interaction is limited in nonverbally conveying emotions. Besides emoticons or stickers, messaging users continue seeking richer options for affective communication. Recent research explored using chat balloons' shape and color to communicate emotional states. However, little work explored whether and how chat-balloon animations could be design… ▽ More

    Submitted 14 August, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

    Comments: under the 2nd review after minor revision by International Journal of Human-Computer Studies

  27. arXiv:2408.05751  [pdf, other

    cs.IR cs.CV

    Advancing Re-Ranking with Multimodal Fusion and Target-Oriented Auxiliary Tasks in E-Commerce Search

    Authors: Enqiang Xu, Xinhui Li, Zhigong Zhou, Jiahao Ji, Jinyuan Zhao, Dadong Miao, Songlin Wang, Lin Liu, Sulong Xu

    Abstract: In the rapidly evolving field of e-commerce, the effectiveness of search re-ranking models is crucial for enhancing user experience and driving conversion rates. Despite significant advancements in feature representation and model architecture, the integration of multimodal information remains underexplored. This study addresses this gap by investigating the computation and fusion of textual and v… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

  28. arXiv:2408.05669  [pdf, other

    cs.CV cs.AI

    StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model

    Authors: Ziyin Zhou, Ke Sun, Zhongxi Chen, Huafeng Kuang, Xiaoshuai Sun, Rongrong Ji

    Abstract: The rapid progress in generative models has given rise to the critical task of AI-Generated Content Stealth (AIGC-S), which aims to create AI-generated images that can evade both forensic detectors and human inspection. This task is crucial for understanding the vulnerabilities of existing detection methods and developing more robust techniques. However, current adversarial attacks often introduce… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  29. arXiv:2408.04916  [pdf, other

    cs.LG

    PTrajM: Efficient and Semantic-rich Trajectory Learning with Pretrained Trajectory-Mamba

    Authors: Yan Lin, Yichen Liu, Zeyu Zhou, Haomin Wen, Erwen Zheng, Shengnan Guo, Youfang Lin, Huaiyu Wan

    Abstract: Vehicle trajectories provide crucial movement information for various real-world applications. To better utilize vehicle trajectories, it is essential to develop a trajectory learning approach that can effectively and efficiently extract rich semantic information, including movement behavior and travel purposes, to support accurate downstream applications. However, creating such an approach presen… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  30. arXiv:2408.04172  [pdf, other

    cs.CV cs.MM

    MultiColor: Image Colorization by Learning from Multiple Color Spaces

    Authors: Xiangcheng Du, Zhao Zhou, Yanlong Wang, Zhuoyao Wang, Yingbin Zheng, Cheng Jin

    Abstract: Deep networks have shown impressive performance in the image restoration tasks, such as image colorization. However, we find that previous approaches rely on the digital representation from single color model with a specific mapping function, a.k.a., color space, during the colorization pipeline. In this paper, we first investigate the modeling of different color spaces, and find each of them exhi… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  31. arXiv:2408.03892  [pdf, other

    cs.SE cs.AI

    MORTAR: A Model-based Runtime Action Repair Framework for AI-enabled Cyber-Physical Systems

    Authors: Renzhi Wang, Zhehua Zhou, Jiayang Song, Xuan Xie, Xiaofei Xie, Lei Ma

    Abstract: Cyber-Physical Systems (CPSs) are increasingly prevalent across various industrial and daily-life domains, with applications ranging from robotic operations to autonomous driving. With recent advancements in artificial intelligence (AI), learning-based components, especially AI controllers, have become essential in enhancing the functionality and efficiency of CPSs. However, the lack of interpreta… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  32. arXiv:2408.03650  [pdf, other

    cs.MM

    Towards Multimodal Emotional Support Conversation Systems

    Authors: Yuqi Chu, Lizi Liao, Zhiyuan Zhou, Chong-Wah Ngo, Richang Hong

    Abstract: The integration of conversational artificial intelligence (AI) into mental health care promises a new horizon for therapist-client interactions, aiming to closely emulate the depth and nuance of human conversations. Despite the potential, the current landscape of conversational AI is markedly limited by its reliance on single-modal data, constraining the systems' ability to empathize and provide e… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  33. Unsupervised Domain Adaption Harnessing Vision-Language Pre-training

    Authors: Wenlve Zhou, Zhiheng Zhou

    Abstract: This paper addresses two vital challenges in Unsupervised Domain Adaptation (UDA) with a focus on harnessing the power of Vision-Language Pre-training (VLP) models. Firstly, UDA has primarily relied on ImageNet pre-trained models. However, the potential of VLP models in UDA remains largely unexplored. The rich representation of VLP models holds significant promise for enhancing UDA tasks. To addre… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

  34. Decide: Knowledge-Based Version Incompatibility Detection in Deep Learning Stacks

    Authors: Zihan Zhou, Zhongkai Zhao, Bonan Kou, Tianyi Zhang

    Abstract: Version incompatibility issues are prevalent when reusing or reproducing deep learning (DL) models and applications. Compared with official API documentation, which is often incomplete or out-of-date, Stack Overflow (SO) discussions possess a wealth of version knowledge that has not been explored by previous approaches. To bridge this gap, we present Decide, a web-based visualization of a knowledg… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

  35. arXiv:2408.01428  [pdf, other

    cs.CV cs.AI

    Transferable Adversarial Facial Images for Privacy Protection

    Authors: Minghui Li, Jiangxiong Wang, Hao Zhang, Ziqi Zhou, Shengshan Hu, Xiaobing Pei

    Abstract: The success of deep face recognition (FR) systems has raised serious privacy concerns due to their ability to enable unauthorized tracking of users in the digital world. Previous studies proposed introducing imperceptible adversarial noises into face images to deceive those face recognition models, thus achieving the goal of enhancing facial privacy protection. Nevertheless, they heavily rely on u… ▽ More

    Submitted 17 July, 2024; originally announced August 2024.

    Comments: Accepted by ACM MM 2024

  36. arXiv:2408.01122  [pdf, other

    cs.CL

    CFBench: A Comprehensive Constraints-Following Benchmark for LLMs

    Authors: Tao Zhang, Yanjun Shen, Wenjing Luo, Yan Zhang, Hao Liang, Tao Zhang, Fan Yang, Mingan Lin, Yujing Qiao, Weipeng Chen, Bin Cui, Wentao Zhang, Zenan Zhou

    Abstract: The adeptness of Large Language Models (LLMs) in comprehending and following natural language instructions is critical for their deployment in sophisticated real-world applications. Existing evaluations mainly focus on fragmented constraints or narrow scenarios, but they overlook the comprehensiveness and authenticity of constraints from the user's perspective. To bridge this gap, we propose CFBen… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: 15 pages, 10 figures

  37. arXiv:2407.21531  [pdf, other

    cs.SD cs.CL cs.MM eess.AS

    Can LLMs "Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation

    Authors: Ziya Zhou, Yuhang Wu, Zhiyue Wu, Xinyue Zhang, Ruibin Yuan, Yinghao Ma, Lu Wang, Emmanouil Benetos, Wei Xue, Yike Guo

    Abstract: Symbolic Music, akin to language, can be encoded in discrete symbols. Recent research has extended the application of large language models (LLMs) such as GPT-4 and Llama2 to the symbolic music domain including understanding and generation. Yet scant research explores the details of how these LLMs perform on advanced music understanding and conditioned generation, especially from the multi-step re… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: Accepted by ISMIR2024

  38. arXiv:2407.21369  [pdf, other

    cs.SE

    An LLM-based Readability Measurement for Unit Tests' Context-aware Inputs

    Authors: Zhichao Zhou, Yutian Tang, Yun Lin, Jingzhu He

    Abstract: Automated test techniques usually generate unit tests with higher code coverage than manual tests. However, the readability of automated tests is crucial for code comprehension and maintenance. The readability of unit tests involves many aspects. In this paper, we focus on test inputs. The central limitation of existing studies on input readability is that they focus on test codes alone without ta… ▽ More

    Submitted 18 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

  39. arXiv:2407.21002  [pdf, other

    cs.CV cs.AI

    XHand: Real-time Expressive Hand Avatar

    Authors: Qijun Gan, Zijie Zhou, Jianke Zhu

    Abstract: Hand avatars play a pivotal role in a wide array of digital interfaces, enhancing user immersion and facilitating natural interaction within virtual environments. While previous studies have focused on photo-realistic hand rendering, little attention has been paid to reconstruct the hand geometry with fine details, which is essential to rendering quality. In the realms of extended reality and gami… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  40. arXiv:2407.20635  [pdf, other

    cs.RO cs.AI

    Autonomous Improvement of Instruction Following Skills via Foundation Models

    Authors: Zhiyuan Zhou, Pranav Atreya, Abraham Lee, Homer Walke, Oier Mees, Sergey Levine

    Abstract: Intelligent instruction-following robots capable of improving from autonomously collected experience have the potential to transform robot learning: instead of collecting costly teleoperated demonstration data, large-scale deployment of fleets of robots can quickly collect larger quantities of autonomous data that can collectively improve their performance. However, autonomous improvement requires… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  41. arXiv:2407.20242  [pdf, other

    cs.CY cs.AI cs.RO

    The Threats of Embodied Multimodal LLMs: Jailbreaking Robotic Manipulation in the Physical World

    Authors: Hangtao Zhang, Chenyu Zhu, Xianlong Wang, Ziqi Zhou, Yichen Wang, Lulu Xue, Minghui Li, Shengshan Hu, Leo Yu Zhang

    Abstract: Embodied artificial intelligence (AI) represents an artificial intelligence system that interacts with the physical world through sensors and actuators, seamlessly integrating perception and action. This design enables AI to learn from and operate within complex, real-world environments. Large Language Models (LLMs) deeply explore language instructions, playing a crucial role in devising plans for… ▽ More

    Submitted 15 August, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: Preliminary version (17 pages, 4 figures). Work in progress, revisions ongoing. Appreciate understanding and welcome any feedback

  42. arXiv:2407.20223  [pdf, other

    cs.CV cs.RO

    Correspondence-Free SE(3) Point Cloud Registration in RKHS via Unsupervised Equivariant Learning

    Authors: Ray Zhang, Zheming Zhou, Min Sun, Omid Ghasemalizadeh, Cheng-Hao Kuo, Ryan Eustice, Maani Ghaffari, Arnie Sen

    Abstract: This paper introduces a robust unsupervised SE(3) point cloud registration method that operates without requiring point correspondences. The method frames point clouds as functions in a reproducing kernel Hilbert space (RKHS), leveraging SE(3)-equivariant features for direct feature space registration. A novel RKHS distance metric is proposed, offering reliable performance amidst noise, outliers,… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: 10 pages, to be published in ECCV 2024

  43. arXiv:2407.19215  [pdf, ps, other

    cs.CR

    Learning Sparse Parity with Noise in Linear Samples

    Authors: Xue Chen, Wenxuan Shu, Zhaienhe Zhou

    Abstract: We revisit the learning parity with noise problem with a sparse secret that involves at most $k$ out of $n$ variables. Let $η$ denote the noise rate such that each label gets flipped with probability $η$. In this work, we show algorithms in the low-noise setting and high-noise setting separately. We present an algorithm of running time $O(η\cdot n/k)^k$ for any $η$ and $k$ satisfying $n>k/η$. This… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

  44. arXiv:2407.17457  [pdf, other

    cs.CV cs.RO

    CSCPR: Cross-Source-Context Indoor RGB-D Place Recognition

    Authors: Jing Liang, Zhuo Deng, Zheming Zhou, Min Sun, Omid Ghasemalizadeh, Cheng-Hao Kuo, Arnie Sen, Dinesh Manocha

    Abstract: We present a new algorithm, Cross-Source-Context Place Recognition (CSCPR), for RGB-D indoor place recognition that integrates global retrieval and reranking into a single end-to-end model. Unlike prior approaches that primarily focus on the RGB domain, CSCPR is designed to handle the RGB-D data. We extend the Context-of-Clusters (CoCs) for handling noisy colorized point clouds and introduce two n… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  45. arXiv:2407.16697  [pdf, other

    cs.CV

    AbdomenAtlas: A Large-Scale, Detailed-Annotated, & Multi-Center Dataset for Efficient Transfer Learning and Open Algorithmic Benchmarking

    Authors: Wenxuan Li, Chongyu Qu, Xiaoxi Chen, Pedro R. A. S. Bassi, Yijia Shi, Yuxiang Lai, Qian Yu, Huimin Xue, Yixiong Chen, Xiaorui Lin, Yutong Tang, Yining Cao, Haoqi Han, Zheyuan Zhang, Jiawei Liu, Tiezheng Zhang, Yujiu Ma, Jincheng Wang, Guang Zhang, Alan Yuille, Zongwei Zhou

    Abstract: We introduce the largest abdominal CT dataset (termed AbdomenAtlas) of 20,460 three-dimensional CT volumes sourced from 112 hospitals across diverse populations, geographies, and facilities. AbdomenAtlas provides 673K high-quality masks of anatomical structures in the abdominal region annotated by a team of 10 radiologists with the help of AI algorithms. We start by having expert radiologists manu… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: Published in Medical Image Analysis

  46. arXiv:2407.16637  [pdf, other

    cs.CL cs.AI cs.LG

    Course-Correction: Safety Alignment Using Synthetic Preferences

    Authors: Rongwu Xu, Yishuo Cai, Zhenhong Zhou, Renjie Gu, Haiqin Weng, Yan Liu, Tianwei Zhang, Wei Xu, Han Qiu

    Abstract: The risk of harmful content generated by large language models (LLMs) becomes a critical concern. This paper presents a systematic study on assessing and improving LLMs' capability to perform the task of \textbf{course-correction}, \ie, the model can steer away from generating harmful content autonomously. To start with, we introduce the \textsc{C$^2$-Eval} benchmark for quantitative assessment an… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: Dataset and script will be available at https://github.com/pillowsofwind/Course-Correction

  47. arXiv:2407.16406  [pdf, other

    cs.CV cs.LG

    Hi-EF: Benchmarking Emotion Forecasting in Human-interaction

    Authors: Haoran Wang, Xinji Mai, Zeng Tao, Yan Wang, Jiawen Yu, Ziheng Zhou, Xuan Tong, Shaoqi Yan, Qing Zhao, Shuyong Gao, Wenqiang Zhang

    Abstract: Affective Forecasting, a research direction in psychology that predicts individuals future emotions, is often constrained by numerous external factors like social influence and temporal distance. To address this, we transform Affective Forecasting into a Deep Learning problem by designing an Emotion Forecasting paradigm based on two-party interactions. We propose a novel Emotion Forecasting (EF) t… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  48. Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

    Authors: Letian Gong, Huaiyu Wan, Shengnan Guo, Xiucheng Li, Yan Lin, Erwen Zheng, Tianyi Wang, Zeyu Zhou, Youfang Lin

    Abstract: The rapid growth of location-based services (LBS) has yielded massive amounts of data on human mobility. Effectively extracting meaningful representations for user-generated check-in sequences is pivotal for facilitating various downstream services. However, the user-generated check-in data are simultaneously influenced by the surrounding objective circumstances and the user's subjective intention… ▽ More

    Submitted 25 July, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted as a regular paper at IEEE TKDE

  49. arXiv:2407.15590  [pdf, other

    cs.CV

    All rivers run into the sea: Unified Modality Brain-like Emotional Central Mechanism

    Authors: Xinji Mai, Junxiong Lin, Haoran Wang, Zeng Tao, Yan Wang, Shaoqi Yan, Xuan Tong, Jiawen Yu, Boyang Wang, Ziheng Zhou, Qing Zhao, Shuyong Gao, Wenqiang Zhang

    Abstract: In the field of affective computing, fully leveraging information from a variety of sensory modalities is essential for the comprehensive understanding and processing of human emotions. Inspired by the process through which the human brain handles emotions and the theory of cross-modal plasticity, we propose UMBEnet, a brain-like unified modal affective processing network. The primary design of UM… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  50. arXiv:2407.15366  [pdf, other

    cs.CL cs.AI cs.CY

    Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias

    Authors: Rongwu Xu, Zi'an Zhou, Tianwei Zhang, Zehan Qi, Su Yao, Ke Xu, Wei Xu, Han Qiu

    Abstract: The common toxicity and societal bias in contents generated by large language models (LLMs) necessitate strategies to reduce harm. Present solutions often demand white-box access to the model or substantial training, which is impractical for cutting-edge commercial LLMs. Moreover, prevailing prompting methods depend on external tool feedback and fail to simultaneously lessen toxicity and bias. Mot… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.