Skip to main content

Showing 1–50 of 253 results for author: Wei, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.02817  [pdf, other

    cs.CR cs.LG

    Obsidian: Cooperative State-Space Exploration for Performant Inference on Secure ML Accelerators

    Authors: Sarbartha Banerjee, Shijia Wei, Prakash Ramrakhyani, Mohit Tiwari

    Abstract: Trusted execution environments (TEEs) for machine learning accelerators are indispensable in secure and efficient ML inference. Optimizing workloads through state-space exploration for the accelerator architectures improves performance and energy consumption. However, such explorations are expensive and slow due to the large search space. Current research has to use fast analytical models that for… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

  2. arXiv:2409.01710  [pdf, other

    cs.MM

    Privacy-Preserving Multimedia Mobile Cloud Computing Using Protective Perturbation

    Authors: Zhongze Tang, Mengmei Ye, Yao Liu, Sheng Wei

    Abstract: Mobile cloud computing has been adopted in many multimedia applications, where the resource-constrained mobile device sends multimedia data (e.g., images) to remote cloud servers to request computation-intensive multimedia services (e.g., image recognition). While significantly improving the performance of the mobile applications, the cloud-based mechanism often causes privacy concerns as the mult… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  3. arXiv:2408.14732  [pdf, other

    cs.CV cs.GR

    OctFusion: Octree-based Diffusion Models for 3D Shape Generation

    Authors: Bojun Xiong, Si-Tong Wei, Xin-Yang Zheng, Yan-Pei Cao, Zhouhui Lian, Peng-Shuai Wang

    Abstract: Diffusion models have emerged as a popular method for 3D generation. However, it is still challenging for diffusion models to efficiently generate diverse and high-quality 3D shapes. In this paper, we introduce OctFusion, which can generate 3D shapes with arbitrary resolutions in 2.5 seconds on a single Nvidia 4090 GPU, and the extracted meshes are guaranteed to be continuous and manifold. The key… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: Technical Report

  4. arXiv:2408.07719  [pdf, other

    cs.LG cs.AI

    Operator Feature Neural Network for Symbolic Regression

    Authors: Yusong Deng, Min Wu, Lina Yu, Jingyi Liu, Shu Wei, Yanjie Li, Weijun Li

    Abstract: Symbolic regression is a task aimed at identifying patterns in data and representing them through mathematical expressions, generally involving skeleton prediction and constant optimization. Many methods have achieved some success, however they treat variables and symbols merely as characters of natural language without considering their mathematical essence. This paper introduces the operator fea… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 12 pages

  5. arXiv:2408.05842  [pdf, other

    cs.AI cs.HC

    Evolving Virtual World with Delta-Engine

    Authors: Hongqiu Wu, Zekai Xu, Tianyang Xu, Shize Wei, Yan Wang, Jiale Hong, Weiqi Wu, Hai Zhao, Min Zhang, Zhezhi He

    Abstract: In this paper, we focus on the \emph{virtual world}, a cyberspace where people can live in. An ideal virtual world shares great similarity with our real world. One of the crucial aspects is its evolving nature, reflected by individuals' capability to grow and thereby influence the objective world. Such dynamics is unpredictable and beyond the reach of existing systems. For this, we propose a speci… ▽ More

    Submitted 2 September, 2024; v1 submitted 11 August, 2024; originally announced August 2024.

  6. arXiv:2407.19845  [pdf, other

    cs.LG cs.CR

    BackdoorBench: A Comprehensive Benchmark and Analysis of Backdoor Learning

    Authors: Baoyuan Wu, Hongrui Chen, Mingda Zhang, Zihao Zhu, Shaokui Wei, Danni Yuan, Mingli Zhu, Ruotong Wang, Li Liu, Chao Shen

    Abstract: As an emerging approach to explore the vulnerability of deep neural networks (DNNs), backdoor learning has attracted increasing interest in recent years, and many seminal backdoor attack and defense algorithms are being developed successively or concurrently, in the status of a rapid arms race. However, mainly due to the diverse settings, and the difficulties of implementation and reproducibility… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: Substantial extensions based on our previous conference version "Backdoorbench: A comprehensive benchmark of backdoor learning" published at NeurIPS D&B Track 2022. 20 backdoor attack algorithms, 32 backdoor defense algorithms, 11000+ pairs of attack-against-defense evaluations, 10 analyses, 18 analysis tools

  7. arXiv:2407.17584  [pdf

    cs.CY

    Theorizing neuro-induced relationships between cognitive diversity, motivation, grit and academic performance in multidisciplinary engineering education context

    Authors: Duy Duong-Tran, Siqing Wei, Li Shen

    Abstract: Nowadays, engineers need to tackle many unprecedented challenges that are often complex, and, most importantly, cannot be exhaustively compartmentalized into a single engineering discipline. In other words, most engineering problems need to be solved from a multidisciplinary approach. However, conventional engineering programs usually adopt pedagogical approaches specifically tailored to tradition… ▽ More

    Submitted 13 May, 2024; originally announced July 2024.

    Comments: 10 pages, 2 figures

  8. arXiv:2407.16364  [pdf, other

    cs.CV

    Harmonizing Visual Text Comprehension and Generation

    Authors: Zhen Zhao, Jingqun Tang, Binghong Wu, Chunhui Lin, Shu Wei, Hao Liu, Xin Tan, Zhizhong Zhang, Can Huang, Yuan Xie

    Abstract: In this work, we present TextHarmony, a unified and versatile multimodal generative model proficient in comprehending and generating visual text. Simultaneously generating images and texts typically results in performance degradation due to the inherent inconsistency between vision and language modalities. To overcome this challenge, existing approaches resort to modality-specific data for supervi… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  9. arXiv:2407.13038  [pdf, other

    cs.CV cs.LG

    Universal Facial Encoding of Codec Avatars from VR Headsets

    Authors: Shaojie Bai, Te-Li Wang, Chenghui Li, Akshay Venkatesh, Tomas Simon, Chen Cao, Gabriel Schwartz, Ryan Wrench, Jason Saragih, Yaser Sheikh, Shih-En Wei

    Abstract: Faithful real-time facial animation is essential for avatar-mediated telepresence in Virtual Reality (VR). To emulate authentic communication, avatar animation needs to be efficient and accurate: able to capture both extreme and subtle expressions within a few milliseconds to sustain the rhythm of natural conversations. The oblique and incomplete views of the face, variability in the donning of he… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: SIGGRAPH 2024 (ACM Transactions on Graphics (TOG))

    Journal ref: ACM Trans. Graph. 43, 4, Article 93 (July 2024), 22 pages.

  10. arXiv:2407.11615  [pdf, other

    cs.LG cs.AI

    Graph Dimension Attention Networks for Enterprise Credit Assessment

    Authors: Shaopeng Wei, Beni Egressy, Xingyan Chen, Yu Zhao, Fuzhen Zhuang, Roger Wattenhofer, Gang Kou

    Abstract: Enterprise credit assessment is critical for evaluating financial risk, and Graph Neural Networks (GNNs), with their advanced capability to model inter-entity relationships, are a natural tool to get a deeper understanding of these financial networks. However, existing GNN-based methodologies predominantly emphasize entity-level attention mechanisms for contagion risk aggregation, often overlookin… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  11. arXiv:2407.04458  [pdf, other

    cs.CV cs.AI

    Robust Multimodal Learning via Representation Decoupling

    Authors: Shicai Wei, Yang Luo, Yuji Wang, Chunbo Luo

    Abstract: Multimodal learning robust to missing modality has attracted increasing attention due to its practicality. Existing methods tend to address it by learning a common subspace representation for different modality combinations. However, we reveal that they are sub-optimal due to their implicit constraint on intra-class representation. Specifically, the sample with different modalities within the same… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: ECCV2024 17 pages

  12. arXiv:2406.14844  [pdf, other

    cs.LG cs.AI

    DN-CL: Deep Symbolic Regression against Noise via Contrastive Learning

    Authors: Jingyi Liu, Yanjie Li, Lina Yu, Min Wu, Weijun Li, Wenqiang Li, Meilan Hao, Yusong Deng, Shu Wei

    Abstract: Noise ubiquitously exists in signals due to numerous factors including physical, electronic, and environmental effects. Traditional methods of symbolic regression, such as genetic programming or deep learning models, aim to find the most fitting expressions for these signals. However, these methods often overlook the noise present in real-world data, leading to reduced fitting accuracy. To tackle… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  13. arXiv:2406.07852  [pdf, other

    cs.CV

    DiffPop: Plausibility-Guided Object Placement Diffusion for Image Composition

    Authors: Jiacheng Liu, Hang Zhou, Shida Wei, Rui Ma

    Abstract: In this paper, we address the problem of plausible object placement for the challenging task of realistic image composition. We propose DiffPop, the first framework that utilizes plausibility-guided denoising diffusion probabilistic model to learn the scale and spatial relations among multiple objects and the corresponding scene image. First, we train an unguided diffusion model to directly learn… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  14. arXiv:2406.06893  [pdf, other

    stat.ML cs.IT cs.LG

    Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot

    Authors: Zixuan Wang, Stanley Wei, Daniel Hsu, Jason D. Lee

    Abstract: The transformer architecture has prevailed in various deep learning settings due to its exceptional capabilities to select and compose structural information. Motivated by these capabilities, Sanford et al. proposed the sparse token selection task, in which transformers excel while fully-connected networks (FCNs) fail in the worst case. Building upon that, we strengthen the FCN lower bound to an a… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  15. arXiv:2406.06375  [pdf, other

    cs.SD cs.AI eess.AS

    MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing

    Authors: Yu-Fen Huang, Nikki Moran, Simon Coleman, Jon Kelly, Shun-Hwa Wei, Po-Yin Chen, Yun-Hsin Huang, Tsung-Ping Chen, Yu-Chia Kuo, Yu-Chi Wei, Chih-Hsuan Li, Da-Yu Huang, Hsuan-Kai Kao, Ting-Wei Lin, Li Su

    Abstract: In cross-modal music processing, translation between visual, auditory, and semantic content opens up new possibilities as well as challenges. The construction of such a transformative scheme depends upon a benchmark corpus with a comprehensive data infrastructure. In particular, the assembly of a large-scale cross-modal dataset presents major challenges. In this paper, we present the MOSA (Music m… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024. 14 pages, 7 figures. Dataset is available on: https://github.com/yufenhuang/MOSA-Music-mOtion-and-Semantic-Annotation-dataset/tree/main and https://zenodo.org/records/11393449

  16. arXiv:2406.05410  [pdf, other

    cs.AI cs.CL

    MLLM-SR: Conversational Symbolic Regression base Multi-Modal Large Language Models

    Authors: Yanjie Li, Weijun Li, Lina Yu, Min Wu, Jingyi Liu, Wenqiang Li, Shu Wei, Yusong Deng

    Abstract: Formulas are the language of communication between humans and nature. It is an important research topic of artificial intelligence to find expressions from observed data to reflect the relationship between each variable in the data, which is called a symbolic regression problem. The existing symbolic regression methods directly generate expressions according to the given observation data, and we c… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 13 pages,

  17. arXiv:2406.03009  [pdf, other

    cs.CL cs.AI

    Unveiling Selection Biases: Exploring Order and Token Sensitivity in Large Language Models

    Authors: Sheng-Lun Wei, Cheng-Kuang Wu, Hen-Hsen Huang, Hsin-Hsi Chen

    Abstract: In this paper, we investigate the phenomena of "selection biases" in Large Language Models (LLMs), focusing on problems where models are tasked with choosing the optimal option from an ordered sequence. We delve into biases related to option order and token usage, which significantly impact LLMs' decision-making processes. We also quantify the impact of these biases through an extensive empirical… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted as a long findings paper at ACL 2024

  18. arXiv:2406.02930  [pdf, other

    cs.CV

    P2PFormer: A Primitive-to-polygon Method for Regular Building Contour Extraction from Remote Sensing Images

    Authors: Tao Zhang, Shiqing Wei, Yikang Zhou, Muying Luo, Wenling You, Shunping Ji

    Abstract: Extracting building contours from remote sensing imagery is a significant challenge due to buildings' complex and diverse shapes, occlusions, and noise. Existing methods often struggle with irregular contours, rounded corners, and redundancy points, necessitating extensive post-processing to produce regular polygonal building contours. To address these challenges, we introduce a novel, streamlined… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  19. arXiv:2406.01326  [pdf, other

    cs.CV

    TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

    Authors: Weichao Zhao, Hao Feng, Qi Liu, Jingqun Tang, Shu Wei, Binghong Wu, Lei Liao, Yongjie Ye, Hao Liu, Houqiang Li, Can Huang

    Abstract: Tables contain factual and quantitative data accompanied by various structures and contents that pose challenges for machine comprehension. Previous methods generally design task-specific architectures and objectives for individual tasks, resulting in modal isolation and intricate workflows. In this paper, we present a novel large vision-language model, TabPedia, equipped with a concept synergy me… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 20 pages, 8 figures

  20. arXiv:2406.00939  [pdf, ps, other

    cs.IT

    Bounds on f-Divergences between Distributions within Generalized Quasi-$\varepsilon$-Neighborhood

    Authors: Xinchun Yu, Shuangqing Wei, Shao-Lun Huang, Xiao-Ping Zhang

    Abstract: A general reverse Pinsker's inequality is derived to give an upper bound on f-divergences in terms of total variational distance when two distributions are close measured under our proposed generalized local information geometry framework. In addition, relationships between two f-divergences equipped with functions that are third order differentiable are established in terms of the lower and upper… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  21. arXiv:2405.20291  [pdf, other

    cs.CR cs.CV cs.LG

    Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning Weight Changes and Backdoor Activeness

    Authors: Weilin Lin, Li Liu, Shaokui Wei, Jianze Li, Hui Xiong

    Abstract: The security threat of backdoor attacks is a central concern for deep neural networks (DNNs). Recently, without poisoned data, unlearning models with clean data and then learning a pruning mask have contributed to backdoor defense. Additionally, vanilla fine-tuning with those clean data can help recover the lost clean accuracy. However, the behavior of clean unlearning is still under-explored, and… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  22. arXiv:2405.17221  [pdf, other

    cs.AI cs.AR

    Efficient Orchestrated AI Workflows Execution on Scale-out Spatial Architecture

    Authors: Jinyi Deng, Xinru Tang, Zhiheng Yue, Guangyang Lu, Qize Yang, Jiahao Zhang, Jinxi Li, Chao Li, Shaojun Wei, Yang Hu, Shouyi Yin

    Abstract: Given the increasing complexity of AI applications, traditional spatial architectures frequently fall short. Our analysis identifies a pattern of interconnected, multi-faceted tasks encompassing both AI and general computational processes. In response, we have conceptualized "Orchestrated AI Workflows," an approach that integrates various tasks with logic-driven decisions into dynamic, sophisticat… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  23. arXiv:2405.16112  [pdf, other

    cs.CR cs.CV

    Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor

    Authors: Shaokui Wei, Hongyuan Zha, Baoyuan Wu

    Abstract: Data-poisoning backdoor attacks are serious security threats to machine learning models, where an adversary can manipulate the training dataset to inject backdoors into models. In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned. Unlike most existing methods that primarily detect and remove/unlearn suspicious samp… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 13 pages, 5 figures and 5 tables

  24. arXiv:2405.14620  [pdf, other

    cs.LG

    Closed-form Symbolic Solutions: A New Perspective on Solving Partial Differential Equations

    Authors: Shu Wei, Yanjie Li, Lina Yu, Min Wu, Weijun Li, Meilan Hao, Wenqiang Li, Jingyi Liu, Yusong Deng

    Abstract: Solving partial differential equations (PDEs) in Euclidean space with closed-form symbolic solutions has long been a dream for mathematicians. Inspired by deep learning, Physics-Informed Neural Networks (PINNs) have shown great promise in numerically solving PDEs. However, since PINNs essentially approximate solutions within the continuous function space, their numerical solutions fall short in bo… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  25. arXiv:2405.11985  [pdf, other

    cs.CV

    MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

    Authors: Jingqun Tang, Qi Liu, Yongjie Ye, Jinghui Lu, Shu Wei, Chunhui Lin, Wanqing Li, Mohamad Fitri Faiz Bin Mahmood, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao Liu, Xiang Bai, Can Huang

    Abstract: Text-Centric Visual Question Answering (TEC-VQA) in its proper format not only facilitates human-machine interaction in text-centric visual environments but also serves as a de facto gold proxy to evaluate AI models in the domain of text-centric scene understanding. Nonetheless, most existing TEC-VQA benchmarks have focused on high-resource languages like English and Chinese. Despite pioneering wo… ▽ More

    Submitted 11 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

  26. arXiv:2405.10890  [pdf, other

    astro-ph.IM astro-ph.GA cs.AI

    A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model

    Authors: Mingxiang Fu, Yu Song, Jiameng Lv, Liang Cao, Peng Jia, Nan Li, Xiangru Li, Jifeng Liu, A-Li Luo, Bo Qiu, Shiyin Shen, Liangping Tu, Lili Wang, Shoulin Wei, Haifeng Yang, Zhenping Yi, Zhiqiang Zou

    Abstract: The exponential growth of astronomical datasets provides an unprecedented opportunity for humans to gain insight into the Universe. However, effectively analyzing this vast amount of data poses a significant challenge. Astronomers are turning to deep learning techniques to address this, but the methods are limited by their specific training sets, leading to considerable duplicate workloads too. He… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 26 pages, 10 figures, to be published on Chinese Physics C

  27. arXiv:2405.03136  [pdf, other

    cs.CR

    FOBNN: Fast Oblivious Binarized Neural Network Inference

    Authors: Xin Chen, Zhili Chen, Benchang Dong, Shiwen Wei, Lin Chen, Daojing He

    Abstract: The superior performance of deep learning has propelled the rise of Deep Learning as a Service, enabling users to transmit their private data to service providers for model execution and inference retrieval. Nevertheless, the primary concern remains safeguarding the confidentiality of sensitive user data while optimizing the efficiency of secure protocols. To address this, we develop a fast oblivi… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  28. arXiv:2404.12803  [pdf, other

    cs.CV cs.LG

    TextSquare: Scaling up Text-Centric Visual Instruction Tuning

    Authors: Jingqun Tang, Chunhui Lin, Zhen Zhao, Shu Wei, Binghong Wu, Qi Liu, Hao Feng, Yang Li, Siqi Wang, Lei Liao, Wei Shi, Yuliang Liu, Hao Liu, Yuan Xie, Xiang Bai, Can Huang

    Abstract: Text-centric visual question answering (VQA) has made great strides with the development of Multimodal Large Language Models (MLLMs), yet open-source models still fall short of leading models like GPT4V and Gemini, partly due to a lack of extensive, high-quality instruction tuning data. To this end, we introduce a new approach for creating a massive, high-quality instruction-tuning dataset, Square… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  29. arXiv:2404.07219  [pdf, other

    cs.IR cs.LG

    Leave No One Behind: Online Self-Supervised Self-Distillation for Sequential Recommendation

    Authors: Shaowei Wei, Zhengwei Wu, Xin Li, Qintong Wu, Zhiqiang Zhang, Jun Zhou, Lihong Gu, Jinjie Gu

    Abstract: Sequential recommendation methods play a pivotal role in modern recommendation systems. A key challenge lies in accurately modeling user preferences in the face of data sparsity. To tackle this challenge, recent methods leverage contrastive learning (CL) to derive self-supervision signals by maximizing the mutual information of two augmented views of the original user behavior sequence. Despite th… ▽ More

    Submitted 17 April, 2024; v1 submitted 22 March, 2024; originally announced April 2024.

  30. arXiv:2404.06330  [pdf, other

    cs.LG cs.AI

    Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning

    Authors: Yanjie Li, Weijun Li, Lina Yu, Min Wu, Jingyi Liu, Wenqiang Li, Meilan Hao, Shu Wei, Yusong Deng

    Abstract: The mathematical formula is the human language to describe nature and is the essence of scientific research. Finding mathematical formulas from observational data is a major demand of scientific research and a major challenge of artificial intelligence. This area is called symbolic regression. Originally symbolic regression was often formulated as a combinatorial optimization problem and solved us… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 21 pages

  31. arXiv:2404.05047  [pdf, other

    cs.LG cs.CR

    Initial Exploration of Zero-Shot Privacy Utility Tradeoffs in Tabular Data Using GPT-4

    Authors: Bishwas Mandal, George Amariucai, Shuangqing Wei

    Abstract: We investigate the application of large language models (LLMs), specifically GPT-4, to scenarios involving the tradeoff between privacy and utility in tabular data. Our approach entails prompting GPT-4 by transforming tabular data points into textual format, followed by the inclusion of precise sanitization instructions in a zero-shot manner. The primary objective is to sanitize the tabular data i… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: 8 pages, Paper accepted at 2024 IEEE International Joint Conference on Neural Networks (IJCNN)

  32. arXiv:2404.05043  [pdf, other

    cs.LG cs.CR

    Optimizing Privacy and Utility Tradeoffs for Group Interests Through Harmonization

    Authors: Bishwas Mandal, George Amariucai, Shuangqing Wei

    Abstract: We propose a novel problem formulation to address the privacy-utility tradeoff, specifically when dealing with two distinct user groups characterized by unique sets of private and utility attributes. Unlike previous studies that primarily focus on scenarios where all users share identical private and utility attributes and often rely on auxiliary datasets or manual annotations, we introduce a coll… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: 10 pages, Paper accepted at 2024 IEEE International Joint Conference on Neural Networks (IJCNN)

  33. arXiv:2404.01064  [pdf, other

    cs.CV

    Roadside Monocular 3D Detection via 2D Detection Prompting

    Authors: Yechi Ma, Shuoquan Wei, Churun Zhang, Wei Hua, Yanan Li, Shu Kong

    Abstract: The problem of roadside monocular 3D detection requires detecting objects of interested classes in a 2D RGB frame and predicting their 3D information such as locations in bird's-eye-view (BEV). It has broad applications in traffic control, vehicle-vehicle communication, and vehicle-infrastructure cooperative perception. To approach this problem, we present a novel and simple method by prompting th… ▽ More

    Submitted 4 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  34. arXiv:2403.18201  [pdf, other

    cs.CV

    Few-shot Online Anomaly Detection and Segmentation

    Authors: Shenxing Wei, Xing Wei, Zhiheng Ma, Songlin Dong, Shaochen Zhang, Yihong Gong

    Abstract: Detecting anomaly patterns from images is a crucial artificial intelligence technique in industrial applications. Recent research in this domain has emphasized the necessity of a large volume of training data, overlooking the practical scenario where, post-deployment of the model, unlabeled data containing both normal and abnormal samples can be utilized to enhance the model's performance. Consequ… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  35. arXiv:2403.15273  [pdf, other

    cs.CL

    Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs

    Authors: Xiaobin Zhang, Liangjun Zang, Qianwen Liu, Shuchong Wei, Songlin Hu

    Abstract: Event temporal relation (TempRel) is a primary subject of the event relation extraction task. However, the inherent ambiguity of TempRel increases the difficulty of the task. With the rise of prompt engineering, it is important to design effective prompt templates and verbalizers to extract relevant knowledge. The traditional manually designed templates struggle to extract precise temporal knowled… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 8 pages,6 figures.Accepted to the International Joint Conference on Neural Networks (IJCNN2024)

  36. arXiv:2403.07240  [pdf, other

    cs.CV

    Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning

    Authors: Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, Yunchao Wei

    Abstract: This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images despite limited training data. Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries. However, the rapid advancements in synthesis technology have led to specific artifacts… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 9 pages, 4 figures, AAAI24

  37. arXiv:2403.05111  [pdf, other

    eess.IV cs.CV

    From Registration Uncertainty to Segmentation Uncertainty

    Authors: Junyu Chen, Yihao Liu, Shuwen Wei, Zhangxing Bian, Aaron Carass, Yong Du

    Abstract: Understanding the uncertainty inherent in deep learning-based image registration models has been an ongoing area of research. Existing methods have been developed to quantify both transformation and appearance uncertainties related to the registration process, elucidating areas where the model may exhibit ambiguity regarding the generated deformation. However, our study reveals that neither uncert… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted by IEEE ISBI'24 ((c) IEEE). Code available at https://bit.ly/42VOZER

  38. arXiv:2403.03215  [pdf, other

    cs.RO

    A Safety-Critical Framework for UGVs in Complex Environments: A Data-Driven Discrepancy-Aware Approach

    Authors: Skylar X. Wei, Lu Gan, Joel W. Burdick

    Abstract: This work presents a novel data-driven multi-layered planning and control framework for the safe navigation of a class of unmanned ground vehicles (UGVs) in the presence of unknown stationary obstacles and additive modeling uncertainties. The foundation of this framework is a novel robust model predictive planner, designed to generate optimal collision-free trajectories given an occupancy grid map… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  39. arXiv:2402.18603  [pdf, other

    cs.LG cs.AI cs.CL

    MMSR: Symbolic Regression is a Multimodal Task

    Authors: Yanjie Li, Jingyi Liu, Weijun Li, Lina Yu, Min Wu, Wenqiang Li, Meilan Hao, Su Wei, Yusong Deng

    Abstract: Mathematical formulas are the crystallization of human wisdom in exploring the laws of nature for thousands of years. Describing the complex laws of nature with a concise mathematical formula is a constant pursuit of scientists and a great challenge for artificial intelligence. This field is called symbolic regression. Symbolic regression was originally formulated as a combinatorial optimization p… ▽ More

    Submitted 14 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 12 page

  40. Learning Invariant Inter-pixel Correlations for Superpixel Generation

    Authors: Sen Xu, Shikui Wei, Tao Ruan, Lixin Liao

    Abstract: Deep superpixel algorithms have made remarkable strides by substituting hand-crafted features with learnable ones. Nevertheless, we observe that existing deep superpixel methods, serving as mid-level representation operations, remain sensitive to the statistical properties (e.g., color distribution, high-level semantics) embedded within the training dataset. Consequently, learnable features exhibi… ▽ More

    Submitted 9 April, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI24

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 6351-6359 (2024)

  41. arXiv:2402.17403  [pdf, other

    cs.CV

    Sora Generates Videos with Stunning Geometrical Consistency

    Authors: Xuanyi Li, Daquan Zhou, Chenxu Zhang, Shaodong Wei, Qibin Hou, Ming-Ming Cheng

    Abstract: The recently developed Sora model [1] has exhibited remarkable capabilities in video generation, sparking intense discussions regarding its ability to simulate real-world phenomena. Despite its growing popularity, there is a lack of established metrics to evaluate its fidelity to real-world physics quantitatively. In this paper, we introduce a new benchmark that assesses the quality of the generat… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 5 pages, 3 figures

  42. arXiv:2402.02364  [pdf, other

    cs.LG cs.AI cs.CL

    The Developmental Landscape of In-Context Learning

    Authors: Jesse Hoogland, George Wang, Matthew Farrugia-Roberts, Liam Carroll, Susan Wei, Daniel Murfet

    Abstract: We show that in-context learning emerges in transformers in discrete developmental stages, when they are trained on either language modeling or linear regression tasks. We introduce two methods for detecting the milestones that separate these stages, by probing the geometry of the population loss in both parameter space and function space. We study the stages revealed by these new methods using a… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  43. arXiv:2402.00667  [pdf, other

    cs.CL

    Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning

    Authors: Jitao Sang, Yuhang Wang, Jing Zhang, Yanxu Zhu, Chao Kong, Junhong Ye, Shuyu Wei, Jinlin Xiao

    Abstract: This paper presents a follow-up study to OpenAI's recent superalignment work on Weak-to-Strong Generalization (W2SG). Superalignment focuses on ensuring that high-level AI systems remain consistent with human values and intentions when dealing with complex, high-risk tasks. The W2SG framework has opened new possibilities for empirical research in this evolving field. Our study simulates two phases… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  44. arXiv:2401.17571  [pdf, other

    eess.IV cs.CV

    Is Registering Raw Tagged-MR Enough for Strain Estimation in the Era of Deep Learning?

    Authors: Zhangxing Bian, Ahmed Alshareef, Shuwen Wei, Junyu Chen, Yuli Wang, Jonghye Woo, Dzung L. Pham, Jiachen Zhuo, Aaron Carass, Jerry L. Prince

    Abstract: Magnetic Resonance Imaging with tagging (tMRI) has long been utilized for quantifying tissue motion and strain during deformation. However, a phenomenon known as tag fading, a gradual decrease in tag visibility over time, often complicates post-processing. The first contribution of this study is to model tag fading by considering the interplay between $T_1$ relaxation and the repeated application… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: Accepted to SPIE Medical Imaging 2024 (oral)

  45. arXiv:2401.15002   

    cs.CV

    BackdoorBench: A Comprehensive Benchmark and Analysis of Backdoor Learning

    Authors: Baoyuan Wu, Hongrui Chen, Mingda Zhang, Zihao Zhu, Shaokui Wei, Danni Yuan, Mingli Zhu, Ruotong Wang, Li Liu, Chao Shen

    Abstract: As an emerging and vital topic for studying deep neural networks' vulnerability (DNNs), backdoor learning has attracted increasing interest in recent years, and many seminal backdoor attack and defense algorithms are being developed successively or concurrently, in the status of a rapid arms race. However, mainly due to the diverse settings, and the difficulties of implementation and reproducibili… ▽ More

    Submitted 11 August, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: We have uploaded a new version, which can be accessed at arXiv:2407.19845

  46. arXiv:2401.14424  [pdf, other

    cs.LG cs.AI

    Discovering Mathematical Formulas from Data via GPT-guided Monte Carlo Tree Search

    Authors: Yanjie Li, Weijun Li, Lina Yu, Min Wu, Jingyi Liu, Wenqiang Li, Meilan Hao, Shu Wei, Yusong Deng

    Abstract: Finding a concise and interpretable mathematical formula that accurately describes the relationship between each variable and the predicted value in the data is a crucial task in scientific research, as well as a significant challenge in artificial intelligence. This problem is referred to as symbolic regression, which is an NP-hard problem. In the previous year, a novel symbolic regression method… ▽ More

    Submitted 30 January, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: 24 pages

  47. arXiv:2401.13578  [pdf, other

    cs.CR

    WPDA: Frequency-based Backdoor Attack with Wavelet Packet Decomposition

    Authors: Zhengyao Song, Yongqiang Li, Danni Yuan, Li Liu, Shaokui Wei, Baoyuan Wu

    Abstract: This work explores an emerging security threat against deep neural networks (DNNs) based image classification, i.e., backdoor attack. In this scenario, the attacker aims to inject a backdoor into the model by manipulating training data, such that the backdoor could be activated by a particular trigger and bootstraps the model to make a target prediction at inference. Currently, most existing data… ▽ More

    Submitted 24 May, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: 15 pages, 15 figures

    ACM Class: I.4.9

  48. arXiv:2401.11002  [pdf, other

    cs.CV cs.AI

    Fast Registration of Photorealistic Avatars for VR Facial Animation

    Authors: Chaitanya Patel, Shaojie Bai, Te-Li Wang, Jason Saragih, Shih-En Wei

    Abstract: Virtual Reality (VR) bares promise of social interactions that can feel more immersive than other media. Key to this is the ability to accurately animate a personalized photorealistic avatar, and hence the acquisition of the labels for headset-mounted camera (HMC) images need to be efficient and accurate, while wearing a VR headset. This is challenging due to oblique camera views and differences i… ▽ More

    Submitted 18 July, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: ECCV 2024. Project page: https://chaitanya100100.github.io/FastRegistration/

  49. arXiv:2401.10632  [pdf, other

    cs.LG

    Interventional Fairness on Partially Known Causal Graphs: A Constrained Optimization Approach

    Authors: Aoqi Zuo, Yiqing Li, Susan Wei, Mingming Gong

    Abstract: Fair machine learning aims to prevent discrimination against individuals or sub-populations based on sensitive attributes such as gender and race. In recent years, causal inference methods have been increasingly used in fair machine learning to measure unfairness by causal effects. However, current methods assume that the true causal graph is given, which is often not true in real-world applicatio… ▽ More

    Submitted 8 March, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted to ICLR24

  50. arXiv:2401.06772  [pdf

    cs.CL

    Semantic Parsing for Question Answering over Knowledge Graphs

    Authors: Sijia Wei, Wenwen Zhang, Qisong Li, Jiang Zhao

    Abstract: In this paper, we introduce a novel method with graph-to-segment mapping for question answering over knowledge graphs, which helps understanding question utterances. This method centers on semantic parsing, a key approach for interpreting these utterances. The challenges lie in comprehending implicit entities, relationships, and complex constraints like time, ordinality, and aggregation within que… ▽ More

    Submitted 27 January, 2024; v1 submitted 1 December, 2023; originally announced January 2024.

    Comments: arXiv admin note: text overlap with arXiv:2401.02968