Skip to main content

Showing 1–50 of 2,443 results for author: Chen, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.03236  [pdf, other

    cs.CV

    Unveiling Context-Related Anomalies: Knowledge Graph Empowered Decoupling of Scene and Action for Human-Related Video Anomaly Detection

    Authors: Chenglizhao Chen, Xinyu Liu, Mengke Song, Luming Li, Xu Yu, Shanchen Pang

    Abstract: Detecting anomalies in human-related videos is crucial for surveillance applications. Current methods primarily include appearance-based and action-based techniques. Appearance-based methods rely on low-level visual features such as color, texture, and shape. They learn a large number of pixel patterns and features related to known scenes during training, making them effective in detecting anomali… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

    Comments: 13pages, 9 figures

  2. arXiv:2409.03164  [pdf, other

    cs.LG cs.GR

    A Scalable Matrix Visualization for Understanding Tree Ensemble Classifiers

    Authors: Zhen Li, Weikai Yang, Jun Yuan, Jing Wu, Changjian Chen, Yao Ming, Fan Yang, Hui Zhang, Shixia Liu

    Abstract: The high performance of tree ensemble classifiers benefits from a large set of rules, which, in turn, makes the models hard to understand. To improve interpretability, existing methods extract a subset of rules for approximation using model reduction techniques. However, by focusing on the reduced rule set, these methods often lose fidelity and ignore anomalous rules that, despite their infrequenc… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: 15 pages, 10 figures

  3. arXiv:2409.03106  [pdf, other

    cs.CV

    Spatial Diffusion for Cell Layout Generation

    Authors: Chen Li, Xiaoling Hu, Shahira Abousamra, Meilong Xu, Chao Chen

    Abstract: Generative models, such as GANs and diffusion models, have been used to augment training sets and boost performances in different tasks. We focus on generative models for cell detection instead, i.e., locating and classifying cells in given pathology images. One important information that has been largely overlooked is the spatial patterns of the cells. In this paper, we propose a spatial-pattern-… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: 12 pages, 4 figures, accepted by MICCAI 2024

  4. arXiv:2409.01556  [pdf, other

    cs.CL cs.AI

    Benchmarking Cognitive Domains for LLMs: Insights from Taiwanese Hakka Culture

    Authors: Chen-Chi Chang, Ching-Yuan Chen, Hung-Shin Lee, Chih-Cheng Lee

    Abstract: This study introduces a comprehensive benchmark designed to evaluate the performance of large language models (LLMs) in understanding and processing cultural knowledge, with a specific focus on Hakka culture as a case study. Leveraging Bloom's Taxonomy, the study develops a multi-dimensional framework that systematically assesses LLMs across six cognitive domains: Remembering, Understanding, Apply… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

    Comments: Submitted to O-COCOSDA 2024

  5. arXiv:2409.00712  [pdf, other

    cs.CR

    Unveiling the Bandwidth Nightmare: CDN Compression Format Conversion Attacks

    Authors: Ziyu Lin, Zhiwei Lin, Ximeng Liu, Zuobing Ying, Cheng Chen

    Abstract: Content Delivery Networks (CDNs) are designed to enhance network performance and protect against web attack traffic for their hosting websites. And the HTTP compression request mechanism primarily aims to reduce unnecessary network transfers. However, we find that the specification failed to consider the security risks introduced when CDNs meet compression requests. In this paper, we present a nov… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

    Comments: 10 pages

  6. arXiv:2409.00116  [pdf, other

    cs.CL cs.LG

    FedMCP: Parameter-Efficient Federated Learning with Model-Contrastive Personalization

    Authors: Qianyi Zhao, Chen Qu, Cen Chen, Mingyuan Fan, Yanhao Wang

    Abstract: With increasing concerns and regulations on data privacy, fine-tuning pretrained language models (PLMs) in federated learning (FL) has become a common paradigm for NLP tasks. Despite being extensively studied, the existing methods for this problem still face two primary challenges. First, the huge number of parameters in large-scale PLMs leads to excessive communication and computational overhead.… ▽ More

    Submitted 28 August, 2024; originally announced September 2024.

  7. arXiv:2408.17180  [pdf, other

    cs.AI cs.GT cs.IR cs.LG cs.MA

    Identifying and Clustering Counter Relationships of Team Compositions in PvP Games for Efficient Balance Analysis

    Authors: Chiu-Chou Lin, Yu-Wei Shih, Kuei-Ting Kuo, Yu-Cheng Chen, Chien-Hua Chen, Wei-Chen Chiu, I-Chen Wu

    Abstract: How can balance be quantified in game settings? This question is crucial for game designers, especially in player-versus-player (PvP) games, where analyzing the strength relations among predefined team compositions-such as hero combinations in multiplayer online battle arena (MOBA) games or decks in card games-is essential for enhancing gameplay and achieving balance. We have developed two advance… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: TMLR 09/2024 https://openreview.net/forum?id=2D36otXvBE

  8. arXiv:2408.16706  [pdf, other

    cs.PL cs.SE

    Incremental Context-free Grammar Inference in Black Box Settings

    Authors: Feifei Li, Xiao Chen, Xi Xiao, Xiaoyu Sun, Chuan Chen, Shaohua Wang, Jitao Han

    Abstract: Black-box context-free grammar inference presents a significant challenge in many practical settings due to limited access to example programs. The state-of-the-art methods, Arvada and Treevada, employ heuristic approaches to generalize grammar rules, initiating from flat parse trees and exploring diverse generalization sequences. We have observed that these approaches suffer from low quality and… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  9. arXiv:2408.16673  [pdf, other

    cs.LG cs.AI

    Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity

    Authors: Ziniu Li, Congliang Chen, Tian Xu, Zeyu Qin, Jiancong Xiao, Ruoyu Sun, Zhi-Quan Luo

    Abstract: Large language models rely on Supervised Fine-Tuning (SFT) to specialize in downstream tasks. Cross Entropy (CE) loss is the de facto choice in SFT, but it often leads to overfitting and limited output diversity due to its aggressive updates to the data distribution. This paper aim to address these issues by introducing the maximum entropy principle, which favors models with flatter distributions… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  10. arXiv:2408.16310  [pdf, other

    cs.CV

    Bootstrap Segmentation Foundation Model under Distribution Shift via Object-Centric Learning

    Authors: Luyao Tang, Yuxuan Yuan, Chaoqi Chen, Kunze Huang, Xinghao Ding, Yue Huang

    Abstract: Foundation models have made incredible strides in achieving zero-shot or few-shot generalization, leveraging prompt engineering to mimic the problem-solving approach of human intelligence. However, when it comes to some foundation models like Segment Anything, there is still a challenge in performing well on out-of-distribution data, including camouflaged and medical images. Inconsistent prompting… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: This work is accepted by ECCV 2024 EVAL-FoMo Workshop

  11. arXiv:2408.15451  [pdf, other

    cs.LG cs.CR stat.ME

    Certified Causal Defense with Generalizable Robustness

    Authors: Yiran Qiao, Yu Yin, Chen Chen, Jing Ma

    Abstract: While machine learning models have proven effective across various scenarios, it is widely acknowledged that many models are vulnerable to adversarial attacks. Recently, there have emerged numerous efforts in adversarial defense. Among them, certified defense is well known for its theoretical guarantees against arbitrary adversarial perturbations on input within a certain range (e.g., $l_2$ ball).… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: Submitted to AAAI

  12. arXiv:2408.14968  [pdf, other

    cs.IR cs.CL

    MRSE: An Efficient Multi-modality Retrieval System for Large Scale E-commerce

    Authors: Hao Jiang, Haoxiang Zhang, Qingshan Hou, Chaofeng Chen, Weisi Lin, Jingchang Zhang, Annan Wang

    Abstract: Providing high-quality item recall for text queries is crucial in large-scale e-commerce search systems. Current Embedding-based Retrieval Systems (ERS) embed queries and items into a shared low-dimensional space, but uni-modality ERS rely too heavily on textual features, making them unreliable in complex contexts. While multi-modality ERS incorporate various data sources, they often overlook indi… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  13. arXiv:2408.14594  [pdf, other

    cs.CV

    MMR: Evaluating Reading Ability of Large Multimodal Models

    Authors: Jian Chen, Ruiyi Zhang, Yufan Zhou, Ryan Rossi, Jiuxiang Gu, Changyou Chen

    Abstract: Large multimodal models (LMMs) have demonstrated impressive capabilities in understanding various types of image, including text-rich images. Most existing text-rich image benchmarks are simple extraction-based question answering, and many LMMs now easily achieve high scores. This means that current benchmarks fail to accurately reflect performance of different models, and a natural idea is to bui… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  14. arXiv:2408.14520  [pdf, other

    cs.LG cs.AI cs.SI

    Towards Graph Prompt Learning: A Survey and Beyond

    Authors: Qingqing Long, Yuchen Yan, Peiyan Zhang, Chen Fang, Wentao Cui, Zhiyuan Ning, Meng Xiao, Ning Cao, Xiao Luo, Lingjun Xu, Shiyue Jiang, Zheng Fang, Chong Chen, Xian-Sheng Hua, Yuanchun Zhou

    Abstract: Large-scale "pre-train and prompt learning" paradigms have demonstrated remarkable adaptability, enabling broad applications across diverse domains such as question answering, image recognition, and multimodal retrieval. This approach fully leverages the potential of large-scale pre-trained models, reducing downstream data requirements and computational costs while enhancing model applicability ac… ▽ More

    Submitted 29 August, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

    Comments: 19 pages, 2 figures

  15. arXiv:2408.14393  [pdf, other

    cs.IR cs.LG

    CURE4Rec: A Benchmark for Recommendation Unlearning with Deeper Influence

    Authors: Chaochao Chen, Jiaming Zhang, Yizhao Zhang, Li Zhang, Lingjuan Lyu, Yuyuan Li, Biao Gong, Chenggang Yan

    Abstract: With increasing privacy concerns in artificial intelligence, regulations have mandated the right to be forgotten, granting individuals the right to withdraw their data from models. Machine unlearning has emerged as a potential solution to enable selective forgetting in models, particularly in recommender systems where historical data contains sensitive user information. Despite recent advances in… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  16. arXiv:2408.14279  [pdf, other

    cs.CV

    Learning Local Pattern Modularization for Point Cloud Reconstruction from Unseen Classes

    Authors: Chao Chen, Yu-Shen Liu, Zhizhong Han

    Abstract: It is challenging to reconstruct 3D point clouds in unseen classes from single 2D images. Instead of object-centered coordinate system, current methods generalized global priors learned in seen classes to reconstruct 3D shapes from unseen classes in viewer-centered coordinate system. However, the reconstruction accuracy and interpretability are still eager to get improved. To resolve this issue, w… ▽ More

    Submitted 4 September, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

    Comments: 14pages, 11figures, accepted by ECCV 2024

  17. arXiv:2408.14119  [pdf, other

    cs.CL cs.AI

    Contrastive Learning Subspace for Text Clustering

    Authors: Qian Yong, Chen Chen, Xiabing Zhou

    Abstract: Contrastive learning has been frequently investigated to learn effective representations for text clustering tasks. While existing contrastive learning-based text clustering methods only focus on modeling instance-wise semantic similarity relationships, they ignore contextual information and underlying relationships among all instances that needs to be clustered. In this paper, we propose a novel… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  18. arXiv:2408.13735  [pdf, other

    cs.CV

    MSVM-UNet: Multi-Scale Vision Mamba UNet for Medical Image Segmentation

    Authors: Chaowei Chen, Li Yu, Shiquan Min, Shunfang Wang

    Abstract: State Space Models (SSMs), especially Mamba, have shown great promise in medical image segmentation due to their ability to model long-range dependencies with linear computational complexity. However, accurate medical image segmentation requires the effective learning of both multi-scale detailed feature representations and global contextual dependencies. Although existing works have attempted to… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

    Comments: 8 pages, 5 figures

  19. arXiv:2408.13454  [pdf, other

    cs.CV

    AdaOcc: Adaptive-Resolution Occupancy Prediction

    Authors: Chao Chen, Ruoyu Wang, Yuliang Guo, Cheng Zhao, Xinyu Huang, Chen Feng, Liu Ren

    Abstract: Autonomous driving in complex urban scenarios requires 3D perception to be both comprehensive and precise. Traditional 3D perception methods focus on object detection, resulting in sparse representations that lack environmental detail. Recent approaches estimate 3D occupancy around vehicles for a more comprehensive scene representation. However, dense 3D occupancy prediction increases computationa… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  20. arXiv:2408.12935  [pdf, other

    cs.AI

    Trustworthy, Responsible, and Safe AI: A Comprehensive Architectural Framework for AI Safety with Challenges and Mitigations

    Authors: Chen Chen, Ziyao Liu, Weifeng Jiang, Goh Si Qi, KwoK-Yan Lam

    Abstract: AI Safety is an emerging area of critical importance to the safe adoption and deployment of AI systems. With the rapid proliferation of AI and especially with the recent advancement of Generative AI (or GAI), the technology ecosystem behind the design, development, adoption, and deployment of AI systems has drastically changed, broadening the scope of AI Safety to address impacts on public safety… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  21. arXiv:2408.12085  [pdf, other

    eess.SY cs.SI math.DS

    Controllability and Observability of Temporal Hypergraphs

    Authors: Anqi Dong, Xin Mao, Can Chen

    Abstract: Numerous complex systems, such as those arisen in ecological networks, genomic contact networks, and social networks, exhibit higher-order and time-varying characteristics, which can be effectively modeled using temporal hypergraphs. However, analyzing and controlling temporal hypergraphs poses significant challenges due to their inherent time-varying and nonlinear nature, while most existing meth… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: 6 pages, 3 figures

    MSC Class: 93Bxx; 92Dxx; 05C65

  22. arXiv:2408.11950  [pdf

    cs.CR cs.PF

    Evaluation of Hash Algorithm Performance for Cryptocurrency Exchanges Based on Blockchain System

    Authors: Abel C. H. Chen

    Abstract: The blockchain system has emerged as one of the focal points of research in recent years, particularly in applications and services such as cryptocurrencies and smart contracts. In this context, the hash value serves as a crucial element in linking blocks within the blockchain, ensuring the integrity of block contents. Therefore, hash algorithms represent a vital security technology for ensuring t… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  23. Power-Domain Interference Graph Estimation for Multi-hop BLE Networks

    Authors: Haifeng Jia, Yichen Wei, Yibo Pi, Cailian Chen

    Abstract: Traditional wisdom for network management allocates network resources separately for the measurement and communication tasks. Heavy measurement tasks may compete limited resources with communication tasks and significantly degrade overall network performance. It is therefore challenging for the interference graph, deemed as incurring heavy measurement overhead, to be used in practice in wireless n… ▽ More

    Submitted 22 August, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

    Comments: This paper is accepted for publication in the ACM Transactions on Sensor Networks (TOSN), and is an extension of our conference paper accepted at EWSN'23 (arXiv:2312.16807)

  24. arXiv:2408.11173  [pdf, other

    cs.PF cs.OS

    Delegation with Trust<T>: A Scalable, Type- and Memory-Safe Alternative to Locks

    Authors: Noaman Ahmad, Ben Baenen, Chen Chen, Jakob Eriksson

    Abstract: We present Trust<T>, a general, type- and memory-safe alternative to locking in concurrent programs. Instead of synchronizing multi-threaded access to an object of type T with a lock, the programmer may place the object in a Trust<T>. The object is then no longer directly accessible. Instead a designated thread, the object's trustee, is responsible for applying any requested operations to the obje… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  25. arXiv:2408.10901  [pdf, other

    cs.CV cs.AI cs.LG

    A Grey-box Attack against Latent Diffusion Model-based Image Editing by Posterior Collapse

    Authors: Zhongliang Guo, Lei Fang, Jingyu Lin, Yifei Qian, Shuai Zhao, Zeyu Wang, Junhao Dong, Cunjian Chen, Ognjen Arandjelović, Chun Pong Lau

    Abstract: Recent advancements in generative AI, particularly Latent Diffusion Models (LDMs), have revolutionized image synthesis and manipulation. However, these generative techniques raises concerns about data misappropriation and intellectual property infringement. Adversarial attacks on machine learning models have been extensively studied, and a well-established body of research has extended these techn… ▽ More

    Submitted 2 September, 2024; v1 submitted 20 August, 2024; originally announced August 2024.

    Comments: 21 pages, 7 figures, 10 tables

  26. arXiv:2408.10673  [pdf, other

    cs.CR

    Iterative Window Mean Filter: Thwarting Diffusion-based Adversarial Purification

    Authors: Hanrui Wang, Ruoxi Sun, Cunjian Chen, Minhui Xue, Lay-Ki Soon, Shuo Wang, Zhe Jin

    Abstract: Face authentication systems have brought significant convenience and advanced developments, yet they have become unreliable due to their sensitivity to inconspicuous perturbations, such as adversarial attacks. Existing defenses often exhibit weaknesses when facing various attack algorithms and adaptive attacks or compromise accuracy for enhanced security. To address these challenges, we have devel… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: Under review

  27. arXiv:2408.10556  [pdf, other

    cs.AI cs.LG

    Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks

    Authors: Yun Qu, Boyuan Wang, Jianzhun Shao, Yuhang Jiang, Chen Chen, Zhenbin Ye, Lin Liu, Junfeng Yang, Lin Lai, Hongyang Qin, Minwen Deng, Juchao Zhuo, Deheng Ye, Qiang Fu, Wei Yang, Guang Yang, Lanxiao Huang, Xiangyang Ji

    Abstract: The advancement of Offline Reinforcement Learning (RL) and Offline Multi-Agent Reinforcement Learning (MARL) critically depends on the availability of high-quality, pre-collected offline datasets that represent real-world complexities and practical applications. However, existing datasets often fall short in their simplicity and lack of realism. To address this gap, we propose Hokoff, a comprehens… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  28. arXiv:2408.10349  [pdf, other

    cs.LG cs.CV

    AIR: Analytic Imbalance Rectifier for Continual Learning

    Authors: Di Fang, Yinan Zhu, Runze Fang, Cen Chen, Ziqian Zeng, Huiping Zhuang

    Abstract: Continual learning enables AI models to learn new data sequentially without retraining in real-world scenarios. Most existing methods assume the training data are balanced, aiming to reduce the catastrophic forgetting problem that models tend to forget previously generated data. However, data imbalance and the mixture of new and old data in real-world scenarios lead the model to ignore categories… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    ACM Class: I.2.6

  29. arXiv:2408.09865  [pdf, other

    cs.LG cs.CL cs.IR

    MAPLE: Enhancing Review Generation with Multi-Aspect Prompt LEarning in Explainable Recommendation

    Authors: Ching-Wen Yang, Che Wei Chen, Kun-da Wu, Hao Xu, Jui-Feng Yao, Hung-Yu Kao

    Abstract: Explainable Recommendation task is designed to receive a pair of user and item and output explanations to justify why an item is recommended to a user. Many models treat review-generation as a proxy of explainable recommendation. Although they are able to generate fluent and grammatical sentences, they suffer from generality and hallucination issues. We propose a personalized, aspect-controlled mo… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 8 main pages, 10 pages for appendix. Under review

  30. arXiv:2408.09393  [pdf, other

    cs.LG cs.AI cs.DC

    Federated Graph Learning with Structure Proxy Alignment

    Authors: Xingbo Fu, Zihan Chen, Binchi Zhang, Chen Chen, Jundong Li

    Abstract: Federated Graph Learning (FGL) aims to learn graph learning models over graph data distributed in multiple data owners, which has been applied in various applications such as social recommendation and financial fraud detection. Inherited from generic Federated Learning (FL), FGL similarly has the data heterogeneity issue where the label distribution may vary significantly for distributed graph dat… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

    Comments: Accepted by KDD 2024

  31. arXiv:2408.09362  [pdf, other

    cs.CV

    Angle of Arrival Estimation with Transformer: A Sparse and Gridless Method with Zero-Shot Capability

    Authors: Zhaoxuan Zhu, Chulong Chen, Bo Yang

    Abstract: Automotive Multiple-Input Multiple-Output (MIMO) radars have gained significant traction in Advanced Driver Assistance Systems (ADAS) and Autonomous Vehicles (AV) due to their cost-effectiveness, resilience to challenging operating conditions, and extended detection range. To fully leverage the advantages of MIMO radars, it is crucial to develop an Angle of Arrival (AOA) algorithm that delivers hi… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

    Comments: 8 pages, 8 figures

  32. arXiv:2408.08946  [pdf, other

    cs.CY

    Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges

    Authors: Baixiang Huang, Canyu Chen, Kai Shu

    Abstract: Accurate attribution of authorship is crucial for maintaining the integrity of digital content, improving forensic investigations, and mitigating the risks of misinformation and plagiarism. Addressing the imperative need for proper authorship attribution is essential to uphold the credibility and accountability of authentic authorship. The rapid advancements of Large Language Models (LLMs) have bl… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 12 pages for the main paper. More resources and a curated list of papers are available and regularly updated at https://llm-authorship.github.io

  33. arXiv:2408.08872  [pdf, other

    cs.CV cs.AI cs.CL

    xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

    Authors: Le Xue, Manli Shu, Anas Awadalla, Jun Wang, An Yan, Senthil Purushwalkam, Honglu Zhou, Viraj Prabhu, Yutong Dai, Michael S Ryoo, Shrikant Kendre, Jieyu Zhang, Can Qin, Shu Zhang, Chia-Chih Chen, Ning Yu, Juntao Tan, Tulika Manoj Awalgaonkar, Shelby Heinecke, Huan Wang, Yejin Choi, Ludwig Schmidt, Zeyuan Chen, Silvio Savarese, Juan Carlos Niebles , et al. (2 additional authors not shown)

    Abstract: This report introduces xGen-MM (also known as BLIP-3), a framework for developing Large Multimodal Models (LMMs). The framework comprises meticulously curated datasets, a training recipe, model architectures, and a resulting suite of LMMs. xGen-MM, short for xGen-MultiModal, expands the Salesforce xGen initiative on foundation AI models. Our models undergo rigorous evaluation across a range of tas… ▽ More

    Submitted 28 August, 2024; v1 submitted 16 August, 2024; originally announced August 2024.

  34. arXiv:2408.08342  [pdf, other

    cs.GR cs.CV

    CT4D: Consistent Text-to-4D Generation with Animatable Meshes

    Authors: Ce Chen, Shaoli Huang, Xuelin Chen, Guangyi Chen, Xiaoguang Han, Kun Zhang, Mingming Gong

    Abstract: Text-to-4D generation has recently been demonstrated viable by integrating a 2D image diffusion model with a video diffusion model. However, existing models tend to produce results with inconsistent motions and geometric structures over time. To this end, we present a novel framework, coined CT4D, which directly operates on animatable meshes for generating consistent 4D content from arbitrary user… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  35. arXiv:2408.08208  [pdf, other

    cs.IR cs.AI

    LLM4DSR: Leveraing Large Language Model for Denoising Sequential Recommendation

    Authors: Bohao Wang, Feng Liu, Jiawei Chen, Yudi Wu, Xingyu Lou, Jun Wang, Yan Feng, Chun Chen, Can Wang

    Abstract: Sequential recommendation systems fundamentally rely on users' historical interaction sequences, which are often contaminated by noisy interactions. Identifying these noisy interactions accurately without additional information is particularly difficult due to the lack of explicit supervisory signals to denote noise. Large Language Models (LLMs), equipped with extensive open knowledge and semantic… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  36. arXiv:2408.08205  [pdf, other

    cs.CV cs.CR cs.MM

    A Multi-task Adversarial Attack Against Face Authentication

    Authors: Hanrui Wang, Shuo Wang, Cunjian Chen, Massimo Tistarelli, Zhe Jin

    Abstract: Deep-learning-based identity management systems, such as face authentication systems, are vulnerable to adversarial attacks. However, existing attacks are typically designed for single-task purposes, which means they are tailored to exploit vulnerabilities unique to the individual target rather than being adaptable for multiple users or systems. This limitation makes them unsuitable for certain at… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: Accepted by ACM Transactions on Multimedia Computing, Communications, and Applications

  37. arXiv:2408.08066  [pdf, other

    cs.IR

    Mamba Retriever: Utilizing Mamba for Effective and Efficient Dense Retrieval

    Authors: Hanqi Zhang, Chong Chen, Lang Mei, Qi Liu, Jiaxin Mao

    Abstract: In the information retrieval (IR) area, dense retrieval (DR) models use deep learning techniques to encode queries and passages into embedding space to compute their semantic relations. It is important for DR models to balance both efficiency and effectiveness. Pre-trained language models (PLMs), especially Transformer-based PLMs, have been proven to be effective encoders of DR models. However, th… ▽ More

    Submitted 22 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

  38. arXiv:2408.07999  [pdf, other

    cs.CV

    Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement

    Authors: Wenxuan Li, Qin Zou, Chi Chen, Bo Du, Long Chen

    Abstract: In the realm of autonomous driving,accurately detecting occluded or distant objects,referred to as weak positive sample ,presents significant challenges. These challenges predominantly arise during query initialization, where an over-reliance on heatmap confidence often results in a high rate of false positives, consequently masking weaker detections and impairing system performance. To alleviate… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  39. arXiv:2408.07703  [pdf, other

    cs.CV

    Knowledge Distillation with Refined Logits

    Authors: Wujie Sun, Defang Chen, Siwei Lyu, Genlang Chen, Chun Chen, Can Wang

    Abstract: Recent research on knowledge distillation has increasingly focused on logit distillation because of its simplicity, effectiveness, and versatility in model compression. In this paper, we introduce Refined Logit Distillation (RLD) to address the limitations of current logit distillation methods. Our approach is motivated by the observation that even high-performing teacher models can make incorrect… ▽ More

    Submitted 19 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

    Comments: 11 pages, 7 figures

  40. arXiv:2408.07536  [pdf, other

    cs.NI

    Context-aware Container Orchestration in Serverless Edge Computing

    Authors: Peiyuan Guan, Chen Chen, Ziru Chen, Lin X. Cai, Xing Hao, Amir Taherkordi

    Abstract: Adopting serverless computing to edge networks benefits end-users from the pay-as-you-use billing model and flexible scaling of applications. This paradigm extends the boundaries of edge computing and remarkably improves the quality of services. However, due to the heterogeneous nature of computing and bandwidth resources in edge networks, it is challenging to dynamically allocate different resour… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: This paper has been accepted by the IEEE GLOBECOM 2024 Conference

  41. arXiv:2408.06995  [pdf, other

    cs.CV

    Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models

    Authors: Cheng Chen, Christina Giannoula, Andreas Moshovos

    Abstract: Diffusion models are emerging models that generate images by iteratively denoising random Gaussian noise using deep neural networks. These models typically exhibit high computational and memory demands, necessitating effective post-training quantization for high-performance inference. Recent works propose low-bitwidth (e.g., 8-bit or 4-bit) quantization for diffusion models, however 4-bit integer… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  42. arXiv:2408.06303  [pdf, other

    cs.CL cs.CV

    Long-Form Answers to Visual Questions from Blind and Low Vision People

    Authors: Mina Huh, Fangyuan Xu, Yi-Hao Peng, Chongyan Chen, Hansika Murugu, Danna Gurari, Eunsol Choi, Amy Pavel

    Abstract: Vision language models can now generate long-form answers to questions about images - long-form visual question answers (LFVQA). We contribute VizWiz-LF, a dataset of long-form answers to visual questions posed by blind and low vision (BLV) users. VizWiz-LF contains 4.2k long-form answers to 600 visual questions, collected from human expert describers and six VQA models. We develop and annotate fu… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: COLM 2024

  43. arXiv:2408.05002  [pdf, other

    cs.SE

    An Empirical Study on Challenges for LLM Developers

    Authors: Xiang Chen, Chaoyang Gao, Chunyang Chen, Guangbei Zhang, Yong Liu

    Abstract: In recent years, large language models (LLMs) have seen rapid advancements, significantly impacting various fields such as natural language processing, and software engineering. These LLMs, exemplified by OpenAI's ChatGPT, have revolutionized the way we approach language understanding and generation tasks. However, in contrast to traditional software development practices, LLM development introduc… ▽ More

    Submitted 11 August, 2024; v1 submitted 6 August, 2024; originally announced August 2024.

    Comments: 29 pages, 15 figures

  44. arXiv:2408.04883  [pdf, other

    cs.CV

    ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation

    Authors: Mengcheng Lan, Chaofeng Chen, Yiping Ke, Xinjiang Wang, Litong Feng, Wayne Zhang

    Abstract: Open-vocabulary semantic segmentation requires models to effectively integrate visual representations with open-vocabulary semantic labels. While Contrastive Language-Image Pre-training (CLIP) models shine in recognizing visual concepts from text, they often struggle with segment coherence due to their limited localization ability. In contrast, Vision Foundation Models (VFMs) excel at acquiring sp… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Comments: Accepted to ECCV 2024. Code available at https://github.com/mc-lan/ProxyCLIP

  45. arXiv:2408.04798  [pdf, other

    cs.HC

    Manipulable Semantic Components: a Computational Representation of Data Visualization Scenes

    Authors: Zhicheng Liu, Chen Chen, John Hooker

    Abstract: Various data visualization applications such as reverse engineering and interactive authoring require a vocabulary that describes the structure of visualization scenes and the procedure to manipulate them. A few scene abstractions have been proposed, but they are restricted to specific applications for a limited set of visualization types. A unified and expressive model of data visualization scene… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: To appear at IEEE Transactions on Visualization & Computer Graphics (Proceedings IEEE VIS'24), 2025. The paper has been selected for the VIS 2024 Honorable Mention award

  46. arXiv:2408.04224  [pdf, other

    cs.CV

    Cross-View Meets Diffusion: Aerial Image Synthesis with Geometry and Text Guidance

    Authors: Ahmad Arrabi, Xiaohan Zhang, Waqas Sultani, Chen Chen, Safwan Wshah

    Abstract: Aerial imagery analysis is critical for many research fields. However, obtaining frequent high-quality aerial images is not always accessible due to its high effort and cost requirements. One solution is to use the Ground-to-Aerial (G2A) technique to synthesize aerial images from easily collectible ground images. However, G2A is rarely studied, because of its challenges, including but not limited… ▽ More

    Submitted 20 August, 2024; v1 submitted 8 August, 2024; originally announced August 2024.

  47. arXiv:2408.03608  [pdf, other

    cs.LG cs.CV stat.ME

    Mixstyle-Entropy: Domain Generalization with Causal Intervention and Perturbation

    Authors: Luyao Tang, Yuxuan Yuan, Chaoqi Chen, Xinghao Ding, Yue Huang

    Abstract: Despite the considerable advancements achieved by deep neural networks, their performance tends to degenerate when the test environment diverges from the training ones. Domain generalization (DG) solves this issue by learning representations independent of domain-related information, thus facilitating extrapolation to unseen environments. Existing approaches typically focus on formulating tailored… ▽ More

    Submitted 22 August, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

    Comments: Accepted by BMVC2024

  48. arXiv:2408.02965  [pdf, other

    cs.LG math.DS physics.comp-ph

    Data-Driven Stochastic Closure Modeling via Conditional Diffusion Model and Neural Operator

    Authors: Xinghao Dong, Chuanqi Chen, Jin-Long Wu

    Abstract: Closure models are widely used in simulating complex multiscale dynamical systems such as turbulence and the earth system, for which direct numerical simulation that resolves all scales is often too expensive. For those systems without a clear scale separation, deterministic and local closure models often lack enough generalization capability, which limits their performance in many real-world appl… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

    MSC Class: 68T01

  49. arXiv:2408.02615  [pdf, other

    cs.CV

    LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba

    Authors: Yunxiang Fu, Chaoqi Chen, Yizhou Yu

    Abstract: Recent Transformer-based diffusion models have shown remarkable performance, largely attributed to the ability of the self-attention mechanism to accurately capture both global and local contexts by computing all-pair interactions among input tokens. However, their quadratic complexity poses significant computational challenges for long-sequence inputs. Conversely, a recent state space model calle… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  50. X.509 Information Security Certification Based on Post-Quantum Cryptography

    Authors: Abel C. H. Chen

    Abstract: In recent years, with the advancement of quantum computing, mainstream asymmetric cryptographic methods in the current Public Key Infrastructure (PKI) systems are gradually being threatened. Therefore, this study explores X.509 security certificates based on Post-Quantum Cryptography (PQC) and discusses implemented solutions. This study compares mainstream asymmetric cryptographic methods (includi… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

    Comments: The manuscript was submitted to arXiv on 6 May 2024, but it was rejected on 11 July 2024. The appeal was submitted on 11 July 2024, and it was accepted on 2 August 2024. The manuscript is written in Chinese language