Skip to main content

Showing 1–50 of 765 results for author: Ding, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.03439  [pdf, other

    cs.RO cs.AI cs.PL

    KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale

    Authors: Wei Gao, Jingqiang Wang, Xinv Zhu, Jun Zhong, Yue Shen, Youshuang Ding

    Abstract: We would like industrial robots to handle unstructured environments with cameras and perception pipelines. In contrast to traditional industrial robots that replay offline-crafted trajectories, online behavior planning is required for these perception-guided industrial applications. Aside from perception and planning algorithms, deploying perception-guided manipulators also requires substantial ef… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  2. arXiv:2409.00670  [pdf, other

    cs.LG cs.SI

    Towards Faster Graph Partitioning via Pre-training and Inductive Inference

    Authors: Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai

    Abstract: Graph partitioning (GP) is a classic problem that divides the node set of a graph into densely-connected blocks. Following the IEEE HPEC Graph Challenge and recent advances in pre-training techniques (e.g., large-language models), we propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel pre-training & refinement paradigm. We first conduct the offline pre-training of a deep gra… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

    Comments: Champion winner of IEEE HPEC 2024 Graph Challenge (https://graphchallenge.mit.edu/champions)

  3. arXiv:2408.14416  [pdf, ps, other

    cs.LG cs.DC

    Hyperdimensional Computing Empowered Federated Foundation Model over Wireless Networks for Metaverse

    Authors: Yahao Ding, Wen Shang, Minrui Xu, Zhaohui Yang, Ye Hu, Dusit Niyato, Mohammad Shikh-Bahaei

    Abstract: The Metaverse, a burgeoning collective virtual space merging augmented reality and persistent virtual worlds, necessitates advanced artificial intelligence (AI) and communication technologies to support immersive and interactive experiences. Federated learning (FL) has emerged as a promising technique for collaboratively training AI models while preserving data privacy. However, FL faces challenge… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  4. arXiv:2408.14032  [pdf, other

    cs.CV

    More Pictures Say More: Visual Intersection Network for Open Set Object Detection

    Authors: Bingcheng Dong, Yuning Ding, Jinrong Zhang, Sifan Zhang, Shenglan Liu

    Abstract: Open Set Object Detection has seen rapid development recently, but it continues to pose significant challenges. Language-based methods, grappling with the substantial modal disparity between textual and visual modalities, require extensive computational resources to bridge this gap. Although integrating visual prompts into these frameworks shows promise for enhancing performance, it always comes w… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: 7pages

  5. arXiv:2408.13614  [pdf, other

    eess.AS cs.CY

    As Biased as You Measure: Methodological Pitfalls of Bias Evaluations in Speaker Verification Research

    Authors: Wiebke Hutiri, Tanvina Patel, Aaron Yi Ding, Odette Scharenborg

    Abstract: Detecting and mitigating bias in speaker verification systems is important, as datasets, processing choices and algorithms can lead to performance differences that systematically favour some groups of people while disadvantaging others. Prior studies have thus measured performance differences across groups to evaluate bias. However, when comparing results across studies, it becomes apparent that t… ▽ More

    Submitted 24 August, 2024; originally announced August 2024.

    Comments: Accepted to Interspeech 2024 (oral)

  6. arXiv:2408.13024  [pdf, other

    cs.CV

    Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding

    Authors: Xianqiang Gao, Pingrui Zhang, Delin Qu, Dong Wang, Zhigang Wang, Yan Ding, Bin Zhao, Xuelong Li

    Abstract: 3D Object Affordance Grounding aims to predict the functional regions on a 3D object and has laid the foundation for a wide range of applications in robotics. Recent advances tackle this problem via learning a mapping between 3D regions and a single human-object interaction image. However, the geometric structure of the 3D object and the object in the human-object interaction image are not always… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  7. arXiv:2408.11480  [pdf, other

    eess.IV cs.CV

    OAPT: Offset-Aware Partition Transformer for Double JPEG Artifacts Removal

    Authors: Qiao Mo, Yukang Ding, Jinhua Hao, Qiang Zhu, Ming Sun, Chao Zhou, Feiyu Chen, Shuyuan Zhu

    Abstract: Deep learning-based methods have shown remarkable performance in single JPEG artifacts removal task. However, existing methods tend to degrade on double JPEG images, which are prevalent in real-world scenarios. To address this issue, we propose Offset-Aware Partition Transformer for double JPEG artifacts removal, termed as OAPT. We conduct an analysis of double JPEG compression that results in up… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: 14 pages, 9 figures. Codes and models are available at https://github.com/QMoQ/OAPT.git

  8. arXiv:2408.11067  [pdf, other

    cs.NE cs.AI cs.LG

    Toward End-to-End Bearing Fault Diagnosis for Industrial Scenarios with Spiking Neural Networks

    Authors: Yongqi Ding, Lin Zuo, Mengmeng Jing, Kunshan Yang, Biao Chen, Yunqian Yu

    Abstract: Spiking neural networks (SNNs) transmit information via low-power binary spikes and have received widespread attention in areas such as computer vision and reinforcement learning. However, there have been very few explorations of SNNs in more practical industrial scenarios. In this paper, we focus on the application of SNNs in bearing fault diagnosis to facilitate the integration of high-performan… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: 13 pages, 10 figures

  9. arXiv:2408.10605  [pdf, other

    cs.CV cs.AI

    MUSES: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration

    Authors: Yanbo Ding, Shaobin Zhuang, Kunchang Li, Zhengrong Yue, Yu Qiao, Yali Wang

    Abstract: Despite recent advancements in text-to-image generation, most existing methods struggle to create images with multiple objects and complex spatial relationships in 3D world. To tackle this limitation, we introduce a generic AI system, namely MUSES, for 3D-controllable image generation from user queries. Specifically, our MUSES addresses this challenging task by developing a progressive workflow wi… ▽ More

    Submitted 21 August, 2024; v1 submitted 20 August, 2024; originally announced August 2024.

  10. arXiv:2408.10443  [pdf, other

    cs.LG cs.CL cs.SD eess.AS

    Federated Learning of Large ASR Models in the Real World

    Authors: Yonghui Xiao, Yuxin Ding, Changwan Ryu, Petr Zadrazil, Francoise Beaufays

    Abstract: Federated learning (FL) has shown promising results on training machine learning models with privacy preservation. However, for large models with over 100 million parameters, the training resource requirement becomes an obstacle for FL because common devices do not have enough memory and computation power to finish the FL tasks. Although efficient training methods have been proposed, it is still a… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  11. arXiv:2408.09717  [pdf, other

    cs.CL

    SEMDR: A Semantic-Aware Dual Encoder Model for Legal Judgment Prediction with Legal Clue Tracing

    Authors: Pengjie Liu, Wang Zhang, Yulong Ding, Xuefeng Zhang, Shuang-Hua Yang

    Abstract: Legal Judgment Prediction (LJP) aims to form legal judgments based on the criminal fact description. However, researchers struggle to classify confusing criminal cases, such as robbery and theft, which requires LJP models to distinguish the nuances between similar crimes. Existing methods usually design handcrafted features to pick up necessary semantic legal clues to make more accurate legal judg… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  12. arXiv:2408.09108  [pdf, other

    cs.AI cs.CV

    Temporal Reversed Training for Spiking Neural Networks with Generalized Spatio-Temporal Representation

    Authors: Lin Zuo, Yongqi Ding, Wenwei Luo, Mengmeng Jing, Xianlong Tian, Kunshan Yang

    Abstract: Spiking neural networks (SNNs) have received widespread attention as an ultra-low energy computing paradigm. Recent studies have focused on improving the feature extraction capability of SNNs, but they suffer from inefficient inference and suboptimal performance. In this paper, we propose a simple yet effective temporal reversed training (TRT) method to optimize the spatio-temporal performance of… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: 15 pages, 8 figures

  13. arXiv:2408.06450  [pdf, other

    cs.SE cs.CL cs.LG

    Evaluating Language Models for Efficient Code Generation

    Authors: Jiawei Liu, Songrun Xie, Junhao Wang, Yuxiang Wei, Yifeng Ding, Lingming Zhang

    Abstract: We introduce Differential Performance Evaluation (DPE), a framework designed to reliably evaluate Large Language Models (LLMs) for efficient code generation. Traditional coding benchmarks often fail to provide reliable insights into code efficiency, due to their reliance on simplistic test inputs and the absence of effective compound metrics. DPE addresses these issues by focusing on efficiency-de… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  14. arXiv:2408.06027  [pdf, other

    eess.SP cs.LG

    A Comprehensive Survey on EEG-Based Emotion Recognition: A Graph-Based Perspective

    Authors: Chenyu Liu, Xinliang Zhou, Yihao Wu, Yi Ding, Liming Zhai, Kun Wang, Ziyu Jia, Yang Liu

    Abstract: Compared to other modalities, electroencephalogram (EEG) based emotion recognition can intuitively respond to emotional patterns in the human brain and, therefore, has become one of the most focused tasks in affective computing. The nature of emotions is a physiological and psychological state change in response to brain region connectivity, making emotion recognition focus more on the dependency… ▽ More

    Submitted 13 August, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

  15. arXiv:2408.05406  [pdf, other

    quant-ph cs.LG

    Efficient Quantum Gradient and Higher-order Derivative Estimation via Generalized Hadamard Test

    Authors: Dantong Li, Dikshant Dulal, Mykhailo Ohorodnikov, Hanrui Wang, Yongshan Ding

    Abstract: In the context of Noisy Intermediate-Scale Quantum (NISQ) computing, parameterized quantum circuits (PQCs) represent a promising paradigm for tackling challenges in quantum sensing, optimal control, optimization, and machine learning on near-term quantum hardware. Gradient-based methods are crucial for understanding the behavior of PQCs and have demonstrated substantial advantages in the convergen… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  16. arXiv:2408.04218  [pdf, other

    cs.IT

    On many-to-one mappings over finite fields

    Authors: Yanbin Zheng, Yanjin Ding, Meiying Zhang, Pingzhi Yuan, Qiang Wang

    Abstract: The definition of many-to-one mapping, or $m$-to-$1$ mapping for short, between two finite sets is introduced in this paper, which unifies and generalizes the definitions of $2$-to-$1$ mappings and $n$-to-$1$ mappings. A generalized local criterion is given, which is an abstract criterion for a mapping to be $m$-to-$1$. By employing the generalized local criterion, three constructions of $m$-to-… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  17. arXiv:2408.02839  [pdf, other

    stat.ML cs.LG

    Optimizing Cox Models with Stochastic Gradient Descent: Theoretical Foundations and Practical Guidances

    Authors: Lang Zeng, Weijing Tang, Zhao Ren, Ying Ding

    Abstract: Optimizing Cox regression and its neural network variants poses substantial computational challenges in large-scale studies. Stochastic gradient descent (SGD), known for its scalability in model optimization, has recently been adapted to optimize Cox models. Unlike its conventional application, which typically targets a sum of independent individual loss, SGD for Cox models updates parameters base… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  18. arXiv:2408.01691  [pdf, other

    cs.LG cs.AI

    TreeCSS: An Efficient Framework for Vertical Federated Learning

    Authors: Qinbo Zhang, Xiao Yan, Yukai Ding, Quanqing Xu, Chuang Hu, Xiaokai Zhou, Jiawei Jiang

    Abstract: Vertical federated learning (VFL) considers the case that the features of data samples are partitioned over different participants. VFL consists of two main steps, i.e., identify the common data samples for all participants (alignment) and train model using the aligned data samples (training). However, when there are many participants and data samples, both alignment and training become slow. As s… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

    Comments: 16 pages, 7 figures

  19. arXiv:2408.01435  [pdf, other

    cs.CV cs.RO

    A New Clustering-based View Planning Method for Building Inspection with Drone

    Authors: Yongshuai Zheng, Guoliang Liu, Yan Ding, Guohui Tian

    Abstract: With the rapid development of drone technology, the application of drones equipped with visual sensors for building inspection and surveillance has attracted much attention. View planning aims to find a set of near-optimal viewpoints for vision-related tasks to achieve the vision coverage goal. This paper proposes a new clustering-based two-step computational method using spectral clustering, loca… ▽ More

    Submitted 19 July, 2024; originally announced August 2024.

  20. arXiv:2408.01391  [pdf, other

    cs.DC cs.LG

    FT K-means: A High-Performance K-means on GPU with Fault Tolerance

    Authors: Shixun Wu, Yitong Ding, Yujia Zhai, Jinyang Liu, Jiajun Huang, Zizhe Jian, Huangliang Dai, Sheng Di, Bryan M. Wong, Zizhong Chen, Franck Cappello

    Abstract: K-means is a widely used algorithm in clustering, however, its efficiency is primarily constrained by the computational cost of distance computing. Existing implementations suffer from suboptimal utilization of computational units and lack resilience against soft errors. To address these challenges, we introduce FT K-means, a high-performance GPU-accelerated implementation of K-means with online f… ▽ More

    Submitted 7 August, 2024; v1 submitted 2 August, 2024; originally announced August 2024.

  21. arXiv:2408.01287  [pdf, other

    cs.CL cs.CV

    Deep Learning based Visually Rich Document Content Understanding: A Survey

    Authors: Yihao Ding, Jean Lee, Soyeon Caren Han

    Abstract: Visually Rich Documents (VRDs) are essential in academia, finance, medical fields, and marketing due to their multimodal information content. Traditional methods for extracting information from VRDs depend on expert knowledge and manual labor, making them costly and inefficient. The advent of deep learning has revolutionized this process, introducing models that leverage multimodal information vis… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: Work in Progress

  22. arXiv:2408.00491  [pdf, other

    cs.CL cs.CV cs.MM

    GalleryGPT: Analyzing Paintings with Large Multimodal Models

    Authors: Yi Bin, Wenhao Shi, Yujuan Ding, Zhiqiang Hu, Zheng Wang, Yang Yang, See-Kiong Ng, Heng Tao Shen

    Abstract: Artwork analysis is important and fundamental skill for art appreciation, which could enrich personal aesthetic sensibility and facilitate the critical thinking ability. Understanding artworks is challenging due to its subjective nature, diverse interpretations, and complex visual elements, requiring expertise in art history, cultural background, and aesthetic theory. However, limited by the data… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: Accepted as Oral Presentation at ACM Multimedia 2024

  23. Leveraging Weak Cross-Modal Guidance for Coherence Modelling via Iterative Learning

    Authors: Yi Bin, Junrong Liao, Yujuan Ding, Haoxuan Li, Yang Yang, See-Kiong Ng, Heng Tao Shen

    Abstract: Cross-modal coherence modeling is essential for intelligent systems to help them organize and structure information, thereby understanding and creating content of the physical world coherently like human-beings. Previous work on cross-modal coherence modeling attempted to leverage the order information from another modality to assist the coherence recovering of the target modality. Despite of the… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: Accepted by ACM Multimedia 2024

  24. Air-to-Ground Cooperative OAM Communications

    Authors: Ruirui Chen, Yu Ding, Beibei Zhang, Song Li, Liping Liang

    Abstract: For users in hotspot region, orbital angular momentum (OAM) can realize multifold increase of spectrum efficiency (SE), and the flying base station (FBS) can rapidly support the real-time communication demand. However, the hollow divergence and alignment requirement impose crucial challenges for users to achieve air-to-ground OAM communications, where there exists the line-of-sight path. Therefore… ▽ More

    Submitted 1 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

    Journal ref: IEEE WIRELESS COMMUNICATIONS LETTERS, VOL. 13, NO. 4, APRIL 2024

  25. Precoding Based Downlink OAM-MIMO Communications with Rate Splitting

    Authors: Ruirui Chen, Jinyang Lin, Beibei Zhang, Yu Ding, Keyue Xu

    Abstract: Orbital angular momentum (OAM) and rate splitting (RS) are the potential key techniques for the future wireless communications. As a new orthogonal resource, OAM can achieve the multifold increase of spectrum efficiency to relieve the scarcity of the spectrum resource, but how to enhance the privacy performance imposes crucial challenge for OAM communications. RS technique divides the information… ▽ More

    Submitted 2 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

    Journal ref: IEEE TRANSACTIONS ON BROADCASTING, VOL. 69, NO. 4, DECEMBER 2023

  26. arXiv:2407.19832  [pdf, other

    cs.CV cs.AI cs.CL

    ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2

    Authors: Wenjun Huang, Jiakai Pan, Jiahao Tang, Yanyu Ding, Yifei Xing, Yuhe Wang, Zhengzhuo Wang, Jianguo Hu

    Abstract: Multimodal Large Language Models (MLLMs) have attracted much attention for their multifunctionality. However, traditional Transformer architectures incur significant overhead due to their secondary computational complexity. To address this issue, we introduce ML-Mamba, a multimodal language model, which utilizes the latest and efficient Mamba-2 model for inference. Mamba-2 is known for its linear… ▽ More

    Submitted 21 August, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

  27. arXiv:2407.19821  [pdf

    eess.IV cs.CV q-bio.TO

    Distilling High Diagnostic Value Patches for Whole Slide Image Classification Using Attention Mechanism

    Authors: Tianhang Nan, Hao Quan, Yong Ding, Xingyu Li, Kai Yang, Xiaoyu Cui

    Abstract: Multiple Instance Learning (MIL) has garnered widespread attention in the field of Whole Slide Image (WSI) classification as it replaces pixel-level manual annotation with diagnostic reports as labels, significantly reducing labor costs. Recent research has shown that bag-level MIL methods often yield better results because they can consider all patches of the WSI as a whole. However, a drawback o… ▽ More

    Submitted 16 August, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

  28. arXiv:2407.19201  [pdf, other

    cs.LG

    Long Range Switching Time Series Prediction via State Space Model

    Authors: Jiaming Zhang, Yang Ding, Yunfeng Gao

    Abstract: In this study, we delve into the Structured State Space Model (S4), Change Point Detection methodologies, and the Switching Non-linear Dynamics System (SNLDS). Our central proposition is an enhanced inference technique and long-range dependency method for SNLDS. The cornerstone of our approach is the fusion of S4 and SNLDS, leveraging the strengths of both models to effectively address the intrica… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

    Comments: 14 pages, 14 figures

  29. arXiv:2407.17349  [pdf, other

    cs.CL

    Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching

    Authors: Yuyang Ding, Hanglei Hu, Jie Zhou, Qin Chen, Bo Jiang, Liang He

    Abstract: With the introduction of large language models (LLMs), automatic math reasoning has seen tremendous success. However, current methods primarily focus on providing solutions or using techniques like Chain-of-Thought to enhance problem-solving accuracy. In this paper, we focus on improving the capability of mathematics teaching via a Socratic teaching-based LLM (\texttt{SocraticLLM}), which guides l… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: Accepted By CIKM 2024

  30. arXiv:2407.17157  [pdf

    cs.CV q-bio.TO

    Establishing Truly Causal Relationship Between Whole Slide Image Predictions and Diagnostic Evidence Subregions in Deep Learning

    Authors: Tianhang Nan, Yong Ding, Hao Quan, Deliang Li, Mingchen Zou, Xiaoyu Cui

    Abstract: In the field of deep learning-driven Whole Slide Image (WSI) classification, Multiple Instance Learning (MIL) has gained significant attention due to its ability to be trained using only slide-level diagnostic labels. Previous MIL researches have primarily focused on enhancing feature aggregators for globally analyzing WSIs, but overlook a causal relationship in diagnosis: model's prediction shoul… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  31. arXiv:2407.17126  [pdf

    cs.CL cs.AI

    SDoH-GPT: Using Large Language Models to Extract Social Determinants of Health (SDoH)

    Authors: Bernardo Consoli, Xizhi Wu, Song Wang, Xinyu Zhao, Yanshan Wang, Justin Rousseau, Tom Hartvigsen, Li Shen, Huanmei Wu, Yifan Peng, Qi Long, Tianlong Chen, Ying Ding

    Abstract: Extracting social determinants of health (SDoH) from unstructured medical notes depends heavily on labor-intensive annotations, which are typically task-specific, hampering reusability and limiting sharing. In this study we introduced SDoH-GPT, a simple and effective few-shot Large Language Model (LLM) method leveraging contrastive examples and concise instructions to extract SDoH without relying… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  32. arXiv:2407.17030  [pdf, other

    cs.NI

    Applications of Multi-Agent Deep Reinforcement Learning Communication in Network Management: A Survey

    Authors: Yue Pi, Wang Zhang, Yong Zhang, Hairong Huang, Baoquan Rao, Yulong Ding, Shuanghua Yang

    Abstract: With the advancement of artificial intelligence technology, the automation of network management, also known as Autonomous Driving Networks (ADN), is gaining widespread attention. The network management has shifted from traditional homogeneity and centralization to heterogeneity and decentralization. Multi-agent deep reinforcement learning (MADRL) allows agents to make decisions based on local obs… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  33. arXiv:2407.15617  [pdf, other

    cs.CV cs.AI

    Norface: Improving Facial Expression Analysis by Identity Normalization

    Authors: Hanwei Liu, Rudong An, Zhimeng Zhang, Bowen Ma, Wei Zhang, Yan Song, Yujing Hu, Wei Chen, Yu Ding

    Abstract: Facial Expression Analysis remains a challenging task due to unexpected task-irrelevant noise, such as identity, head pose, and background. To address this issue, this paper proposes a novel framework, called Norface, that is unified for both Action Unit (AU) analysis and Facial Emotion Recognition (FER) tasks. Norface consists of a normalization network and a classification network. First, the ca… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  34. arXiv:2407.14996  [pdf, other

    cs.LG

    All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks

    Authors: Ajay Jaiswal, Nurendra Choudhary, Ravinarayana Adkathimar, Muthu P. Alagappan, Gaurush Hiranandani, Ying Ding, Zhangyang Wang, Edward W Huang, Karthik Subbian

    Abstract: Graph Neural Networks (GNNs) have attracted immense attention in the past decade due to their numerous real-world applications built around graph-structured data. On the other hand, Large Language Models (LLMs) with extensive pretrained knowledge and powerful semantic comprehension abilities have recently shown a remarkable ability to benefit applications using vision and text data. In this paper,… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

  35. arXiv:2407.14643  [pdf, other

    cs.RO

    Double-Layer Soft Data Fusion for Indoor Robot WiFi-Visual Localization

    Authors: Yuehua Ding, Jean-Francois Dollinger, Vincent Vauchey, Mourad Zghal

    Abstract: This paper presents a novel WiFi-Visual data fusion method for indoor robot (TIAGO++) localization. This method can use 10 WiFi samples and 4 low-resolution images ($58 \times 58$ in pixels) to localize a indoor robot with an average error distance about 1.32 meters. The experiment test is 3 months after the data collection in a general teaching building, whose WiFi and visual environments are par… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  36. arXiv:2407.13126  [pdf, other

    cs.DC

    Improving GPU Multi-Tenancy Through Dynamic Multi-Instance GPU Reconfiguration

    Authors: Tianyu Wang, Sheng Li, Bingyao Li, Yue Dai, Ao Li, Geng Yuan, Yufei Ding, Youtao Zhang, Xulong Tang

    Abstract: Continuous learning (CL) has emerged as one of the most popular deep learning paradigms deployed in modern cloud GPUs. Specifically, CL has the capability to continuously update the model parameters (through model retraining) and use the updated model (if available) to serve overtime arriving inference requests. It is generally beneficial to co-locate the retraining and inference together to enabl… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  37. arXiv:2407.08255  [pdf, other

    cs.CV cs.LG

    GraphMamba: An Efficient Graph Structure Learning Vision Mamba for Hyperspectral Image Classification

    Authors: Aitao Yang, Min Li, Yao Ding, Leyuan Fang, Yaoming Cai, Yujie He

    Abstract: Efficient extraction of spectral sequences and geospatial information has always been a hot topic in hyperspectral image classification. In terms of spectral sequence feature capture, RNN and Transformer have become mainstream classification frameworks due to their long-range feature capture capabilities. In terms of spatial information aggregation, CNN enhances the receptive field to retain integ… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 13 pages, 10 figures

  38. arXiv:2407.06505  [pdf

    cs.HC

    Not all explicit cues help communicate: Pedestrians' perceptions, fixations, and decisions toward automated vehicles with varied appearance

    Authors: Wei Lyu, Yaqin Cao, Yi Ding, Jingyu Li, Kai Tian, Hui Zhang

    Abstract: Given pedestrians' vulnerability in road traffic, it remains unclear how novel AV appearances will impact pedestrians crossing behaviour. To address this gap, this study pioneers an investigation into the influence of AVs' exterior design, correlated with their kinematics, on pedestrians' road-crossing perception and decision-making. A video-based eye-tracking experimental study was conducted with… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 37 pages, 13 figures, 4 tables

  39. CrowdTransfer: Enabling Crowd Knowledge Transfer in AIoT Community

    Authors: Yan Liu, Bin Guo, Nuo Li, Yasan Ding, Zhouyangzi Zhang, Zhiwen Yu

    Abstract: Artificial Intelligence of Things (AIoT) is an emerging frontier based on the deep fusion of Internet of Things (IoT) and Artificial Intelligence (AI) technologies. Although advanced deep learning techniques enhance the efficient data processing and intelligent analysis of complex IoT data, they still suffer from notable challenges when deployed to practical AIoT applications, such as constrained… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted for publication in IEEE Communications Surveys & Tutorials. Copyright will be transferred without notice, after this version may no longer be accessible

  40. arXiv:2407.05739  [pdf, other

    cs.NE cs.AI

    Multi-Bit Mechanism: A Novel Information Transmission Paradigm for Spiking Neural Networks

    Authors: Yongjun Xiao, Xianlong Tian, Yongqi Ding, Pei He, Mengmeng Jing, Lin Zuo

    Abstract: Since proposed, spiking neural networks (SNNs) gain recognition for their high performance, low power consumption and enhanced biological interpretability. However, while bringing these advantages, the binary nature of spikes also leads to considerable information loss in SNNs, ultimately causing performance degradation. We claim that the limited expressiveness of current binary spikes, resulting… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Under review

  41. arXiv:2407.03474  [pdf

    cs.CY

    How high-status women promote repeated collaboration among women in male-dominated contexts

    Authors: Huimin Xu, Jamie Strassman, Ying Ding, Steven Gray, Maytal Saar-Tsechansky

    Abstract: Male-dominated contexts pose a dilemma: they increase the benefits of repeated collaboration among women, yet at the same time, make such collaborations less likely. This paper seeks to understand the conditions that foster repeated collaboration among women versus men in male-dominated settings by examining the critical role of status hierarchies. Using collaboration data on 8,232,769 computer sc… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  42. arXiv:2407.03374  [pdf

    cs.AI cs.SE eess.SP eess.SY

    An Outline of Prognostics and Health Management Large Model: Concepts, Paradigms, and Challenges

    Authors: Laifa Tao, Shangyu Li, Haifei Liu, Qixuan Huang, Liang Ma, Guoao Ning, Yiling Chen, Yunlong Wu, Bin Li, Weiwei Zhang, Zhengduo Zhao, Wenchao Zhan, Wenyan Cao, Chao Wang, Hongmei Liu, Jian Ma, Mingliang Suo, Yujie Cheng, Yu Ding, Dengwei Song, Chen Lu

    Abstract: Prognosis and Health Management (PHM), critical for ensuring task completion by complex systems and preventing unexpected failures, is widely adopted in aerospace, manufacturing, maritime, rail, energy, etc. However, PHM's development is constrained by bottlenecks like generalization, interpretation and verification abilities. Presently, generative artificial intelligence (AI), represented by Larg… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  43. arXiv:2407.02390  [pdf, other

    cs.DC cs.LG

    Uncertainty-Aware Decarbonization for Datacenters

    Authors: Amy Li, Sihang Liu, Yi Ding

    Abstract: This paper represents the first effort to quantify uncertainty in carbon intensity forecasting for datacenter decarbonization. We identify and analyze two types of uncertainty -- temporal and spatial -- and discuss their system implications. To address the temporal dynamics in quantifying uncertainty for carbon intensity forecasting, we introduce a conformal prediction-based framework. Evaluation… ▽ More

    Submitted 23 August, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  44. arXiv:2407.02159  [pdf, other

    cs.CV eess.IV

    SparseSSP: 3D Subcellular Structure Prediction from Sparse-View Transmitted Light Images

    Authors: Jintu Zheng, Yi Ding, Qizhe Liu, Yi Cao, Ying Hu, Zenan Wang

    Abstract: Traditional fluorescence staining is phototoxic to live cells, slow, and expensive; thus, the subcellular structure prediction (SSP) from transmitted light (TL) images is emerging as a label-free, faster, low-cost alternative. However, existing approaches utilize 3D networks for one-to-one voxel level dense prediction, which necessitates a frequent and time-consuming Z-axis imaging process. Moreov… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Accpeted to ECCV2024

  45. arXiv:2407.00105  [pdf, other

    cs.LG cs.AI

    Multiple Kronecker RLS fusion-based link propagation for drug-side effect prediction

    Authors: Yuqing Qian, Ziyu Zheng, Prayag Tiwari, Yijie Ding, Quan Zou

    Abstract: Drug-side effect prediction has become an essential area of research in the field of pharmacology. As the use of medications continues to rise, so does the importance of understanding and mitigating the potential risks associated with them. At present, researchers have turned to data-driven methods to predict drug-side effects. Drug-side effect prediction is a link prediction problem, and the rela… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Transactions on Machine Learning Research (TMLR 2024)

  46. arXiv:2406.18938  [pdf, other

    cs.IR

    Towards Personalized Federated Multi-Scenario Multi-Task Recommendation

    Authors: Yue Ding, Yanbiao Ji, Xun Cai, Xin Xin, Yuxiang Lu, Suizhi Huang, Chang Liu, Xiaofeng Gao, Tsuyoshi Murata, Hongtao Lu

    Abstract: In modern recommender systems, especially in e-commerce, predicting multiple targets such as click-through rate (CTR) and post-view conversion rate (CTCVR) is common. Multi-task recommender systems are increasingly popular in both research and practice, as they leverage shared knowledge across diverse business scenarios to enhance performance. However, emerging real-world scenarios and data privac… ▽ More

    Submitted 19 August, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

  47. arXiv:2406.18585  [pdf, other

    cs.CV cs.AI

    Flexible ViG: Learning the Self-Saliency for Flexible Object Recognition

    Authors: Lin Zuo, Kunshan Yang, Xianlong Tian, Kunbin He, Yongqi Ding, Mengmeng Jing

    Abstract: Existing computer vision methods mainly focus on the recognition of rigid objects, whereas the recognition of flexible objects remains unexplored. Recognizing flexible objects poses significant challenges due to their inherently diverse shapes and sizes, translucent attributes, ambiguous boundaries, and subtle inter-class differences. In this paper, we claim that these problems primarily arise fro… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: under review

  48. arXiv:2406.18345  [pdf, other

    cs.LG eess.SP

    EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition

    Authors: Yi Ding, Chengxuan Tong, Shuailei Zhang, Muyun Jiang, Yong Li, Kevin Lim Jun Liang, Cuntai Guan

    Abstract: Integrating prior knowledge of neurophysiology into neural network architecture enhances the performance of emotion decoding. While numerous techniques emphasize learning spatial and short-term temporal patterns, there has been limited emphasis on capturing the vital long-term contextual information associated with emotional cognitive processes. In order to address this discrepancy, we introduce a… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 11 pages, 5 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  49. arXiv:2406.17659  [pdf, other

    cs.AI cs.RO

    DKPROMPT: Domain Knowledge Prompting Vision-Language Models for Open-World Planning

    Authors: Xiaohan Zhang, Zainab Altaweel, Yohei Hayamizu, Yan Ding, Saeid Amiri, Hao Yang, Andy Kaminski, Chad Esselink, Shiqi Zhang

    Abstract: Vision-language models (VLMs) have been applied to robot task planning problems, where the robot receives a task in natural language and generates plans based on visual inputs. While current VLMs have demonstrated strong vision-language understanding capabilities, their performance is still far from being satisfactory in planning tasks. At the same time, although classical task planners, such as P… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  50. arXiv:2406.16034  [pdf, ps, other

    math.LO cs.LO

    Some General Completeness Results for Propositionally Quantified Modal Logics

    Authors: Yifeng Ding, Yipu Li

    Abstract: We study the completeness problem for propositionally quantified modal logics on quantifiable general frames, where the admissible sets are the propositions the quantifiers can range over and expressible sets of worlds are admissible, and Kripke frames, where the quantifiers range over all sets of worlds. We show that any normal propositionally quantified modal logic containing all instances of th… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.