Skip to main content

Showing 1–50 of 917 results for author: Feng, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.03215  [pdf, other

    cs.CL cs.AI cs.LG

    xLAM: A Family of Large Action Models to Empower AI Agent Systems

    Authors: Jianguo Zhang, Tian Lan, Ming Zhu, Zuxin Liu, Thai Hoang, Shirley Kokane, Weiran Yao, Juntao Tan, Akshara Prabhakar, Haolin Chen, Zhiwei Liu, Yihao Feng, Tulika Awalgaonkar, Rithesh Murthy, Eric Hu, Zeyuan Chen, Ran Xu, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong

    Abstract: Autonomous agents powered by large language models (LLMs) have attracted significant research interest. However, the open-source community faces many challenges in developing specialized models for agent tasks, driven by the scarcity of high-quality agent datasets and the absence of standard protocols in this area. We introduce and publicly release xLAM, a series of large action models designed fo… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: Technical report for the Salesforce xLAM model series

  2. arXiv:2409.02492  [pdf

    cs.CV cs.LG eess.IV

    Reliable Deep Diffusion Tensor Estimation: Rethinking the Power of Data-Driven Optimization Routine

    Authors: Jialong Li, Zhicheng Zhang, Yunwei Chen, Qiqi Lu, Ye Wu, Xiaoming Liu, QianJin Feng, Yanqiu Feng, Xinyuan Zhang

    Abstract: Diffusion tensor imaging (DTI) holds significant importance in clinical diagnosis and neuroscience research. However, conventional model-based fitting methods often suffer from sensitivity to noise, leading to decreased accuracy in estimating DTI parameters. While traditional data-driven deep learning methods have shown potential in terms of accuracy and efficiency, their limited generalization to… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

  3. arXiv:2409.01976  [pdf, other

    cs.CR

    Benchmarking ZK-Friendly Hash Functions and SNARK Proving Systems for EVM-compatible Blockchains

    Authors: Hanze Guo, Yebo Feng, Cong Wu, Zengpeng Li, Jiahua Xu

    Abstract: With the rapid development of Zero-Knowledge Proofs (ZKPs), particularly Succinct Non-Interactive Arguments of Knowledge (SNARKs), benchmarking various ZK tools has become a valuable task. ZK-friendly hash functions, as key algorithms in blockchain, have garnered significant attention. Therefore, comprehensive benchmarking and evaluations of these evolving algorithms in ZK circuits present both pr… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  4. arXiv:2409.00103  [pdf, other

    cs.CL cs.AI

    Nuance Matters: Probing Epistemic Consistency in Causal Reasoning

    Authors: Shaobo Cui, Junyou Li, Luca Mouchel, Yiyang Feng, Boi Faltings

    Abstract: To address this gap, our study introduces the concept of causal epistemic consistency, which focuses on the self-consistency of Large Language Models (LLMs) in differentiating intermediates with nuanced differences in causal reasoning. We propose a suite of novel metrics -- intensity ranking concordance, cross-group position agreement, and intra-group clustering -- to evaluate LLMs on this front.… ▽ More

    Submitted 27 August, 2024; originally announced September 2024.

    Comments: 20 pages

  5. arXiv:2408.14197  [pdf, other

    cs.CV

    Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving

    Authors: Yu Yang, Jianbiao Mei, Yukai Ma, Siliang Du, Wenqing Chen, Yijie Qian, Yuxiang Feng, Yong Liu

    Abstract: World models envision potential future states based on various ego actions. They embed extensive knowledge about the driving environment, facilitating safe and scalable autonomous driving. Most existing methods primarily focus on either data generation or the pretraining paradigms of world models. Unlike the aforementioned prior works, we propose Drive-OccWorld, which adapts a vision-centric 4D fo… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: 18 pages, 10 figures

  6. arXiv:2408.12680  [pdf, other

    cs.AI

    Can LLMs Understand Social Norms in Autonomous Driving Games?

    Authors: Boxuan Wang, Haonan Duan, Yanhao Feng, Xu Chen, Yongjie Fu, Zhaobin Mo, Xuan Di

    Abstract: Social norm is defined as a shared standard of acceptable behavior in a society. The emergence of social norms fosters coordination among agents without any hard-coded rules, which is crucial for the large-scale deployment of AVs in an intelligent transportation system. This paper explores the application of LLMs in understanding and modeling social norms in autonomous driving games. We introduce… ▽ More

    Submitted 1 September, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

  7. arXiv:2408.12590  [pdf, other

    cs.CV cs.AI

    xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

    Authors: Can Qin, Congying Xia, Krithika Ramakrishnan, Michael Ryoo, Lifu Tu, Yihao Feng, Manli Shu, Honglu Zhou, Anas Awadalla, Jun Wang, Senthil Purushwalkam, Le Xue, Yingbo Zhou, Huan Wang, Silvio Savarese, Juan Carlos Niebles, Zeyuan Chen, Ran Xu, Caiming Xiong

    Abstract: We present xGen-VideoSyn-1, a text-to-video (T2V) generation model capable of producing realistic scenes from textual descriptions. Building on recent advancements, such as OpenAI's Sora, we explore the latent diffusion model (LDM) architecture and introduce a video variational autoencoder (VidVAE). VidVAE compresses video data both spatially and temporally, significantly reducing the length of vi… ▽ More

    Submitted 31 August, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

    Comments: Accepted by ECCV24 AI4VA

  8. arXiv:2408.09857  [pdf, other

    cs.CL

    TaSL: Continual Dialog State Tracking via Task Skill Localization and Consolidation

    Authors: Yujie Feng, Xu Chu, Yongxin Xu, Guangyuan Shi, Bo Liu, Xiao-Ming Wu

    Abstract: A practical dialogue system requires the capacity for ongoing skill acquisition and adaptability to new tasks while preserving prior knowledge. However, current methods for Continual Dialogue State Tracking (DST), a crucial function of dialogue systems, struggle with the catastrophic forgetting issue and knowledge transfer between tasks. We present TaSL, a novel framework for task skill localizati… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: Accepted to ACL 2024 Main Conference. arXiv admin note: text overlap with arXiv:2408.05200

  9. arXiv:2408.09846  [pdf, other

    cs.CL

    Continual Dialogue State Tracking via Reason-of-Select Distillation

    Authors: Yujie Feng, Bo Liu, Xiaoyu Dong, Zexin Lu, Li-Ming Zhan, Xiao-Ming Wu, Albert Y. S. Lam

    Abstract: An ideal dialogue system requires continuous skill acquisition and adaptation to new tasks while retaining prior knowledge. Dialogue State Tracking (DST), vital in these systems, often involves learning new services and confronting catastrophic forgetting, along with a critical capability loss termed the "Value Selection Quandary." To address these challenges, we introduce the Reason-of-Select (Ro… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: Accepted to ACL 2024 Findings

  10. arXiv:2408.08208  [pdf, other

    cs.IR cs.AI

    LLM4DSR: Leveraing Large Language Model for Denoising Sequential Recommendation

    Authors: Bohao Wang, Feng Liu, Jiawei Chen, Yudi Wu, Xingyu Lou, Jun Wang, Yan Feng, Chun Chen, Can Wang

    Abstract: Sequential recommendation systems fundamentally rely on users' historical interaction sequences, which are often contaminated by noisy interactions. Identifying these noisy interactions accurately without additional information is particularly difficult due to the lack of explicit supervisory signals to denote noise. Large Language Models (LLMs), equipped with extensive open knowledge and semantic… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  11. arXiv:2408.07353  [pdf, other

    cs.CL

    Only One Relation Possible? Modeling the Ambiguity in Event Temporal Relation Extraction

    Authors: Yutong Hu, Quzhe Huang, Yansong Feng

    Abstract: Event Temporal Relation Extraction (ETRE) aims to identify the temporal relationship between two events, which plays an important role in natural language understanding. Most previous works follow a single-label classification style, classifying an event pair into either a specific temporal relation (e.g., \textit{Before}, \textit{After}), or a special label \textit{Vague} when there may be multip… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  12. arXiv:2408.07137  [pdf, other

    cs.CL

    ELLA: Empowering LLMs for Interpretable, Accurate and Informative Legal Advice

    Authors: Yutong Hu, Kangcheng Luo, Yansong Feng

    Abstract: Despite remarkable performance in legal consultation exhibited by legal Large Language Models(LLMs) combined with legal article retrieval components, there are still cases when the advice given is incorrect or baseless. To alleviate these problems, we propose {\bf ELLA}, a tool for {\bf E}mpowering {\bf L}LMs for interpretable, accurate, and informative {\bf L}egal {\bf A}dvice. ELLA visually pres… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  13. arXiv:2408.07060  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

    Authors: Kexun Zhang, Weiran Yao, Zuxin Liu, Yihao Feng, Zhiwei Liu, Rithesh Murthy, Tian Lan, Lei Li, Renze Lou, Jiacheng Xu, Bo Pang, Yingbo Zhou, Shelby Heinecke, Silvio Savarese, Huan Wang, Caiming Xiong

    Abstract: Large language model (LLM) agents have shown great potential in solving real-world software engineering (SWE) problems. The most advanced open-source SWE agent can resolve over 27% of real GitHub issues in SWE-Bench Lite. However, these sophisticated agent frameworks exhibit varying strengths, excelling in certain tasks while underperforming in others. To fully harness the diversity of these agent… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  14. arXiv:2408.06608  [pdf, other

    cs.AR cs.GR

    Potamoi: Accelerating Neural Rendering via a Unified Streaming Architecture

    Authors: Yu Feng, Weikai Lin, Zihan Liu, Jingwen Leng, Minyi Guo, Han Zhao, Xiaofeng Hou, Jieru Zhao, Yuhao Zhu

    Abstract: Neural Radiance Field (NeRF) has emerged as a promising alternative for photorealistic rendering. Despite recent algorithmic advancements, achieving real-time performance on today's resource-constrained devices remains challenging. In this paper, we identify the primary bottlenecks in current NeRF algorithms and introduce a unified algorithm-architecture co-design, Potamoi, designed to accommodate… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2404.11852

  15. arXiv:2408.05442  [pdf, other

    cs.CL

    Chain of Condition: Construct, Verify and Solve Conditions for Conditional Question Answering

    Authors: Jiuheng Lin, Yuxuan Lai, Yansong Feng

    Abstract: Conditional question answering (CQA) is an important task that aims to find probable answers and identify conditions that need to be satisfied to support the answer. Existing approaches struggle with CQA due to two main challenges: (1) precisely identifying conditions and their logical relationship, and (2) verifying and solving the conditions. To address these challenges, we propose Chain of Cond… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  16. arXiv:2408.05353  [pdf, other

    cs.IR cs.LG

    IntentRec: Predicting User Session Intent with Hierarchical Multi-Task Learning

    Authors: Sejoon Oh, Moumita Bhattacharya, Yesu Feng, Sudarshan Lamkhede

    Abstract: Recommender systems have played a critical role in diverse digital services such as e-commerce, streaming media, social networks, etc. If we know what a user's intent is in a given session (e.g. do they want to watch short videos or a movie or play games; are they shopping for a camping trip), it becomes easier to provide high-quality recommendations. In this paper, we introduce IntentRec, a novel… ▽ More

    Submitted 25 July, 2024; originally announced August 2024.

  17. arXiv:2408.05200  [pdf, other

    cs.CL cs.AI

    TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning

    Authors: Yujie Feng, Xu Chu, Yongxin Xu, Zexin Lu, Bo Liu, Philip S. Yu, Xiao-Ming Wu

    Abstract: Language model continual learning (CL) has recently attracted significant interest for its ability to adapt large language models (LLMs) to dynamic real-world scenarios without retraining. A major challenge in this domain is catastrophic forgetting, where models lose previously acquired knowledge upon learning new tasks. Existing approaches commonly utilize multiple parameter-efficient fine-tuning… ▽ More

    Submitted 30 August, 2024; v1 submitted 9 August, 2024; originally announced August 2024.

    Comments: Extension of ACL 2024 paper titled: Continual Dialog State Tracking via Task Skill Localization and Consolidation

  18. arXiv:2408.04804  [pdf, other

    cs.CV

    Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation

    Authors: Yifan Feng, Jiangang Huang, Shaoyi Du, Shihui Ying, Jun-Hai Yong, Yipeng Li, Guiguang Ding, Rongrong Ji, Yue Gao

    Abstract: We introduce Hyper-YOLO, a new object detection method that integrates hypergraph computations to capture the complex high-order correlations among visual features. Traditional YOLO models, while powerful, have limitations in their neck designs that restrict the integration of cross-level features and the exploitation of high-order feature interrelationships. To address these challenges, we propos… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  19. arXiv:2408.04683  [pdf, other

    cs.CR cs.AI cs.SE

    Eliminating Backdoors in Neural Code Models via Trigger Inversion

    Authors: Weisong Sun, Yuchen Chen, Chunrong Fang, Yebo Feng, Yuan Xiao, An Guo, Quanjun Zhang, Yang Liu, Baowen Xu, Zhenyu Chen

    Abstract: Neural code models (NCMs) have been widely used for addressing various code understanding tasks, such as defect detection and clone detection. However, numerous recent studies reveal that such models are vulnerable to backdoor attacks. Backdoored NCMs function normally on normal code snippets, but exhibit adversary-expected behavior on poisoned code snippets injected with the adversary-crafted tri… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: Under review

    MSC Class: 68-04 ACM Class: D.2.3; I.2.2; I.2.7

  20. arXiv:2408.03617  [pdf, other

    cs.CL cs.AI cs.LG

    Is Child-Directed Speech Effective Training Data for Language Models?

    Authors: Steven Y. Feng, Noah D. Goodman, Michael C. Frank

    Abstract: While high-performing language models are typically trained on hundreds of billions of words, human children become fluent language users with a much smaller amount of data. What are the features of the data they receive, and how do these features support language modeling objectives? To investigate this question, we train GPT-2 models on 29M words of English-language child-directed speech and a n… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: Preprint. Code and data will be released soon

  21. arXiv:2408.03262  [pdf, other

    cs.SE

    Towards Fixing Panic Bugs for Real-world Rust Programs

    Authors: Yunbo Ni, Yang Feng, Zixi Liu, Runtao Chen, Baowen Xu

    Abstract: The Rust programming language has garnered significant attention due to its robust safety features and memory management capabilities. Despite its guaranteed memory safety, Rust programs still suffer from runtime errors that are unmanageable, i.e., panic errors. Notably, over half of the bugs in rustc, Rust's own compiler, are attributable to crash stemming from panic errors. However, understandin… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  22. arXiv:2408.01394  [pdf, other

    cs.CL

    Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features

    Authors: Mengyu Bu, Shuhao Gu, Yang Feng

    Abstract: The many-to-many multilingual neural machine translation can be regarded as the process of integrating semantic features from the source sentences and linguistic features from the target sentences. To enhance zero-shot translation, models need to share knowledge across languages, which can be achieved through auxiliary tasks for learning a universal representation or cross-lingual mapping. To this… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: Accepted by ACL2024 Findings

  23. arXiv:2407.21043  [pdf, other

    cs.CL cs.AI cs.LG

    CP-Prompt: Composition-Based Cross-modal Prompting for Domain-Incremental Continual Learning

    Authors: Yu Feng, Zhen Tian, Yifan Zhu, Zongfu Han, Haoran Luo, Guangwei Zhang, Meina Song

    Abstract: The key challenge of cross-modal domain-incremental learning (DIL) is to enable the learning model to continuously learn from novel data with different feature distributions under the same task without forgetting old ones. However, existing top-performing methods still cause high forgetting rates, by lacking intra-domain knowledge extraction and inter-domain common prompting strategy. In this pape… ▽ More

    Submitted 2 August, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

    Comments: Accepted by ACM MM 2024

  24. An Efficient Convex-Hull Relaxation Based Algorithm for Multi-User Discrete Passive Beamforming

    Authors: Wenhai Lai, Zheyu Wu, Yi Feng, Kaiming Shen, Ya-Feng Liu

    Abstract: Intelligent reflecting surface (IRS) is an emerging technology to enhance spatial multiplexing in wireless networks. This letter considers the discrete passive beamforming design for IRS in order to maximize the minimum signal-to-interference-plus-noise ratio (SINR) among multiple users in an IRS-assisted downlink network. The main design difficulty lies in the discrete phase-shift constraint. Dif… ▽ More

    Submitted 28 August, 2024; v1 submitted 30 July, 2024; originally announced July 2024.

    Comments: 5 pages

    Journal ref: IEEE Signal Processing Letters 2024

  25. arXiv:2407.18492  [pdf

    cs.CV

    Neural Modulation Alteration to Positive and Negative Emotions in Depressed Patients: Insights from fMRI Using Positive/Negative Emotion Atlas

    Authors: Yu Feng, Weiming Zeng, Yifan Xie, Hongyu Chen, Lei Wang, Yingying Wang, Hongjie Yan, Kaile Zhang, Ran Tao, Wai Ting Siok, Nizhuan Wang

    Abstract: Background: Although it has been noticed that depressed patients show differences in processing emotions, the precise neural modulation mechanisms of positive and negative emotions remain elusive. FMRI is a cutting-edge medical imaging technology renowned for its high spatial resolution and dynamic temporal information, making it particularly suitable for the neural dynamics of depression research… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  26. arXiv:2407.18483  [pdf

    cs.CL cs.AI

    A Role-specific Guided Large Language Model for Ophthalmic Consultation Based on Stylistic Differentiation

    Authors: Laiyi Fu, Binbin Fan, Hongkai Du, Yanxiang Feng, Chunhua Li, Huping Song

    Abstract: Ophthalmology consultations are crucial for diagnosing, treating, and preventing eye diseases. However, the growing demand for consultations exceeds the availability of ophthalmologists. By leveraging large pre-trained language models, we can design effective dialogues for specific scenarios, aiding in consultations. Traditional fine-tuning strategies for question-answering tasks are impractical d… ▽ More

    Submitted 31 July, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

  27. arXiv:2407.18333  [pdf, other

    cs.AR cs.AI

    AutoVCoder: A Systematic Framework for Automated Verilog Code Generation using LLMs

    Authors: Mingzhe Gao, Jieru Zhao, Zhe Lin, Wenchao Ding, Xiaofeng Hou, Yu Feng, Chao Li, Minyi Guo

    Abstract: Recently, the use of large language models (LLMs) for software code generation, e.g., C/C++ and Python, has proven a great success. However, LLMs still suffer from low syntactic and functional correctness when it comes to the generation of register-transfer level (RTL) code, such as Verilog. To address this issue, in this paper, we develop AutoVCoder, a systematic open-source framework that signif… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  28. arXiv:2407.16959  [pdf, other

    cs.LG

    Dynamic Graph Transformer with Correlated Spatial-Temporal Positional Encoding

    Authors: Zhe Wang, Sheng Zhou, Jiawei Chen, Zhen Zhang, Binbin Hu, Yan Feng, Chun Chen, Can Wang

    Abstract: Learning effective representations for Continuous-Time Dynamic Graphs (CTDGs) has garnered significant research interest, largely due to its powerful capabilities in modeling complex interactions between nodes. A fundamental and crucial requirement for representation learning in CTDGs is the appropriate estimation and preservation of proximity. However, due to the sparse and evolving characteristi… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  29. arXiv:2407.16667  [pdf, other

    cs.CR cs.AI cs.CL

    RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent

    Authors: Huiyu Xu, Wenhui Zhang, Zhibo Wang, Feng Xiao, Rui Zheng, Yunhe Feng, Zhongjie Ba, Kui Ren

    Abstract: Recently, advanced Large Language Models (LLMs) such as GPT-4 have been integrated into many real-world applications like Code Copilot. These applications have significantly expanded the attack surface of LLMs, exposing them to a variety of threats. Among them, jailbreak attacks that induce toxic responses through jailbreak prompts have raised critical safety concerns. To identify these threats, a… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  30. arXiv:2407.15376  [pdf, other

    cs.MM

    Structure-Aware Residual-Center Representation for Self-Supervised Open-Set 3D Cross-Modal Retrieval

    Authors: Yang Xu, Yifan Feng, Yu Jiang

    Abstract: Existing methods of 3D cross-modal retrieval heavily lean on category distribution priors within the training set, which diminishes their efficacy when tasked with unseen categories under open-set environments. To tackle this problem, we propose the Structure-Aware Residual-Center Representation (SRCR) framework for self-supervised open-set 3D cross-modal retrieval. To address the center deviation… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: ICME 2024

  31. arXiv:2407.15309  [pdf, other

    cs.DC cs.LG

    vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving

    Authors: Jiale Xu, Rui Zhang, Cong Guo, Weiming Hu, Zihan Liu, Feiyang Wu, Yu Feng, Shixuan Sun, Changxu Shao, Yuhong Guo, Junping Zhao, Ke Zhang, Minyi Guo, Jingwen Leng

    Abstract: Large Language Models (LLMs) are widely used across various domains, processing millions of daily requests. This surge in demand poses significant challenges in optimizing throughput and latency while keeping costs manageable. The Key-Value (KV) cache, a standard method for retaining previous computations, makes LLM inference highly bounded by memory. While batching strategies can enhance performa… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: 16 pages, 12 figures

  32. arXiv:2407.14207  [pdf, other

    cs.LG

    Longhorn: State Space Models are Amortized Online Learners

    Authors: Bo Liu, Rui Wang, Lemeng Wu, Yihao Feng, Peter Stone, Qiang Liu

    Abstract: The most fundamental capability of modern AI methods such as Large Language Models (LLMs) is the ability to predict the next token in a long sequence of tokens, known as ``sequence modeling." Although the Transformers model is the current dominant approach to sequence modeling, its quadratic computational cost with respect to sequence length is a significant drawback. State-space models (SSMs) off… ▽ More

    Submitted 31 July, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

  33. arXiv:2407.13328  [pdf, other

    cs.CV

    Unsupervised Domain Adaptive Lane Detection via Contextual Contrast and Aggregation

    Authors: Kunyang Zhou, Yunjian Feng, Jun Li

    Abstract: This paper focuses on two crucial issues in domain-adaptive lane detection, i.e., how to effectively learn discriminative features and transfer knowledge across domains. Existing lane detection methods usually exploit a pixel-wise cross-entropy loss to train detection models. However, the loss ignores the difference in feature representation among lanes, which leads to inefficient feature learning… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  34. arXiv:2407.11578  [pdf, other

    cs.CV eess.IV

    UP-Diff: Latent Diffusion Model for Remote Sensing Urban Prediction

    Authors: Zeyu Wang, Zecheng Hao, Jingyu Lin, Yuchao Feng, Yufei Guo

    Abstract: This study introduces a novel Remote Sensing (RS) Urban Prediction (UP) task focused on future urban planning, which aims to forecast urban layouts by utilizing information from existing urban layouts and planned change maps. To address the proposed RS UP task, we propose UP-Diff, which leverages a Latent Diffusion Model (LDM) to capture positionaware embeddings of pre-change urban layouts and pla… ▽ More

    Submitted 16 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: 5 pages, 4 figures

  35. arXiv:2407.11550  [pdf, other

    cs.CL cs.AI

    Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference

    Authors: Yuan Feng, Junlin Lv, Yukun Cao, Xike Xie, S. Kevin Zhou

    Abstract: Large Language Models have excelled in various fields but encounter challenges in memory and time efficiency due to the expanding Key-Value (KV) cache required for long-sequence inference. Recent efforts try to reduce KV cache size to a given memory budget by evicting vast non-critical cache elements during runtime, while preserving generation quality. Our revisiting of current eviction methods re… ▽ More

    Submitted 16 August, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

  36. arXiv:2407.10290  [pdf, other

    cs.AR

    Switch-Less Dragonfly on Wafers: A Scalable Interconnection Architecture based on Wafer-Scale Integration

    Authors: Yinxiao Feng, Kaisheng Ma

    Abstract: Existing high-performance computing (HPC) interconnection architectures are based on high-radix switches, which limits the injection/local performance and introduces latency/energy/cost overhead. The new wafer-scale packaging and high-speed wireline technologies provide high-density, low-latency, and high-bandwidth connectivity, thus promising to support direct-connected high-radix interconnection… ▽ More

    Submitted 26 August, 2024; v1 submitted 14 July, 2024; originally announced July 2024.

    Journal ref: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2024

  37. arXiv:2407.09920  [pdf, other

    cs.CV

    MutDet: Mutually Optimizing Pre-training for Remote Sensing Object Detection

    Authors: Ziyue Huang, Yongchao Feng, Qingjie Liu, Yunhong Wang

    Abstract: Detection pre-training methods for the DETR series detector have been extensively studied in natural scenes, e.g., DETReg. However, the detection pre-training remains unexplored in remote sensing scenes. In existing pre-training methods, alignment between object embeddings extracted from a pre-trained backbone and detector features is significant. However, due to differences in feature extraction… ▽ More

    Submitted 24 July, 2024; v1 submitted 13 July, 2024; originally announced July 2024.

    Comments: 14 pages, 4 figures; Accept to ECCV 2024

  38. Gap Completion in Point Cloud Scene occluded by Vehicles using SGC-Net

    Authors: Yu Feng, Yiming Xu, Yan Xia, Claus Brenner, Monika Sester

    Abstract: Recent advances in mobile mapping systems have greatly enhanced the efficiency and convenience of acquiring urban 3D data. These systems utilize LiDAR sensors mounted on vehicles to capture vast cityscapes. However, a significant challenge arises due to occlusions caused by roadside parked vehicles, leading to the loss of scene information, particularly on the roads, sidewalks, curbs, and the lowe… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Journal ref: ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 215, Sep. 2024

  39. arXiv:2407.07400  [pdf

    cond-mat.mtrl-sci cs.HC physics.bio-ph

    Invisible sweat sensor: ultrathin membrane mimics skin for stress monitoring

    Authors: Yuchen Feng, Andreas Kenny Oktavius, Reno Adley Prawoto, Hing Ni Ko, Qiao Gu, Ping Gao

    Abstract: Epidermal skin sensors have emerged as a promising approach for continuous and noninvasive monitoring of vital health signals, but to maximize their performance, these sensors must integrate seamlessly with the skin, minimizing impedance while maintaining the skin's natural protective and regulatory functions.In this study, we introduce an imperceptible sweat sensor that achieves this seamless ski… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  40. FORAY: Towards Effective Attack Synthesis against Deep Logical Vulnerabilities in DeFi Protocols

    Authors: Hongbo Wen, Hanzhi Liu, Jiaxin Song, Yanju Chen, Wenbo Guo, Yu Feng

    Abstract: Blockchain adoption has surged with the rise of Decentralized Finance (DeFi) applications. However, the significant value of digital assets managed by DeFi protocols makes them prime targets for attacks. Current smart contract vulnerability detection tools struggle with DeFi protocols due to deep logical bugs arising from complex financial interactions between multiple smart contracts. These tools… ▽ More

    Submitted 30 August, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  41. arXiv:2407.05283  [pdf, other

    cs.CV

    SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning

    Authors: Yi Feng, Zizhan Guo, Qijun Chen, Rui Fan

    Abstract: Unsupervised monocular depth estimation frameworks have shown promising performance in autonomous driving. However, existing solutions primarily rely on a simple convolutional neural network for ego-motion recovery, which struggles to estimate precise camera poses in dynamic, complicated real-world scenarios. These inaccurately estimated camera poses can inevitably deteriorate the photometric reco… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Accepted by IEEE Transactions on Intelligent Vehicles. Code is available at https://mias.group/SCIPaD

  42. arXiv:2407.03994  [pdf, other

    cs.CL cs.AI

    Unlocking the Potential of Model Merging for Low-Resource Languages

    Authors: Mingxu Tao, Chen Zhang, Quzhe Huang, Tianyao Ma, Songfang Huang, Dongyan Zhao, Yansong Feng

    Abstract: Adapting large language models (LLMs) to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT). However, this CT-then-SFT approach struggles with limited data in the context of low-resource languages, failing to balance language modeling and task-solving capabilities. We thus propose model merging as an alternative for low-resource languages, combini… ▽ More

    Submitted 9 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  43. arXiv:2407.03993  [pdf, other

    cs.CL

    A Survey on Natural Language Counterfactual Generation

    Authors: Yongjie Wang, Xiaoqi Qiu, Yu Yue, Xu Guo, Zhiwei Zeng, Yuhong Feng, Zhiqi Shen

    Abstract: Natural Language Counterfactual generation aims to minimally modify a given text such that the modified text will be classified into a different class. The generated counterfactuals provide insight into the reasoning behind a model's predictions by highlighting which words significantly influence the outcomes. Additionally, they can be used to detect model fairness issues or augment the training d… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: A survey paper

    MSC Class: 68T50 ACM Class: I.2.7

  44. arXiv:2407.02043  [pdf, other

    cs.CL

    Concise and Precise Context Compression for Tool-Using Language Models

    Authors: Yang Xu, Yunlong Feng, Honglin Mu, Yutai Hou, Yitong Li, Xinghao Wang, Wanjun Zhong, Zhongyang Li, Dandan Tu, Qingfu Zhu, Min Zhang, Wanxiang Che

    Abstract: Through reading the documentation in the context, tool-using language models can dynamically extend their capability using external tools. The cost is that we have to input lengthy documentation every time the model needs to use the tool, occupying the input window as well as slowing down the decoding process. Given the progress in general-purpose compression, soft context compression is a suita… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  45. arXiv:2407.01029  [pdf, other

    cs.CV

    EndoSparse: Real-Time Sparse View Synthesis of Endoscopic Scenes using Gaussian Splatting

    Authors: Chenxin Li, Brandon Y. Feng, Yifan Liu, Hengyu Liu, Cheng Wang, Weihao Yu, Yixuan Yuan

    Abstract: 3D reconstruction of biological tissues from a collection of endoscopic images is a key to unlock various important downstream surgical applications with 3D capabilities. Existing methods employ various advanced neural rendering techniques for photorealistic view synthesis, but they often struggle to recover accurate 3D representations when only sparse observations are available, which is usually… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accpeted by MICCAI2024

  46. arXiv:2407.00435  [pdf, other

    cs.GR

    RTGS: Enabling Real-Time Gaussian Splatting on Mobile Devices Using Efficiency-Guided Pruning and Foveated Rendering

    Authors: Weikai Lin, Yu Feng, Yuhao Zhu

    Abstract: Point-Based Neural Rendering (PBNR), i.e., the 3D Gaussian Splatting-family algorithms, emerges as a promising class of rendering techniques, which are permeating all aspects of society, driven by a growing demand for real-time, photorealistic rendering in AR/VR and digital twins. Achieving real-time PBNR on mobile devices is challenging. This paper proposes RTGS, a PBNR system that for the firs… ▽ More

    Submitted 2 July, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

    Comments: 9 pages

    MSC Class: I.3; I.2

  47. arXiv:2406.18547  [pdf

    eess.IV cs.CV

    Enhancing Medical Imaging with GANs Synthesizing Realistic Images from Limited Data

    Authors: Yinqiu Feng, Bo Zhang, Lingxi Xiao, Yutian Yang, Tana Gegen, Zexi Chen

    Abstract: In this research, we introduce an innovative method for synthesizing medical images using generative adversarial networks (GANs). Our proposed GANs method demonstrates the capability to produce realistic synthetic images even when trained on a limited quantity of real medical image data, showcasing commendable generalization prowess. To achieve this, we devised a generator and discriminator networ… ▽ More

    Submitted 22 May, 2024; originally announced June 2024.

  48. arXiv:2406.18518  [pdf, other

    cs.CL cs.AI cs.LG cs.SE

    APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

    Authors: Zuxin Liu, Thai Hoang, Jianguo Zhang, Ming Zhu, Tian Lan, Shirley Kokane, Juntao Tan, Weiran Yao, Zhiwei Liu, Yihao Feng, Rithesh Murthy, Liangwei Yang, Silvio Savarese, Juan Carlos Niebles, Huan Wang, Shelby Heinecke, Caiming Xiong

    Abstract: The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications. We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets in a scal… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  49. arXiv:2406.17233  [pdf, other

    cs.SE cs.CL

    Self-Constructed Context Decompilation with Fined-grained Alignment Enhancement

    Authors: Yunlong Feng, Yang Xu, Dechuan Teng, Honglin Mu, Xiao Xu, Libo Qin, Wanxiang Che, Qingfu Zhu

    Abstract: Decompilation transforms compiled code back into a high-level programming language for analysis when source code is unavailable. Previous work has primarily focused on enhancing decompilation performance by increasing the scale of model parameters or training data for pre-training. Based on the characteristics of the decompilation task, we propose two methods: (1) Without fine-tuning, the Self-Con… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Under Review

  50. arXiv:2406.16982  [pdf

    cs.LG cs.AI

    Research on Disease Prediction Model Construction Based on Computer AI deep Learning Technology

    Authors: Yang Lin, Muqing Li, Ziyi Zhu, Yinqiu Feng, Lingxi Xiao, Zexi Chen

    Abstract: The prediction of disease risk factors can screen vulnerable groups for effective prevention and treatment, so as to reduce their morbidity and mortality. Machine learning has a great demand for high-quality labeling information, and labeling noise in medical big data poses a great challenge to efficient disease risk warning methods. Therefore, this project intends to study the robust learning alg… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.