Skip to main content

Showing 1–50 of 667 results for author: Lin, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.14354  [pdf, other

    cs.SE cs.AI cs.CL

    SWE-bench-java: A GitHub Issue Resolving Benchmark for Java

    Authors: Daoguang Zan, Zhirong Huang, Ailun Yu, Shaoxin Lin, Yifan Shi, Wei Liu, Dong Chen, Zongshuai Qi, Hao Yu, Lei Yu, Dezhi Ran, Muhan Zeng, Bo Shen, Pan Bian, Guangtai Liang, Bei Guan, Pengjie Huang, Tao Xie, Yongji Wang, Qianxiang Wang

    Abstract: GitHub issue resolving is a critical task in software engineering, recently gaining significant attention in both industry and academia. Within this task, SWE-bench has been released to evaluate issue resolving capabilities of large language models (LLMs), but has so far only focused on Python version. However, supporting more programming languages is also important, as there is a strong demand in… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: This work is in progress

  2. arXiv:2408.13890  [pdf, other

    cs.CV

    Making Large Language Models Better Planners with Reasoning-Decision Alignment

    Authors: Zhijian Huang, Tao Tang, Shaoxiang Chen, Sihao Lin, Zequn Jie, Lin Ma, Guangrun Wang, Xiaodan Liang

    Abstract: Data-driven approaches for autonomous driving (AD) have been widely adopted in the past decade but are confronted with dataset bias and uninterpretability. Inspired by the knowledge-driven nature of human driving, recent approaches explore the potential of large language models (LLMs) to improve understanding and decision-making in traffic scenarios. They find that the pretrain-finetune paradigm o… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  3. arXiv:2408.13759  [pdf, other

    cs.RO

    MASQ: Multi-Agent Reinforcement Learning for Single Quadruped Robot Locomotion

    Authors: Qi Liu, Jingxiang Guo, Sixu Lin, Shuaikang Ma, Jinxuan Zhu, Yanjie Li

    Abstract: This paper proposes a novel method to improve locomotion learning for a single quadruped robot using multi-agent deep reinforcement learning (MARL). Many existing methods use single-agent reinforcement learning for an individual robot or MARL for the cooperative task in multi-robot systems. Unlike existing methods, this paper proposes using MARL for the locomotion learning of a single quadruped ro… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  4. arXiv:2408.12454  [pdf, other

    cs.CV cs.AI

    Relaxed Rotational Equivariance via $G$-Biases in Vision

    Authors: Zhiqiang Wu, Licheng Sun, Yingjie Liu, Jian Yang, Hanlin Dong, Shing-Ho J. Lin, Xuan Tang, Jinpeng Mi, Bo Jin, Xian Wei

    Abstract: Group Equivariant Convolution (GConv) can effectively handle rotational symmetry data. They assume uniform and strict rotational symmetry across all features, as the transformations under the specific group. However, real-world data rarely conforms to strict rotational symmetry commonly referred to as Rotational Symmetry-Breaking in the system or dataset, making GConv unable to adapt effectively t… ▽ More

    Submitted 25 August, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

  5. arXiv:2408.12416  [pdf, other

    cs.SE cs.LG

    Unlearning Trojans in Large Language Models: A Comparison Between Natural Language and Source Code

    Authors: Mahdi Kazemi, Aftab Hussain, Md Rafiqul Islam Rabin, Mohammad Amin Alipour, Sen Lin

    Abstract: This work investigates the application of Machine Unlearning (MU) for mitigating the impact of trojans embedded in conventional large language models of natural language (Text-LLMs) and large language models of code (Code-LLMs) We propose a novel unlearning approach, LYA, that leverages both gradient ascent and elastic weight consolidation, a Fisher Information Matrix (FIM) based regularization te… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  6. arXiv:2408.10555  [pdf, other

    cs.LG cs.IR

    Target-Prompt Online Graph Collaborative Learning for Temporal QoS Prediction

    Authors: Shengxiang Hu, Guobing Zou, Song Yang, Shiyi Lin, Bofeng Zhang, Yixin Chen

    Abstract: In service-oriented architecture, accurately predicting the Quality of Service (QoS) is vital for maintaining reliability and enhancing user satisfaction. However, current methods often neglect high-order latent collaborative relationships and fail to dynamically adjust feature learning for specific user-service invocations, which are critical for precise feature extraction. Moreover, relying on R… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

    MSC Class: 68T99 ACM Class: H.4.0; I.2.0

  7. arXiv:2408.10536  [pdf, other

    cs.IR cs.CL

    Synergistic Approach for Simultaneous Optimization of Monolingual, Cross-lingual, and Multilingual Information Retrieval

    Authors: Adel Elmahdy, Sheng-Chieh Lin, Amin Ahmad

    Abstract: Information retrieval across different languages is an increasingly important challenge in natural language processing. Recent approaches based on multilingual pre-trained language models have achieved remarkable success, yet they often optimize for either monolingual, cross-lingual, or multilingual retrieval performance at the expense of others. This paper proposes a novel hybrid batch training s… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 15 pages, 2 figures, 13 tables

  8. arXiv:2408.09798  [pdf, other

    cs.LG

    Enhance Modality Robustness in Text-Centric Multimodal Alignment with Adversarial Prompting

    Authors: Yun-Da Tsai, Ting-Yu Yen, Keng-Te Liao, Shou-De Lin

    Abstract: Converting different modalities into generalized text, which then serves as input prompts for large language models (LLMs), is a common approach for aligning multimodal models, particularly when pairwise data is limited. Text-centric alignment method leverages the unique properties of text as a modality space, transforming diverse inputs into a unified textual representation, thereby enabling down… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: arXiv admin note: text overlap with arXiv:2407.05036

  9. arXiv:2408.07966  [pdf, other

    cs.LG cs.DC

    Addressing Skewed Heterogeneity via Federated Prototype Rectification with Personalization

    Authors: Shunxin Guo, Hongsong Wang, Shuxia Lin, Zhiqiang Kou, Xin Geng

    Abstract: Federated learning is an efficient framework designed to facilitate collaborative model training across multiple distributed devices while preserving user data privacy. A significant challenge of federated learning is data-level heterogeneity, i.e., skewed or long-tailed distribution of private data. Although various methods have been proposed to address this challenge, most of them assume that th… ▽ More

    Submitted 22 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

  10. arXiv:2408.05457  [pdf, other

    cs.CL cs.AI

    Investigating Instruction Tuning Large Language Models on Graphs

    Authors: Kerui Zhu, Bo-Wei Huang, Bowen Jin, Yizhu Jiao, Ming Zhong, Kevin Chang, Shou-De Lin, Jiawei Han

    Abstract: Inspired by the recent advancements of Large Language Models (LLMs) in NLP tasks, there's growing interest in applying LLMs to graph-related tasks. This study delves into the capabilities of instruction-following LLMs for engaging with real-world graphs, aiming to offer empirical insights into how LLMs can effectively interact with graphs and generalize across graph tasks. We begin by constructing… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

    Comments: COLM 2024

  11. arXiv:2408.01604  [pdf, other

    cs.RO eess.SY

    Efficient Data-driven Joint-level Calibration of Cable-driven Surgical Robots

    Authors: Haonan Peng, Andrew Lewis, Yun-Hsuan Su, Shan Lin, Dun-Tin Chiang, Wenfan Jiang, Helen Lai, Blake Hannaford

    Abstract: Knowing accurate joint positions is crucial for safe and precise control of laparoscopic surgical robots, especially for the automation of surgical sub-tasks. These robots have often been designed with cable-driven arms and tools because cables allow for larger motors to be placed at the base of the robot, further from the operating area where space is at a premium. However, by connecting the join… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  12. arXiv:2407.21783  [pdf, other

    cs.AI cs.CL cs.CV

    The Llama 3 Herd of Models

    Authors: Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang , et al. (510 additional authors not shown)

    Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More

    Submitted 15 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

  13. arXiv:2407.19826  [pdf, other

    cs.RO

    Design and Control of a Novel Six-Degree-of-Freedom Hybrid Robotic Arm

    Authors: Yang Chen, Zhonghua Miao, Yuanyue Ge, Sen lin, Liping Chen, Ya Xiong

    Abstract: Robotic arms are key components in fruit-harvesting robots. In agricultural settings, conventional serial or parallel robotic arms often fall short in meeting the demands for a large workspace, rapid movement, enhanced capability of obstacle avoidance and affordability. This study proposes a novel hybrid six-degree-of-freedom (DoF) robotic arm that combines the advantages of parallel and serial me… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: Accepted by IROS 2024

  14. arXiv:2407.18992  [pdf, other

    cs.AI

    Towards Automated Solution Recipe Generation for Industrial Asset Management with LLM

    Authors: Nianjun Zhou, Dhaval Patel, Shuxin Lin, Fearghal O'Donncha

    Abstract: This study introduces a novel approach to Industrial Asset Management (IAM) by incorporating Conditional-Based Management (CBM) principles with the latest advancements in Large Language Models (LLMs). Our research introduces an automated model-building process, traditionally reliant on intensive collaboration between data scientists and domain experts. We present two primary innovations: a taxonom… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  15. arXiv:2407.17875  [pdf, other

    cs.NE

    Overcoming Binary Adversarial Optimisation with Competitive Coevolution

    Authors: Per Kristian Lehre, Shishen Lin

    Abstract: Co-evolutionary algorithms (CoEAs), which pair candidate designs with test cases, are frequently used in adversarial optimisation, particularly for binary test-based problems where designs and tests yield binary outcomes. The effectiveness of designs is determined by their performance against tests, and the value of tests is based on their ability to identify failing designs, often leading to more… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: 38 pages, Accepted at the 18th International Conference on Parallel Problem Solving From Nature (PPSN 2024)

  16. arXiv:2407.17654  [pdf, other

    cs.LG stat.ML

    Generative Learning for Simulation of Vehicle Faults

    Authors: Patrick Kuiper, Sirui Lin, Jose Blanchet, Vahid Tarokh

    Abstract: We develop a novel generative model to simulate vehicle health and forecast faults, conditioned on practical operational considerations. The model, trained on data from the US Army's Predictive Logistics program, aims to support predictive maintenance. It forecasts faults far enough in advance to execute a maintenance intervention before a breakdown occurs. The model incorporates real-world factor… ▽ More

    Submitted 30 July, 2024; v1 submitted 24 July, 2024; originally announced July 2024.

  17. arXiv:2407.16205  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    Figure it Out: Analyzing-based Jailbreak Attack on Large Language Models

    Authors: Shi Lin, Rongchang Li, Xun Wang, Changting Lin, Wenpeng Xing, Meng Han

    Abstract: The rapid development of Large Language Models (LLMs) has brought remarkable generative capabilities across diverse tasks. However, despite the impressive achievements, these LLMs still have numerous inherent vulnerabilities, particularly when faced with jailbreak attacks. By investigating jailbreak attacks, we can uncover hidden weaknesses in LLMs and inform the development of more robust defense… ▽ More

    Submitted 13 August, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

  18. arXiv:2407.12568  [pdf, other

    cs.CV

    LTRL: Boosting Long-tail Recognition via Reflective Learning

    Authors: Qihao Zhao, Yalun Dai, Shen Lin, Wei Hu, Fan Zhang, Jun Liu

    Abstract: In real-world scenarios, where knowledge distributions exhibit long-tail. Humans manage to master knowledge uniformly across imbalanced distributions, a feat attributed to their diligent practices of reviewing, summarizing, and correcting errors. Motivated by this learning process, we propose a novel learning paradigm, called reflecting learning, in handling long-tail recognition. Our method integ… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: ECCV2024

  19. arXiv:2407.09911  [pdf, other

    cs.HC cs.CV cs.LG eess.SP

    SensEmo: Enabling Affective Learning through Real-time Emotion Recognition with Smartwatches

    Authors: Kushan Choksi, Hongkai Chen, Karan Joshi, Sukrutha Jade, Shahriar Nirjon, Shan Lin

    Abstract: Recent research has demonstrated the capability of physiological signals to infer both user emotional and attention responses. This presents an opportunity for leveraging widely available physiological sensors in smartwatches, to detect real-time emotional cues in users, such as stress and excitement. In this paper, we introduce SensEmo, a smartwatch-based system designed for affective learning. S… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: 7 pages, 7 figures, 2 tables. IEEE MASS 2024

    ACM Class: C.3.3; J.3.2; J.4.2

  20. arXiv:2407.08561  [pdf, other

    cs.CV

    MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps

    Authors: Hang Wu, Zhenghao Zhang, Siyuan Lin, Xiangru Mu, Qiang Zhao, Ming Yang, Tong Qin

    Abstract: Robust localization is the cornerstone of autonomous driving, especially in challenging urban environments where GPS signals suffer from multipath errors. Traditional localization approaches rely on high-definition (HD) maps, which consist of precisely annotated landmarks. However, building HD map is expensive and challenging to scale up. Given these limitations, leveraging navigation maps has eme… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: IROS 2024 (Oral)

  21. arXiv:2407.08526  [pdf, other

    cs.CV

    BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight

    Authors: Hang Wu, Zhenghao Zhang, Siyuan Lin, Tong Qin, Jin Pan, Qiang Zhao, Chunjing Xu, Ming Yang

    Abstract: Bird's-eye-view (BEV) representation is crucial for the perception function in autonomous driving tasks. It is difficult to balance the accuracy, efficiency and range of BEV representation. The existing works are restricted to a limited perception range within 50 meters. Extending the BEV representation range can greatly benefit downstream tasks such as topology reasoning, scene understanding, and… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: IEEE IV 2024

  22. arXiv:2407.05036  [pdf, other

    cs.CL cs.LG

    Enhance the Robustness of Text-Centric Multimodal Alignments

    Authors: Ting-Yu Yen, Yun-Da Tsai, Keng-Te Liao, Shou-De Lin

    Abstract: Converting different modalities into general text, serving as input prompts for large language models (LLMs), is a common method to align multimodal models when there is limited pairwise data. This text-centric approach leverages the unique properties of text as a modality space, transforming diverse inputs into a unified textual representation. This enables downstream models to effectively interp… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  23. arXiv:2407.03566  [pdf, ps, other

    cs.IT eess.SP

    Stacked Intelligent Metasurfaces for Wireless Sensing and Communication: Applications and Challenges

    Authors: Hao Liu, Jiancheng An, Xing Jia, Shining Lin, Xianghao Yao, Lu Gan, Bruno Clerckx, Chau Yuen, Mehdi Bennis, Mérouane Debbah

    Abstract: The rapid advancement of wireless communication technologies has precipitated an unprecedented demand for high data rates, extremely low latency, and ubiquitous connectivity. In order to achieve these goals, stacked intelligent metasurfaces (SIM) has been developed as a novel solution to perform advanced signal processing tasks directly in the electromagnetic wave domain, thus achieving ultra-fast… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 8 pages, 5 figures, 1 table

  24. arXiv:2407.00009  [pdf, other

    cs.DC cs.NI

    An Open-Source Fast Parallel Routing Approach for Commercial FPGAs

    Authors: Xinshi Zang, Wenhao Lin, Shiju Lin, Jinwei Liu, Evangeline F. Y. Young

    Abstract: In the face of escalating complexity and size of contemporary FPGAs and circuits, routing emerges as a pivotal and time-intensive phase in FPGA compilation flows. In response to this challenge, we present an open-source parallel routing methodology designed to expedite routing procedures for commercial FPGAs. Our approach introduces a novel recursive partitioning ternary tree to augment the parall… ▽ More

    Submitted 25 April, 2024; originally announced July 2024.

  25. arXiv:2406.19394  [pdf, other

    cs.CV

    HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection

    Authors: Liujuan Cao, Jianghang Lin, Zebo Hong, Yunhang Shen, Shaohui Lin, Chao Chen, Rongrong Ji

    Abstract: Most WSOD methods rely on traditional object proposals to generate candidate regions and are confronted with unstable training, which easily gets stuck in a poor local optimum. In this paper, we introduce a unified, high-capacity weakly supervised object detection (WSOD) network called HUWSOD, which utilizes a comprehensive self-training framework without needing external modules or additional sup… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  26. arXiv:2406.16437  [pdf, other

    cs.LG cs.AI

    Theory on Mixture-of-Experts in Continual Learning

    Authors: Hongbo Li, Sen Lin, Lingjie Duan, Yingbin Liang, Ness B. Shroff

    Abstract: Continual learning (CL) has garnered significant attention because of its ability to adapt to new tasks that arrive over time. Catastrophic forgetting (of old tasks) has been identified as a major issue in CL, as the model adapts to new tasks. The Mixture-of-Experts (MoE) model has recently been shown to effectively mitigate catastrophic forgetting in CL, by employing a gating network to sparsify… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  27. arXiv:2406.12433  [pdf, other

    cs.IR

    LLM-enhanced Reranking in Recommender Systems

    Authors: Jingtong Gao, Bo Chen, Xiangyu Zhao, Weiwen Liu, Xiangyang Li, Yichao Wang, Zijian Zhang, Wanyu Wang, Yuyang Ye, Shanru Lin, Huifeng Guo, Ruiming Tang

    Abstract: Reranking is a critical component in recommender systems, playing an essential role in refining the output of recommendation algorithms. Traditional reranking models have focused predominantly on accuracy, but modern applications demand consideration of additional criteria such as diversity and fairness. Existing reranking approaches often fail to harmonize these diverse criteria effectively at th… ▽ More

    Submitted 20 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  28. arXiv:2406.11251  [pdf, other

    cs.IR

    Unifying Multimodal Retrieval via Document Screenshot Embedding

    Authors: Xueguang Ma, Sheng-Chieh Lin, Minghan Li, Wenhu Chen, Jimmy Lin

    Abstract: In the real world, documents are organized in different formats and varied modalities. Traditional retrieval pipelines require tailored document parsing techniques and content extraction modules to prepare input for indexing. This process is tedious, prone to errors, and has information loss. To this end, we propose Document Screenshot Embedding} (DSE), a novel retrieval paradigm that regards docu… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  29. arXiv:2406.10961  [pdf, other

    cs.CV cs.AI cs.CY

    Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP

    Authors: Shuyang Lin, Tong Jia, Hao Wang, Bowen Ma, Mingyuan Li, Dongyue Chen

    Abstract: X-ray prohibited item detection is an essential component of security check and categories of prohibited item are continuously increasing in accordance with the latest laws. Previous works all focus on close-set scenarios, which can only recognize known categories used for training and often require time-consuming as well as labor-intensive annotations when learning novel categories, resulting in… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  30. arXiv:2406.10280  [pdf, other

    cs.CR cs.CL cs.LG

    Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text Embeddings without Model Queries

    Authors: Yu-Hsiang Huang, Yuche Tsai, Hsiang Hsiao, Hong-Yi Lin, Shou-De Lin

    Abstract: This study investigates the privacy risks associated with text embeddings, focusing on the scenario where attackers cannot access the original embedding model. Contrary to previous research requiring direct model access, we explore a more realistic threat model by developing a transfer attack method. This approach uses a surrogate model to mimic the victim model's behavior, allowing the attacker t… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted at ACL 2024 Main Conference

  31. arXiv:2406.06253  [pdf, other

    eess.SY cs.PL

    PretVM: Predictable, Efficient Virtual Machine for Real-Time Concurrency

    Authors: Shaokai Lin, Erling Jellum, Mirco Theile, Tassilo Tanneberger, Binqi Sun, Chadlia Jerad, Ruomu Xu, Guangyu Feng, Christian Menard, Marten Lohstroh, Jeronimo Castrillon, Sanjit Seshia, Edward Lee

    Abstract: This paper introduces the Precision-Timed Virtual Machine (PretVM), an intermediate platform facilitating the execution of quasi-static schedules compiled from a subset of programs written in the Lingua Franca (LF) coordination language. The subset consists of those programs that in principle should have statically verifiable and predictable timing behavior. The PretVM provides a schedule with wel… ▽ More

    Submitted 25 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  32. arXiv:2406.02787  [pdf, other

    cs.CL cs.AI cs.LG

    Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities

    Authors: Wenyue Hua, Kaijie Zhu, Lingyao Li, Lizhou Fan, Shuhang Lin, Mingyu Jin, Haochen Xue, Zelong Li, JinDong Wang, Yongfeng Zhang

    Abstract: This study intends to systematically disentangle pure logic reasoning and text understanding by investigating the contrast across abstract and contextualized logical problems from a comprehensive set of domains. We explore whether LLMs demonstrate genuine reasoning capabilities across various domains when the underlying logical structure remains constant. We focus on two main questions (1) Can abs… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 22 pages, 9 figures

  33. arXiv:2406.01304  [pdf, other

    cs.CL cs.AI cs.SE

    CodeR: Issue Resolving with Multi-Agent and Task Graphs

    Authors: Dong Chen, Shaoxin Lin, Muhan Zeng, Daoguang Zan, Jian-Gang Wang, Anton Cheshkov, Jun Sun, Hao Yu, Guoliang Dong, Artem Aliev, Jie Wang, Xiao Cheng, Guangtai Liang, Yuchi Ma, Pan Bian, Tao Xie, Qianxiang Wang

    Abstract: GitHub issue resolving recently has attracted significant attention from academia and industry. SWE-bench is proposed to measure the performance in resolving issues. In this paper, we propose CodeR, which adopts a multi-agent framework and pre-defined task graphs to Repair & Resolve reported bugs and add new features within code Repository. On SWE-bench lite, CodeR is able to solve 28.33% of issue… ▽ More

    Submitted 10 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: https://github.com/NL2Code/CodeR

  34. arXiv:2406.00427  [pdf, other

    cs.CV

    You Only Need Less Attention at Each Stage in Vision Transformers

    Authors: Shuoxi Zhang, Hanpeng Liu, Stephen Lin, Kun He

    Abstract: The advent of Vision Transformers (ViTs) marks a substantial paradigm shift in the realm of computer vision. ViTs capture the global information of images through self-attention modules, which perform dot product computations among patchified image tokens. While self-attention modules empower ViTs to capture long-range dependencies, the computational complexity grows quadratically with the number… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 Camera-Ready; 10 pages, 3 figures

  35. arXiv:2405.21075  [pdf, other

    cs.CV cs.CL

    Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

    Authors: Chaoyou Fu, Yuhan Dai, Yongdong Luo, Lei Li, Shuhuai Ren, Renrui Zhang, Zihan Wang, Chenyu Zhou, Yunhang Shen, Mengdan Zhang, Peixian Chen, Yanwei Li, Shaohui Lin, Sirui Zhao, Ke Li, Tong Xu, Xiawu Zheng, Enhong Chen, Rongrong Ji, Xing Sun

    Abstract: In the quest for artificial general intelligence, Multi-modal Large Language Models (MLLMs) have emerged as a focal point in recent advancements. However, the predominant focus remains on developing their capabilities in static image understanding. The potential of MLLMs in processing sequential visual data is still insufficiently explored, highlighting the absence of a comprehensive, high-quality… ▽ More

    Submitted 16 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

    Comments: Project Page: https://video-mme.github.io

  36. arXiv:2405.18497  [pdf, other

    cs.IT

    Capacity Results for Non-Ergodic Multi-Modal Broadcast Channels with Controllable Statistics

    Authors: Alireza Vahid, Shih-Chun Lin

    Abstract: Movable antennas and reconfigurable intelligent surfaces enable a new paradigm in which channel statistics can be controlled and altered. Further, the known trajectory and operation protocol of communication satellites results in networks with predictable statistics. The predictability of future changes results in a non-ergodic model for which the fundamentals are largely unknown. We consider the… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: under review

  37. arXiv:2405.17477  [pdf, other

    cs.LG cs.AI

    OLLIE: Imitation Learning from Offline Pretraining to Online Finetuning

    Authors: Sheng Yue, Xingyuan Hua, Ju Ren, Sen Lin, Junshan Zhang, Yaoxue Zhang

    Abstract: In this paper, we study offline-to-online Imitation Learning (IL) that pretrains an imitation policy from static demonstration data, followed by fast finetuning with minimal environmental interaction. We find the naïve combination of existing offline IL and online IL methods tends to behave poorly in this context, because the initial discriminator (often used in online IL) operates randomly and di… ▽ More

    Submitted 30 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: International Conference on Machine Learning (ICML)

  38. arXiv:2405.17476  [pdf, other

    cs.LG cs.AI

    How to Leverage Diverse Demonstrations in Offline Imitation Learning

    Authors: Sheng Yue, Jiani Liu, Xingyuan Hua, Ju Ren, Sen Lin, Junshan Zhang, Yaoxue Zhang

    Abstract: Offline Imitation Learning (IL) with imperfect demonstrations has garnered increasing attention owing to the scarcity of expert data in many real-world domains. A fundamental problem in this scenario is how to extract positive behaviors from noisy data. In general, current approaches to the problem select data building on state-action similarity to given expert demonstrations, neglecting precious… ▽ More

    Submitted 30 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: International Conference on Machine Learning (ICML)

  39. arXiv:2405.16634  [pdf, other

    cs.GR

    Fast and Globally Consistent Normal Orientation based on the Winding Number Normal Consistency

    Authors: Siyou Lin, Zuoqiang Shi, Yebin Liu

    Abstract: Estimating a consistently oriented normal vector field for an unoriented point cloud enables a number of important downstream applications in computer graphics. While normal estimation for a small patch of points can be done with simple techniques like principal component analysis (PCA), orienting these normals to be globally consistent has been a notoriously difficult problem. Some recent methods… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  40. arXiv:2405.15334  [pdf, other

    cs.CL

    Detection and Positive Reconstruction of Cognitive Distortion sentences: Mandarin Dataset and Evaluation

    Authors: Shuya Lin, Yuxiong Wang, Jonathan Dong, Shiguang Ni

    Abstract: This research introduces a Positive Reconstruction Framework based on positive psychology theory. Overcoming negative thoughts can be challenging, our objective is to address and reframe them through a positive reinterpretation. To tackle this challenge, a two-fold approach is necessary: identifying cognitive distortions and suggesting a positively reframed alternative while preserving the origina… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  41. arXiv:2405.14358  [pdf, other

    cs.MA

    AI-Olympics: Exploring the Generalization of Agents through Open Competitions

    Authors: Chen Wang, Yan Song, Shuai Wu, Sa Wu, Ruizhi Zhang, Shu Lin, Haifeng Zhang

    Abstract: Between 2021 and 2023, AI-Olympics, a series of online AI competitions was hosted by the online evaluation platform Jidi in collaboration with the IJCAI committee. In these competitions, an agent is required to accomplish diverse sports tasks in a two-dimensional continuous world, while competing against an opponent. This paper provides a brief overview of the competition series and highlights not… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: IJCAI 2024 Demo Track Paper

  42. arXiv:2405.12117  [pdf, other

    cs.DC

    Strongly-Consistent Distributed Discrete-event Systems

    Authors: Peter Donovan, Erling Jellum, Byeonggil Jun, Hokeun Kim, Edward A. Lee, Shaokai Lin, Marten Lohstroh, Anirudh Rengarajan

    Abstract: Discrete-event (DE) systems are concurrent programs where components communicate via tagged events, where tags are drawn from a totally ordered set. Reactors are an emerging model of computation based on DE and realized in the open-source coordination language Lingua Franca. Distributed DE (DDE) systems are DE systems where the components (reactors) communicate over networks. The prior art has req… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  43. arXiv:2405.11734  [pdf, other

    cs.IT

    Finite Field Multiple Access for Sourced Massive Random Access with Finite Blocklength

    Authors: Qi-yue Yu, Shi-wen Lin, Shu Lin

    Abstract: For binary source transmission, this paper proposes an element-pair (EP) coding scheme for supporting sourced massive random access, which is used to solve the finite blocklength (FBL) of multiuser reliability transmission problem. In this paper, we first give the definition of an EP, which is used as a virtual resource. If the Cartesian product of $J$ distinct EPs satisfies the unique sum-pattern… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.14086

  44. arXiv:2405.09487  [pdf, other

    cs.CV

    Color Space Learning for Cross-Color Person Re-Identification

    Authors: Jiahao Nie, Shan Lin, Alex C. Kot

    Abstract: The primary color profile of the same identity is assumed to remain consistent in typical Person Re-identification (Person ReID) tasks. However, this assumption may be invalid in real-world situations and images hold variant color profiles, because of cross-modality cameras or identity with different clothing. To address this issue, we propose Color Space Learning (CSL) for those Cross-Color Perso… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted by ICME 2024 (Oral)

  45. arXiv:2405.08765  [pdf, other

    cs.CV

    Image to Pseudo-Episode: Boosting Few-Shot Segmentation by Unlabeled Data

    Authors: Jie Zhang, Yuhan Li, Yude Wang, Stephen Lin, Shiguang Shan

    Abstract: Few-shot segmentation (FSS) aims to train a model which can segment the object from novel classes with a few labeled samples. The insufficient generalization ability of models leads to unsatisfactory performance when the models lack enough labeled data from the novel classes. Considering that there are abundant unlabeled data available, it is promising to improve the generalization ability by expl… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  46. arXiv:2405.08748  [pdf, other

    cs.CV

    Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

    Authors: Zhimin Li, Jianwei Zhang, Qin Lin, Jiangfeng Xiong, Yanxin Long, Xinchi Deng, Yingfang Zhang, Xingchao Liu, Minbin Huang, Zedong Xiao, Dayou Chen, Jiajun He, Jiahao Li, Wenyue Li, Chen Zhang, Rongwei Quan, Jianxiang Lu, Jiabin Huang, Xiaoyan Yuan, Xiaoxiao Zheng, Yixuan Li, Jihong Zhang, Chao Zhang, Meng Chen, Jie Liu , et al. (20 additional authors not shown)

    Abstract: We present Hunyuan-DiT, a text-to-image diffusion transformer with fine-grained understanding of both English and Chinese. To construct Hunyuan-DiT, we carefully design the transformer structure, text encoder, and positional encoding. We also build from scratch a whole data pipeline to update and evaluate data for iterative model optimization. For fine-grained language understanding, we train a Mu… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Project Page: https://dit.hunyuan.tencent.com/

  47. arXiv:2405.07319  [pdf, other

    cs.CV

    LayGA: Layered Gaussian Avatars for Animatable Clothing Transfer

    Authors: Siyou Lin, Zhe Li, Zhaoqi Su, Zerong Zheng, Hongwen Zhang, Yebin Liu

    Abstract: Animatable clothing transfer, aiming at dressing and animating garments across characters, is a challenging problem. Most human avatar works entangle the representations of the human body and clothing together, which leads to difficulties for virtual try-on across identities. What's worse, the entangled representations usually fail to exactly track the sliding motion of garments. To overcome these… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: SIGGRAPH 2024 conference track

  48. arXiv:2405.04480  [pdf, other

    cs.NE cs.AI

    Concentration Tail-Bound Analysis of Coevolutionary and Bandit Learning Algorithms

    Authors: Per Kristian Lehre, Shishen Lin

    Abstract: Runtime analysis, as a branch of the theory of AI, studies how the number of iterations algorithms take before finding a solution (its runtime) depends on the design of the algorithm and the problem structure. Drift analysis is a state-of-the-art tool for estimating the runtime of randomised algorithms, such as evolutionary and bandit algorithms. Drift refers roughly to the expected progress towar… ▽ More

    Submitted 10 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted at International Joint Conference on Artificial Intelligence (IJCAI) 2024

  49. arXiv:2405.01525  [pdf, other

    cs.CL cs.AI

    FLAME: Factuality-Aware Alignment for Large Language Models

    Authors: Sheng-Chieh Lin, Luyu Gao, Barlas Oguz, Wenhan Xiong, Jimmy Lin, Wen-tau Yih, Xilun Chen

    Abstract: Alignment is a standard procedure to fine-tune pre-trained large language models (LLMs) to follow natural language instructions and serve as helpful AI assistants. We have observed, however, that the conventional alignment process fails to enhance the factual accuracy of LLMs, and often leads to the generation of more false facts (i.e. hallucination). In this paper, we study how to make the LLM al… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  50. arXiv:2405.00946  [pdf, other

    cs.LG

    SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters

    Authors: Shengsheng Lin, Weiwei Lin, Wentai Wu, Haojun Chen, Junjie Yang

    Abstract: This paper introduces SparseTSF, a novel, extremely lightweight model for Long-term Time Series Forecasting (LTSF), designed to address the challenges of modeling complex temporal dependencies over extended horizons with minimal computational resources. At the heart of SparseTSF lies the Cross-Period Sparse Forecasting technique, which simplifies the forecasting task by decoupling the periodicity… ▽ More

    Submitted 3 June, 2024; v1 submitted 1 May, 2024; originally announced May 2024.