Skip to main content

Showing 1–50 of 246 results for author: Zheng, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15585  [pdf, other

    cs.SD eess.AS

    Whisper-PMFA: Partial Multi-Scale Feature Aggregation for Speaker Verification using Whisper Models

    Authors: Yiyang Zhao, Shuai Wang, Guangzhi Sun, Zehua Chen, Chao Zhang, Mingxing Xu, Thomas Fang Zheng

    Abstract: In this paper, Whisper, a large-scale pre-trained model for automatic speech recognition, is proposed to apply to speaker verification. A partial multi-scale feature aggregation (PMFA) approach is proposed based on a subset of Whisper encoder blocks to derive highly discriminative speaker embeddings.Experimental results demonstrate that using the middle to later blocks of the Whisper encoder keeps… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: Accepted by Interspeech 2024

  2. arXiv:2408.13996  [pdf

    cs.NE

    Research Advances and New Paradigms for Biology-inspired Spiking Neural Networks

    Authors: Tianyu Zheng, Liyuan Han, Tielin Zhang

    Abstract: Spiking neural networks (SNNs) are gaining popularity in the computational simulation and artificial intelligence fields owing to their biological plausibility and computational efficiency. This paper explores the historical development of SNN and concludes that these two fields are intersecting and merging rapidly. Following the successful application of Dynamic Vision Sensors (DVS) and Dynamic A… ▽ More

    Submitted 28 August, 2024; v1 submitted 25 August, 2024; originally announced August 2024.

  3. arXiv:2408.12817  [pdf, other

    cs.LG physics.chem-ph

    Data-Driven Parametrization of Molecular Mechanics Force Fields for Expansive Chemical Space Coverage

    Authors: Tianze Zheng, Ailun Wang, Xu Han, Yu Xia, Xingyuan Xu, Jiawei Zhan, Yu Liu, Yang Chen, Zhi Wang, Xiaojie Wu, Sheng Gong, Wen Yan

    Abstract: A force field is a critical component in molecular dynamics simulations for computational drug discovery. It must achieve high accuracy within the constraints of molecular mechanics' (MM) limited functional forms, which offers high computational efficiency. With the rapid expansion of synthetically accessible chemical space, traditional look-up table approaches face significant challenges. In this… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: ByteFF, a machine learning parametrized MMFF

  4. arXiv:2408.11834  [pdf, other

    cs.CV cs.AI

    SCREENER: A general framework for task-specific experiment design in quantitative MRI

    Authors: Tianshu Zheng, Zican Wang, Timothy Bray, Daniel C. Alexander, Dan Wu, Hui Zhang

    Abstract: Quantitative magnetic resonance imaging (qMRI) is increasingly investigated for use in a variety of clinical tasks from diagnosis, through staging, to treatment monitoring. However, experiment design in qMRI, the identification of the optimal acquisition protocols, has been focused on obtaining the most precise parameter estimations, with no regard for the specific requirements of downstream tasks… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  5. arXiv:2408.11562  [pdf, other

    cs.SD eess.AS

    A Joint Noise Disentanglement and Adversarial Training Framework for Robust Speaker Verification

    Authors: Xujiang Xing, Mingxing Xu, Thomas Fang Zheng

    Abstract: Automatic Speaker Verification (ASV) suffers from performance degradation in noisy conditions. To address this issue, we propose a novel adversarial learning framework that incorporates noise-disentanglement to establish a noise-independent speaker invariant embedding space. Specifically, the disentanglement module includes two encoders for separating speaker related and irrelevant information, re… ▽ More

    Submitted 22 August, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

    Comments: 5 pages, accepted by Interspeech2024

  6. arXiv:2408.09851  [pdf, other

    cs.NI eess.SY

    ISAC-Fi: Enabling Full-fledged Monostatic Sensing over Wi-Fi Communication

    Authors: Zhe Chen, Chao Hu, Tianyue Zheng, Hangcheng Cao, Yanbing Yang, Yen Chu, Hongbo Jiang, Jun Luo

    Abstract: Whereas Wi-Fi communications have been exploited for sensing purpose for over a decade, the bistatic or multistatic nature of Wi-Fi still poses multiple challenges, hampering real-life deployment of integrated sensing and communication (ISAC) within Wi-Fi framework. In this paper, we aim to re-design WiFi so that monostatic sensing (mimicking radar) can be achieved over the multistatic communicati… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 14 pages, 22 figures

  7. arXiv:2408.08072  [pdf, other

    cs.CL

    I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm

    Authors: Yiming Liang, Ge Zhang, Xingwei Qu, Tianyu Zheng, Jiawei Guo, Xinrun Du, Zhenzhu Yang, Jiaheng Liu, Chenghua Lin, Lei Ma, Wenhao Huang, Jiajun Zhang

    Abstract: Large Language Models (LLMs) have achieved significant advancements, however, the common learning paradigm treats LLMs as passive information repositories, neglecting their potential for active learning and alignment. Some approaches train LLMs using their own generated synthetic data, exploring the possibility of active alignment. However, there is still a huge gap between these one-time alignmen… ▽ More

    Submitted 27 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

  8. arXiv:2408.06911  [pdf, other

    eess.AS cs.AI

    Heterogeneous Space Fusion and Dual-Dimension Attention: A New Paradigm for Speech Enhancement

    Authors: Tao Zheng, Liejun Wang, Yinfeng Yu

    Abstract: Self-supervised learning has demonstrated impressive performance in speech tasks, yet there remains ample opportunity for advancement in the realm of speech enhancement research. In addressing speech tasks, confining the attention mechanism solely to the temporal dimension poses limitations in effectively focusing on critical speech features. Considering the aforementioned issues, our study introd… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: Accepted for publication by IEEE International Conference on Systems, Man, and Cybernetics 2024

  9. arXiv:2408.06185  [pdf, other

    eess.SY cs.CY cs.GT cs.NI

    Hi-SAM: A high-scalable authentication model for satellite-ground Zero-Trust system using mean field game

    Authors: Xuesong Wu, Tianshuai Zheng, Runfang Wu, Jie Ren, Junyan Guo, Ye Du

    Abstract: As more and more Internet of Thing (IoT) devices are connected to satellite networks, the Zero-Trust Architecture brings dynamic security to the satellite-ground system, while frequent authentication creates challenges for system availability. To make the system's accommodate more IoT devices, this paper proposes a high-scalable authentication model (Hi-SAM). Hi-SAM introduces the Proof-of-Work id… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  10. arXiv:2408.05765  [pdf, other

    cs.LG stat.ML

    Scalable and Adaptive Spectral Embedding for Attributed Graph Clustering

    Authors: Yunhui Liu, Tieke He, Qing Wu, Tao Zheng, Jianhua Zhao

    Abstract: Attributed graph clustering, which aims to group the nodes of an attributed graph into disjoint clusters, has made promising advancements in recent years. However, most existing methods face challenges when applied to large graphs due to the expensive computational cost and high memory usage. In this paper, we introduce Scalable and Adaptive Spectral Embedding (SASE), a simple attributed graph clu… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

    Comments: Accepted by CIKM 2024 (Short Paper)

  11. arXiv:2408.05087  [pdf, other

    cs.LG

    Bootstrap Latents of Nodes and Neighbors for Graph Self-Supervised Learning

    Authors: Yunhui Liu, Huaisong Zhang, Tieke He, Tao Zheng, Jianhua Zhao

    Abstract: Contrastive learning is a significant paradigm in graph self-supervised learning. However, it requires negative samples to prevent model collapse and learn discriminative representations. These negative samples inevitably lead to heavy computation, memory overhead and class collision, compromising the representation learning. Recent studies present that methods obviating negative samples can attai… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Comments: Accepted by ECML PKDD 2024

  12. arXiv:2408.03979  [pdf, ps, other

    cs.SD eess.AS

    Speaker Adaptation for Quantised End-to-End ASR Models

    Authors: Qiuming Zhao, Guangzhi Sun, Chao Zhang, Mingxing Xu, Thomas Fang Zheng

    Abstract: End-to-end models have shown superior performance for automatic speech recognition (ASR). However, such models are often very large in size and thus challenging to deploy on resource-constrained edge devices. While quantisation can reduce model sizes, it can lead to increased word error rates (WERs). Although improved quantisation methods were proposed to address the issue of performance degradati… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: submitted to ASRU 2023 Workshop

  13. arXiv:2408.03765  [pdf, other

    cs.LG

    Reliable Node Similarity Matrix Guided Contrastive Graph Clustering

    Authors: Yunhui Liu, Xinyi Gao, Tieke He, Tao Zheng, Jianhua Zhao, Hongzhi Yin

    Abstract: Graph clustering, which involves the partitioning of nodes within a graph into disjoint clusters, holds significant importance for numerous subsequent applications. Recently, contrastive learning, known for utilizing supervisory information, has demonstrated encouraging results in deep graph clustering. This methodology facilitates the learning of favorable node representations for clustering by a… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE)

  14. arXiv:2408.02814  [pdf, other

    cs.LG cs.CR

    Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services

    Authors: Shaopeng Fu, Xuexue Sun, Ke Qing, Tianhang Zheng, Di Wang

    Abstract: Though pre-trained encoders can be easily accessed online to build downstream machine learning (ML) services quickly, various attacks have been designed to compromise the security and privacy of these encoders. While most attacks target encoders on the upstream side, it remains unknown how an encoder could be threatened when deployed in a downstream ML service. This paper unveils a new vulnerabili… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  15. arXiv:2408.02559  [pdf, other

    cs.CL cs.AI

    Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information

    Authors: Yauwai Yim, Chunkit Chan, Tianyu Shi, Zheye Deng, Wei Fan, Tianshi Zheng, Yangqiu Song

    Abstract: Large language models (LLMs) have shown success in handling simple games with imperfect information and enabling multi-agent coordination, but their ability to facilitate practical collaboration against other agents in complex, imperfect information environments, especially in a non-English environment, still needs to be explored. This study investigates the applicability of knowledge acquired by… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  16. arXiv:2407.20564  [pdf, other

    cs.CL

    CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge

    Authors: Tianshi Zheng, Jiaxin Bai, Yicheng Wang, Tianqing Fang, Yue Guo, Yauwai Yim, Yangqiu Song

    Abstract: While large language models (LLMs) have demonstrated impressive capabilities across various natural language processing tasks by acquiring rich factual knowledge from their broad training data, their ability to synthesize and logically reason with this knowledge in complex ways remains underexplored. In this work, we present a systematic evaluation of state-of-the-art LLMs' complex logical reasoni… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

    Comments: 9 pages

  17. arXiv:2407.18962  [pdf

    cs.RO cs.LG

    Autonomous Navigation of Unmanned Vehicle Through Deep Reinforcement Learning

    Authors: Letian Xu, Jiabei Liu, Haopeng Zhao, Tianyao Zheng, Tongzhou Jiang, Lipeng Liu

    Abstract: This paper explores the method of achieving autonomous navigation of unmanned vehicles through Deep Reinforcement Learning (DRL). The focus is on using the Deep Deterministic Policy Gradient (DDPG) algorithm to address issues in high-dimensional continuous action spaces. The paper details the model of a Ackermann robot and the structure and application of the DDPG algorithm. Experiments were condu… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  18. arXiv:2407.17070  [pdf, other

    cs.LG cs.AI

    Curriculum Negative Mining For Temporal Networks

    Authors: Ziyue Chen, Tongya Zheng, Mingli Song

    Abstract: Temporal networks are effective in capturing the evolving interactions of networks over time, such as social networks and e-commerce networks. In recent years, researchers have primarily concentrated on developing specific model architectures for Temporal Graph Neural Networks (TGNNs) in order to improve the representation quality of temporal nodes and edges. However, limited attention has been gi… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  19. arXiv:2407.15389  [pdf, other

    cs.LG cs.CR cs.DC

    Poisoning with A Pill: Circumventing Detection in Federated Learning

    Authors: Hanxi Guo, Hao Wang, Tao Song, Tianhang Zheng, Yang Hua, Haibing Guan, Xiangyu Zhang

    Abstract: Without direct access to the client's data, federated learning (FL) is well-known for its unique strength in data privacy protection among existing distributed machine learning techniques. However, its distributive and iterative nature makes FL inherently vulnerable to various poisoning attacks. To counteract these threats, extensive defenses have been proposed to filter out malicious clients, usi… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  20. arXiv:2407.15325  [pdf, other

    cs.AI

    Odyssey: Empowering Agents with Open-World Skills

    Authors: Shunyu Liu, Yaoru Li, Kongcheng Zhang, Zhenyu Cui, Wenkai Fang, Yuxuan Zheng, Tongya Zheng, Mingli Song

    Abstract: Recent studies have delved into constructing generalist agents for open-world embodied environments like Minecraft. Despite the encouraging results, existing efforts mainly focus on solving basic programmatic tasks, e.g., material collection and tool-crafting following the Minecraft tech-tree, treating the ObtainDiamond task as the ultimate goal. This limitation stems from the narrowly defined set… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  21. arXiv:2407.13778  [pdf, other

    cs.CV cs.LG

    Assessing the Potential of PlanetScope Satellite Imagery to Estimate Particulate Matter Oxidative Potential

    Authors: Ian Hough, Loïc Argentier, Ziyang Jiang, Tongshu Zheng, Mike Bergin, David Carlson, Jean-Luc Jaffrezo, Jocelyn Chanussot, Gaëlle Uzu

    Abstract: Oxidative potential (OP), which measures particulate matter's (PM) capacity to induce oxidative stress in the lungs, is increasingly recognized as an indicator of PM toxicity. Since OP is not routinely monitored, it can be challenging to estimate exposure and health impacts. Remote sensing data are commonly used to estimate PM mass concentration, but have never been used to estimate OP. In this st… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  22. arXiv:2407.09904  [pdf, other

    cs.LG

    Learning a Mini-batch Graph Transformer via Two-stage Interaction Augmentation

    Authors: Wenda Li, Kaixuan Chen, Shunyu Liu, Tongya Zheng, Wenjie Huang, Mingli Song

    Abstract: Mini-batch Graph Transformer (MGT), as an emerging graph learning model, has demonstrated significant advantages in semi-supervised node prediction tasks with improved computational efficiency and enhanced model robustness. However, existing methods for processing local information either rely on sampling or simple aggregation, which respectively result in the loss and squashing of critical neighb… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: 8 pages, 4 figures, Accept by ECAI2024

  23. arXiv:2407.08942  [pdf

    cs.IR cs.AI

    A Neural Matrix Decomposition Recommender System Model based on the Multimodal Large Language Model

    Authors: Ao Xiang, Bingjie Huang, Xinyu Guo, Haowei Yang, Tianyao Zheng

    Abstract: Recommendation systems have become an important solution to information search problems. This article proposes a neural matrix factorization recommendation system model based on the multimodal large language model called BoNMF. This model combines BoBERTa's powerful capabilities in natural language processing, ViT in computer in vision, and neural matrix decomposition technology. By capturing the… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  24. arXiv:2407.06514  [pdf, other

    eess.IV cs.CV

    Asymmetric Mask Scheme for Self-Supervised Real Image Denoising

    Authors: Xiangyu Liao, Tianheng Zheng, Jiayu Zhong, Pingping Zhang, Chao Ren

    Abstract: In recent years, self-supervised denoising methods have gained significant success and become critically important in the field of image restoration. Among them, the blind spot network based methods are the most typical type and have attracted the attentions of a large number of researchers. Although the introduction of blind spot operations can prevent identity mapping from noise to noise, it imp… ▽ More

    Submitted 14 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  25. arXiv:2407.05112  [pdf, other

    cs.CR cs.AI

    Releasing Malevolence from Benevolence: The Menace of Benign Data on Machine Unlearning

    Authors: Binhao Ma, Tianhang Zheng, Hongsheng Hu, Di Wang, Shuo Wang, Zhongjie Ba, Zhan Qin, Kui Ren

    Abstract: Machine learning models trained on vast amounts of real or synthetic data often achieve outstanding predictive performance across various domains. However, this utility comes with increasing concerns about privacy, as the training data may include sensitive information. To address these concerns, machine unlearning has been proposed to erase specific data samples from models. While some unlearning… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  26. Unveiling Global Interactive Patterns across Graphs: Towards Interpretable Graph Neural Networks

    Authors: Yuwen Wang, Shunyu Liu, Tongya Zheng, Kaixuan Chen, Mingli Song

    Abstract: Graph Neural Networks (GNNs) have emerged as a prominent framework for graph mining, leading to significant advances across various domains. Stemmed from the node-wise representations of GNNs, existing explanation studies have embraced the subgraph-specific viewpoint that attributes the decision results to the salient features and local structures of nodes. However, graph-level tasks necessitate l… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted in KDD2024

  27. arXiv:2406.19706  [pdf, other

    cs.SD eess.AS

    SAML: Speaker Adaptive Mixture of LoRA Experts for End-to-End ASR

    Authors: Qiuming Zhao, Guangzhi Sun, Chao Zhang, Mingxing Xu, Thomas Fang Zheng

    Abstract: Mixture-of-experts (MoE) models have achieved excellent results in many tasks. However, conventional MoE models are often very large, making them challenging to deploy on resource-constrained edge devices. In this paper, we propose a novel speaker adaptive mixture of LoRA experts (SAML) approach, which uses low-rank adaptation (LoRA) modules as experts to reduce the number of trainable parameters… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 5 pages, accepted by Interspeech 2024. arXiv admin note: substantial text overlap with arXiv:2309.09136

  28. arXiv:2406.17588  [pdf, other

    cs.CL

    LongIns: A Challenging Long-context Instruction-based Exam for LLMs

    Authors: Shawn Gavin, Tuney Zheng, Jiaheng Liu, Quehry Que, Noah Wang, Jian Yang, Chenchen Zhang, Wenhao Huang, Wenhu Chen, Ge Zhang

    Abstract: The long-context capabilities of large language models (LLMs) have been a hot topic in recent years. To evaluate the performance of LLMs in different scenarios, various assessment benchmarks have emerged. However, as most of these benchmarks focus on identifying key information to answer questions, which mainly requires the retrieval ability of LLMs, these benchmarks can partially represent the re… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  29. arXiv:2406.17286  [pdf

    cs.RO eess.SY

    Prioritized experience replay-based DDQN for Unmanned Vehicle Path Planning

    Authors: Liu Lipeng, Letian Xu, Jiabei Liu, Haopeng Zhao, Tongzhou Jiang, Tianyao Zheng

    Abstract: Path planning module is a key module for autonomous vehicle navigation, which directly affects its operating efficiency and safety. In complex environments with many obstacles, traditional planning algorithms often cannot meet the needs of intelligence, which may lead to problems such as dead zones in unmanned vehicles. This paper proposes a path planning algorithm based on DDQN and combines it wi… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 4 pages, 6 figures, 2024 5th International Conference on Information Science, Parallel and Distributed Systems

  30. arXiv:2406.14903  [pdf, other

    cs.AI

    GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models

    Authors: Leyan Wang, Yonggang Jin, Tianhao Shen, Tianyu Zheng, Xinrun Du, Chenchen Zhang, Wenhao Huang, Jiaheng Liu, Shi Wang, Ge Zhang, Liuyu Xiang, Zhaofeng He

    Abstract: As large language models (LLMs) continue to develop and gain widespread application, the ability of LLMs to exhibit empathy towards diverse group identities and understand their perspectives is increasingly recognized as critical. Most existing benchmarks for empathy evaluation of LLMs focus primarily on universal human emotions, such as sadness and pain, often overlooking the context of individua… ▽ More

    Submitted 24 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

  31. arXiv:2406.14773  [pdf, other

    cs.CR

    Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data

    Authors: Shenglai Zeng, Jiankun Zhang, Pengfei He, Jie Ren, Tianqi Zheng, Hanqing Lu, Han Xu, Hui Liu, Yue Xing, Jiliang Tang

    Abstract: Retrieval-augmented generation (RAG) enhances the outputs of language models by integrating relevant information retrieved from external knowledge sources. However, when the retrieval process involves private data, RAG systems may face severe privacy risks, potentially leading to the leakage of sensitive information. To address this issue, we propose using synthetic data as a privacy-preserving al… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  32. arXiv:2406.09442  [pdf

    physics.med-ph cs.LG physics.app-ph physics.bio-ph

    An insertable glucose sensor using a compact and cost-effective phosphorescence lifetime imager and machine learning

    Authors: Artem Goncharov, Zoltan Gorocs, Ridhi Pradhan, Brian Ko, Ajmal Ajmal, Andres Rodriguez, David Baum, Marcell Veszpremi, Xilin Yang, Maxime Pindrys, Tianle Zheng, Oliver Wang, Jessica C. Ramella-Roman, Michael J. McShane, Aydogan Ozcan

    Abstract: Optical continuous glucose monitoring (CGM) systems are emerging for personalized glucose management owing to their lower cost and prolonged durability compared to conventional electrochemical CGMs. Here, we report a computational CGM system, which integrates a biocompatible phosphorescence-based insertable biosensor and a custom-designed phosphorescence lifetime imager (PLI). This compact and cos… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 24 Pages, 4 Figures

    Journal ref: ACS Nano (2024)

  33. arXiv:2406.08068  [pdf, other

    cs.CL

    Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey

    Authors: Hao Yang, Yanyan Zhao, Yang Wu, Shilong Wang, Tian Zheng, Hongbo Zhang, Zongyang Ma, Wanxiang Che, Bing Qin

    Abstract: Compared to traditional sentiment analysis, which only considers text, multimodal sentiment analysis needs to consider emotional signals from multimodal sources simultaneously and is therefore more consistent with the way how humans process sentiment in real-world scenarios. It involves processing emotional information from various sources such as natural language, images, videos, audio, physiolog… ▽ More

    Submitted 16 August, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2210.14556 by other authors

  34. arXiv:2406.07989  [pdf, other

    cs.IT eess.SP

    Near-Field Wideband Beam Training Based on Distance-Dependent Beam Split

    Authors: Tianyue Zheng, Mingyao Cui, Zidong Wu, Linglong Dai

    Abstract: Near-field beam training is essential for acquiring channel state information in 6G extremely large-scale multiple input multiple output (XL-MIMO) systems. To achieve low-overhead beam training, existing method has been proposed to leverage the near-field beam split effect, which deploys true-time-delay arrays to simultaneously search multiple angles of the entire angular range in a distance ring… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  35. arXiv:2405.19708  [pdf, other

    cs.CV cs.AI

    Text Guided Image Editing with Automatic Concept Locating and Forgetting

    Authors: Jia Li, Lijie Hu, Zhixian He, Jingfeng Zhang, Tianhang Zheng, Di Wang

    Abstract: With the advancement of image-to-image diffusion models guided by text, significant progress has been made in image editing. However, a persistent challenge remains in seamlessly incorporating objects into images based on textual instructions, without relying on extra user-provided guidance. Text and images are inherently distinct modalities, bringing out difficulties in fully capturing the semant… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  36. arXiv:2405.19327  [pdf, other

    cs.CL cs.AI cs.LG

    MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

    Authors: Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, Raven Yuan, Tuney Zheng, Wei Pang, Xinrun Du, Yiming Liang, Yinghao Ma, Yizhi Li, Ziyang Ma, Bill Lin, Emmanouil Benetos, Huan Yang, Junting Zhou, Kaijing Ma, Minghao Liu, Morry Niu , et al. (20 additional authors not shown)

    Abstract: Large Language Models (LLMs) have made great strides in recent years to achieve unprecedented performance across different tasks. However, due to commercial interest, the most competitive models like GPT, Gemini, and Claude have been gated behind proprietary interfaces without disclosing the training details. Recently, many institutions have open-sourced several strong LLMs like LLaMA-3, comparabl… ▽ More

    Submitted 10 July, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: https://map-neo.github.io/

  37. arXiv:2405.03548  [pdf, other

    cs.CL

    MAmmoTH2: Scaling Instructions from the Web

    Authors: Xiang Yue, Tuney Zheng, Ge Zhang, Wenhu Chen

    Abstract: Instruction tuning improves the reasoning abilities of large language models (LLMs), with data quality and scalability being the crucial factors. Most instruction tuning data come from human crowd-sourcing or GPT-4 distillation. We propose a paradigm to efficiently harvest 10 million naturally existing instruction data from the pre-training web corpus to enhance LLM reasoning. Our approach involve… ▽ More

    Submitted 23 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  38. arXiv:2404.14215  [pdf, other

    cs.CL

    Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction

    Authors: Zheye Deng, Chunkit Chan, Weiqi Wang, Yuxi Sun, Wei Fan, Tianshi Zheng, Yauwai Yim, Yangqiu Song

    Abstract: The task of condensing large chunks of textual information into concise and structured tables has gained attention recently due to the emergence of Large Language Models (LLMs) and their potential benefit for downstream tasks, such as text summarization and text mining. Previous approaches often generate tables that directly replicate information from the text, limiting their applicability in broa… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  39. arXiv:2404.12135  [pdf, other

    cs.MA cs.CR cs.DC

    mABC: multi-Agent Blockchain-Inspired Collaboration for root cause analysis in micro-services architecture

    Authors: Wei Zhang, Hongcheng Guo, Jian Yang, Yi Zhang, Chaoran Yan, Zhoujin Tian, Hangyuan Ji, Zhoujun Li, Tongliang Li, Tieqiao Zheng, Chao Chen, Yi Liang, Xu Shi, Liangfan Zheng, Bo Zhang

    Abstract: The escalating complexity of micro-services architecture in cloud-native technologies poses significant challenges for maintaining system stability and efficiency. To conduct root cause analysis (RCA) and resolution of alert events, we propose a pioneering framework, multi-Agent Blockchain-inspired Collaboration for root cause analysis in micro-services architecture (mABC), to revolutionize the AI… ▽ More

    Submitted 3 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  40. arXiv:2404.10237  [pdf, other

    cs.CV cs.CL

    Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models

    Authors: Songtao Jiang, Tuo Zheng, Yan Zhang, Yeying Jin, Li Yuan, Zuozhu Liu

    Abstract: Recent advancements in general-purpose or domain-specific multimodal large language models (LLMs) have witnessed remarkable progress for medical decision-making. However, they are designated for specific classification or generative tasks, and require model training or finetuning on large-scale datasets with sizeable parameters and tremendous computing, hindering their clinical utility across dive… ▽ More

    Submitted 26 June, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  41. arXiv:2404.07181  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.comp-ph

    BAMBOO: a predictive and transferable machine learning force field framework for liquid electrolyte development

    Authors: Sheng Gong, Yumin Zhang, Zhenliang Mu, Zhichen Pu, Hongyi Wang, Zhiao Yu, Mengyi Chen, Tianze Zheng, Zhi Wang, Lifei Chen, Xiaojie Wu, Shaochen Shi, Weihao Gao, Wen Yan, Liang Xiang

    Abstract: Despite the widespread applications of machine learning force field (MLFF) on solids and small molecules, there is a notable gap in applying MLFF to complex liquid electrolytes. In this work, we introduce BAMBOO (ByteDance AI Molecular Simulation Booster), a novel framework for molecular dynamics (MD) simulations, with a demonstration of its capabilities in the context of liquid electrolytes for l… ▽ More

    Submitted 22 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  42. arXiv:2404.07084  [pdf, other

    cs.CL cs.AI

    Dynamic Generation of Personalities with Large Language Models

    Authors: Jianzhi Liu, Hexiang Gu, Tianyu Zheng, Liuyu Xiang, Huijia Wu, Jie Fu, Zhaofeng He

    Abstract: In the realm of mimicking human deliberation, large language models (LLMs) show promising performance, thereby amplifying the importance of this research area. Deliberation is influenced by both logic and personality. However, previous studies predominantly focused on the logic of LLMs, neglecting the exploration of personality aspects. In this work, we introduce Dynamic Personality Generation (DP… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  43. arXiv:2404.06393  [pdf, other

    cs.SD cs.AI eess.AS

    MuPT: A Generative Symbolic Music Pretrained Transformer

    Authors: Xingwei Qu, Yuelin Bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xinrun Du, Shuyue Guo, Yiming Liang, Yizhi Li, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia, Emmanouil Benetos, Xiang Yue, Chenghua Lin, Xu Tan , et al. (4 additional authors not shown)

    Abstract: In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Notation, which aligns more closely with their design and strengths, thereby enhancing the model's performance in musical composition. To address the chal… ▽ More

    Submitted 10 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  44. arXiv:2404.04167  [pdf, other

    cs.CL cs.AI

    Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model

    Authors: Xinrun Du, Zhouliang Yu, Songyang Gao, Ding Pan, Yuyang Cheng, Ziyang Ma, Ruibin Yuan, Xingwei Qu, Jiaheng Liu, Tianyu Zheng, Xinchen Luo, Guorui Zhou, Wenhu Chen, Ge Zhang

    Abstract: In this study, we introduce CT-LLM, a 2B large language model (LLM) that illustrates a pivotal shift towards prioritizing the Chinese language in developing LLMs. Uniquely initiated from scratch, CT-LLM diverges from the conventional methodology by primarily incorporating Chinese textual data, utilizing an extensive corpus of 1,200 billion tokens, including 800 billion Chinese tokens, 300 billion… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

  45. arXiv:2404.03543  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

    Authors: Jiawei Guo, Ziming Li, Xueling Liu, Kaijing Ma, Tianyu Zheng, Zhouliang Yu, Ding Pan, Yizhi LI, Ruibo Liu, Yue Wang, Shuyue Guo, Xingwei Qu, Xiang Yue, Ge Zhang, Wenhu Chen, Jie Fu

    Abstract: Large Language Models (LLMs) for code are rapidly evolving, with code editing emerging as a critical capability. We introduce CodeEditorBench, an evaluation framework designed to rigorously assess the performance of LLMs in code editing tasks, including debugging, translating, polishing, and requirement switching. Unlike existing benchmarks focusing solely on code generation, CodeEditorBench empha… ▽ More

    Submitted 6 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  46. arXiv:2403.18058  [pdf, other

    cs.CL cs.AI

    COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning

    Authors: Yuelin Bai, Xinrun Du, Yiming Liang, Yonggang Jin, Ziqiang Liu, Junting Zhou, Tianyu Zheng, Xincheng Zhang, Nuo Ma, Zekun Wang, Ruibin Yuan, Haihong Wu, Hongquan Lin, Wenhao Huang, Jiajun Zhang, Wenhu Chen, Chenghua Lin, Jie Fu, Min Yang, Shiwen Ni, Ge Zhang

    Abstract: Recently, there have been significant advancements in large language models (LLMs), particularly focused on the English language. These advancements have enabled these LLMs to understand and execute complex instructions with unprecedented accuracy and fluency. However, despite these advancements, there remains a noticeable gap in the development of Chinese instruction tuning. The unique linguistic… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  47. arXiv:2403.14951  [pdf, other

    cs.LG cs.AI cs.SI

    Simple Graph Condensation

    Authors: Zhenbang Xiao, Yu Wang, Shunyu Liu, Huiqiong Wang, Mingli Song, Tongya Zheng

    Abstract: The burdensome training costs on large-scale graphs have aroused significant interest in graph condensation, which involves tuning Graph Neural Networks (GNNs) on a small condensed graph for use on the large-scale original graph. Existing methods primarily focus on aligning key metrics between the condensed and original graphs, such as gradients, output distribution and trajectories of GNNs, yield… ▽ More

    Submitted 17 July, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: ECML-PKDD 2024

  48. arXiv:2403.09090  [pdf, other

    math.OC cs.LG

    Dissipative Gradient Descent Ascent Method: A Control Theory Inspired Algorithm for Min-max Optimization

    Authors: Tianqi Zheng, Nicolas Loizou, Pengcheng You, Enrique Mallada

    Abstract: Gradient Descent Ascent (GDA) methods for min-max optimization problems typically produce oscillatory behavior that can lead to instability, e.g., in bilinear settings. To address this problem, we introduce a dissipation term into the GDA updates to dampen these oscillations. The proposed Dissipative GDA (DGDA) method can be seen as performing standard GDA on a state-augmented and regularized sadd… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  49. arXiv:2403.07939  [pdf, other

    cs.CV

    Dynamic Policy-Driven Adaptive Multi-Instance Learning for Whole Slide Image Classification

    Authors: Tingting Zheng, Kui Jiang, Hongxun Yao

    Abstract: Multi-Instance Learning (MIL) has shown impressive performance for histopathology whole slide image (WSI) analysis using bags or pseudo-bags. It involves instance sampling, feature representation, and decision-making. However, existing MIL-based technologies at least suffer from one or more of the following problems: 1) requiring high storage and intensive pre-processing for numerous instances (sa… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024;Project page:https://vilab.hit.edu.cn/projects/pamil

  50. arXiv:2403.04233  [pdf, other

    cs.CL cs.AI

    DEEP-ICL: Definition-Enriched Experts for Language Model In-Context Learning

    Authors: Xingwei Qu, Yiming Liang, Yucheng Wang, Tianyu Zheng, Tommy Yue, Lei Ma, Stephen W. Huang, Jiajun Zhang, Yinan Shi, Chenghua Lin, Jie Fu, Ge Zhang

    Abstract: It has long been assumed that the sheer number of parameters in large language models (LLMs) drives in-context learning (ICL) capabilities, enabling remarkable performance improvements by leveraging task-specific demonstrations. Challenging this hypothesis, we introduce DEEP-ICL, a novel task Definition Enriched ExPert Ensembling methodology for ICL. DEEP-ICL explicitly extracts task definitions f… ▽ More

    Submitted 16 June, 2024; v1 submitted 7 March, 2024; originally announced March 2024.