Skip to main content

Showing 1–50 of 152 results for author: Song, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.02795  [pdf, other

    cs.CL

    Towards a Unified View of Preference Learning for Large Language Models: A Survey

    Authors: Bofei Gao, Feifan Song, Yibo Miao, Zefan Cai, Zhe Yang, Liang Chen, Helan Hu, Runxin Xu, Qingxiu Dong, Ce Zheng, Wen Xiao, Ge Zhang, Daoguang Zan, Keming Lu, Bowen Yu, Dayiheng Liu, Zeyu Cui, Jian Yang, Lei Sha, Houfeng Wang, Zhifang Sui, Peiyi Wang, Tianyu Liu, Baobao Chang

    Abstract: Large Language Models (LLMs) exhibit remarkably powerful capabilities. One of the crucial factors to achieve success is aligning the LLM's output with human preferences. This alignment process often requires only a small amount of data to efficiently enhance the LLM's performance. While effective, research in this area spans multiple domains, and the methods involved are relatively complex to unde… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: Initial Commit, 21 pages

  2. arXiv:2408.15503  [pdf, other

    cs.CV cs.AI

    RoboSense: Large-scale Dataset and Benchmark for Multi-sensor Low-speed Autonomous Driving

    Authors: Haisheng Su, Feixiang Song, Cong Ma, Panpan Cai, Wei Wu, Cewu Lu

    Abstract: Robust object detection and tracking under arbitrary sight of view is challenging yet essential for the development of Autonomous Vehicle technology. With the growing demand of unmanned function vehicles, near-field scene understanding becomes an important research topic in the areas of low-speed autonomous driving. Due to the complexity of driving conditions and diversity of near obstacles such a… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  3. arXiv:2408.04194  [pdf, other

    cs.SE cs.CR

    FDI: Attack Neural Code Generation Systems through User Feedback Channel

    Authors: Zhensu Sun, Xiaoning Du, Xiapu Luo, Fu Song, David Lo, Li Li

    Abstract: Neural code generation systems have recently attracted increasing attention to improve developer productivity and speed up software development. Typically, these systems maintain a pre-trained neural model and make it available to general users as a service (e.g., through remote APIs) and incorporate a feedback mechanism to extensively collect and utilize the users' reaction to the generated code,… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: Accepted by ISSTA'24

  4. arXiv:2407.18035  [pdf, other

    cs.CV cs.AI cs.CL

    RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models

    Authors: Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Sixiang Chen, Tian Ye, Renjing Pei, Kaiwen Zhou, Fenglong Song, Lei Zhu

    Abstract: Natural images captured by mobile devices often suffer from multiple types of degradation, such as noise, blur, and low light. Traditional image restoration methods require manual selection of specific tasks, algorithms, and execution sequences, which is time-consuming and may yield suboptimal results. All-in-one models, though capable of handling multiple tasks, typically support only a limited r… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  5. arXiv:2407.13292  [pdf, other

    cs.SD cs.CL eess.AS

    Low-Resourced Speech Recognition for Iu Mien Language via Weakly-Supervised Phoneme-based Multilingual Pre-training

    Authors: Lukuan Dong, Donghong Qin, Fengbo Bai, Fanhua Song, Yan Liu, Chen Xu, Zhijian Ou

    Abstract: The mainstream automatic speech recognition (ASR) technology usually requires hundreds to thousands of hours of annotated speech data. Three approaches to low-resourced ASR are phoneme or subword based supervised pre-training, and self-supervised pre-training over multilingual data. The Iu Mien language is the main ethnic language of the Yao ethnic group in China and is low-resourced in the sense… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  6. arXiv:2407.09935  [pdf, other

    cs.CV cs.MM eess.IV

    LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation

    Authors: Jiacheng Li, Chang Chen, Fenglong Song, Youliang Yan, Zhiwei Xiong

    Abstract: Image resampling is a basic technique that is widely employed in daily applications, such as camera photo editing. Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors. Still, these methods are not the perfect substitute for interpolation, due to the drawbacks in efficiency and versatility. In this work, we propose a novel method of Lea… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: Code: https://github.com/ddlee-cn/LeRF-PyTorch

  7. arXiv:2407.08109  [pdf, other

    cs.CV cs.AI cs.LG

    Urban Waterlogging Detection: A Challenging Benchmark and Large-Small Model Co-Adapter

    Authors: Suqi Song, Chenxu Zhang, Peng Zhang, Pengkun Li, Fenglong Song, Lei Zhang

    Abstract: Urban waterlogging poses a major risk to public safety and infrastructure. Conventional methods using water-level sensors need high-maintenance to hardly achieve full coverage. Recent advances employ surveillance camera imagery and deep learning for detection, yet these struggle amidst scarce data and adverse environmental conditions. In this paper, we establish a challenging Urban Waterlogging Be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  8. arXiv:2407.02158  [pdf, other

    cs.CV

    UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks

    Authors: Jingjing Ren, Wenbo Li, Haoyu Chen, Renjing Pei, Bin Shao, Yong Guo, Long Peng, Fenglong Song, Lei Zhu

    Abstract: Ultra-high-resolution image generation poses great challenges, such as increased semantic planning complexity and detail synthesis difficulties, alongside substantial training resource demands. We present UltraPixel, a novel architecture utilizing cascade diffusion models to generate high-quality images at multiple resolutions (\textit{e.g.}, 1K to 6K) within a single model, while maintaining comp… ▽ More

    Submitted 4 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Project page https://jingjingrenabc.github.io/ultrapixel

  9. arXiv:2406.15873  [pdf, other

    physics.comp-ph cs.LG physics.chem-ph

    NeuralSCF: Neural network self-consistent fields for density functional theory

    Authors: Feitong Song, Ji Feng

    Abstract: Kohn-Sham density functional theory (KS-DFT) has found widespread application in accurate electronic structure calculations. However, it can be computationally demanding especially for large-scale simulations, motivating recent efforts toward its machine-learning (ML) acceleration. We propose a neural network self-consistent fields (NeuralSCF) framework that establishes the Kohn-Sham density map a… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 14 pages, 4 figures

  10. arXiv:2406.11490  [pdf, other

    cs.LG stat.ME

    Interventional Imbalanced Multi-Modal Representation Learning via $β$-Generalization Front-Door Criterion

    Authors: Yi Li, Jiangmeng Li, Fei Song, Qingmeng Zhu, Changwen Zheng, Wenwen Qiang

    Abstract: Multi-modal methods establish comprehensive superiority over uni-modal methods. However, the imbalanced contributions of different modalities to task-dependent predictions constantly degrade the discriminative performance of canonical multi-modal methods. Based on the contribution to task-dependent predictions, modalities can be identified as predominant and auxiliary modalities. Benchmark methods… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  11. arXiv:2405.11770  [pdf, other

    cs.CV

    Learning Spatial Similarity Distribution for Few-shot Object Counting

    Authors: Yuanwu Xu, Feifan Song, Haofeng Zhang

    Abstract: Few-shot object counting aims to count the number of objects in a query image that belong to the same class as the given exemplar images. Existing methods compute the similarity between the query image and exemplars in the 2D spatial domain and perform regression to obtain the counting number. However, these methods overlook the rich information about the spatial distribution of similarity on the… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Accepted to IJCAI2024

  12. arXiv:2404.04281  [pdf

    cs.CL cs.AI

    Similar Data Points Identification with LLM: A Human-in-the-loop Strategy Using Summarization and Hidden State Insights

    Authors: Xianlong Zeng, Fanghao Song, Ang Liu

    Abstract: This study introduces a simple yet effective method for identifying similar data points across non-free text domains, such as tabular and image data, using Large Language Models (LLMs). Our two-step approach involves data point summarization and hidden state extraction. Initially, data is condensed via summarization using an LLM, reducing complexity and highlighting essential information in senten… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  13. arXiv:2404.03327  [pdf, other

    cs.CV eess.IV

    DI-Retinex: Digital-Imaging Retinex Theory for Low-Light Image Enhancement

    Authors: Shangquan Sun, Wenqi Ren, Jingyang Peng, Fenglong Song, Xiaochun Cao

    Abstract: Many existing methods for low-light image enhancement (LLIE) based on Retinex theory ignore important factors that affect the validity of this theory in digital imaging, such as noise, quantization error, non-linearity, and dynamic range overflow. In this paper, we propose a new expression called Digital-Imaging Retinex theory (DI-Retinex) through theoretical and experimental analysis of Retinex t… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  14. arXiv:2403.20204  [pdf, other

    cs.AI

    The Future of Combating Rumors? Retrieval, Discrimination, and Generation

    Authors: Junhao Xu, Longdi Xian, Zening Liu, Mingliang Chen, Qiuyang Yin, Fenghua Song

    Abstract: Artificial Intelligence Generated Content (AIGC) technology development has facilitated the creation of rumors with misinformation, impacting societal, economic, and political ecosystems, challenging democracy. Current rumor detection efforts fall short by merely labeling potentially misinformation (classification task), inadequately addressing the issue, and it is unrealistic to have authoritativ… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 8 pages

    MSC Class: 68T99

  15. arXiv:2403.11124  [pdf, other

    cs.CL cs.AI

    Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment

    Authors: Feifan Song, Bowen Yu, Hao Lang, Haiyang Yu, Fei Huang, Houfeng Wang, Yongbin Li

    Abstract: Alignment with human preference prevents large language models (LLMs) from generating misleading or toxic content while requiring high-cost human feedback. Assuming resources of human annotation are limited, there are two different ways of allocating considered: more diverse PROMPTS or more diverse RESPONSES to be labeled. Nonetheless, a straightforward comparison between their impact is absent. I… ▽ More

    Submitted 30 March, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted by LREC-COLING 2024

  16. arXiv:2402.13506  [pdf, other

    cs.CR cs.SE

    Towards Efficient Verification of Constant-Time Cryptographic Implementations

    Authors: Luwei Cai, Fu Song, Taolue Chen

    Abstract: Timing side-channel attacks exploit secret-dependent execution time to fully or partially recover secrets of cryptographic implementations, posing a severe threat to software security. Constant-time programming discipline is an effective software-based countermeasure against timing side-channel attacks, but developing constant-time implementations turns out to be challenging and error-prone. Curre… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: Accepted by ACM FSE 2024

  17. arXiv:2402.09320  [pdf, other

    cs.CL cs.AI

    ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization

    Authors: Feifan Song, Yuxuan Fan, Xin Zhang, Peiyi Wang, Houfeng Wang

    Abstract: Large Language Models (LLMs) rely on Human Preference Alignment (HPA) to ensure the generation of safe content. Due to the heavy cost associated with fine-tuning, fine-tuning-free methods have emerged, typically modifying LLM decoding with external auxiliary methods. However, these methods do not essentially enhance the LLM itself. In this paper, we rethink the derivation procedures of DPO, based… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  18. arXiv:2401.17133  [pdf, other

    cs.SD cs.AI cs.CR cs.LG cs.MM eess.AS

    A Proactive and Dual Prevention Mechanism against Illegal Song Covers empowered by Singing Voice Conversion

    Authors: Guangke Chen, Yedi Zhang, Fu Song, Ting Wang, Xiaoning Du, Yang Liu

    Abstract: Singing voice conversion (SVC) automates song covers by converting one singer's singing voice into another target singer's singing voice with the original lyrics and melody. However, it raises serious concerns about copyright and civil right infringements to multiple entities. This work proposes SongBsAb, the first proactive approach to mitigate unauthorized SVC-based illegal song covers. SongBsAb… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  19. arXiv:2401.14166  [pdf, other

    cs.CL cs.AI

    BayesPrompt: Prompting Large-Scale Pre-Trained Language Models on Few-shot Inference via Debiased Domain Abstraction

    Authors: Jiangmeng Li, Fei Song, Yifan Jin, Wenwen Qiang, Changwen Zheng, Fuchun Sun, Hui Xiong

    Abstract: As a novel and effective fine-tuning paradigm based on large-scale pre-trained language models (PLMs), prompt-tuning aims to reduce the gap between downstream tasks and pre-training objectives. While prompt-tuning has yielded continuous advancements in various tasks, such an approach still remains a persistent defect: prompt-tuning methods fail to generalize to specific few-shot patterns. From the… ▽ More

    Submitted 20 March, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted by ICLR2024

  20. arXiv:2401.09964  [pdf, other

    cs.SE cs.AI

    When Neural Code Completion Models Size up the Situation: Attaining Cheaper and Faster Completion through Dynamic Model Inference

    Authors: Zhensu Sun, Xiaoning Du, Fu Song, Shangwen Wang, Li Li

    Abstract: Leveraging recent advancements in large language models, modern neural code completion models have demonstrated the capability to generate highly accurate code suggestions. However, their massive size poses challenges in terms of computational costs and environmental impact, hindering their widespread adoption in practical scenarios. Dynamic inference emerges as a promising solution, as it allocat… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: Accepted to ICSE24

  21. arXiv:2312.13310  [pdf, other

    eess.IV cs.CV

    Computational Spectral Imaging with Unified Encoding Model: A Comparative Study and Beyond

    Authors: Xinyuan Liu, Lizhi Wang, Lingen Li, Chang Chen, Xue Hu, Fenglong Song, Youliang Yan

    Abstract: Computational spectral imaging is drawing increasing attention owing to the snapshot advantage, and amplitude, phase, and wavelength encoding systems are three types of representative implementations. Fairly comparing and understanding the performance of these systems is essential, but challenging due to the heterogeneity in encoding design. To overcome this limitation, we propose the unified enco… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  22. arXiv:2312.12833  [pdf, other

    eess.IV cs.CV

    Learning Exhaustive Correlation for Spectral Super-Resolution: Where Spatial-Spectral Attention Meets Linear Dependence

    Authors: Hongyuan Wang, Lizhi Wang, Jiang Xu, Chang Chen, Xue Hu, Fenglong Song, Youliang Yan

    Abstract: Spectral super-resolution that aims to recover hyperspectral image (HSI) from easily obtainable RGB image has drawn increasing interest in the field of computational photography. The crucial aspect of spectral super-resolution lies in exploiting the correlation within HSIs. However, two types of bottlenecks in existing Transformers limit performance improvement and practical applications. First, e… ▽ More

    Submitted 18 March, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  23. arXiv:2312.04763  [pdf, other

    cs.IR

    Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective

    Authors: Fangzhou Song, Bin Zhu, Yanbin Hao, Shuo Wang

    Abstract: Learning recipe and food image representation in common embedding space is non-trivial but crucial for cross-modal recipe retrieval. In this paper, we propose a new perspective for this problem by utilizing foundation models for data augmentation. Leveraging on the remarkable capabilities of foundation models (i.e., Llama2 and SAM), we propose to augment recipe and food image by extracting alignab… ▽ More

    Submitted 17 July, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: ECCV2024

  24. arXiv:2311.03723  [pdf, ps, other

    quant-ph cs.CR

    Generalized Hybrid Search and Applications to Blockchain and Hash Function Security

    Authors: Alexandru Cojocaru, Juan Garay, Fang Song

    Abstract: In this work we first examine the hardness of solving various search problems by hybrid quantum-classical strategies, namely, by algorithms that have both quantum and classical capabilities. We then construct a hybrid quantum-classical search algorithm and analyze its success probability. Regarding the former, for search problems that are allowed to have multiple solutions and in which the input i… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  25. arXiv:2310.14464  [pdf, ps, other

    quant-ph cs.CR

    A Cryptographic Perspective on the Verifiability of Quantum Advantage

    Authors: Nai-Hui Chia, Honghao Fu, Fang Song, Penghui Yao

    Abstract: In recent years, achieving verifiable quantum advantage on a NISQ device has emerged as an important open problem in quantum information. The sampling-based quantum advantages are not known to have efficient verification methods. This paper investigates the verification of quantum advantage from a cryptographic perspective. We establish a strong connection between the verifiability of quantum adva… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: 21 pages, 2 figures

  26. arXiv:2310.08861  [pdf, other

    cs.CV

    Re-initialization-free Level Set Method via Molecular Beam Epitaxy Equation Regularization for Image Segmentation

    Authors: Fanghui Song, Jiebao Sun, Shengzhu Shi, Zhichang Guo, Dazhi Zhang

    Abstract: Variational level set method has become a powerful tool in image segmentation due to its ability to handle complex topological changes and maintain continuity and smoothness in the process of evolution. However its evolution process can be unstable, which results in over flatted or over sharpened contours and segmentation failure. To improve the accuracy and stability of evolution, we propose a hi… ▽ More

    Submitted 26 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

  27. arXiv:2310.01844  [pdf, other

    cs.RO

    Semi-Aerodynamic Model Aided Invariant Kalman Filtering for UAV Full-State Estimation

    Authors: Xiaoyu Ye, Fujun Song, Zongyu Zhang, Rui Zhang, Qinghua Zeng

    Abstract: Due to the state trajectory-independent features of invariant Kalman filtering (InEKF), it has attracted widespread attention in the research community for its significantly improved state estimation accuracy and convergence under disturbance. In this paper, we formulate the full-source data fusion navigation problem for fixed-wing unmanned aerial vehicle (UAV) within a framework based on error st… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  28. arXiv:2309.11002  [pdf, other

    cs.CV

    PPD: A New Valet Parking Pedestrian Fisheye Dataset for Autonomous Driving

    Authors: Zizhang Wu, Xinyuan Chen, Fan Song, Yuanzhu Gan, Tianhao Xu, Jian Pu, Rui Tang

    Abstract: Pedestrian detection under valet parking scenarios is fundamental for autonomous driving. However, the presence of pedestrians can be manifested in a variety of ways and postures under imperfect ambient conditions, which can adversely affect detection performance. Furthermore, models trained on publicdatasets that include pedestrians generally provide suboptimal outcomes for these valet parking sc… ▽ More

    Submitted 24 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: 9 pages, 6 figures

  29. arXiv:2309.08941  [pdf, ps, other

    quant-ph cs.CC cs.CR

    Quantum Pseudorandom Scramblers

    Authors: Chuhan Lu, Minglong Qin, Fang Song, Penghui Yao, Mingnan Zhao

    Abstract: Quantum pseudorandom state generators (PRSGs) have stimulated exciting developments in recent years. A PRSG, on a fixed initial (e.g., all-zero) state, produces an output state that is computationally indistinguishable from a Haar random state. However, pseudorandomness of the output state is not guaranteed on other initial states. In fact, known PRSG constructions provably fail on some initial st… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

  30. arXiv:2309.07983  [pdf, other

    cs.CR cs.LG cs.MM cs.SD eess.AS

    SLMIA-SR: Speaker-Level Membership Inference Attacks against Speaker Recognition Systems

    Authors: Guangke Chen, Yedi Zhang, Fu Song

    Abstract: Membership inference attacks allow adversaries to determine whether a particular example was contained in the model's training dataset. While previous works have confirmed the feasibility of such attacks in various applications, none has focused on speaker recognition (SR), a promising voice-based biometric recognition technique. In this work, we propose SLMIA-SR, the first membership inference at… ▽ More

    Submitted 27 November, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: In Proceedings of the 31st Network and Distributed System Security (NDSS) Symposium, 2024

  31. arXiv:2309.02144  [pdf, other

    cs.CL cs.AI cs.LG

    Making Large Language Models Better Reasoners with Alignment

    Authors: Peiyi Wang, Lei Li, Liang Chen, Feifan Song, Binghuai Lin, Yunbo Cao, Tianyu Liu, Zhifang Sui

    Abstract: Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capability is essential for large language models (LLMs) to serve as the brain of the artificial general intelligence agent. Recent studies reveal that fine-tuning LLMs on data with the chain of thought (COT) reasoning process can significantly enhance their reasoning capabilities. However, we find that t… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Large Language Models; Reasoning; Alignment

  32. CodeMark: Imperceptible Watermarking for Code Datasets against Neural Code Completion Models

    Authors: Zhensu Sun, Xiaoning Du, Fu Song, Li Li

    Abstract: Code datasets are of immense value for training neural-network-based code completion models, where companies or organizations have made substantial investments to establish and process these datasets. Unluckily, these datasets, either built for proprietary or public usage, face the high risk of unauthorized exploits, resulting from data leakages, license violations, etc. Even worse, the ``black-bo… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: Accepted to FSE 2023

  33. arXiv:2308.07590  [pdf, other

    cs.CV

    ADD: An Automatic Desensitization Fisheye Dataset for Autonomous Driving

    Authors: Zizhang Wu, Chenxin Yuan, Hongyang Wei, Fan Song, Tianhao Xu

    Abstract: Autonomous driving systems require many images for analyzing the surrounding environment. However, there is fewer data protection for private information among these captured images, such as pedestrian faces or vehicle license plates, which has become a significant issue. In this paper, in response to the call for data security laws and regulations and based on the advantages of large Field of Vie… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Journal ref: Engineering Applications of Artificial Intelligence 2023

  34. AutoAssign+: Automatic Shared Embedding Assignment in Streaming Recommendation

    Authors: Ziru Liu, Kecheng Chen, Fengyi Song, Bo Chen, Xiangyu Zhao, Huifeng Guo, Ruiming Tang

    Abstract: In the domain of streaming recommender systems, conventional methods for addressing new user IDs or item IDs typically involve assigning initial ID embeddings randomly. However, this practice results in two practical challenges: (i) Items or users with limited interactive data may yield suboptimal prediction performance. (ii) Embedding new IDs or low-frequency IDs necessitates consistently expandi… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

    Journal ref: Knowledge and Information Systems 2023

  35. arXiv:2308.05014  [pdf, other

    cs.SE cs.LG

    A Comprehensive Empirical Study of Bugs in Open-Source Federated Learning Frameworks

    Authors: Weijie Shao, Yuyang Gao, Fu Song, Sen Chen, Lingling Fan, JingZhu He

    Abstract: Federated learning (FL) is a distributed machine learning (ML) paradigm, allowing multiple clients to collaboratively train shared machine learning (ML) models without exposing clients' data privacy. It has gained substantial popularity in recent years, especially since the enforcement of data protection laws and regulations in many countries. To foster the application of FL, a variety of FL frame… ▽ More

    Submitted 6 October, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

  36. arXiv:2307.15907  [pdf, other

    cs.LG cs.FL

    An Automata-Theoretic Approach to Synthesizing Binarized Neural Networks

    Authors: Ye Tao, Wanwei Liu, Fu Song, Zhen Liang, Ji Wang, Hongxu Zhu

    Abstract: Deep neural networks, (DNNs, a.k.a. NNs), have been widely used in various tasks and have been proven to be successful. However, the accompanied expensive computing and storage costs make the deployments in resource-constrained devices a significant concern. To solve this issue, quantization has emerged as an effective way to reduce the costs of DNNs with little accuracy degradation by quantizing… ▽ More

    Submitted 29 July, 2023; originally announced July 2023.

  37. ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading

    Authors: Yujia Xiao, Shaofei Zhang, Xi Wang, Xu Tan, Lei He, Sheng Zhao, Frank K. Soong, Tan Lee

    Abstract: While state-of-the-art Text-to-Speech systems can generate natural speech of very high quality at sentence level, they still meet great challenges in speech generation for paragraph / long-form reading. Such deficiencies are due to i) ignorance of cross-sentence contextual information, and ii) high computation and memory cost for long-form synthesis. To address these issues, this work develops a l… ▽ More

    Submitted 7 October, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: 5 pages, 4 figures, Proceedings of Interspeech 2023

  38. arXiv:2307.00561  [pdf, other

    cs.CR cs.AR cs.SE

    SAT-based Formal Fault-Resistance Verification of Cryptographic Circuits

    Authors: Huiyu Tan, Pengfei Gao, Taolue Chen, Fu Song, Zhilin Wu

    Abstract: Fault injection attacks represent a type of active, physical attack against cryptographic circuits. Various countermeasures have been proposed to thwart such attacks, the design and implementation of which are, however, intricate, error-prone, and laborious. The current formal fault-resistance verification approaches are limited in efficiency and scalability. In this paper, we formalize the fault-… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

  39. arXiv:2306.17492  [pdf, other

    cs.CL cs.AI

    Preference Ranking Optimization for Human Alignment

    Authors: Feifan Song, Bowen Yu, Minghao Li, Haiyang Yu, Fei Huang, Yongbin Li, Houfeng Wang

    Abstract: Large language models (LLMs) often contain misleading content, emphasizing the need to align them with human values to ensure secure AI systems. Reinforcement learning from human feedback (RLHF) has been employed to achieve this alignment. However, it encompasses two main drawbacks: (1) RLHF exhibits complexity, instability, and sensitivity to hyperparameters in contrast to SFT. (2) Despite massiv… ▽ More

    Submitted 27 February, 2024; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: Accepted by AAAI 2024

  40. arXiv:2305.16596  [pdf, other

    cs.CR cs.PL cs.SE

    Automated Verification of Correctness for Masked Arithmetic Programs

    Authors: Mingyang Liu, Fu Song, Taolue Chen

    Abstract: Masking is a widely-used effective countermeasure against power side-channel attacks for implementing cryptographic algorithms. Surprisingly, few formal verification techniques have addressed a fundamental question, i.e., whether the masked program and the original (unmasked) cryptographic algorithm are functional equivalent. In this paper, we study this problem for masked arithmetic programs over… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  41. arXiv:2305.14097  [pdf, other

    cs.CR cs.LG cs.MM cs.SD eess.AS

    QFA2SR: Query-Free Adversarial Transfer Attacks to Speaker Recognition Systems

    Authors: Guangke Chen, Yedi Zhang, Zhe Zhao, Fu Song

    Abstract: Current adversarial attacks against speaker recognition systems (SRSs) require either white-box access or heavy black-box queries to the target SRS, thus still falling behind practical attacks against proprietary commercial APIs and voice-controlled devices. To fill this gap, we propose QFA2SR, an effective and imperceptible query-free black-box attack, by leveraging the transferability of adversa… ▽ More

    Submitted 23 September, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted by the 32nd USENIX Security Symposium (2023 USENIX Security); Full Version

  42. arXiv:2304.08244  [pdf, other

    cs.CL cs.AI

    API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs

    Authors: Minghao Li, Yingxiu Zhao, Bowen Yu, Feifan Song, Hangyu Li, Haiyang Yu, Zhoujun Li, Fei Huang, Yongbin Li

    Abstract: Recent research has demonstrated that Large Language Models (LLMs) can enhance their capabilities by utilizing external tools. However, three pivotal questions remain unanswered: (1) How effective are current LLMs in utilizing tools? (2) How can we enhance LLMs' ability to utilize tools? (3) What obstacles need to be overcome to leverage tools? To address these questions, we introduce API-Bank, a… ▽ More

    Submitted 25 October, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

    Comments: EMNLP 2023

  43. arXiv:2212.11138  [pdf, other

    cs.CR cs.AI cs.SE

    QVIP: An ILP-based Formal Verification Approach for Quantized Neural Networks

    Authors: Yedi Zhang, Zhe Zhao, Fu Song, Min Zhang, Taolue Chen, Jun Sun

    Abstract: Deep learning has become a promising programming paradigm in software development, owing to its surprising performance in solving many challenging tasks. Deep neural networks (DNNs) are increasingly being deployed in practice, but are limited on resource-constrained devices owing to their demand for computational power. Quantization has emerged as a promising technique to reduce the size of DNNs w… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

    Comments: Accepted in ASE 2022

  44. arXiv:2212.02781  [pdf, other

    cs.LG cs.AI

    QEBVerif: Quantization Error Bound Verification of Neural Networks

    Authors: Yedi Zhang, Fu Song, Jun Sun

    Abstract: To alleviate the practical constraints for deploying deep neural networks (DNNs) on edge devices, quantization is widely regarded as one promising technique. It reduces the resource requirements for computational power and storage space by quantizing the weights and/or activation tensors of a DNN into lower bit-width fixed-point numbers, resulting in quantized neural networks (QNNs). While it has… ▽ More

    Submitted 23 May, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

  45. arXiv:2211.14275  [pdf, other

    cs.LG cs.AI cs.CL

    Solving math word problems with process- and outcome-based feedback

    Authors: Jonathan Uesato, Nate Kushman, Ramana Kumar, Francis Song, Noah Siegel, Lisa Wang, Antonia Creswell, Geoffrey Irving, Irina Higgins

    Abstract: Recent work has shown that asking language models to generate reasoning steps improves performance on many reasoning tasks. When moving beyond prompting, this raises the question of how we should supervise such models: outcome-based approaches which supervise the final result, or process-based approaches which supervise the reasoning process itself? Differences between these approaches might natur… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

  46. arXiv:2211.03058  [pdf, other

    cs.CV eess.IV

    Towards Real World HDRTV Reconstruction: A Data Synthesis-based Approach

    Authors: Zhen Cheng, Tao Wang, Yong Li, Fenglong Song, Chang Chen, Zhiwei Xiong

    Abstract: Existing deep learning based HDRTV reconstruction methods assume one kind of tone mapping operators (TMOs) as the degradation procedure to synthesize SDRTV-HDRTV pairs for supervised training. In this paper, we argue that, although traditional TMOs exploit efficient dynamic range compression priors, they have several drawbacks on modeling the realistic degradation: information over-preservation, c… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

  47. arXiv:2210.11153  [pdf, other

    eess.IV cs.CV

    Reversed Image Signal Processing and RAW Reconstruction. AIM 2022 Challenge Report

    Authors: Marcos V. Conde, Radu Timofte, Yibin Huang, Jingyang Peng, Chang Chen, Cheng Li, Eduardo Pérez-Pellitero, Fenglong Song, Furui Bai, Shuai Liu, Chaoyu Feng, Xiaotao Wang, Lei Lei, Yu Zhu, Chenghua Li, Yingying Jiang, Yong A, Peisong Wang, Cong Leng, Jian Cheng, Xiaoyu Liu, Zhicun Yin, Zhilu Zhang, Junyi Li, Ming Liu , et al. (18 additional authors not shown)

    Abstract: Cameras capture sensor RAW images and transform them into pleasant RGB images, suitable for the human eyes, using their integrated Image Signal Processor (ISP). Numerous low-level vision tasks operate in the RAW domain (e.g. image denoising, white balance) due to its linear relationship with the scene irradiance, wide-range of information at 12bits, and sensor designs. Despite this, RAW image data… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: ECCV 2022 Advances in Image Manipulation (AIM) workshop

  48. arXiv:2210.03337  [pdf, other

    cs.CL cs.AI

    A Unified Framework for Multi-intent Spoken Language Understanding with prompting

    Authors: Feifan Song, Lianzhe Huang, Houfeng Wang

    Abstract: Multi-intent Spoken Language Understanding has great potential for widespread implementation. Jointly modeling Intent Detection and Slot Filling in it provides a channel to exploit the correlation between intents and slots. However, current approaches are apt to formulate these two sub-tasks differently, which leads to two issues: 1) It hinders models from effective extraction of shared features.… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: Work in progress

  49. arXiv:2209.10887  [pdf, other

    cs.SD cs.CL eess.AS

    A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS

    Authors: Haohan Guo, Fenglong Xie, Frank K. Soong, Xixin Wu, Helen Meng

    Abstract: We propose a Multi-Stage, Multi-Codebook (MSMC) approach to high-performance neural TTS synthesis. A vector-quantized, variational autoencoder (VQ-VAE) based feature analyzer is used to encode Mel spectrograms of speech training data by down-sampling progressively in multiple stages into MSMC Representations (MSMCRs) with different time resolutions, and quantizing them with multiple VQ codebooks,… ▽ More

    Submitted 22 September, 2022; originally announced September 2022.

  50. arXiv:2209.06484  [pdf, other

    cs.SD cs.CL eess.AS

    ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS

    Authors: Liumeng Xue, Frank K. Soong, Shaofei Zhang, Lei Xie

    Abstract: Recent advancements in neural end-to-end TTS models have shown high-quality, natural synthesized speech in a conventional sentence-based TTS. However, it is still challenging to reproduce similar high quality when a whole paragraph is considered in TTS, where a large amount of contextual information needs to be considered in building a paragraph-based TTS model. To alleviate the difficulty in trai… ▽ More

    Submitted 14 September, 2022; originally announced September 2022.

    Comments: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing