Skip to main content

Showing 1–50 of 123 results for author: Liao, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.00918  [pdf, other

    cs.DC

    LuWu: An End-to-End In-Network Out-of-Core Optimizer for 100B-Scale Model-in-Network Data-Parallel Training on Distributed GPUs

    Authors: Mo Sun, Zihan Yang, Changyue Liao, Yingtao Li, Fei Wu, Zeke Wang

    Abstract: The recent progress made in large language models (LLMs) has brought tremendous application prospects to the world. The growing model size demands LLM training on multiple GPUs, while data parallelism is the most popular distributed training strategy due to its simplicity, efficiency, and scalability. Current systems adopt the model-sharded data parallelism to enable memory-efficient training, how… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  2. arXiv:2409.00638  [pdf, other

    cs.CV

    IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo Matching

    Authors: Gangwei Xu, Xianqi Wang, Zhaoxing Zhang, Junda Cheng, Chunyuan Liao, Xin Yang

    Abstract: Stereo matching is a core component in many computer vision and robotics systems. Despite significant advances over the last decade, handling matching ambiguities in ill-posed regions and large disparities remains an open challenge. In this paper, we propose a new deep network architecture, called IGEV++, for stereo matching. The proposed IGEV++ builds Multi-range Geometry Encoding Volumes (MGEV)… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

    Comments: 12 pages, 10 figures

  3. arXiv:2408.10995  [pdf, other

    cs.CL

    CTP-LLM: Clinical Trial Phase Transition Prediction Using Large Language Models

    Authors: Michael Reinisch, Jianfeng He, Chenxi Liao, Sauleh Ahmad Siddiqui, Bei Xiao

    Abstract: New medical treatment development requires multiple phases of clinical trials. Despite the significant human and financial costs of bringing a drug to market, less than 20% of drugs in testing will make it from the first phase to final approval. Recent literature indicates that the design of the trial protocols significantly contributes to trial performance. We investigated Clinical Trial Outcome… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  4. arXiv:2408.09394  [pdf, other

    cs.NI cs.IT cs.LG

    GRLinQ: An Intelligent Spectrum Sharing Mechanism for Device-to-Device Communications with Graph Reinforcement Learning

    Authors: Zhiwei Shan, Xinping Yi, Le Liang, Chung-Shou Liao, Shi Jin

    Abstract: Device-to-device (D2D) spectrum sharing in wireless communications is a challenging non-convex combinatorial optimization problem, involving entangled link scheduling and power control in a large-scale network. The state-of-the-art methods, either from a model-based or a data-driven perspective, exhibit certain limitations such as the critical need for channel state information (CSI) and/or a larg… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  5. Relevance Filtering for Embedding-based Retrieval

    Authors: Nicholas Rossi, Juexin Lin, Feng Liu, Zhen Yang, Tony Lee, Alessandro Magnani, Ciya Liao

    Abstract: In embedding-based retrieval, Approximate Nearest Neighbor (ANN) search enables efficient retrieval of similar items from large-scale datasets. While maximizing recall of relevant items is usually the goal of retrieval systems, a low precision may lead to a poor search experience. Unlike lexical retrieval, which inherently limits the size of the retrieved set through keyword matching, dense retrie… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Comments: 8 pages, 3 figures, CIKM 2024

    ACM Class: H.3.3

  6. Enhancing Relevance of Embedding-based Retrieval at Walmart

    Authors: Juexin Lin, Sachin Yadav, Feng Liu, Nicholas Rossi, Praveen R. Suram, Satya Chembolu, Prijith Chandran, Hrushikesh Mohapatra, Tony Lee, Alessandro Magnani, Ciya Liao

    Abstract: Embedding-based neural retrieval (EBR) is an effective search retrieval method in product search for tackling the vocabulary gap between customer search queries and products. The initial launch of our EBR system at Walmart yielded significant gains in relevance and add-to-cart rates [1]. However, despite EBR generally retrieving more relevant products for reranking, we have observed numerous insta… ▽ More

    Submitted 14 August, 2024; v1 submitted 9 August, 2024; originally announced August 2024.

    Comments: 8 pages, 3 figures, CIKM 2024

    ACM Class: H.3.3

  7. arXiv:2408.00753  [pdf

    eess.SP cs.AI

    A deep learning-enabled smart garment for versatile sleep behaviour monitoring

    Authors: Chenyu Tang, Wentian Yi, Muzi Xu, Yuxuan Jin, Zibo Zhang, Xuhang Chen, Caizhi Liao, Peter Smielewski, Luigi G. Occhipinti

    Abstract: Continuous monitoring and accurate detection of complex sleep patterns associated to different sleep-related conditions is essential, not only for enhancing sleep quality but also for preventing the risk of developing chronic illnesses associated to unhealthy sleep. Despite significant advances in research, achieving versatile recognition of various unhealthy and sub-healthy sleep patterns with si… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: 18 pages, 5 figures, 1 table

  8. arXiv:2407.14568  [pdf, other

    cs.CL cs.AI cs.DB

    SQLfuse: Enhancing Text-to-SQL Performance through Comprehensive LLM Synergy

    Authors: Tingkai Zhang, Chaoyu Chen, Cong Liao, Jun Wang, Xudong Zhao, Hang Yu, Jianchao Wang, Jianguo Li, Wenhui Shi

    Abstract: Text-to-SQL conversion is a critical innovation, simplifying the transition from complex SQL to intuitive natural language queries, especially significant given SQL's prevalence in the job market across various roles. The rise of Large Language Models (LLMs) like GPT-3.5 and GPT-4 has greatly advanced this field, offering improved natural language understanding and the ability to generate nuanced… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  9. arXiv:2407.12851  [pdf

    cs.CL

    ISPO: An Integrated Ontology of Symptom Phenotypes for Semantic Integration of Traditional Chinese Medical Data

    Authors: Zixin Shu, Rui Hua, Dengying Yan, Chenxia Lu, Ning Xu, Jun Li, Hui Zhu, Jia Zhang, Dan Zhao, Chenyang Hui, Junqiu Ye, Chu Liao, Qi Hao, Wen Ye, Cheng Luo, Xinyan Wang, Chuang Cheng, Xiaodong Li, Baoyan Liu, Xiaji Zhou, Runshun Zhang, Min Xu, Xuezhong Zhou

    Abstract: Symptom phenotypes are one of the key types of manifestations for diagnosis and treatment of various disease conditions. However, the diversity of symptom terminologies is one of the major obstacles hindering the analysis and knowledge sharing of various types of symptom-related medical data particularly in the fields of Traditional Chinese Medicine (TCM). Objective: This study aimed to construct… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 39 pages, 6 figures, 6 tables

  10. arXiv:2406.18087  [pdf, other

    cs.SE cs.AI cs.CL

    EHR-Based Mobile and Web Platform for Chronic Disease Risk Prediction Using Large Language Multimodal Models

    Authors: Chun-Chieh Liao, Wei-Ting Kuo, I-Hsuan Hu, Yen-Chen Shih, Jun-En Ding, Feng Liu, Fang-Ming Hung

    Abstract: Traditional diagnosis of chronic diseases involves in-person consultations with physicians to identify the disease. However, there is a lack of research focused on predicting and developing application systems using clinical notes and blood test values. We collected five years of Electronic Health Records (EHRs) from Taiwan's hospital database between 2017 and 2021 as an AI database. Furthermore,… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  11. arXiv:2406.05982  [pdf

    eess.IV cs.LG physics.med-ph

    Artificial Intelligence for Neuro MRI Acquisition: A Review

    Authors: Hongjia Yang, Guanhua Wang, Ziyu Li, Haoxiang Li, Jialan Zheng, Yuxin Hu, Xiaozhi Cao, Congyu Liao, Huihui Ye, Qiyuan Tian

    Abstract: Magnetic resonance imaging (MRI) has significantly benefited from the resurgence of artificial intelligence (AI). By leveraging AI's capabilities in large-scale optimization and pattern recognition, innovative methods are transforming the MRI acquisition workflow, including planning, sequence design, and correction of acquisition artifacts. These emerging algorithms demonstrate substantial potenti… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Magn Reson Mater Phy (2024)

  12. arXiv:2406.04151  [pdf, other

    cs.AI cs.CL

    AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

    Authors: Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Dingwen Yang, Chenyang Liao, Xin Guo, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang

    Abstract: Building generalist agents that can handle diverse tasks and evolve themselves across different environments is a long-term goal in the AI community. Large language models (LLMs) are considered a promising foundation to build such agents due to their generalized capabilities. Current approaches either have LLM-based agents imitate expert-provided trajectories step-by-step, requiring human supervis… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Project site: https://agentgym.github.io

  13. arXiv:2406.01436  [pdf, other

    cs.CL

    Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models

    Authors: Cheng-Hsun Hsueh, Paul Kuo-Ming Huang, Tzu-Han Lin, Che-Wei Liao, Hung-Chieh Fang, Chao-Wei Huang, Yun-Nung Chen

    Abstract: Knowledge editing is a rising technique for efficiently updating factual knowledge in Large Language Models (LLMs) with minimal alteration of parameters. However, recent studies have identified concerning side effects, such as knowledge distortion and the deterioration of general abilities, that have emerged after editing. This survey presents a comprehensive study of these side effects, providing… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  14. arXiv:2406.00247  [pdf, other

    cs.IR cs.AI

    Large Language Models for Relevance Judgment in Product Search

    Authors: Navid Mehrdad, Hrushikesh Mohapatra, Mossaab Bagdouri, Prijith Chandran, Alessandro Magnani, Xunfan Cai, Ajit Puthenputhussery, Sachin Yadav, Tony Lee, ChengXiang Zhai, Ciya Liao

    Abstract: High relevance of retrieved and re-ranked items to the search query is the cornerstone of successful product search, yet measuring relevance of items to queries is one of the most challenging tasks in product information retrieval, and quality of product search is highly influenced by the precision and scale of available relevance-labelled data. In this paper, we present an array of techniques for… ▽ More

    Submitted 16 July, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

    Comments: 10 pages, 1 figure, 11 tables - SIGIR 2024, LLM4Eval

    ACM Class: H.3.3; I.2.7

  15. arXiv:2403.06504  [pdf, other

    cs.DC

    Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU

    Authors: Changyue Liao, Mo Sun, Zihan Yang, Kaiqi Chen, Binhang Yuan, Fei Wu, Zeke Wang

    Abstract: Recent advances in large language models have brought immense value to the world, with their superior capabilities stemming from the massive number of parameters they utilize. However, even the GPUs with the highest memory capacities, currently peaking at 80GB, are far from sufficient to accommodate these vast parameters and their associated optimizer states when conducting stochastic gradient des… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  16. arXiv:2402.04416  [pdf, other

    cs.CV cs.LG

    Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality Gap

    Authors: Christopher Liao, Christian So, Theodoros Tsiligkaridis, Brian Kulis

    Abstract: Domain generalization (DG) is an important problem that learns a model which generalizes to unseen test domains leveraging one or more source domains, under the assumption of shared label spaces. However, most DG methods assume access to abundant source data in the target label space, a requirement that proves overly stringent for numerous real-world applications, where acquiring the same label sp… ▽ More

    Submitted 29 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  17. arXiv:2402.02662  [pdf, other

    cs.CV cs.CL cs.LG

    Image-Caption Encoding for Improving Zero-Shot Generalization

    Authors: Eric Yang Yu, Christopher Liao, Sathvik Ravi, Theodoros Tsiligkaridis, Brian Kulis

    Abstract: Recent advances in vision-language models have combined contrastive approaches with generative methods to achieve state-of-the-art (SOTA) on downstream inference tasks like zero-shot image classification. However, a persistent issue of these models for image classification is their out-of-distribution (OOD) generalization capabilities. We first show that when an OOD data point is misclassified, th… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  18. arXiv:2402.01439  [pdf, other

    cs.LG cs.AI q-bio.BM q-bio.QM

    From Words to Molecules: A Survey of Large Language Models in Chemistry

    Authors: Chang Liao, Yemin Yu, Yu Mei, Ying Wei

    Abstract: In recent years, Large Language Models (LLMs) have achieved significant success in natural language processing (NLP) and various interdisciplinary areas. However, applying LLMs to chemistry is a complex task that requires specialized domain knowledge. This paper provides a thorough exploration of the nuanced methodologies employed in integrating LLMs into the field of chemistry, delving into the c… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: Submitted to IJCAI 2024 survey track

  19. arXiv:2401.02143  [pdf, other

    cs.LG cs.AI cs.IR cs.SI

    Graph Neural Networks for Tabular Data Learning: A Survey with Taxonomy and Directions

    Authors: Cheng-Te Li, Yu-Che Tsai, Chih-Yao Chen, Jay Chiehen Liao

    Abstract: In this survey, we dive into Tabular Data Learning (TDL) using Graph Neural Networks (GNNs), a domain where deep learning-based approaches have increasingly shown superior performance in both classification and regression tasks compared to traditional methods. The survey highlights a critical gap in deep neural TDL methods: the underrepresentation of latent correlations among data instances and fe… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: Under review, ongoing work, Github page: https://github.com/Roytsai27/awesome-GNN4TDL

  20. arXiv:2311.15480  [pdf

    cs.LG cs.AI cs.CL cs.MM cs.SD

    Automatic Time Signature Determination for New Scores Using Lyrics for Latent Rhythmic Structure

    Authors: Callie C. Liao, Duoduo Liao, Jesse Guessford

    Abstract: There has recently been a sharp increase in interest in Artificial Intelligence-Generated Content (AIGC). Despite this, musical components such as time signatures have not been studied sufficiently to form an algorithmic determination approach for new compositions, especially lyrical songs. This is likely because of the neglect of musical details, which is critical for constructing a robust framew… ▽ More

    Submitted 28 January, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

    Comments: Accepted by 2023 IEEE International Conference on Big Data (IEEE BigData 2023)

  21. arXiv:2311.13612  [pdf, other

    cs.CV

    Descriptor and Word Soups: Overcoming the Parameter Efficiency Accuracy Tradeoff for Out-of-Distribution Few-shot Learning

    Authors: Christopher Liao, Theodoros Tsiligkaridis, Brian Kulis

    Abstract: Over the past year, a large body of multimodal research has emerged around zero-shot evaluation using GPT descriptors. These studies boost the zero-shot accuracy of pretrained VL models with an ensemble of label-specific text generated by GPT. A recent study, WaffleCLIP, demonstrated that similar zero-shot accuracy can be achieved with an ensemble of random descriptors. However, both zero-shot met… ▽ More

    Submitted 29 March, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

  22. arXiv:2311.12833  [pdf, other

    cs.DC cs.AI cs.CL

    HPC-GPT: Integrating Large Language Model for High-Performance Computing

    Authors: Xianzhong Ding, Le Chen, Murali Emani, Chunhua Liao, Pei-Hung Lin, Tristan Vanderbruggen, Zhen Xie, Alberto E. Cerpa, Wan Du

    Abstract: Large Language Models (LLMs), including the LLaMA model, have exhibited their efficacy across various general-domain natural language processing (NLP) tasks. However, their performance in high-performance computing (HPC) domain tasks has been less than optimal due to the specialized expertise required to interpret the model responses. In response to this challenge, we propose HPC-GPT, a novel LLaM… ▽ More

    Submitted 2 October, 2023; originally announced November 2023.

    Comments: 9 pages

  23. arXiv:2311.07989  [pdf, other

    cs.CL cs.AI cs.SE

    Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code

    Authors: Ziyin Zhang, Chaoyu Chen, Bingchang Liu, Cong Liao, Zi Gong, Hang Yu, Jianguo Li, Rui Wang

    Abstract: In this work we systematically review the recent advancements in software engineering with language models, covering 70+ models, 40+ evaluation tasks, 180+ datasets, and 900 related works. Unlike previous works, we integrate software engineering (SE) with natural language processing (NLP) by discussing the perspectives of both sides: SE applies language models for development automation, while NLP… ▽ More

    Submitted 26 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Repo: https://github.com/codefuse-ai/Awesome-Code-LLM. 9 figures, 18 tables, and 902 references. Under review

  24. arXiv:2311.02303  [pdf, other

    cs.LG cs.AI

    MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning

    Authors: Bingchang Liu, Chaoyu Chen, Cong Liao, Zi Gong, Huan Wang, Zhichao Lei, Ming Liang, Dajun Chen, Min Shen, Hailian Zhou, Hang Yu, Jianguo Li

    Abstract: Code LLMs have emerged as a specialized research field, with remarkable studies dedicated to enhancing model's coding capabilities through fine-tuning on pre-trained models. Previous fine-tuning approaches were typically tailored to specific downstream tasks or scenarios, which meant separate fine-tuning for each task, requiring extensive training resources and posing challenges in terms of deploy… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  25. arXiv:2310.11478  [pdf, other

    cs.LG cs.AI cs.CV

    ASP: Automatic Selection of Proxy dataset for efficient AutoML

    Authors: Peng Yao, Chao Liao, Jiyuan Jia, Jianchao Tan, Bin Chen, Chengru Song, Di Zhang

    Abstract: Deep neural networks have gained great success due to the increasing amounts of data, and diverse effective neural network designs. However, it also brings a heavy computing burden as the amount of training data is proportional to the training time. In addition, a well-behaved model requires repeated trials of different structure designs and hyper-parameters, which may take a large amount of time… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: This paper was actually finished in 2021

  26. arXiv:2310.11117  [pdf, other

    cs.CV cs.AI

    USDC: Unified Static and Dynamic Compression for Visual Transformer

    Authors: Huan Yuan, Chao Liao, Jianchao Tan, Peng Yao, Jiyuan Jia, Bin Chen, Chengru Song, Di Zhang

    Abstract: Visual Transformers have achieved great success in almost all vision tasks, such as classification, detection, and so on. However, the model complexity and the inference speed of the visual transformers hinder their deployments in industrial products. Various model compression techniques focus on directly compressing the visual transformers into a smaller one while maintaining the model performanc… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: This paper was actually finished in 2021

  27. arXiv:2310.07488  [pdf, other

    cs.CL cs.AI cs.LG

    KwaiYiiMath: Technical Report

    Authors: Jiayi Fu, Lei Lin, Xiaoyang Gao, Pengli Liu, Zhengzong Chen, Zhirui Yang, Shengnan Zhang, Xue Zheng, Yan Li, Yuliang Liu, Xucheng Ye, Yiqiao Liao, Chao Liao, Bin Chen, Chengru Song, Junchen Wan, Zijia Lin, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai

    Abstract: Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning. In this report, we introduce the KwaiYiiMath which enhances the mathematical reasoning abilities of KwaiYiiBase1, by applying Supervised Fine-Tuning (SFT) and Reinforced Lea… ▽ More

    Submitted 19 October, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: technical report. arXiv admin note: text overlap with arXiv:2306.16636 by other authors

  28. arXiv:2310.06266  [pdf, other

    cs.SE cs.AI cs.LG

    CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model

    Authors: Peng Di, Jianguo Li, Hang Yu, Wei Jiang, Wenting Cai, Yang Cao, Chaoyu Chen, Dajun Chen, Hongwei Chen, Liang Chen, Gang Fan, Jie Gong, Zi Gong, Wen Hu, Tingting Guo, Zhichao Lei, Ting Li, Zheng Li, Ming Liang, Cong Liao, Bingchang Liu, Jiachen Liu, Zhiwei Liu, Shaojun Lu, Min Shen , et al. (13 additional authors not shown)

    Abstract: Code Large Language Models (Code LLMs) have gained significant attention in the industry due to their wide applications in the full lifecycle of software engineering. However, the effectiveness of existing models in understanding non-English inputs for multi-lingual code-related tasks is still far from well studied. This paper introduces CodeFuse-13B, an open-sourced pre-trained code LLM. It is sp… ▽ More

    Submitted 10 January, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted by ICSE-SEIP 2024

  29. arXiv:2310.05193  [pdf, other

    cs.CV

    Improving Discriminative Multi-Modal Learning with Large-Scale Pre-Trained Models

    Authors: Chenzhuang Du, Yue Zhao, Chonghua Liao, Jiacheng You, Jie Fu, Hang Zhao

    Abstract: This paper investigates how to better leverage large-scale pre-trained uni-modal models to further enhance discriminative multi-modal learning. Even when fine-tuned with only uni-modal data, these models can outperform previous multi-modal models in certain tasks. It's clear that their incorporation into multi-modal learning would significantly improve performance. However, multi-modal learning wi… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  30. arXiv:2309.04669  [pdf, other

    cs.CV

    Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization

    Authors: Yang Jin, Kun Xu, Kun Xu, Liwei Chen, Chao Liao, Jianchao Tan, Quzhe Huang, Bin Chen, Chenyi Lei, An Liu, Chengru Song, Xiaoqiang Lei, Di Zhang, Wenwu Ou, Kun Gai, Yadong Mu

    Abstract: Recently, the remarkable advance of the Large Language Model (LLM) has inspired researchers to transfer its extraordinary reasoning capability to both vision and language data. However, the prevailing approaches primarily regard the visual input as a prompt and focus exclusively on optimizing the text generation process conditioned upon vision content by a frozen LLM. Such an inequitable treatment… ▽ More

    Submitted 22 March, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

    Comments: ICLR 2024

  31. arXiv:2308.08649  [pdf, other

    cs.NE cs.AI

    Towards Zero Memory Footprint Spiking Neural Network Training

    Authors: Bin Lei, Sheng Lin, Pei-Hung Lin, Chunhua Liao, Caiwen Ding

    Abstract: Biologically-inspired Spiking Neural Networks (SNNs), processing information using discrete-time events known as spikes rather than continuous values, have garnered significant attention due to their hardware-friendly and energy-efficient characteristics. However, the training of SNNs necessitates a considerably large memory footprint, given the additional storage requirements for spikes or events… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  32. arXiv:2308.08614  [pdf, other

    cs.LG cs.AI cs.CL

    Boosting Logical Reasoning in Large Language Models through a New Framework: The Graph of Thought

    Authors: Bin Lei, pei-Hung Lin, Chunhua Liao, Caiwen Ding

    Abstract: Recent advancements in large-scale models, such as GPT-4, have showcased remarkable capabilities in addressing standard queries. However, when facing complex problems that require multi-step logical reasoning, their accuracy dramatically decreases. Current research has explored the realm of \textit{prompting engineering} to bolster the inferential capacities of these models. Our paper unveils a pi… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  33. arXiv:2308.08473  [pdf, other

    cs.SE

    DataRaceBench V1.4.1 and DataRaceBench-ML V0.1: Benchmark Suites for Data Race Detection

    Authors: Le Chen, Wenhao Wu, Stephen F. Siegel, Pei-Hung Lin, Chunhua Liao

    Abstract: Data races pose a significant threat in multi-threaded parallel applications due to their negative impact on program correctness. DataRaceBench, an open-source benchmark suite, is specifically crafted to assess these data race detection tools in a systematic and measurable manner. Machine learning techniques have recently demonstrated considerable potential in high-performance computing (HPC) prog… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  34. Data Race Detection Using Large Language Models

    Authors: Le Chen, Xianzhong Ding, Murali Emani, Tristan Vanderbruggen, Pei-hung Lin, Chuanhua Liao

    Abstract: Large language models (LLMs) are demonstrating significant promise as an alternate strategy to facilitate analyses and optimizations of high-performance computing programs, circumventing the need for resource-intensive manual tool creation. In this paper, we explore a novel LLM-based data race detection approach combining prompting engineering and fine-tuning techniques. We create a dedicated data… ▽ More

    Submitted 3 October, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

  35. arXiv:2307.07686  [pdf, other

    cs.SE cs.AI cs.LG

    Creating a Dataset for High-Performance Computing Code Translation using LLMs: A Bridge Between OpenMP Fortran and C++

    Authors: Bin Lei, Caiwen Ding, Le Chen, Pei-Hung Lin, Chunhua Liao

    Abstract: In this study, we present a novel dataset for training machine learning models translating between OpenMP Fortran and C++ code. To ensure reliability and applicability, the dataset is created from a range of representative open-source OpenMP benchmarks. It is also refined using a meticulous code similarity test. The effectiveness of our dataset is assessed using both quantitative (CodeBLEU) and qu… ▽ More

    Submitted 18 September, 2023; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: This paper was accepted by the HPEC 2023 conference and received the Outstanding Student Paper Award

  36. arXiv:2306.16036  [pdf, other

    eess.IV cs.CV

    A Cascaded Approach for ultraly High Performance Lesion Detection and False Positive Removal in Liver CT Scans

    Authors: Fakai Wang, Chi-Tung Cheng, Chien-Wei Peng, Ke Yan, Min Wu, Le Lu, Chien-Hung Liao, Ling Zhang

    Abstract: Liver cancer has high morbidity and mortality rates in the world. Multi-phase CT is a main medical imaging modality for detecting/identifying and diagnosing liver tumors. Automatically detecting and classifying liver lesions in CT images have the potential to improve the clinical workflow. This task remains challenging due to liver lesions' large variations in size, appearance, image contrast, and… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

  37. LM4HPC: Towards Effective Language Model Application in High-Performance Computing

    Authors: Le Chen, Pei-Hung Lin, Tristan Vanderbruggen, Chunhua Liao, Murali Emani, Bronis de Supinski

    Abstract: In recent years, language models (LMs), such as GPT-4, have been widely used in multiple domains, including natural language processing, visualization, and so on. However, applying them for analyzing and optimizing high-performance computing (HPC) software is still challenging due to the lack of HPC-specific support. In this paper, we design the LM4HPC framework to facilitate the research and deve… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  38. arXiv:2306.10196  [pdf, other

    cs.CL cs.AI cs.FL cs.LG

    Structured Thoughts Automaton: First Formalized Execution Model for Auto-Regressive Language Models

    Authors: Tristan Vanderbruggen, Chunhua Liao, Peter Pirkelbauer, Pei-Hung Lin

    Abstract: In recent months, Language Models (LMs) have become a part of daily discourse, with focus on OpenAI and the potential of Artificial General Intelligence (AGI). Furthermore, the leaking of LLama's weights to the public has led to an influx of innovations demonstrating the impressive capabilities of generative LMs. While we believe that AGI is still a distant goal, we recognize the potential of LMs… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: Submitted to CGO-24

  39. arXiv:2306.08861  [pdf, other

    cs.CV cs.AI

    Motion Capture Dataset for Practical Use of AI-based Motion Editing and Stylization

    Authors: Makito Kobayashi, Chen-Chieh Liao, Keito Inoue, Sentaro Yojima, Masafumi Takahashi

    Abstract: In this work, we proposed a new style-diverse dataset for the domain of motion style transfer. The motion dataset uses an industrial-standard human bone structure and thus is industry-ready to be plugged into 3D characters for many projects. We claim the challenges in motion style transfer and encourage future work in this domain by releasing the proposed motion dataset both to the public and the… ▽ More

    Submitted 9 July, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

  40. arXiv:2305.15843  [pdf, other

    cs.LG cs.SI

    TabGSL: Graph Structure Learning for Tabular Data Prediction

    Authors: Jay Chiehen Liao, Cheng-Te Li

    Abstract: This work presents a novel approach to tabular data prediction leveraging graph structure learning and graph neural networks. Despite the prevalence of tabular data in real-world applications, traditional deep learning methods often overlook the potentially valuable associations between data instances. Such associations can offer beneficial insights for classification tasks, as instances may exhib… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  41. arXiv:2305.07186  [pdf, other

    cs.IT

    Learning to Code on Graphs for Topological Interference Management

    Authors: Zhiwei Shan, Xinping Yi, Han Yu, Chung-Shou Liao, Shi Jin

    Abstract: The state-of-the-art coding schemes for topological interference management (TIM) problems are usually handcrafted for specific families of network topologies, relying critically on experts' domain knowledge. This inevitably restricts the potential wider applications to wireless communication systems, due to the limited generalizability. This work makes the first attempt to advocate a novel intell… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

    Comments: An extended version of a paper accepted by International Symposium on Information Theory (ISIT) 2023

  42. arXiv:2304.07076  [pdf, other

    cs.CV cs.AI

    BCE-Net: Reliable Building Footprints Change Extraction based on Historical Map and Up-to-Date Images using Contrastive Learning

    Authors: Cheng Liao, Han Hu, Xuekun Yuan, Haifeng Li, Chao Liu, Chunyang Liu, Gui Fu, Yulin Ding, Qing Zhu

    Abstract: Automatic and periodic recompiling of building databases with up-to-date high-resolution images has become a critical requirement for rapidly developing urban environments. However, the architecture of most existing approaches for change extraction attempts to learn features related to changes but ignores objectives related to buildings. This inevitably leads to the generation of significant pseud… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  43. arXiv:2304.00474  [pdf, ps, other

    cs.LG eess.SP

    On the Optimal Recovery of Graph Signals

    Authors: Simon Foucart, Chunyang Liao, Nate Veldt

    Abstract: Learning a smooth graph signal from partially observed data is a well-studied task in graph-based machine learning. We consider this task from the perspective of optimal recovery, a mathematical framework for learning a function from observational data that adopts a worst-case perspective tied to model assumptions on the function to be learned. Earlier work in the optimal recovery literature has s… ▽ More

    Submitted 29 May, 2023; v1 submitted 2 April, 2023; originally announced April 2023.

    Comments: This paper has been accepted by 14th International conference on Sampling Theory and Applications (SampTA 2023)

  44. arXiv:2303.10674  [pdf

    cs.LG cs.AI

    URM4DMU: an user represention model for darknet markets users

    Authors: Hongmeng Liu, Jiapeng Zhao, Yixuan Huo, Yuyan Wang, Chun Liao, Liyan Shen, Shiyao Cui, Jinqiao Shi

    Abstract: Darknet markets provide a large platform for trading illicit goods and services due to their anonymity. Learning an invariant representation of each user based on their posts on different markets makes it easy to aggregate user information across different platforms, which helps identify anonymous users. Traditional user representation methods mainly rely on modeling the text information of posts… ▽ More

    Submitted 19 March, 2023; originally announced March 2023.

    Comments: 9pages

    MSC Class: 62 (Primary); 54 (Secondary) ACM Class: I.2.7

  45. arXiv:2303.08873  [pdf, other

    cs.PL cs.DC cs.LG

    Machine Learning-Driven Adaptive OpenMP For Portable Performance on Heterogeneous Systems

    Authors: Giorgis Georgakoudis, Konstantinos Parasyris, Chunhua Liao, David Beckingsale, Todd Gamblin, Bronis de Supinski

    Abstract: Heterogeneity has become a mainstream architecture design choice for building High Performance Computing systems. However, heterogeneity poses significant challenges for achieving performance portability of execution. Adapting a program to a new heterogeneous platform is laborious and requires developers to manually explore a vast space of execution parameters. To address those challenges, this pa… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Report number: LLNL-CONF-833682

  46. arXiv:2303.06292  [pdf, other

    cs.CY

    Multi-view shaker detection: Insights from a noise-immune influence analysis Perspective

    Authors: Chang Liao

    Abstract: Entities whose changes will significantly affect others in a networked system are called shakers. In recent years, some models have been proposed to detect such shaker from evolving entities. However, limited work has focused on shaker detection in very short term, which has many real-world applications. For example, in financial market, it can enable both investors and governors to quickly respon… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

    Comments: 14 pages, 4 figures

    ACM Class: J.4

  47. arXiv:2303.06284  [pdf, other

    cs.SI cs.CY

    Prospecting Community Development Strength based on Economic Graph: From Categorization to Scoring

    Authors: Chang Liao

    Abstract: Recent years have witnessed a growing number of researches on community characterization. In contrast to the large body of researches on the categorical measures (rise or decline) for evaluating the community development, we propose to estimate the community development strength (to which degree the rise or decline is). More specifically, given already known categorical information of community de… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

    Comments: 12 pages, 3 figures

    ACM Class: J.4

  48. Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation

    Authors: Yasheng Sun, Qianyi Wu, Hang Zhou, Kaisiyuan Wang, Tianshu Hu, Chen-Chieh Liao, Shio Miyafuji, Ziwei Liu, Hideki Koike

    Abstract: Creating the photo-realistic version of people sketched portraits is useful to various entertainment purposes. Existing studies only generate portraits in the 2D plane with fixed views, making the results less vivid. In this paper, we present Stereoscopic Simplified Sketch-to-Portrait (SSSP), which explores the possibility of creating Stereoscopic 3D-aware portraits from simple contour sketches by… ▽ More

    Submitted 1 October, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: Project Page on https://hangz-nju-cuhk.github.io/projects/SSSP, Video Url: https://youtu.be/GiOKbvr2U_E

  49. arXiv:2301.05991  [pdf

    cs.HC cs.AI cs.DL q-bio.TO

    Conceptual Framework and Documentation Standards of Cystoscopic Media Content for Artificial Intelligence

    Authors: Okyaz Eminaga, Timothy Jiyong Lee, Jessie Ge, Eugene Shkolyar, Mark Laurie, Jin Long, Lukas Graham Hockman, Joseph C. Liao

    Abstract: Background: The clinical documentation of cystoscopy includes visual and textual materials. However, the secondary use of visual cystoscopic data for educational and research purposes remains limited due to inefficient data management in routine clinical practice. Methods: A conceptual framework was designed to document cystoscopy in a standardized manner with three major sections: data management… ▽ More

    Submitted 18 January, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

    Comments: Under Reveiw

  50. Model-based Transfer Learning for Automatic Optical Inspection based on domain discrepancy

    Authors: Erik Isai Valle Salgado, Haoxin Yan, Yue Hong, Peiyuan Zhu, Shidong Zhu, Chengwei Liao, Yanxiang Wen, Xiu Li, Xiang Qian, Xiaohao Wang, Xinghui Li

    Abstract: Transfer learning is a promising method for AOI applications since it can significantly shorten sample collection time and improve efficiency in today's smart manufacturing. However, related research enhanced the network models by applying TL without considering the domain similarity among datasets, the data long-tailedness of a source dataset, and mainly used linear transformations to mitigate th… ▽ More

    Submitted 14 January, 2023; originally announced January 2023.

    Comments: This is a fix of the published paper "Relational-based transfer learning for automatic optical inspection based on domain discrepancy"

    Journal ref: Proc. SPIE 12317, Optoelectronic Imaging and Multimedia Technology IXMultimedia Technology IX, 2023