Skip to main content

Showing 1–50 of 88 results for author: Pan, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.13987  [pdf, other

    cs.CL cs.AI

    Focused Large Language Models are Stable Many-Shot Learners

    Authors: Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Heda Wang, Yao Hu, Kan Li

    Abstract: In-Context Learning (ICL) enables large language models (LLMs) to achieve rapid task adaptation by learning from demonstrations. With the increase in available context length of LLMs, recent experiments have shown that the performance of ICL does not necessarily scale well in many-shot (demonstration) settings. We theoretically and experimentally confirm that the reason lies in more demonstrations… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

    Comments: 15 pages

  2. arXiv:2408.13738  [pdf, other

    cs.CL

    Poor-Supervised Evaluation for SuperLLM via Mutual Consistency

    Authors: Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Boyuan Pan, Heda Wang, Yao Hu, Kan Li

    Abstract: The guidance from capability evaluations has greatly propelled the progress of both human society and Artificial Intelligence. However, as LLMs evolve, it becomes challenging to construct evaluation benchmarks for them with accurate labels on hard tasks that approach the boundaries of human capabilities. To credibly conduct evaluation without accurate labels (denoted as poor-supervised evaluation)… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

    Comments: ACL findings

  3. arXiv:2408.13457  [pdf, other

    cs.CL cs.AI

    Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning

    Authors: Xinglin Wang, Shaoxiong Feng, Yiwei Li, Peiwen Yuan, Yueqi Zhang, Boyuan Pan, Heda Wang, Yao Hu, Kan Li

    Abstract: Self-consistency (SC), a widely used decoding strategy for chain-of-thought reasoning, shows significant gains across various multi-step reasoning tasks but comes with a high cost due to multiple sampling with the preset size. Its variants, Adaptive self-consistency (ASC) and Early-stopping self-consistency (ESC), dynamically adjust the number of samples based on the posterior distribution of a se… ▽ More

    Submitted 24 August, 2024; originally announced August 2024.

    Comments: Preprint

  4. arXiv:2408.09150  [pdf, other

    cs.CL cs.AI

    CogLM: Tracking Cognitive Development of Large Language Models

    Authors: Xinglin Wang, Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Boyuan Pan, Heda Wang, Yao Hu, Kan Li

    Abstract: Piaget's Theory of Cognitive Development (PTC) posits that the development of cognitive levels forms the foundation for human learning across various abilities. As Large Language Models (LLMs) have recently shown remarkable abilities across a wide variety of tasks, we are curious about the cognitive levels of current LLMs: to what extent they have developed and how this development has been achiev… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: under review

  5. arXiv:2408.07444  [pdf, other

    eess.IV cs.CV

    Costal Cartilage Segmentation with Topology Guided Deformable Mamba: Method and Benchmark

    Authors: Senmao Wang, Haifan Gong, Runmeng Cui, Boyao Wan, Yicheng Liu, Zhonglin Hu, Haiqing Yang, Jingyang Zhou, Bo Pan, Lin Lin, Haiyue Jiang

    Abstract: Costal cartilage segmentation is crucial to various medical applications, necessitating precise and reliable techniques due to its complex anatomy and the importance of accurate diagnosis and surgical planning. We propose a novel deep learning-based approach called topology-guided deformable Mamba (TGDM) for costal cartilage segmentation. The TGDM is tailored to capture the intricate long-range co… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  6. arXiv:2407.04100  [pdf, other

    cs.CV

    C$^3$DG: Conditional Domain Generalization for Hyperspectral Imagery Classification with Convergence and Constrained-risk Theories

    Authors: Zhe Gao, Bin Pan, Zhenwei Shi

    Abstract: Hyperspectral imagery (HSI) classification may suffer the challenge of hyperspectral-monospectra, where different classes present similar spectra. Joint spatial-spectral feature extraction is a popular solution for the problem, but this strategy tends to inflate accuracy since test pixels may exist in training patches. Domain generalization methods show promising potential, but they still fail to… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  7. arXiv:2407.02056  [pdf, other

    cs.CL cs.AI

    Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation

    Authors: Xinglin Wang, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Boyuan Pan, Heda Wang, Yao Hu, Kan Li

    Abstract: Self-consistency (SC), leveraging multiple samples from LLMs, shows significant gains on various reasoning tasks but struggles with free-form generation due to the difficulty of aggregating answers. Its variants, UCS and USC, rely on sample selection or voting mechanisms to improve output quality. These methods, however, face limitations due to their inability to fully utilize the nuanced consensu… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted to ACL2024 Main Conference

  8. arXiv:2406.05628  [pdf, other

    cs.LG

    Domain Generalization Guided by Large-Scale Pre-Trained Priors

    Authors: Zongbin Wang, Bin Pan, Shiyu Shen, Tianyang Shi, Zhenwei Shi

    Abstract: Domain generalization (DG) aims to train a model from limited source domains, allowing it to generalize to unknown target domains. Typically, DG models only employ large-scale pre-trained models during the initialization of fine-tuning. However, large-scale pre-trained models already possess the ability to resist domain shift. If we reference pre-trained models continuously during fine-tuning to m… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  9. arXiv:2406.05616  [pdf, other

    cs.LG

    Domain Agnostic Conditional Invariant Predictions for Domain Generalization

    Authors: Zongbin Wang, Bin Pan, Zhenwei Shi

    Abstract: Domain generalization aims to develop a model that can perform well on unseen target domains by learning from multiple source domains. However, recent-proposed domain generalization models usually rely on domain labels, which may not be available in many real-world scenarios. To address this challenge, we propose a Discriminant Risk Minimization (DRM) theory and the corresponding algorithm to capt… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  10. arXiv:2405.16800  [pdf, other

    cs.LG cs.AI

    TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations

    Authors: Zheng Zhang, Yuntong Hu, Bo Pan, Chen Ling, Liang Zhao

    Abstract: Text-Attributed Graphs (TAGs) enhance graph structures with natural language descriptions, enabling detailed representation of data and their relationships across a broad spectrum of real-world scenarios. Despite the potential for deeper insights, existing TAG representation learning primarily relies on supervised methods, necessitating extensive labeled data and limiting applicability across dive… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  11. arXiv:2405.16506  [pdf, other

    cs.LG

    GRAG: Graph Retrieval-Augmented Generation

    Authors: Yuntong Hu, Zhihan Lei, Zheng Zhang, Bo Pan, Chen Ling, Liang Zhao

    Abstract: While Retrieval-Augmented Generation (RAG) enhances the accuracy and relevance of responses by generative language models, it falls short in graph-based contexts where both textual and topological information are important. Naive RAG approaches inherently neglect the structural intricacies of textual graphs, resulting in a critical gap in the generation process. To address this challenge, we intro… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 14 pages, 4 figures

  12. arXiv:2405.16219  [pdf, other

    cs.LG stat.ML

    Deep Causal Generative Models with Property Control

    Authors: Qilong Zhao, Shiyu Wang, Guangji Bai, Bo Pan, Zhaohui Qin, Liang Zhao

    Abstract: Generating data with properties of interest by external users while following the right causation among its intrinsic factors is important yet has not been well addressed jointly. This is due to the long-lasting challenge of jointly identifying key latent variables, their causal relations, and their correlation with properties of interest, as well as how to leverage their discoveries toward causal… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 13 pages, 6 figures

  13. arXiv:2405.11809  [pdf, other

    cs.CV cs.AI

    Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices

    Authors: Baiyu Pan, Jichao Jiao, Jianxing Pang, Jun Cheng

    Abstract: In recent years, numerous real-time stereo matching methods have been introduced, but they often lack accuracy. These methods attempt to improve accuracy by introducing new modules or integrating traditional methods. However, the improvements are only modest. In this paper, we propose a novel strategy by incorporating knowledge distillation and model pruning to overcome the inherent trade-off betw… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: International Conference on Robotics and Automation (ICRA) 2024

  14. arXiv:2404.14416  [pdf, other

    physics.geo-ph cs.AI cs.LG physics.ao-ph

    Conditional diffusion models for downscaling & bias correction of Earth system model precipitation

    Authors: Michael Aich, Philipp Hess, Baoxiang Pan, Sebastian Bathiany, Yu Huang, Niklas Boers

    Abstract: Climate change exacerbates extreme weather events like heavy rainfall and flooding. As these events cause severe losses of property and lives, accurate high-resolution simulation of precipitation is imperative. However, existing Earth System Models (ESMs) struggle with resolving small-scale dynamics and suffer from biases, especially for extreme events. Traditional statistical bias correction and… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  15. arXiv:2404.11987  [pdf, other

    cs.CV

    MultiPhys: Multi-Person Physics-aware 3D Motion Estimation

    Authors: Nicolas Ugrinovic, Boxiao Pan, Georgios Pavlakos, Despoina Paschalidou, Bokui Shen, Jordi Sanchez-Riera, Francesc Moreno-Noguer, Leonidas Guibas

    Abstract: We introduce MultiPhys, a method designed for recovering multi-person motion from monocular videos. Our focus lies in capturing coherent spatial placement between pairs of individuals across varying degrees of engagement. MultiPhys, being physically aware, exhibits robustness to jittering and occlusions, and effectively eliminates penetration issues between the two individuals. We devise a pipelin… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  16. arXiv:2404.11943  [pdf, other

    cs.HC

    AgentCoord: Visually Exploring Coordination Strategy for LLM-based Multi-Agent Collaboration

    Authors: Bo Pan, Jiaying Lu, Ke Wang, Li Zheng, Zhen Wen, Yingchaojie Feng, Minfeng Zhu, Wei Chen

    Abstract: The potential of automatic task-solving through Large Language Model (LLM)-based multi-agent collaboration has recently garnered widespread attention from both the research community and industry. While utilizing natural language to coordinate multiple agents presents a promising avenue for democratizing agent technology for general users, designing coordination strategies remains challenging with… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  17. arXiv:2404.11593  [pdf, other

    cs.CV

    IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

    Authors: Xi Chen, Sida Peng, Dongchen Yang, Yuan Liu, Bowen Pan, Chengfei Lv, Xiaowei Zhou

    Abstract: This paper aims to recover object materials from posed images captured under an unknown static lighting condition. Recent methods solve this task by optimizing material parameters through differentiable physically based rendering. However, due to the coupling between object geometry, materials, and environment lighting, there is inherent ambiguity during the inverse rendering process, preventing p… ▽ More

    Submitted 22 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: Project page: https://zju3dv.github.io/IntrinsicAnything

  18. arXiv:2404.05567  [pdf, other

    cs.LG cs.AI cs.CL

    Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models

    Authors: Bowen Pan, Yikang Shen, Haokun Liu, Mayank Mishra, Gaoyuan Zhang, Aude Oliva, Colin Raffel, Rameswar Panda

    Abstract: Mixture-of-Experts (MoE) language models can reduce computational costs by 2-4$\times$ compared to dense models without sacrificing performance, making them more efficient in computation-bounded scenarios. However, MoE models generally require 2-4$\times$ times more parameters to achieve comparable performance to a dense model, which incurs larger GPU memory requirements and makes MoE models less… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  19. arXiv:2403.12766  [pdf, other

    cs.CL

    NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens

    Authors: Cunxiang Wang, Ruoxi Ning, Boqi Pan, Tonghui Wu, Qipeng Guo, Cheng Deng, Guangsheng Bao, Xiangkun Hu, Zheng Zhang, Qian Wang, Yue Zhang

    Abstract: The rapid advancement of Large Language Models (LLMs) has introduced a new frontier in natural language processing, particularly in understanding and processing long-context information. However, the evaluation of these models' long-context abilities remains a challenge due to the limitations of current benchmarks. To address this gap, we introduce NovelQA, a benchmark specifically designed to tes… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

  20. arXiv:2403.05304  [pdf

    cs.RO

    Spatiotemporal Predictive Pre-training for Robotic Motor Control

    Authors: Jiange Yang, Bei Liu, Jianlong Fu, Bocheng Pan, Gangshan Wu, Limin Wang

    Abstract: Robotic motor control necessitates the ability to predict the dynamics of environments and interaction objects. However, advanced self-supervised pre-trained visual representations (PVRs) in robotic motor control, leveraging large-scale egocentric videos, often focus solely on learning the static content features of sampled image frames. This neglects the crucial temporal motion clues in human vid… ▽ More

    Submitted 27 May, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: 21 pages, 9 figures, 8 tables

  21. arXiv:2403.02774  [pdf, other

    physics.ao-ph cs.CV cs.LG physics.geo-ph

    Fast, Scale-Adaptive, and Uncertainty-Aware Downscaling of Earth System Model Fields with Generative Foundation Models

    Authors: Philipp Hess, Michael Aich, Baoxiang Pan, Niklas Boers

    Abstract: Accurate and high-resolution Earth system model (ESM) simulations are essential to assess the ecological and socio-economic impacts of anthropogenic climate change, but are computationally too expensive. Recent machine learning approaches have shown promising results in downscaling ESM simulations, outperforming state-of-the-art statistical approaches. However, existing methods require computation… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  22. arXiv:2403.01423  [pdf, other

    cs.CR cs.LG

    Collective Certified Robustness against Graph Injection Attacks

    Authors: Yuni Lai, Bailin Pan, Kaihuang Chen, Yancheng Yuan, Kai Zhou

    Abstract: We investigate certified robustness for GNNs under graph injection attacks. Existing research only provides sample-wise certificates by verifying each node independently, leading to very limited certifying performance. In this paper, we present the first collective certificate, which certifies a set of target nodes simultaneously. To achieve it, we formulate the problem as a binary integer quadrat… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  23. arXiv:2402.13098  [pdf, other

    cs.CL cs.AI

    ELAD: Explanation-Guided Large Language Models Active Distillation

    Authors: Yifei Zhang, Bo Pan, Chen Ling, Yuntong Hu, Liang Zhao

    Abstract: The deployment and application of Large Language Models (LLMs) is hindered by their memory inefficiency, computational demands, and the high costs of API inferences. Traditional distillation methods, which transfer the capabilities of LLMs to smaller models, often fail to determine whether the knowledge has been sufficiently transferred, potentially resulting in high costs or incomplete distillati… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  24. arXiv:2402.12022  [pdf, other

    cs.CL cs.LG

    Distilling Large Language Models for Text-Attributed Graph Learning

    Authors: Bo Pan, Zheng Zhang, Yifei Zhang, Yuntong Hu, Liang Zhao

    Abstract: Text-Attributed Graphs (TAGs) are graphs of connected textual documents. Graph models can efficiently learn TAGs, but their training heavily relies on human-annotated labels, which are scarce or even unavailable in many applications. Large language models (LLMs) have recently demonstrated remarkable capabilities in few-shot and zero-shot TAG learning, but they suffer from scalability, cost, and pr… ▽ More

    Submitted 5 August, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: CIKM 2024

  25. arXiv:2402.10930  [pdf, other

    cs.AR cs.AI cs.LG

    ConSmax: Hardware-Friendly Alternative Softmax with Learnable Parameters

    Authors: Shiwei Liu, Guanchen Tao, Yifei Zou, Derek Chow, Zichen Fan, Kauna Lei, Bangfei Pan, Dennis Sylvester, Gregory Kielian, Mehdi Saligane

    Abstract: The self-attention mechanism sets transformer-based large language model (LLM) apart from the convolutional and recurrent neural networks. Despite the performance improvement, achieving real-time LLM inference on silicon is challenging due to the extensively used Softmax in self-attention. Apart from the non-linearity, the low arithmetic intensity greatly reduces the processing parallelism, which… ▽ More

    Submitted 20 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

  26. arXiv:2402.08995  [pdf, other

    cs.HC cs.AI

    AgentLens: Visual Analysis for Agent Behaviors in LLM-based Autonomous Systems

    Authors: Jiaying Lu, Bo Pan, Jieyi Chen, Yingchaojie Feng, Jingyuan Hu, Yuchen Peng, Wei Chen

    Abstract: Recently, Large Language Model based Autonomous system(LLMAS) has gained great popularity for its potential to simulate complicated behaviors of human societies. One of its main challenges is to present and analyze the dynamic events evolution of LLMAS. In this work, we present a visualization approach to explore detailed statuses and agents' behavior within LLMAS. We propose a general pipeline th… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  27. arXiv:2402.06646  [pdf

    physics.ao-ph cs.LG physics.geo-ph

    Diffusion Model-based Probabilistic Downscaling for 180-year East Asian Climate Reconstruction

    Authors: Fenghua Ling, Zeyu Lu, Jing-Jia Luo, Lei Bai, Swadhin K. Behera, Dachao Jin, Baoxiang Pan, Huidong Jiang, Toshio Yamagata

    Abstract: As our planet is entering into the "global boiling" era, understanding regional climate change becomes imperative. Effective downscaling methods that provide localized insights are crucial for this target. Traditional approaches, including computationally-demanding regional dynamical models or statistical downscaling frameworks, are often susceptible to the influence of downscaling uncertainty. He… ▽ More

    Submitted 5 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  28. arXiv:2402.04663  [pdf, other

    cs.NE

    CLIF: Complementary Leaky Integrate-and-Fire Neuron for Spiking Neural Networks

    Authors: Yulong Huang, Xiaopeng Lin, Hongwei Ren, Haotian Fu, Yue Zhou, Zunchang Liu, Biao Pan, Bojun Cheng

    Abstract: Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models. Compared to conventional deep Artificial Neural Networks (ANNs), SNNs exhibit superior efficiency and capability to process temporal information. However, it remains a challenge to train SNNs due to their undifferentiable spiking mechanism. The surrogate gradients method is commonly used to train SNNs, but often c… ▽ More

    Submitted 14 July, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  29. arXiv:2402.01858  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Explaining latent representations of generative models with large multimodal models

    Authors: Mengdan Zhu, Zhenke Liu, Bo Pan, Abhinav Angirekula, Liang Zhao

    Abstract: Learning interpretable representations of data generative latent factors is an important topic for the development of artificial intelligence. With the rise of the large multimodal model, it can align images with text to generate answers. In this work, we propose a framework to comprehensively explain each latent variable in the generative models using a large multimodal model. We further measure… ▽ More

    Submitted 17 April, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: ICLR 2024 Workshop on Reliable and Responsible Foundation Models

  30. arXiv:2401.10822  [pdf, other

    cs.CV

    ActAnywhere: Subject-Aware Video Background Generation

    Authors: Boxiao Pan, Zhan Xu, Chun-Hao Paul Huang, Krishna Kumar Singh, Yang Zhou, Leonidas J. Guibas, Jimei Yang

    Abstract: Generating video background that tailors to foreground subject motion is an important problem for the movie industry and visual effects community. This task involves synthesizing background that aligns with the motion and appearance of the foreground subject, while also complies with the artist's creative intention. We introduce ActAnywhere, a generative model that automates this process which tra… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  31. arXiv:2401.10487  [pdf, other

    cs.IR cs.CL

    Generative Dense Retrieval: Memory Can Be a Burden

    Authors: Peiwen Yuan, Xinglin Wang, Shaoxiong Feng, Boyuan Pan, Yiwei Li, Heda Wang, Xupeng Miao, Kan Li

    Abstract: Generative Retrieval (GR), autoregressively decoding relevant document identifiers given a query, has been shown to perform well under the setting of small-scale corpora. By memorizing the document corpus with model parameters, GR implicitly achieves deep interaction between query and document. However, such a memorizing mechanism faces three drawbacks: (1) Poor memory accuracy for fine-grained fe… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: EACL 2024 main

    Journal ref: EACL 2024 main

  32. arXiv:2401.10480  [pdf, other

    cs.CL cs.AI

    Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning

    Authors: Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Xinglin Wang, Bin Sun, Heda Wang, Kan Li

    Abstract: Self-consistency (SC) has been a widely used decoding strategy for chain-of-thought reasoning. Despite bringing significant performance improvements across a variety of multi-step reasoning tasks, it is a high-cost method that requires multiple sampling with the preset size. In this paper, we propose a simple and scalable sampling process, \textbf{E}arly-Stopping \textbf{S}elf-\textbf{C}onsistency… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: ICLR 2024

  33. arXiv:2401.00437  [pdf, other

    cs.CL

    BatchEval: Towards Human-like Text Evaluation

    Authors: Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Boyuan Pan, Heda Wang, Kan Li

    Abstract: Significant progress has been made in automatic text evaluation with the introduction of large language models (LLMs) as evaluators. However, current sample-wise evaluation paradigm suffers from the following issues: (1) Sensitive to prompt design; (2) Poor resistance to noise; (3) Inferior ensemble performance with static reference. Inspired by the fact that humans treat both criterion definition… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: 19 pages, 9 figures

  34. arXiv:2312.12832  [pdf, other

    cs.CL cs.AI

    Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

    Authors: Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Bin Sun, Xinglin Wang, Heda Wang, Kan Li

    Abstract: Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. One promising way is distilling the reasoning ability from LLMs to small models by the generated chain-of-thought reasoning paths. In some cases, however, LLMs may produce incorrect reasoning chains, especially when facing complex mathe… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: AAAI 2024

  35. arXiv:2312.07839  [pdf, ps, other

    math.ST cs.LG math.PR stat.ML

    Minimax-optimal estimation for sparse multi-reference alignment with collision-free signals

    Authors: Subhro Ghosh, Soumendu Sundar Mukherjee, Jing Bin Pan

    Abstract: The Multi-Reference Alignment (MRA) problem aims at the recovery of an unknown signal from repeated observations under the latent action of a group of cyclic isometries, in the presence of additive noise of high intensity $σ$. It is a more tractable version of the celebrated cryo EM model. In the crucial high noise regime, it is known that its sample complexity scales as $σ^6$. Recent investigatio… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  36. arXiv:2312.03979  [pdf, other

    cs.LG cs.CR

    Node-aware Bi-smoothing: Certified Robustness against Graph Injection Attacks

    Authors: Yuni Lai, Yulin Zhu, Bailin Pan, Kai Zhou

    Abstract: Deep Graph Learning (DGL) has emerged as a crucial technique across various domains. However, recent studies have exposed vulnerabilities in DGL models, such as susceptibility to evasion and poisoning attacks. While empirical and provable robustness techniques have been developed to defend against graph modification attacks (GMAs), the problem of certified robustness against graph injection attack… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  37. arXiv:2311.14900  [pdf, other

    cs.CV cs.AI

    Resfusion: Denoising Diffusion Probabilistic Models for Image Restoration Based on Prior Residual Noise

    Authors: Zhenning Shi, Haoshuai Zheng, Chen Xu, Changsheng Dong, Bin Pan, Xueshuo Xie, Along He, Tao Li, Huazhu Fu

    Abstract: Recently, research on denoising diffusion models has expanded its application to the field of image restoration. Traditional diffusion-based image restoration methods utilize degraded images as conditional input to effectively guide the reverse generation process, without modifying the original denoising diffusion process. However, since the degraded images already include low-frequency informatio… ▽ More

    Submitted 20 May, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

  38. arXiv:2311.09806  [pdf, other

    cs.CV

    EvaSurf: Efficient View-Aware Implicit Textured Surface Reconstruction on Mobile Devices

    Authors: Jingnan Gao, Zhuo Chen, Yichao Yan, Bowen Pan, Zhe Wang, Jiangjing Lyu, Xiaokang Yang

    Abstract: Reconstructing real-world 3D objects has numerous applications in computer vision, such as virtual reality, video games, and animations. Ideally, 3D reconstruction methods should generate high-fidelity results with 3D consistency in real-time. Traditional methods match pixels between images using photo-consistency constraints or learned features, while differentiable rendering methods like Neural… ▽ More

    Submitted 19 July, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Project Page: http://g-1nonly.github.io/EvaSurf-Website/

  39. arXiv:2310.16277  [pdf, other

    cs.LG cs.AI

    Bayesian Domain Invariant Learning via Posterior Generalization of Parameter Distributions

    Authors: Shiyu Shen, Bin Pan, Tianyang Shi, Tao Li, Zhenwei Shi

    Abstract: Domain invariant learning aims to learn models that extract invariant features over various training domains, resulting in better generalization to unseen target domains. Recently, Bayesian Neural Networks have achieved promising results in domain invariant learning, but most works concentrate on aligning features distributions rather than parameter distributions. Inspired by the principle of Baye… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  40. arXiv:2310.13027  [pdf, other

    cs.LG cs.AI

    Be Bayesian by Attachments to Catch More Uncertainty

    Authors: Shiyu Shen, Bin Pan, Tianyang Shi, Tao Li, Zhenwei Shi

    Abstract: Bayesian Neural Networks (BNNs) have become one of the promising approaches for uncertainty estimation due to the solid theorical foundations. However, the performance of BNNs is affected by the ability of catching uncertainty. Instead of only seeking the distribution of neural network weights by in-distribution (ID) data, in this paper, we propose a new Bayesian Neural Network with an Attached st… ▽ More

    Submitted 12 April, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

  41. arXiv:2310.08537  [pdf, other

    cs.CV

    XAI Benchmark for Visual Explanation

    Authors: Yifei Zhang, Siyi Gu, James Song, Bo Pan, Guangji Bai, Liang Zhao

    Abstract: The rise of deep learning has ushered in significant progress in computer vision (CV) tasks, yet the "black box" nature of these models often precludes interpretability. This challenge has spurred the development of Explainable Artificial Intelligence (XAI) by generating explanations to AI's decision-making process. An explanation is aimed to not only faithfully reflect the true reasoning process… ▽ More

    Submitted 21 November, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

  42. arXiv:2310.08420  [pdf, other

    cs.CV

    Visual Attention Prompted Prediction and Learning

    Authors: Yifei Zhang, Siyi Gu, Bo Pan, Guangji Bai, Meikang Qiu, Xiaofeng Yang, Liang Zhao

    Abstract: Visual explanation (attention)-guided learning uses not only labels but also explanations to guide model reasoning process. While visual attention-guided learning has shown promising results, it requires a large number of explanation annotations that are time-consuming to prepare. However, in many real-world situations, it is usually desired to prompt the model with visual attention without model… ▽ More

    Submitted 23 April, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

  43. arXiv:2310.07889  [pdf, other

    cs.CV cs.AI cs.CL cs.RO

    LangNav: Language as a Perceptual Representation for Navigation

    Authors: Bowen Pan, Rameswar Panda, SouYoung Jin, Rogerio Feris, Aude Oliva, Phillip Isola, Yoon Kim

    Abstract: We explore the use of language as a perceptual representation for vision-and-language navigation (VLN), with a focus on low-data settings. Our approach uses off-the-shelf vision systems for image captioning and object detection to convert an agent's egocentric panoramic view at each time step into natural language descriptions. We then finetune a pretrained language model to select an action, base… ▽ More

    Submitted 30 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  44. arXiv:2310.07698  [pdf, other

    cs.AI cs.LG

    SurroCBM: Concept Bottleneck Surrogate Models for Generative Post-hoc Explanation

    Authors: Bo Pan, Zhenke Liu, Yifei Zhang, Liang Zhao

    Abstract: Explainable AI seeks to bring light to the decision-making processes of black-box models. Traditional saliency-based methods, while highlighting influential data segments, often lack semantic understanding. Recent advancements, such as Concept Activation Vectors (CAVs) and Concept Bottleneck Models (CBMs), offer concept-based explanations but necessitate human-defined concepts. However, human-anno… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  45. arXiv:2310.07683  [pdf, other

    cs.LG cs.AI

    Controllable Data Generation Via Iterative Data-Property Mutual Mappings

    Authors: Bo Pan, Muran Qin, Shiyu Wang, Yifei Zhang, Liang Zhao

    Abstract: Deep generative models have been widely used for their ability to generate realistic data samples in various areas, such as images, molecules, text, and speech. One major goal of data generation is controllability, namely to generate new data with desired properties. Despite growing interest in the area of controllable generation, significant challenges still remain, including 1) disentangling des… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  46. arXiv:2309.17024  [pdf, other

    cs.CV

    HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World

    Authors: Xin Wang, Taein Kwon, Mahdi Rad, Bowen Pan, Ishani Chakraborty, Sean Andrist, Dan Bohus, Ashley Feniello, Bugra Tekin, Felipe Vieira Frujeri, Neel Joshi, Marc Pollefeys

    Abstract: Building an interactive AI assistant that can perceive, reason, and collaborate with humans in the real world has been a long-standing pursuit in the AI community. This work is part of a broader research effort to develop intelligent agents that can interactively guide humans through performing tasks in the physical world. As a first step in this direction, we introduce HoloAssist, a large-scale e… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: ICCV 2023

  47. arXiv:2304.00341  [pdf, other

    cs.CV

    JacobiNeRF: NeRF Shaping with Mutual Information Gradients

    Authors: Xiaomeng Xu, Yanchao Yang, Kaichun Mo, Boxiao Pan, Li Yi, Leonidas Guibas

    Abstract: We propose a method that trains a neural radiance field (NeRF) to encode not only the appearance of the scene but also semantic correlations between scene points, regions, or entities -- aiming to capture their mutual co-variation patterns. In contrast to the traditional first-order photometric reconstruction objective, our method explicitly regularizes the learning dynamics to align the Jacobians… ▽ More

    Submitted 1 April, 2023; originally announced April 2023.

  48. arXiv:2303.15892  [pdf, other

    cs.CV

    Head3D: Complete 3D Head Generation via Tri-plane Feature Distillation

    Authors: Yuhao Cheng, Yichao Yan, Wenhan Zhu, Ye Pan, Bowen Pan, Xiaokang Yang

    Abstract: Head generation with diverse identities is an important task in computer vision and computer graphics, widely used in multimedia applications. However, current full head generation methods require a large number of 3D scans or multi-view images to train the model, resulting in expensive data acquisition cost. To address this issue, we propose Head3D, a method to generate full 3D heads with limited… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

  49. arXiv:2303.09554  [pdf, other

    cs.CV

    PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision

    Authors: Konstantinos Tertikas, Despoina Paschalidou, Boxiao Pan, Jeong Joon Park, Mikaela Angelina Uy, Ioannis Emiris, Yannis Avrithis, Leonidas Guibas

    Abstract: Impressive progress in generative models and implicit representations gave rise to methods that can generate 3D shapes of high quality. However, being able to locally control and edit shapes is another essential property that can unlock several content creation applications. Local control can be achieved with part-aware models, but existing methods require 3D supervision and cannot produce texture… ▽ More

    Submitted 21 March, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: To appear in CVPR 2023, Project Page: https://ktertikas.github.io/part_nerf

  50. arXiv:2302.06184  [pdf, other

    cs.SE

    A Reference Architecture for Blockchain-based Traceability Systems Using Domain-Driven Design and Microservices

    Authors: Yanze Wang, Shanshan Li, Huikun Liu, He Zhang, Bo Pan

    Abstract: Traceability systems are important for solving problems due to the increasing scale of the global supply chain, such as food safety crises and market disorder. Blockchain, as an immutable and decentralized ledger, is able to optimize the traditional traceability system by ensuring the transparency and reliability of the system data. However, the use of blockchain technology may lead to a rapid inc… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Comments: 10 pages, 6 figures, Asia Pacific Structural Engineering and Construction conference (APSEC) 2022