No abstract available.
Proceeding Downloads
Large Language Models on Mobile Devices: Measurements, Analysis, and Insights
Deploying large language models (LLMs) inference into mobile devices is cost-efficient for companies, and well addresses the privacy concern of users. However, the limited computation capacity and memory constraints of mobile devices hinder their ...
WiP: An On-device LLM-based Approach to Query Privacy Protection
Privacy leakage from user queries is a widely-concerned issue in search engines and chatbot services. Existing solutions based on privacy information removal, obfuscation, and encryption may inevitably hurt service quality or require full trust of the ...
Towards a Task-agnostic Distillation Methodology for Creating Edge Foundation Models
In recent years, AI has undergone significant changes. Firstly, there is a growing recognition of the need to deploy inference models based on Deep Neural Networks (DNNs) on edge devices. Secondly, there is an increasing demand for low-energy inferencing ...
WiP: A Solution for Reducing MLLM-Based Agent Interaction Overhead
Current Multi-modal LLM-based mobile agents are associated with concerns over high inference time and cost. We propose to tackle these issues by developing a lightweight UI Transition Graph (UTG) and locally executing automatic tasks. Specifically, we ...
ChainStream: A Stream-based LLM Agent Framework for Continuous Context Sensing and Sharing
This paper introduces ChainStream, an LLM-based framework for building and serving context-aware AI agents. Driven by the goal to enable context awareness of LLM agents and flexible information sharing between them, we adopt a stream-based design, in ...
Are Large Language Models Capable of Causal Reasoning for Sensing Data Analysis?
The correlation analysis between socioeconomic factors and environmental impact is essential for policy making to ensure sustainability and economic development simultaneously. With the development of Internet of Things (IoT), citizen science IoT ...
WiP: Towards Light Adaptation of Large Language Models For Personal Hardware
The large language models (LLMs) that everyone is using are not deployed locally. Users need to send relatively private and important data to LLM when using it. Handing over private and important data to LLM will cause people to worry, especially now ...
WiP: Efficient LLM Prefilling with Mobile NPU
Large language models (LLMs) play a crucial role in various Natural Language Processing (NLP) tasks, prompting their deployment on mobile devices for inference. However, a significant challenge arises due to high waiting latency, especially for long ...
Hybrid SLM and LLM for Edge-Cloud Collaborative Inference
Edge-Cloud collaboration for deep learning inference has been actively studied, to enhance the inference performance by leveraging both Edge and Cloud resources. However, traditional Edge-Cloud collaboration based on model partitioning or confidence ...