research-article

Open access

Creating Edge AI from Cloud-based LLMs

Authors:

Xiangliang Chen,

Mahadev SatyanarayananAuthors Info & Claims

HOTMOBILE '24: Proceedings of the 25th International Workshop on Mobile Computing Systems and Applications

Pages 8 - 13

https://doi.org/10.1145/3638550.3641126

Published: 28 February 2024 Publication History

Abstract

Cyber-human and cyber-physical systems have tight end-to-end latency bounds, typically on the order of a few tens of milliseconds. In contrast, cloud-based large-language models (LLMs) have end-to-end latencies that are two to three orders of magnitude larger. This paper shows how to bridge this large gap by using LLMs as offline compilers for creating task-specific code that avoids LLM accesses. We provide three case studies as proofs of concept, and discuss the challenges in generalizing this technique to broader uses.

References

[1]

Agus, T., Suied, C., Thorpe, S., and Pressnitzer, D. Characteristics of human voice processing. In Proc. of 2010 IEEE Intl. Symp. on Circuits and Systems (ISCAS) (Paris, France, June 2010).

[2]

Amazon. Voxelab Aquila X2 3D Printer. (https://www.amazon.com/gp/product/B095GQ87QG/ref=ppx_yo_dt_b_asin_title_o07_s00?ie=UTF8&th=1). Last accessed September 23, 2023.

[3]

Bala, M., Eiszler, T., Chen, X., Harkes, J., Blakley, J., Pillai, P., and Satyanarayanan, M. Democratizing Drone Autonomy via Edge Computing. In Proc. of the Eighth ACM/IEEE Symp. on Edge Computing (SEC) (Wilmington, DE, December 2023).

[4]

Chen, Z., Hu, W., Wang, J., Zhao, S., Amos, B., Wu, G., Ha, K., Elgazzar, K., Pillai, P., Klatzky, R., Siewiorek, D., and Satyanarayanan, M. An Empirical Study of Latency in an Emerging Class of Edge Computing Applications for Wearable Cognitive Assistance. In Proceedings of the Second ACM/IEEE Symposium on Edge Computing (Fremont, CA, October 2017).

Digital Library

[5]

Dronecode. QGroundControl: Intuitive and Powerful Ground Control Station for the MAVLink protocol. (http://qgroundcontrol.com/). Last accessed October 2, 2023.

[6]

Ellis, S. R., Mania, K., Adelstein, B. D., and Hill, M. I. Generalizeability of Latency Detection in a Variety of Virtual Environments. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting (2004), vol. 48.

[7]

George, S., Eiszler, T., Iyengar, R., Turki, H., Feng, Z., Wang, J., Pillai, P., and Satyanarayanan, M. OpenRTiST: End-to-End Benchmarking for Edge Computing. IEEE Pervasive Computing 19, 4 (2020).

[8]

Ha, K., Chen, Z., Hu, W., Richter, W., Pillai, P., and Satyanarayanan, M. Towards Wearable Cognitive Assistance. In Proceedings of the Twelfth International Conference on Mobile Systems, Applications, and Services (Bretton Woods, NH, June 2014).

Digital Library

[9]

Open Geospatial Consortium. KML Overview. (https://www.ogc.org/standard/kml/). Last accessed October 2, 2023.

[10]

OpenAI. Introducing ChatGPT. (https://openai.com/blog/chatgpt). Last accessed September 25, 2023.

[11]

OpenAI. Introducing ChatGPT and Whisper APIs. (https://openai.com/blog/introducing-chatgpt-and-whisper-apis). Last accessed September 29, 2023.

[12]

Pham, T. A., Wang, J., Iyengar, R., Xiao, Y., Pillai, P., Klatzky, R., and Satyanarayanan, M. Ajalon: Simplifying the authoring of wearable cognitive assistants. Journal of Software Practice and Experience 51, 8 (August 2021).

[13]

Pungas, T. LLM latency is linear in output token count. (https://www.taivo.ai/__llmlatency-is-linear-in-output-token-count/). Last accessed September 25, 2023.

[14]

Ramon, M., Caharel, S., and Rossion, B. The speed of recognition of personally familiar faces. Perception 40, 4 (2011).

[15]

Satyanarayanan, M., Beckmann, N., Lewis, G. A., and Lucia, B. The Role of Edge Offload for Hardware-Accelerated Mobile Devices. In The 22nd International Workshop on Mobile Computing Systems and Applications (Hotmobile '21) (Virtual, February 2021).

[16]

Satyanarayanan, M., and Davies, N. Augmenting Cognition through Edge Computing. IEEE Computer 52, 7 (July 2019).

[17]

Satyanarayanan, M., Gao, W., and Lucia, B. The Computing Landscape of the 21st Century. In Proc. of HotMobile '19 (Santa Cruz, CA, 2019).

Digital Library

[18]

YouTube. Using ChatGPT to Simplify Development of a Wearable Cognitive Assistant for 3D Printer Assembly. (https://www.youtube.com/watch?v=Fn5vTvFCfl8). Last accessed October 12, 2023.

[19]

YouTube. Voxelab Aquila X2 Unbox and Build. (https://www.youtube.com/watch?v=7IPSiIdaIlA). Last accessed September 23, 2023.

[20]

YouTube. Wearable Cognitive Assistance for assembly of IKEA Utility Cart. (https://www.youtube.com/watch?app=desktop&v=yO56SsZZRDg). Last accessed September 24, 2023.

[21]

Zhu, X., Li, J., Liu, Y., Ma, C., and Wang, W. A Survey on Model Compression for Large Language Models. (https://arxiv.org/abs/2308.07633), 2023.

Cited By

Piccialli FChiaro DQi PBellandi VDamiani E(2025)Federated and edge learning for large language modelsInformation Fusion10.1016/j.inffus.2024.102840117(102840)Online publication date: May-2025
https://doi.org/10.1016/j.inffus.2024.102840
Ferrara E(2024)Large Language Models for Wearable Sensor-Based Human Activity Recognition, Health Monitoring, and Behavioral Modeling: A Survey of Early Trends, Datasets, and ChallengesSensors10.3390/s2415504524:15(5045)Online publication date: 4-Aug-2024
https://doi.org/10.3390/s24155045
Sengupta KShangguan ZBharadwaj SArora SOhn-Bar EMancuso R(2024)Unified Local-Cloud Decision-Making via Reinforcement LearningComputer Vision – ECCV 202410.1007/978-3-031-72940-9_11(185-203)Online publication date: 17-Nov-2024
https://doi.org/10.1007/978-3-031-72940-9_11
Show More Cited By

Index Terms

Creating Edge AI from Cloud-based LLMs

Recommendations

Edge computing: A survey
Abstract
In recent years, the Edge computing paradigm has gained considerable popularity in academic and industrial circles. It serves as a key enabler for many future technologies like 5G, Internet of Things (IoT), augmented reality and ...
Highlights
- A comprehensive survey on edge computing, i.e., Fog, Mobile-edge and Cloudlet.
- ...
Deviceless edge computing: extending serverless computing to the edge of the network
SYSTOR '17: Proceedings of the 10th ACM International Systems and Storage Conference

The serverless paradigm has been rapidly adopted by developers of cloud-native applications, mainly because it relieves them from the burden of provisioning, scaling and operating the underlying infrastructure. In this paper, we propose a novel ...
Supporting Multi-Provider Serverless Computing on the Edge
ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel Processing

Serverless computing has recently emerged as a new execution model for cloud computing, in which service providers offer compute runtimes, also known as Function-as-a-Service (FaaS) platforms, allowing users to develop, execute and manage application ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HOTMOBILE '24: Proceedings of the 25th International Workshop on Mobile Computing Systems and Applications

February 2024

167 pages

ISBN:9798400704970

DOI:10.1145/3638550

Chair:
Nigel Davies,
Program Chair:
Chenren Xu

Copyright © 2024 Owner/Author(s).

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGMOBILE: ACM Special Interest Group on Mobility of Systems, Users, Data and Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 February 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

U.S. Army Research Office
National Science Foundation

Conference

HOTMOBILE '24

Sponsor:

SIGMOBILE

HOTMOBILE '24: 25th International Workshop on Mobile Computing Systems and Applications

February 28 - 29, 2024

CA, San Diego, USA

Acceptance Rates

Overall Acceptance Rate 96 of 345 submissions, 28%

Upcoming Conference

HOTMOBILE '25

Sponsor:
sigmobile

The 26th International Workshop on Mobile Computing Systems and Applications

February 26 - 27, 2025

La Quinta , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
1,546
Total Downloads

Downloads (Last 12 months)1,546
Downloads (Last 6 weeks)162

Reflects downloads up to 27 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Piccialli FChiaro DQi PBellandi VDamiani E(2025)Federated and edge learning for large language modelsInformation Fusion10.1016/j.inffus.2024.102840117(102840)Online publication date: May-2025
https://doi.org/10.1016/j.inffus.2024.102840
Ferrara E(2024)Large Language Models for Wearable Sensor-Based Human Activity Recognition, Health Monitoring, and Behavioral Modeling: A Survey of Early Trends, Datasets, and ChallengesSensors10.3390/s2415504524:15(5045)Online publication date: 4-Aug-2024
https://doi.org/10.3390/s24155045
Sengupta KShangguan ZBharadwaj SArora SOhn-Bar EMancuso R(2024)Unified Local-Cloud Decision-Making via Reinforcement LearningComputer Vision – ECCV 202410.1007/978-3-031-72940-9_11(185-203)Online publication date: 17-Nov-2024
https://doi.org/10.1007/978-3-031-72940-9_11
Tavakkoli VMohsenzadegan KKyamakya K(2024)Leveraging Context-Aware Emotion and Fatigue Recognition Through Large Language Models for Enhanced Advanced Driver Assistance Systems (ADAS)Recent Advances in Machine Learning Techniques and Sensor Applications for Human Emotion, Activity Recognition and Support10.1007/978-3-031-71821-2_2(49-85)Online publication date: 8-Nov-2024
https://doi.org/10.1007/978-3-031-71821-2_2

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents