skip to main content
10.1145/3611380.3630165acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

MAAIG : Motion Analysis And Instruction Generation

Published: 30 December 2023 Publication History

Abstract

Many people engage in self-directed sports training at home but lack the real-time guidance of professional coaches, making them susceptible to injuries or the development of incorrect habits. In this paper, we propose a novel application framework called MAAIG(Motion Analysis And Instruction Generation). It can generate embedding vectors for each frame based on user-provided sports action videos. These embedding vectors are associated with the 3D skeleton of each frame and are further input into a pretrained T5 model. Ultimately, our model utilizes this information to generate specific sports instructions. It has the capability to identify potential issues and provide real-time guidance in a manner akin to professional coaches, helping users improve their sports skills and avoid injuries.

References

[1]
N. Ahmed, T. Natarajan, and K.R. Rao. 1974. Discrete Cosine Transform. IEEE Trans. Comput. C-23, 1 (1974), 90–93. https://doi.org/10.1109/T-C.1974.223784
[2]
Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, 65–72. https://aclanthology.org/W05-0909
[3]
Max Grusky. 2023. Rogue Scores. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Toronto, Canada, 1914–1934. https://doi.org/10.18653/v1/2023.acl-long.107
[4]
Chuan Guo, Xinxin Zuo, Sen Wang, and Li Cheng. 2022. TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts. arxiv:2207.01696 [cs.CV]
[5]
Biao Jiang, Xin Chen, Wen Liu, Jingyi Yu, Gang Yu, and Tao Chen. 2023. MotionGPT: Human Motion as a Foreign Language. arxiv:2306.14795 [cs.CV]
[6]
Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, and Cewu Lu. 2020. HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation. CoRR abs/2011.14672 (2020). arXiv:2011.14672https://arxiv.org/abs/2011.14672
[7]
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A Skinned Multi-Person Linear Model. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 34, 6 (Oct. 2015), 248:1–248:16. https://doi.org/10.1145/2816795.2818013
[8]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 311–318. https://doi.org/10.3115/1073083.1073135
[9]
Mathis Petrovich, Michael J. Black, and Gül Varol. 2022. TEMOS: Generating diverse human motions from textual descriptions. arxiv:2204.14109 [cs.CV]
[10]
Matthias Plappert, Christian Mandery, and Tamim Asfour. 2016. The KIT Motion-Language Dataset. Big Data 4, 4 (dec 2016), 236–252. https://doi.org/10.1089/big.2016.0028
[11]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. CoRR abs/1910.10683 (2019). arXiv:1910.10683http://arxiv.org/abs/1910.10683
[12]
Aäron van den Oord, Oriol Vinyals, and Koray Kavukcuoglu. 2017. Neural Discrete Representation Learning. CoRR abs/1711.00937 (2017). arXiv:1711.00937http://arxiv.org/abs/1711.00937
[13]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2023. Attention Is All You Need. arxiv:1706.03762 [cs.CL]
[14]
Ziyi Zhao, Sena Kiciroglu, Hugues Vinzant, Yuan Cheng, Isinsu Katircioglu, Mathieu Salzmann, and Pascal Fua. 2022. 3D Pose Based Feedback for Physical Exercises. arxiv:2208.03257 [cs.CV]

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MMAsia '23 Workshops: Proceedings of the 5th ACM International Conference on Multimedia in Asia Workshops
December 2023
97 pages
ISBN:9798400703263
DOI:10.1145/3611380
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 December 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. computer vision
  2. deep learning
  3. instruction generation
  4. nature language generating

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • National Science and Technology Council of Taiwan

Conference

MMAsia '23
Sponsor:
MMAsia '23: ACM Multimedia Asia
December 6 - 8, 2023
Tainan, Taiwan

Acceptance Rates

Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 54
    Total Downloads
  • Downloads (Last 12 months)52
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media