skip to main content
10.1145/3313831.3376147acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

PenSight: Enhanced Interaction with a Pen-Top Camera

Published: 23 April 2020 Publication History

Abstract

We propose mounting a downward-facing camera above the top end of a digital tablet pen. This creates a unique and practical viewing angle for capturing the pen-holding hand and the immediate surroundings which can include the other hand. The fabrication of a prototype device is described and the enabled interaction design space is explored, including dominant and non-dominant hand pose recognition, tablet grip detection, hand gestures, capturing physical content in the environment, and detecting users and pens. A deep learning computer vision pipeline is developed for classification, regression, and keypoint detection to enable these interactions. Example applications demonstrate usage scenarios and a qualitative user evaluation confirms the potential of the approach.

Supplementary Material

MP4 File (paper020vf.mp4)
Supplemental video
MP4 File (paper020pv.mp4)
Preview video

References

[1]
6 Key Ways To Hold A Watercolor Brush. https://watercolorpainting.com/brush-exercise/. Accessed: 2019-09--17.
[2]
Anoto Livescribe. https://www.anoto.com/solutions/livescribe/. Accessed: 2019-09-01.
[3]
Ilhan Aslan, Björn Bittner, Florian Müller, and Elisabeth André. 2018. Exploring the User Experience of Proxemic Hand and Pen Input Above and Aside a Drawing Screen. In Proceedings of the 17th International Conference on Mobile and Ubiquitous Multimedia. ACM, 183--192.
[4]
Ilhan Aslan, Ida Buchwald, Philipp Koytek, and Elisabeth André. 2016. Pen + Mid-Air: An Exploration of Mid-Air Gestures to Complement Pen Input on Tablets. In Proceedings of the 9th Nordic Conference on Human-Computer Interaction (NordiCHI '16). ACM, New York, NY, USA, 1:1--1:10.
[5]
Ilhan Aslan, Tabea Schmidt, Jens Woehrle, Lukas Vogel, and Elisabeth André. 2018. Pen + Mid-Air Gestures: Eliciting Contextual Gestures. In Proceedings of the 20th ACM International Conference on Multimodal Interaction (ICMI '18). ACM, New York, NY, USA, 135--144.
[6]
Xiaojun Bi, Tomer Moscovich, Gonzalo Ramos, Ravin Balakrishnan, and Ken Hinckley. 2008. An Exploration of Pen Rolling for Pen-based Interaction. In Proceedings of the 21st Annual ACM Symposium on User Interface Software and Technology (UIST '08). ACM, New York, NY, USA, 191--200.
[7]
Peter Brandl, Clifton Forlines, Daniel Wigdor, Michael Haller, and Chia Shen. 2008. Combining and Measuring the Benefits of Bimanual Pen and Direct-touch Interaction on Horizontal Interfaces. In Proceedings of the Working Conference on Advanced Visual Interfaces (AVI '08). ACM, New York, NY, USA, 154--161.
[8]
Drini Cami, Fabrice Matulic, Richard G Calland, Brian Vogel, and Daniel Vogel. 2018. Unimanual Pen+Touch Input Using Variations of Precision Grip Postures. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology (UIST '18). ACM, New York, NY, USA, 825--837.
[9]
Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2018. OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields. In arXiv preprint arXiv:1812.08008.
[10]
Liwei Chan, Yi-Ling Chen, Chi-Hao Hsieh, Rong-Hao Liang, and Bing-Yu Chen. 2015. CyclopsRing: Enabling Whole-Hand and Context-Aware Interactions Through a Fisheye Ring. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology (UIST '15). ACM, New York, NY, USA, 549--556.
[11]
Xinghao Chen. Awesome Hand Pose Estimation. https://github.com/xinghaochen/ awesome-hand-pose-estimation. Accessed: 2019-09-01.
[12]
Xiang 'Anthony' Chen, Julia Schwarz, Chris Harrison, Jennifer Mankoff, and Scott E. Hudson. 2014. Air+Touch: Interweaving Touch & In-air Gestures. In Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology (UIST '14). ACM, New York, NY, USA, 519--525.
[13]
Artem Dementyev and Joseph A Paradiso. 2014. WristFlex: Low-power Gesture Input with Wrist-worn Pressure Sensors. In Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology (UIST '14). ACM, New York, NY, USA, 161--166.
[14]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). IEEE Computer Society, 248--255.
[15]
Nicholas Fellion, Thomas Pietrzak, and Audrey Girouard. 2017. FlexStylus: Leveraging Bend Input for Pen Interaction. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (UIST '17). ACM, New York, NY, USA, 375--385.
[16]
Tovi Grossman, Ken Hinckley, Patrick Baudisch, Maneesh Agrawala, and Ravin Balakrishnan. 2006. Hover Widgets: Using the Tracking State to Extend the Capabilities of Pen-operated Devices. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '06). ACM, New York, NY, USA, 861--870.
[17]
Yves Guiard. 1987. Asymmetric division of labor in human skilled bimanual action: The kinematic chain as a model. Journal of motor behavior 19, 4 (1987), 486--517.
[18]
Ankush Gupta, Andrea Vedaldi, and Andrew Zisserman. 2016. Synthetic data for text localisation in natural images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2315--2324.
[19]
Khalad Hasan, Xing-Dong Yang, Andrea Bunt, and Pourang Irani. 2012. A-coord Input: Coordinating Auxiliary Input Streams for Augmenting Contextual Pen-based Interactions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). ACM, New York, NY, USA, 805--814.
[20]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[21]
Ken Hinckley, Xiang 'Anthony' Chen, and Hrvoje Benko. 2013. Motion and Context Sensing Techniques for Pen Computing. In Proceedings of Graphics Interface 2013 (GI '13). Canadian Information Processing Society, Toronto, Ont., Canada, Canada, 71--78. http://dl.acm.org/citation.cfm?id=2532129.2532143
[22]
Ken Hinckley, Michel Pahud, Hrvoje Benko, Pourang Irani, François Guimbretière, Marcel Gavriliu, Xiang 'Anthony' Chen, Fabrice Matulic, William Buxton, and Andrew Wilson. 2014. Sensing Techniques for Tablet+Stylus Interaction. In Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology (UIST '14). ACM, New York, NY, USA, 605--614.
[23]
Ken Hinckley, Koji Yatani, Michel Pahud, Nicole Coddington, Jenny Rodenhouse, Andy Wilson, Hrvoje Benko, and Bill Buxton. 2010. Pen + touch = new tools. In Proceedings of the 23nd annual ACM symposium on User interface software and technology. ACM, New York, New York, USA, 27--36.
[24]
Sungjae Hwang, Andrea Bianchi, Myungwook Ahn, and Kwangyun Wohn. 2013. MagPen: Magnetically Driven Pen Interactions on and Around Conventional Smartphones. In Proceedings of the 15th International Conference on Human-computer Interaction with Mobile Devices and Services (MobileHCI '13). ACM, New York, NY, USA, 412--415.
[25]
Yasha Iravantchi, Yang Zhang, Evi Bernitsas, Mayank Goel, and Chris Harrison. 2019. Interferi: Gesture Sensing Using On-Body Acoustic Interferometry. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, New York, NY, USA, 276:1--276:13.
[26]
David Kim, Otmar Hilliges, Shahram Izadi, Alex D Butler, Jiawen Chen, Iason Oikonomidis, and Patrick Olivier. 2012. Digits: Freehand 3D Interactions Anywhere Using a Wrist-worn Gloveless Sensor. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (UIST '12). ACM, New York, NY, USA, 167--176.
[27]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[28]
Rui Li, Zhenyu Liu, and Jianrong Tan. 2019. A survey on 3D hand pose estimation: Cameras, methods, and datasets. Pattern Recognition 93 (2019), 251--272.
[29]
Michael Linderman, Mikhail A Lebedev, and Joseph S Erlichman. 2009. Recognition of handwriting from electromyography. PLoS One 4, 8 (2009), e6791.
[30]
Shenwei Liu and François Guimbretière. 2012. FlexAura: A Flexible Near-surface Range Sensor. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (UIST '12). ACM, New York, NY, USA, 327--330.
[31]
Fabrice Matulic and Moira Norrie. 2013. Pen and Touch Gestural Environment for Document Editing on Interactive Tabletops. In Proceedings of the 2013 ACM international conference on Interactive tabletops and surfaces. ACM, St Andrews, UK, 41--50.
[32]
Fabrice Matulic and Moira C. Norrie. 2012. Supporting Active Reading on Pen and Touch-operated Tabletops. In Proceedings of the International Working Conference on Advanced Visual Interfaces (AVI '12). ACM, New York, NY, USA, 612--619.
[33]
Fabrice Matulic, Brian Vogel, Naoki Kimura, and Daniel Vogel. 2019. Eliciting Pen-Holding Postures for General Input with Suitability for EMG Armband Detection. In Proceedings of the 2019 ACM International Conference on Interactive Surfaces and Spaces (ISS '19). ACM, New York, NY, USA, 89--100.
[34]
Fabrice Matulic, Daniel Vogel, and Raimund Dachselt. 2017. Hand Contact Shape Recognition for Posture-Based Tabletop Widgets and Interaction. In Proceedings of the 2017 ACM International Conference on Interactive Surfaces and Spaces (ISS '17). ACM, New York, NY, USA, 3--11.
[35]
Jess McIntosh, Asier Marzo, and Mike Fraser. 2017a. SensIR: Detecting Hand Gestures with a Wearable Bracelet Using Infrared Transmission and Reflection. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (UIST '17). ACM, New York, NY, USA, 593--597.
[36]
Jess McIntosh, Asier Marzo, Mike Fraser, and Carol Phillips. 2017b. EchoFlex: Hand Gesture Recognition Using Ultrasound Imaging. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 1923--1934.
[37]
Franziska Mueller, Florian Bernard, Oleksandr Sotnychenko, Dushyant Mehta, Srinath Sridhar, Dan Casas, and Christian Theobalt. 2018. Ganerated hands for real-time 3d hand tracking from monocular rgb. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 49--59.
[38]
S. K. Nayar. 1997. Catadioptric omnidirectional camera. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 482--488.
[39]
Ken Pfeuffer, Ken Hinckley, Michel Pahud, and Bill Buxton. 2017. Thumb + Pen Interaction on Tablets. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 3254--3266.
[40]
Raf Ramakers, Davy Vanacken, Kris Luyten, Karin Coninx, and Johannes Schöning. 2012. Carpus: A Non-intrusive User Identification Technique for Interactive Surfaces. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (UIST '12). ACM, New York, NY, USA, 35--44.
[41]
Jef Raskin. 2000. The Humane Interface: New Directions for Designing Interactive Systems. ACM Press/Addison-Wesley Publishing Co., New York, NY, USA.
[42]
Jun Rekimoto. 1997. Pick-and-drop: A Direct Manipulation Technique for Multiple Computer Environments. In Proceedings of the 10th Annual ACM Symposium on User Interface Software and Technology (UIST '97). ACM, New York, NY, USA, 31--39.
[43]
T Scott Saponas, Desney S Tan, Dan Morris, and Ravin Balakrishnan. 2008. Demonstrating the Feasibility of Using Forearm Electromyography for Muscle-computer Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '08). ACM, New York, NY, USA, 515--524.
[44]
Dominik Schmidt, Ming Ki Chong, and Hans Gellersen. 2010. HandsDown: Hand-contour-based User Identification for Interactive Surfaces. In Proceedings of the 6th Nordic Conference on Human-Computer Interaction: Extending Boundaries (NordiCHI '10). ACM, New York, NY, USA, 432--441.
[45]
A. Seniuk and D. Blostein. 2009. Pen Acoustic Emissions for Text and Gesture Recognition. In 2009 10th International Conference on Document Analysis and Recognition. 872--876.
[46]
Hyunyoung Song, Hrvoje Benko, Francois Guimbretiere, Shahram Izadi, Xiang Cao, and Ken Hinckley. 2011. Grips and Gestures on a Multi-touch Pen. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '11). ACM, New York, NY, USA, 1323--1332.
[47]
Jie Song, Gábor Sörös, Fabrizio Pece, Sean Ryan Fanello, Shahram Izadi, Cem Keskin, and Otmar Hilliges. 2014. In-air Gestures Around Unmodified Mobile Devices. In Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology (UIST '14). ACM, New York, NY, USA, 319--329.
[48]
Yu Suzuki, Kazuo Misue, and Jiro Tanaka. 2009. Interaction Technique for a Pen-Based Interface Using Finger Motions. In Human-Computer Interaction. Novel Interaction Methods and Techniques, Julie A. Jacko (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 503--512.
[49]
Marc Teyssier, Gilles Bailly, and Éric Lecolinet. 2017. VersaPen: An Adaptable, Modular and Multimodal I/O Pen. In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA '17). ACM, New York, NY, USA, 2155--2163.
[50]
Feng Tian, Lishuang Xu, Hongan Wang, Xiaolong Zhang, Yuanyuan Liu, Vidya Setlur, and Guozhong Dai. 2008. Tilt Menu: Using the 3D Orientation Information of Pen Devices to Extend the Selection Capability of Pen-based User Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '08). ACM, New York, NY, USA, 1371--1380.
[51]
Andrew M. Webb, Hannah Fowler, Andruid Kerne, Galen Newman, Jun-Hyun Kim, and Wendy E. Mackay. 20Interstices: Sustained Spatial RelationshipsBetween Hands and Surfaces Reveal Anticipated Action. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, New York, NY, USA, Article 588, 12 pages.
[52]
Hongyi Wen, Julian Ramos Rojas, and Anind K Dey. 2016. Serendipity: Finger Gesture Recognition Using an Off-the-Shelf Smartwatch. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 3847--3851.
[53]
Mike Wu and Ravin Balakrishnan. 2003. Multi-finger and Whole Hand Gestural Interaction Techniques for Multi-user Tabletop Displays. In Proceedings of the 16th Annual ACM Symposium on User Interface Software and Technology (UIST '03). ACM, New York, NY, USA, 193--202.
[54]
Chao Xu, Parth H Pathak, and Prasant Mohapatra. 2015. Finger-writing with Smartwatch: A Case for Finger and Hand Gesture Recognition Using Smartwatch. In Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications (HotMobile '15). ACM, New York, NY, USA, 9--14.
[55]
Xing-Dong Yang, Khalad Hasan, Neil Bruce, and Pourang Irani. 2013. Surround-see: Enabling Peripheral Vision on Smartphones During Active Use. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology (UIST '13). ACM, New York, NY, USA, 291--300.
[56]
Zhongliang Yang and Yumiao Chen. 2016. Surface EMG-based sketching recognition using two analysis windows and gene expression programming. Frontiers in neuroscience 10 (2016), 445.
[57]
Dongwook Yoon, Ken Hinckley, Hrvoje Benko, François Guimbretière, Pourang Irani, Michel Pahud, and Marcel Gavriliu. 2015. Sensing Tablet Grasp + Micro-mobility for Active Reading. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology (UIST '15). ACM, New York, NY, USA, 477--487.
[58]
Chun Yu, Xiaoying Wei, Shubh Vachher, Yue Qin, Chen Liang, Yueting Weng, Yizheng Gu, and Yuanchun Shi. 2019. HandSee: Enabling Full Hand Interaction on Smartphone with Front Camera-based Stereo Vision. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, New York, NY, USA, Article 705, 13 pages.
[59]
X Zhang, X Chen, Y Li, V Lantz, K Wang, and J Yang. 2011. A Framework for Hand Gesture Recognition Based on Accelerometer and EMG Sensors. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans 41, 6 (nov 2011), 1064--1076.
[60]
Yang Zhang and Chris Harrison. 2015. Tomo: Wearable, Low-Cost Electrical Impedance Tomography for Hand Gesture Recognition. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology (UIST '15). ACM, New York, NY, USA, 167--173.
[61]
Yang Zhang, Michel Pahud, Christian Holz, Haijun Xia, Gierad Laput, Michael McGuffin, Xiao Tu, Andrew Mittereder, Fei Su, William Buxton, and Ken Hinckley. 20Sensing Posture-Aware Pen+Touch Interaction on Tablets. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, New York, NY, USA, 55:1--55:14.
[62]
Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, Weiran He, and Jiajun Liang. 2017. EAST: an efficient and accurate scene text detector. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 5551--5560.

Cited By

View all
  • (2024)EarAuthCam: Personal Identification and Authentication Method Using Ear Images Acquired with a Camera-Equipped Hearable DeviceProceedings of the Augmented Humans International Conference 202410.1145/3652920.3653059(119-130)Online publication date: 4-Apr-2024
  • (2024)Spreadsheets on Interactive Surfaces: Breaking through the Grid with the PenACM Transactions on Computer-Human Interaction10.1145/363009731:2(1-33)Online publication date: 29-Jan-2024
  • (2024)TipTrack: Precise, Low-Latency, Robust Optical Pen Tracking on Arbitrary Surfaces Using an IR-Emitting Pen TipProceedings of the Eighteenth International Conference on Tangible, Embedded, and Embodied Interaction10.1145/3623509.3633366(1-13)Online publication date: 11-Feb-2024
  • Show More Cited By

Index Terms

  1. PenSight: Enhanced Interaction with a Pen-Top Camera

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
      April 2020
      10688 pages
      ISBN:9781450367080
      DOI:10.1145/3313831
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 23 April 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Badges

      • Best Paper

      Author Tags

      1. hand pose estimation
      2. pen input
      3. tablet input

      Qualifiers

      • Research-article

      Conference

      CHI '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)187
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 14 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media