skip to main content
10.1145/3121257.3121260acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Find, understand, and extend development screencasts on YouTube

Published: 04 September 2017 Publication History

Abstract

A software development screencast is a video that captures the screen of a developer working on a particular task and explaining implementation details. Due to the increased popularity of development screencasts e.g., on YouTube, we study how and to what extent they can be used as additional source of knowledge to answer developers’ questions, for example about the use of a specific API. We first study the difference between development screencasts and other types of screencasts using video frame analysis. When comparing frames with the Cosine algorithm, developers can expect ten development screencasts in the top 20 out of 100 different YouTube videos. We then extracted popular development topics. These were: database operations, system set-up, plug-in development, game development, and testing. We also identified six recurring tasks performed in development screencasts, such as object usage and UI operations. Finally, we conducted a similarity analysis of the screencast transcripts and the Javadoc of the corresponding screencasts.

References

[1]
Muhammad Ahasanuzzaman, Muhammad Asaduzzaman, Chanchal K Roy, and Kevin A Schneider. 2016. Mining duplicate questions in stack overflow. In Proceedings of the 13th International Conference on Mining Software Repositories. ACM, 402–412.
[2]
Roger B Bradford. 2008. An empirical study of required dimensionality for large-scale latent semantic indexing applications. In Proceedings of the 17th ACM conference on Information and knowledge management. ACM, 153–162.
[3]
Jason Chuang, Christopher D Manning, and Jeffrey Heer. 2012. Termite: Visualization techniques for assessing textual topic models. In Proceedings of the International Working Conference on Advanced Visual Interfaces. ACM, 74–77.
[4]
Jack G Conrad, Xi S Guo, and Cindy P Schriber. 2003. Online duplicate document detection: signature reliability in a dynamic retrieval environment. In Proceedings of the twelfth international conference on Information and knowledge management. ACM, 443–452.
[5]
Thomas Fritz and Gail C Murphy. 2010. Using information fragments to answer the questions developers ask. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 1. ACM, 175–184.
[6]
Anna Huang. 2008. Similarity measures for text document clustering. In Proceedings of the sixth new zealand computer science research student conference (NZCSRSC2008), Christchurch, New Zealand. 49–56.
[7]
Andrew J Ko, Brad A Myers, Michael J Coblenz, and Htet Htet Aung. 2006. An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Transactions on software engineering 32, 12 (2006).
[8]
Ivan Laptev, Marcin Marszalek, Cordelia Schmid, and Benjamin Rozenfeld. 2008. Learning realistic human actions from movies. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on. IEEE, 1–8.
[9]
Timothy C Lethbridge, Janice Singer, and Andrew Forward. 2003. How software engineers use documentation: The state of the practice. IEEE software 20, 6 (2003), 35–39.
[10]
Walid Maalej. 2009. Task-first or context-first? tool integration revisited. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society, 344–355.
[11]
Walid Maalej, Mathias Ellmann, and Romain Robbes. 2016. Using contexts similarity to predict relationships between tasks. Journal of Systems and Software (2016).
[12]
Walid Maalej and Martin P Robillard. 2013. Patterns of knowledge in API reference documentation. IEEE Transactions on Software Engineering 39, 9 (2013), 1264– 1282.
[13]
Find, Understand, and Extend Development Screencasts on YouTube SWAN’17, September 4, 2017, Paderborn, Germany
[14]
Walid Maalej, Rebecca Tiarks, Tobias Roehm, and Rainer Koschke. 2014. On the comprehension of program comprehension. ACM Transactions on Software Engineering and Methodology (TOSEM) 23, 4 (2014), 31.
[15]
Laura MacLeod, Margaret-Anne Storey, and Andreas Bergen. 2015. Code, camera, action: How software developers document and share program knowledge using YouTube. In Proceedings of the 2015 IEEE 23rd International Conference on Program Comprehension. IEEE Press, 104–114.
[16]
Rada Mihalcea, Courtney Corley, and Carlo Strapparava. 2006. Corpus-based and knowledge-based measures of text semantic similarity. In AAAI, Vol. 6. 775–780.
[17]
Seung-Taek Park, David M Pennock, C Lee Giles, and Robert Krovetz. 2002. Analysis of lexical signatures for finding lost or related documents. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 11–18.
[18]
Luca Ponzanelli, Gabriele Bavota, Andrea Mocci, Massimiliano Di Penta, Rocco Oliveto, Barbara Russo, Sonia Haiduc, and Michele Lanza. 2016. CodeTube: extracting relevant fragments from software development video tutorials. In Proceedings of the 38th International Conference on Software Engineering Companion. ACM, 645–648.
[19]
pyLDAvis. 2014. Python library for interactive topic model visualization. (2014). https://github.com/bmabey/pyLDAvis
[20]
Martin P Robillard, Walid Maalej, Robert J Walker, and Thomas Zimmermann. 2014. Recommendation systems in software engineering. Springer.
[21]
Terry Shepard, Margaret Lamb, and Diane Kelly. 2001. More testing should be taught. Commun. ACM 44, 6 (2001), 103–108.
[22]
Carson Sievert and Kenneth E Shirley. 2014. LDAvis: A method for visualizing and interpreting topics. In Proceedings of the workshop on interactive language learning, visualization, and interfaces. 63–70.
[23]
Jonathan Sillito, Gail C Murphy, and Kris De Volder. 2008. Asking and answering questions during a programming change task. IEEE Transactions on Software Engineering 34, 4 (2008), 434–451.
[24]
Janice Singer, Timothy Lethbridge, Norman Vinson, and Nicolas Anquetil. 2010. An examination of software engineering work practices. In CASCON First Decade High Impact Papers. IBM Corp., 174–188.
[25]
Rebecca Tiarks and Walid Maalej. 2014. How does a typical tutorial for mobile development look like?. In Proceedings of the 11th Working Conference on Mining Software Repositories. ACM, 272–281.
[26]
Christoph Treude, Ohad Barzilay, and Margaret-Anne Storey. 2011. How do programmers ask and answer questions on the web?: Nier track. In Software Engineering (ICSE), 2011 33rd International Conference on. IEEE, 804–807.
[27]
Christoph Treude and Martin P Robillard. 2016. Augmenting API documentation with insights from Stack Overflow. In Proceedings of the 38th International Conference on Software Engineering. ACM, 392–403.
[28]
Christoph Treude, Mathieu Sicard, Marc Klocke, and Martin Robillard. 2015. TaskNav: Task-based navigation of software documentation. In Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on, Vol. 2. IEEE, 649–652.
[29]
Jon Udell. 2005. What Is Screencasting - O’Reilly Media. http://archive.oreilly. com/pub/a/oreilly/digitalmedia/2005/11/16/what-is-screencasting.html. (November 2005). (Accessed on 11/01/2016).
[30]
Xiaogang Wang and Eric Grimson. 2008. Spatial latent dirichlet allocation. In Advances in neural information processing systems. 1577–1584.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SWAN 2017: Proceedings of the 3rd ACM SIGSOFT International Workshop on Software Analytics
September 2017
26 pages
ISBN:9781450351577
DOI:10.1145/3121257
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 September 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. API documentation
  2. Development Screencasts
  3. Similarity Analytics

Qualifiers

  • Research-article

Conference

ESEC/FSE'17
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)2
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media