research-article

Towards a corpus for credibility assessment in software practitioner blog articles

Authors:

Ashley Williams,

Matthew Shardlow,

Austen RainerAuthors Info & Claims

EASE '21: Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering

Pages 100 - 108

https://doi.org/10.1145/3463274.3463330

Published: 21 June 2021 Publication History

Abstract

Background: Blogs are a source of grey literature which are widely adopted by software practitioners for disseminating opinion and experience. Analysing such articles can provide useful insights into the state–of–practice for software engineering research. However, there are challenges in identifying higher quality content from the large quantity of articles available. Credibility assessment can help in identifying quality content, though there is a lack of existing corpora. Credibility is typically measured through a series of conceptual criteria, with ’argumentation’ and ’evidence’ being two important criteria.

Objective: We create a corpus labelled for argumentation and evidence that can aid the credibility community. The corpus consists of articles from the blog of a single software practitioner and is publicly available.

Method: Three annotators label the corpus with a series of conceptual credibility criteria, reaching an agreement of 0.82 (Fleiss’ Kappa). We present preliminary analysis of the corpus by using it to investigate the identification of claim sentences (one of our ten labels).

Results: We train four systems (Bert, KNN, Decision Tree and SVM) using three feature sets (Bag of Words, Topic Modelling and InferSent), achieving an F1 score of 0.64 using InferSent and a Linear SVM.

Conclusions: Our preliminary results are promising, indicating that the corpus can help future studies in detecting the credibility of grey literature. Future research will investigate the degree to which the sentence level annotations can infer the credibility of the overall document.

References

[1]

Ali Abdolrahmani and Ravi Kuber. 2016. Should I trust it when I cannot see it? Credibility assessment for blind web users. In Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility. ACM, New York, NY, USA, 191–199.

Digital Library

[2]

Jean Adams, Frances C Hillier-Brown, Helen J Moore, Amelia A Lake, Vera Araujo-Soares, Martin White, and Carolyn Summerbell. 2016. Searching and synthesising ‘grey literature’and ‘grey information’in public health: critical reflections on three case studies. Systematic reviews 5, 1 (2016), 164.

[3]

Richard J Adams, Palie Smart, and Anne Sigismund Huff. 2017. Shades of grey: guidelines for working with the grey literature in systematic reviews for management and organizational studies. International Journal of Management Reviews 19, 4 (2017), 432–454.

[4]

Ehud Aharoni, Anatoly Polnarov, Tamar Lavee, Daniel Hershcovich, Ran Levy, Ruty Rinott, Dan Gutfreund, and Noam Slonim. 2014. A benchmark dataset for automatic detection of claims and evidence in the context of controversial topics. In Proceedings of the first workshop on argumentation mining. Association for Computational Linguistics, Baltimore, Maryland USA, 64–68.

[5]

Rushlene Kaur Bakshi, Navneet Kaur, Ravneet Kaur, and Gurpreet Kaur. 2016. Opinion mining and sentiment analysis. In 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom). ACM, New Delhi, India, 452–455.

[6]

Marcus Banks. 2009. Blog posts and tweets: the next frontier for grey literature. In Grey literature in library and information studies. De Gruyter, UCSF Library, USA.

[7]

David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research 3, Jan (2003), 993–1022.

Digital Library

[8]

Leo Breiman, Jerome Friedman, Charles J Stone, and Richard A Olshen. 1984. Classification and regression trees. CRC press, Boca Raton, Florida.

[9]

Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM transactions on intelligent systems and technology (TIST) 2, 3(2011), 1–27.

Digital Library

[10]

Alexis Conneau, Douwe Kiela, Holger Schwenk, Loïc Barrault, and Antoine Bordes. 2017. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 670–680. https://www.aclweb.org/anthology/D17-1070

[11]

Premkumar Devanbu, Thomas Zimmermann, and Christian Bird. 2016. Belief & evidence in empirical software engineering. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). IEEE, ACM, New York, NY, USA, 108–119.

Digital Library

[12]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423

[13]

Nicola Dragoni, Saverio Giallorenzo, Alberto Lluch Lafuente, Manuel Mazzara, Fabrizio Montesi, Ruslan Mustafin, and Larisa Safina. 2017. Microservices: Yesterday, Today, and Tomorrow. Springer International Publishing, Cham, 195–216. https://doi.org/10.1007/978-3-319-67425-4_12

[14]

Richard Eckart de Castilho, Éva Mújdricza-Maydt, Seid Muhie Yimam, Silvana Hartmann, Iryna Gurevych, Anette Frank, and Chris Biemann. 2016. A Web-based Tool for the Integrated Annotation of Semantic and Syntactic Structures. In Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH). The COLING 2016 Organizing Committee, Osaka, Japan, 76–84. https://www.aclweb.org/anthology/W16-4011

[15]

Vahid Garousi, Michael Felderer, and Mika V Mäntylä. 2016. The need for multivocal literature reviews in software engineering: complementing systematic literature reviews with grey literature. In Proceedings of the 20th international conference on evaluation and assessment in software engineering. ACM, New York, NY, USA, 1–6.

Digital Library

[16]

Vahid Garousi, Michael Felderer, and Mika V Mäntylä. 2019. Guidelines for including grey literature and conducting multivocal literature reviews in software engineering. Information and Software Technology 106 (2019), 101–121.

[17]

Jacob Goldberger, Geoffrey E Hinton, Sam T. Roweis, and Russ R Salakhutdinov. 2005. Neighbourhood Components Analysis. In Advances in Neural Information Processing Systems 17, L. K. Saul, Y. Weiss, and L. Bottou (Eds.). MIT Press, Cambridge, MA, 513–520. http://papers.nips.cc/paper/2566-neighbourhood-components-analysis.pdf

[18]

Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, and Tomas Mikolov. 2018. Learning Word Vectors for 157 Languages. In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan, 3483–3487.

[19]

James Lewis and Martin Fowler. 2014. Microservices. https://martinfowler.com/articles/microservices.html

[20]

Qingzi Vera Liao. 2010. Effects of cognitive aging on credibility assessment of online health information. In CHI’10 Extended Abstracts on Human Factors in Computing Systems. ACM, New York, NY, USA, 4321–4326.

[21]

Marco Lippi and Paolo Torroni. 2016. Argumentation mining: State of the art and emerging trends. ACM Transactions on Internet Technology (TOIT) 16, 2 (2016), 1–25.

Digital Library

[22]

Marco Lippi and Paolo Torroni. 2016. MARGOT: A web server for argumentation mining. Expert Systems with Applications 65 (2016), 292–303.

Digital Library

[23]

Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han. 2020. On the Variance of the Adaptive Learning Rate and Beyond. In Proceedings of the Eighth International Conference on Learning Representations (ICLR 2020). OpenReview, Ethiopia, na.

[24]

Ericka Menchen-Trevino and Eszter Hargittai. 2011. YOUNG ADULTS’CREDIBILITY ASSESSMENT OF WIKIPEDIA. Information, Communication & Society 14, 1 (2011), 24–51.

[25]

Jana Novovičová, Antonín Malík, and Pavel Pudil. 2004. Feature selection using improved mutual information for text classification. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). Springer, Springer, Berlin, Heidelberg, 1010–1017.

[26]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.

Digital Library

[27]

John C. Platt. 1999. Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods. In ADVANCES IN LARGE MARGIN CLASSIFIERS. MIT Press, Cambridge, MA, 61–74.

[28]

Austen Rainer. 2017. Using argumentation theory to analyse software practitioners’ defeasible evidence, inference and belief. Information and Software Technology 87 (2017), 62–80.

Digital Library

[29]

Austen Rainer, Tracy Hall, and Nathan Baddoo. 2003. Persuading developers to” buy into” software process improvement: a local opinion and empirical evidence. In 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings. IEEE, IEEE, Rome, Italy, 326–335.

[30]

Austen Rainer and Ashley Williams. 2019. Using blog-like documents to investigate software practice: Benefits, challenges, and research directions. Journal of Software: Evolution and Process 31, 11 (2019), e2197.

Digital Library

[31]

Austen Rainer, Ashley Williams, Vahid Garousi, and Michael Felderer. 2020. Retrieving and mining professional experience of software practice from grey literature: an exploratory review. IET Software 14, 6 (2020), 665–676.

Digital Library

[32]

Radim Řehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45–50. http://is.muni.cz/publication/884893/en.

[33]

Zaher Salah, Abdel-Rahman F Al-Ghuwairi, Aladdin Baarah, Ahmad Aloqaily, Bar’a Qadoumi, Momen Alhayek, and Bushra Alhijawi. 2019. A systematic review on opinion mining and sentiment analysis in social media. International Journal of Business Information Systems 31, 4 (2019), 530–554.

Digital Library

[34]

Matthew Shardlow, Riza Batista-Navarro, Paul Thompson, Raheel Nawaz, John McNaught, and Sophia Ananiadou. 2018. Identification of research hypotheses and new knowledge from scientific literature. BMC medical informatics and decision making 18, 1 (2018), 1–13.

[35]

Jacopo Soldani, Damian Andrew Tamburri, and Willem-Jan Van Den Heuvel. 2018. The Pains and Gains of Microservices: A Systematic Grey Literature Review. Journal of Systems and Software 146 (2018), 215–232.

[36]

Margaret-Anne Storey, Leif Singer, Brendan Cleary, Fernando Figueira Filho, and Alexey Zagalsky. 2014. The (r) evolution of social media in software engineering. In Future of Software Engineering Proceedings. ACM, New York, NY, USA, 100–116.

Digital Library

[37]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett(Eds.). Curran Associates, Inc., Red Hook, NY, 5998–6008. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf

Digital Library

[38]

Ashley Williams. 2018. Do software engineering practitioners cite research on software testing in their online articles? A preliminary survey. In Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering 2018. ACM, New York, NY, USA, 151–156.

Digital Library

[39]

Ashley Williams. 2018. Using reasoning markers to select the more rigorous software practitioners’ online content when searching for grey literature. In Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering 2018. ACM, New York, NY, USA, 46–56.

Digital Library

[40]

Ashley Williams and Austen Rainer. 2016. Identifying practitioners’ arguments and evidence in blogs: insights from a pilot study. In 2016 23rd Asia-Pacific Software Engineering Conference (APSEC). IEEE, Hamilton, NZ, 345–348.

[41]

Ashley Williams and Austen Rainer. 2017. The analysis and synthesis of previous work on credibility assessment in online media: technical report. Technical Report. University of Canterbury, NZ.

[42]

Ashley Williams and Austen Rainer. 2019. Do software engineering practitioners cite software testing research in their online articles? A larger scale replication. In Proceedings of the Evaluation and Assessment on Software Engineering. ACM, New York, NY, USA, 292–297.

Digital Library

[43]

Ashley Williams and Austen Rainer. 2019. How do empirical software engineering researchers assess the credibility of practitioner-generated blog posts?In Proceedings of the Evaluation and Assessment on Software Engineering. ACM, New York, NY, USA, 211–220.

[44]

Olaf Zimmermann. 2017. Microservices tenets. Computer Science-Research and Development 32, 3 (2017), 301–310.

Digital Library

Cited By

Soliman MGericke KAvgeriou P(2023)Where and What do Software Architects blog? : An Exploratory Study on Architectural Knowledge in Blogs, and their Relevance to Design Steps2023 IEEE 20th International Conference on Software Architecture (ICSA)10.1109/ICSA56044.2023.00020(129-140)Online publication date: Mar-2023
https://doi.org/10.1109/ICSA56044.2023.00020

Recommendations

Extending a corpus for assessing the credibility of software practitioner blog articles using meta-knowledge
EASE '22: Proceedings of the 26th International Conference on Evaluation and Assessment in Software Engineering

Practitioner written grey literature, such as blog articles, has value in software engineering research. Such articles provide insight into practice that is often not visible to research. However, a high quantity and varying quality are two major ...
How do empirical software engineering researchers assess the credibility of practitioner-generated blog posts?
EASE '19: Proceedings of the 23rd International Conference on Evaluation and Assessment in Software Engineering

Background: Blog posts offer potential benefits for research, but also present challenges. The use of blog posts in SE research is contentious for some members of the community. Also, there are no guidelines for evaluating the credibility of blog posts.
...
Why People Trust Wikipedia Articles: Credibility Assessment Strategies Used by Readers
OpenSym '22: Proceedings of the 18th International Symposium on Open Collaboration

We examine how a diverse global readership assigns trust to Wikipedia articles, and the strategies they use to assess Wikipedia’s credibility. Through surveys and interviews, we develop and refine a Wikipedia trust taxonomy that describes the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

EASE '21: Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering

June 2021

417 pages

ISBN:9781450390538

DOI:10.1145/3463274

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

EASE 2021

EASE 2021: Evaluation and Assessment in Software Engineering

June 21 - 23, 2021

Trondheim, Norway

Acceptance Rates

Overall Acceptance Rate 71 of 232 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
134
Total Downloads

Downloads (Last 12 months)18
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Soliman MGericke KAvgeriou P(2023)Where and What do Software Architects blog? : An Exploratory Study on Architectural Knowledge in Blogs, and their Relevance to Design Steps2023 IEEE 20th International Conference on Software Architecture (ICSA)10.1109/ICSA56044.2023.00020(129-140)Online publication date: Mar-2023
https://doi.org/10.1109/ICSA56044.2023.00020

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten