skip to main content
10.1145/3442442.3452342acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Tracing the Factoids: the Anatomy of Information Re-organization in Wikipedia Articles

Published: 03 June 2021 Publication History

Abstract

Wikipedia articles are known for their exhaustive knowledge and extensive collaboration. Users perform various tasks that include editing in terms of adding new facts or rectifying some mistakes, looking up new topics, or simply browsing. In this paper, we investigate the impact of gradual edits on the re-positioning and organization of the factual information in Wikipedia articles. Literature shows that in a collaborative system, a set of contributors are responsible for seeking, perceiving, and organizing the information. However, very little is known about the evolution of information organization on Wikipedia articles. Based on our analysis, we show that in a Wikipedia article, the crowd is capable of placing the factual information to its correct position, eventually reducing the knowledge gaps. We also show that the majority of information re-arrangement occurs in the initial stages of the article development and gradually decreases in the later stages.
Our findings advance our understanding of the fundamentals of information organization on Wikipedia articles and can have implications for developers aiming to improve the content quality and completeness of Wikipedia articles.

References

[1]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org.
[2]
Rodrigo B Almeida, Barzan Mozafari, and Junghoo Cho. 2007. On the Evolution of Wikipedia. In ICWSM.
[3]
Ofer Arazy and Oded Nov. 2010. Determinants of wikipedia quality: the roles of global and local contribution inequality. In Proceedings of the 2010 ACM conference on Computer supported cooperative work. 233–236.
[4]
Phoebe Ayers, Charles Matthews, and Ben Yates. 2008. How Wikipedia works: And how you can be a part of it. No Starch Press.
[5]
Joshua E Blumenstock. 2008. Size matters: word count as a measure of quality on wikipedia. In Proceedings of the 17th international conference on World Wide Web. 1095–1096.
[6]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135–146.
[7]
Susan L Bryant, Andrea Forte, and Amy Bruckman. 2005. Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia. In Proceedings of the 2005 international ACM SIGGROUP conference on Supporting group work. 1–10.
[8]
Andrea Ceroni, Mihai Georgescu, Ujwal Gadiraju, Kaweh Djafari Naini, and Marco Fisichella. 2014. Information evolution in wikipedia. In Proceedings of The International Symposium on Open Collaboration. 1–10.
[9]
Jilin Chen, Yuqing Ren, and John Riedl. 2010. The effects of diversity on group productivity and member withdrawal in online volunteer groups. In Proceedings of the SIGCHI conference on human factors in computing systems. 821–830.
[10]
Anamika Chhabra and SRS Iyengar. 2017. How Does Knowledge Come By?arXiv preprint arXiv:1705.06946(2017).
[11]
Anamika Chhabra and SR Sudarshan Iyengar. 2018. Characterizing the Triggering Phenomenon in Wikipedia. In Proceedings of the 14th International Symposium on Open Collaboration. 1–7.
[12]
Youmna Farag and Helen Yannakoudakis. 2019. Multi-task learning for coherence modeling. arXiv preprint arXiv:1907.02427(2019).
[13]
Colum Foley and Alan F Smeaton. 2010. Division of labour and sharing of knowledge for synchronous collaborative information retrieval. Information processing & management 46, 6 (2010), 762–772.
[14]
Jesse Prabawa Gozali, Min-Yen Kan, and Hari Sundaram. 2012. How do people organize their photos in each event and how does it affect storytelling, searching and interpretation tasks?. In Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries. 315–324.
[15]
Dani Gunawan, C Sembiring, and Mohammad Budiman. 2018. The Implementation of Cosine Similarity to Calculate Text Relevance between Two Documents. Journal of Physics: Conference Series 978 (03 2018), 012120. https://doi.org/10.1088/1742-6596/978/1/012120
[16]
Lu Hong and Scott E Page. 2004. Groups of diverse problem solvers can outperform groups of high-ability problem solvers. Proceedings of the National Academy of Sciences 101, 46(2004), 16385–16389.
[17]
Meiqun Hu, Ee-Peng Lim, Aixin Sun, Hady Wirawan Lauw, and Ba-Quy Vuong. 2007. Measuring article quality in wikipedia: models and evaluation. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. 243–252.
[18]
Aniket Kittur and Robert E Kraut. 2008. Harnessing the wisdom of crowds in wikipedia: quality through coordination. In Proceedings of the 2008 ACM conference on Computer supported cooperative work. ACM, 37–46.
[19]
Niklas Luhmann. 1995. Social systems. stanford university Press.
[20]
Gary Marchionini. 1997. Information seeking in electronic environments. Number 9. Cambridge university press.
[21]
Mediawiki. 2020. Wikipedia:Verifiability. https://en.wikipedia.org/wiki/Wikipedia:Verifiability. [Online; accessed 14-October-2020].
[22]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781(2013).
[23]
Ikujiro Nonaka. 1994. A dynamic theory of organizational knowledge creation. Organization science 5, 1 (1994), 14–37.
[24]
Sérgio Nunes, Cristina Ribeiro, and Gabriel David. 2008. WikiChanges: exposing Wikipedia revision activity. In Proceedings of the 4th International Symposium on Wikis. ACM, 25.
[25]
Chitu Okoli, Mohamad Mehdi, Mostafa Mesgari, Finn Årup Nielsen, and Arto Lanamäki. 2012. The people’s encyclopedia under the gaze of the sages: A systematic review of scholarly research on Wikipedia. Available at SSRN 2021326(2012).
[26]
Soumyadip Pal. [n.d.]. Measuring the Spread of Data. https://helpfulstats.com/data-spread/
[27]
Jean Piaget. 1976. Piaget’s theory. In Piaget and his school. Springer, 11–23.
[28]
Ruqin Ren. 2015. The evolution of knowledge creation online: Wikipedia and knowledge processes. In Proceedings of the 11th International Symposium on Open Collaboration. 1–3.
[29]
Diomidis Spinellis and Panagiotis Louridas. 2008. The collaborative organization of knowledge. Commun. ACM 51, 8 (2008), 68–73.
[30]
Besiki Stvilia, Michael B Twidale, Linda C Smith, and Les Gasser. 2008. Information quality work organization in Wikipedia. Journal of the American society for information science and technology 59, 6 (2008), 983–1001.
[31]
Amit Arjun Verma, SRS Iyengar, Simran Setia, and Neeru Dubey. 2020. KDAP: An Open Source Toolkit to Accelerate Knowledge Building Research. In Proceedings of the 16th International Symposium on Open Collaboration. 1–11.
[32]
Georg Von Krogh, Kazuo Ichijo, Ikujiro Nonaka, 2000. Enabling knowledge creation: How to unlock the mystery of tacit knowledge and release the power of innovation. Oxford University Press on Demand.
[33]
Christian Wagner and Ann Majchrzak. 2006. Enabling customer-centricity using wikis and the wiki way. Journal of management information systems 23, 3 (2006), 17–43.
[34]
Andrew M Webb and Andruid Kerne. 2011. Integrating implicit structure visualization with authoring promotes ideation. In Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries. 203–212.
[35]
Wikipedia contributors. 2020. Consensus decision-making — Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=Consensus_decision-making&oldid=960993839 [Online; accessed 8-June-2020].
[36]
Wikipedia contributors. 2020. Wikipedia community — Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=Wikipedia_community&oldid=958464301 [Online; accessed 8-June-2020].
[37]
Dennis M Wilkinson and Bernardo A Huberman. 2007. Cooperation and quality in wikipedia. In Proceedings of the 2007 international symposium on Wikis. 157–164.
[38]
Thomas Wöhner and Ralf Peters. 2009. Assessing the quality of Wikipedia articles with lifecycle based metrics. In Proceedings of the 5th International Symposium on Wikis and Open Collaboration. 1–10.
[39]
Diyi Yang, Aaron Halfaker, Robert Kraut, and Eduard Hovy. 2016. Who did what: Editor role identification in Wikipedia. In Tenth International AAAI Conference on Web and Social Media.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '21: Companion Proceedings of the Web Conference 2021
April 2021
726 pages
ISBN:9781450383134
DOI:10.1145/3442442
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Wikipedia
  2. factoids
  3. information organization
  4. information seeking
  5. knowledge building
  6. semantic similarity
  7. sentence embedding

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '21
Sponsor:
WWW '21: The Web Conference 2021
April 19 - 23, 2021
Ljubljana, Slovenia

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media