skip to main content
10.1145/3078714.3078720acmconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
research-article

Estimating Relative User Expertise for Content Quality Prediction on Reddit

Published: 04 July 2017 Publication History

Abstract

Reddit as a social curation site relies on its users to curate content from the World Wide Web (WWW) for the consumption of other users. Content on the site is enriched through user comments, discussions and extensions. This additional content is of varying quality however -- ranging from meaningful information to misleading content; depending on the reliability, expertise and intention of the authors. Reddit relies on the Wisdom of the Crowd (WotC) from its community as well as selected moderators to manage its content. We argue that this approach suffers from the cold start in collecting user votes and is at risk of user bias, particularly a group-think mentality. Besides that, managing the large collection of content on Reddit is expensive. In our study, we explore the estimation of relative user expertise through various content-agnostic approaches. We show that it is possible to infer information quality on Reddit using the expertise of the authors. This prediction of content quality could lead to an improved organisation of Reddit content (re-ranking) for user consumption and future information retrieval.

References

[1]
Ashton Anderson, Daniel Huttenlocher, Jon Kleinberg, and Jure Leskovec. 2012. Discovering Value from Community Activity on Focused Question Answering Sites: A Case Study of Stack Overflow. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '12). ACM, 850--858.
[2]
Çiğdem Aslay, Neil O'Hare, Luca Maria Aiello, and Alejandro Jaimes. 2013. Competition-based Networks for Expert Finding. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '13). ACM, 1033--1036.
[3]
Ralph A. Bradley and Milton E. Terry. 1952. Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons. Biometrika 39, 3/4 (1952).
[4]
Xuan Chen and John Heidemann. 2005. Flash Crowd Mitigation via Adaptive Admission Control Based on Application-level Observations. ACM Trans. Internet Technol. 5, 3 (Aug. 2005), 532--569.
[5]
Justin Cheng, Lada Adamic, P. Alex Dow, Jon Michael Kleinberg, and Jure Leskovec. 2014. Can Cascades Be Predicted?. In Proceedings of the 23rd International Conference on World Wide Web (WWW '14). ACM, New York, NY, USA, 925--936.
[6]
Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. 2008. An experimental comparison of click position-bias models. In WSDM '08: Proceedings of the international conference on Web search and web data mining. ACM, New York, NY, USA, 87--94.
[7]
Jeremy Elson and Jon Howell. 2008. Handling Flash Crowds from Your Garage. In USENIX 2008 Annual Technical Conference (ATC'08). USENIX Association, Berkeley, CA, USA, 171--184.
[8]
Adrien Friggeri, Lada Adamic, Dean Eckles, and Justin Cheng. 2014. Rumor Cascades. (2014).
[9]
Eric Gilbert. 2013. Widespread Underprovision on Reddit. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work (CSCW '13). ACM, New York, NY, USA, 803--808.
[10]
Mark Glickman. 2001. Dynamic paired comparison models with stochastic variances. Journal of Applied Statistics 28, 6 (2001).
[11]
Joshua Guberman, Carol Schmitz, and Libby Hemphill. 2016. Quantifying Toxicity and Verbal Violence on Twitter. In Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion (CSCW '16 Companion). ACM, New York, NY, USA, 277--280.
[12]
Nathan Oken Hodas and Kristina Lerman. 2012. How Visibility and Divided Attention Constrain Social Contagion. In Proceedings of the 2012 ASE/IEEE International Conference on Social Computing and 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust (SOCIALCOM-PASSAT '12). IEEE Computer Society, Washington, DC, USA, 249--257.
[13]
Nathan Oken Hodas and Kristina Lerman. 2013. The Simple Rules of Social Contagion. CoRR abs/1308.5015 (2013).
[14]
Tad Hogg, Kristina Lerman, and Laura M. Smith. 2013. Stochastic Models Predict User Behavior in Social Media. CoRR abs/1308.2705 (2013).
[15]
Jeon-Hyung Kang and Kristina Lerman. 2015. VIP: Incorporating Human Cog- nitive Biases in a Probabilistic Model of Retweeting. In Social Computing, Behavioral-Cultural Modeling, and Prediction, Nitin Agarwal, Kevin Xu, and Nathaniel Osgood (Eds.). Lecture Notes in Computer Science, Vol. 9021. Springer International Publishing, 101--110.
[16]
Simon Kassing, Jasper Oosterman, Alessandro Bozzon, and Geert-Jan Houben. 2015. Locating Domain-specific Contents and Experts on Social Bookmarking Communities. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC '15). ACM, New York, NY, USA, 747--752.
[17]
Himabindu Lakkaraju, Julian McAuley, and Jure Leskovec. 2013. What's in a Name? Understanding the Interplay between Titles, Content, and Communities in Social Media. (2013).
[18]
Kristina Lerman and Aram Galstyan. 2008. Analysis of Social Voting Patterns on Digg. In Proceedings of the First Workshop on Online Social Networks (WOSN '08). ACM, New York, NY, USA, 7--12.
[19]
Kristina Lerman and Tad Hogg. 2010. Using a Model of Social Dynamics to Predict Popularity of News. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). ACM, New York, NY, USA, 621--630.
[20]
Kristina Lerman and Tad Hogg. 2012. Using Stochastic Models to Describe and Predict Social Dynamics of Web Users. ACM Trans. Intell. Syst. Technol. 3, 4, Article 62 (Sept. 2012), 33 pages.
[21]
Kristina Lerman and Tad Hogg. 2014. Leveraging Position Bias to Improve Peer Recommendation. PLoS ONE 9, 6 (11 June 2014), e98914+.
[22]
Baichuan Li and Irwin King. 2010. Routing Questions to Appropriate Answerers in Community Question Answering Services. In Proceedings of the ACM Conference on Information & Knowledge Management (CIKM).
[23]
Wern Han Lim, Mark James Carman, and Sze-Meng Jojo Wong. 2016. Estimating Domain-Specific User Expertise for Answer Retrieval in Community Question-Answering Platforms. In Proceedings of the 21st Australasian Document Computing Symposium (ADCS '16). ACM, New York, NY, USA, 33--40.
[24]
Jing Liu, Young-In Song, and Chin-Yew Lin. 2011. Competition-based User Expertise Score Estimation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '11). ACM, 425--434.
[25]
Richard Mills. 2011. Researching Social News -fi?! Is reddit.com a mouthpiece for the Hive Mind, or a Collective Intelligence approach to Information Overload?. In ETHICOMP 2011 Proceedings. Sheffeld Hallam University.
[26]
Blair Nonnecke and Jenny Preece. 2000. Lurker Demographics: Counting the Silent. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '00). ACM, New York, NY, USA, 73--80.
[27]
Henrique Pinto, Jussara M. Almeida, and Marcos A. Gonçalves. 2013. Using Early View Patterns to Predict the Popularity of Youtube Videos. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining (WSDM '13). ACM, New York, NY, USA, 365--374.
[28]
Maria Priestley and Alex Mesoudi. 2015. Do Online Voting Patterns Reflect Evolved Features of Human Cognition? An Exploratory Empirical Investigation. PLoS ONE 10, 6 (06 2015), e0129703.
[29]
Philipp Singer, Fabian Flöck, Clemens Meinhart, Elias Zeitfogel, and Markus Strohmaier. 2014. Evolution of Reddit: From the Front Page of the Internet to a Self-referential Community?. In Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion (WWW Companion '14). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 517--522.
[30]
Greg Stoddard. 2015. Popularity and Quality in Social News Aggregators: A Study of Reddit and Hacker News. In Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 815--818.
[31]
Gabor Szabo and Bernardo A. Huberman. 2010. Predicting the Popularity of Online Content. Commun. ACM 53, 8 (Aug. 2010), 80--88.
[32]
Tim Weninger, Xihao Avi Zhu, and Jiawei Han. 2013. An Exploration of Discussion Threads in Social News Sites: A Case Study of the Reddit Community. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM '13). ACM, New York, NY, USA, 579--583.
[33]
Fang Wu and Bernardo A. Huberman. 2007. Novelty and collective attention. Proceedings of the National Academy of Sciences 104, 45 (2007), 17599--17601. arXiv:http://www.pnas.org/content/104/45/17599.full.pdf
[34]
Jun Zhang, Mark S. Ackerman, and Lada Adamic. 2007. Expertise Networks in Online Communities: Structure and Algorithms. In Proceedings of the 16th International Conference on World Wide Web (WWW '07). ACM, 221--230.
[35]
Zhi-Min Zhou, Man Lan, Zheng-Yu Niu, and Yue Lu. 2012. Exploiting User Profile Information for Answer Ranking in cQA. In Proceedings of the Conference Companion on World Wide Web (WWW).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HT '17: Proceedings of the 28th ACM Conference on Hypertext and Social Media
July 2017
336 pages
ISBN:9781450347082
DOI:10.1145/3078714
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 July 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. information quality
  2. information retrieval
  3. knowledge management
  4. reddit
  5. user expertise

Qualifiers

  • Research-article

Conference

HT'17
Sponsor:
HT'17: 28th Conference on Hypertext and Social Media
July 4 - 7, 2017
Prague, Czech Republic

Acceptance Rates

HT '17 Paper Acceptance Rate 19 of 69 submissions, 28%;
Overall Acceptance Rate 378 of 1,158 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)46
  • Downloads (Last 6 weeks)1
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media