skip to main content
10.1145/3308558.3313712acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

A Semi-Supervised Active-learning Truth Estimator for Social Networks

Published: 13 May 2019 Publication History

Abstract

This paper introduces an active-learning-based truth estimator for social networks, such as Twitter, that enhances estimation accuracy significantly by requesting a well-selected (small) fraction of data to be labeled. Data assessment and truth discovery from arbitrary open online sources are a hard problem due to uncertainty regarding source reliability. Multiple truth finding systems were developed to solve this problem. Their accuracy is limited by the noisy nature of the data, where distortions, fabrications, omissions, and duplication are introduced. This paper presents a semi-supervised truth estimator for social networks, in which a portion of inputs are carefully selected to be reliably verified. The challenge is to find the subset of observations to verify that would maximally enhance the overall fact-finding accuracy. This work extends previous passive approaches to recursive truth estimation, as well as semi-supervised approaches where the estimator has no control over the choice of data to be labeled. Results show that by optimally selecting claims to be verified, we improve estimated accuracy by 12% over unsupervised baseline, and by 5% over previous semi-supervised approaches.

References

[1]
Md Tanvir Al Amin, Charu Aggarwal, Shuochao Yao, Tarek Abdelzaher, and Lance Kaplan. 2017. Unveiling polarization in social networks: A matrix factorization approach. Technical Report. IEEE.
[2]
Jeffrey A Burke, Deborah Estrin, Mark Hansen, Andrew Parker, Nithya Ramanathan, Sasank Reddy, and Mani B Srivastava. 2006. Participatory sensing. Center for Embedded Network Sensing(2006).
[3]
Hang Cui, Tarek Abdelzaher, and Lance Kaplan. 2018. Recursive Truth Estimation of Time-Varying Sensing Data from Online Open Sources. In International Conference on Distributed Computing in Sensor Systems (DCOSS). New York, NY.
[4]
Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava. 2009. Integrating conflicting data: the role of source dependence. Proceedings of the VLDB Endowment 2, 1 (2009), 550-561.
[5]
Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava. 2009. Truth discovery and copying detection in a dynamic world. Proceedings of the VLDB Endowment 2, 1 (2009), 562-573.
[6]
Luyang Liu, Hongyu Li, Jian Liu, Cagdas Karatas, Yan Wang, Marco Gruteser, Yingying Chen, and Richard P Martin. 2017. Bigroad: Scaling road data acquisition for dependable self-driving. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 371-384.
[7]
Chuishi Meng, Houping Xiao, Lu Su, and Yun Cheng. 2016. Tackling the Redundancy and Sparsity in Crowd Sensing Applications. In SenSys. 150-163.
[8]
Alan Mislove, Massimiliano Marcon, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, 29-42.
[9]
Praneeth Netrapalli and Sujay Sanghavi. 2012. Learning the Graph of Epidemic Cascades. SIGMETRICS Perform. Eval. Rev. 40, 1 (June 2012), 211-222.
[10]
Praneeth Netrapalli and Sujay Sanghavi. 2012. Learning the graph of epidemic cascades. In ACM SIGMETRICS Performance Evaluation Review, Vol. 40. ACM, 211-222.
[11]
Jeff Pasternack and Dan Roth. 2010. Knowing what to believe (when you already know something). In Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, 877-885.
[12]
Jeff Pasternack and Dan Roth. 2013. Latent credibility analysis. In Proceedings of the 22nd international conference on World Wide Web. ACM, 1009-1020.
[13]
Tauhidur Rahman, Alexander Travis Adams, Perry Schein, Aadhar Jain, David Erickson, and Tanzeem Choudhury. 2016. Nutrilyzer: A Mobile System for Characterizing Liquid Food with Photoacoustic Effect. In SenSys. 123-136.
[14]
Dong Wang, Md Tanvir Amin, Shen Li, Tarek Abdelzaher, Lance Kaplan, Siyu Gu, Chenji Pan, Hengchang Liu, Charu C Aggarwal, Raghu Ganti, 2014. Using humans as sensors: an estimation-theoretic perspective. In Information Processing in Sensor Networks, IPSN-14 Proceedings of the 13th International Symposium on. IEEE, 35-46.
[15]
Dong Wang, Lance Kaplan, Hieu Le, and Tarek Abdelzaher. 2012. On truth discovery in social sensing: A maximum likelihood estimation approach. In Information Processing in Sensor Networks (IPSN), 2012 ACM/IEEE 11th International Conference on. IEEE, 233-244.
[16]
Shiguang Wang, Dong Wang, Lu Su, Lance Kaplan, and Tarek F Abdelzaher. 2014. Towards cyber-physical systems in social spaces: The data reliability challenge. In Real-Time Systems Symposium (RTSS), 2014 IEEE. IEEE, 74-85.
[17]
Shuochao Yao, Md Tanvir Amin, Lu Su, Shaohan Hu, Shen Li, Shiguang Wang, Yiran Zhao, Tarek Abdelzaher, Lance Kaplan, Charu Aggarwal, 2016. Recursive ground truth estimator for social data streams. In Information Processing in Sensor Networks (IPSN), 2016 15th ACM/IEEE International Conference on. IEEE, 1-12.
[18]
Shuochao Yao, Md Tanvir Amin, Lu Su, Shaohan Hu, Shen Li, Shiguang Wang, Yiran Zhao, Tarek Abdelzaher, Lance Kaplan, Charu Aggarwal, and Aylin Yener. 2016. Recursive Ground Truth Estimator for Social Data Streams. In Proceedings of the 15th International Conference on Information Processing in Sensor Networks(IPSN '16). IEEE Press, Piscataway, NJ, USA, Article 14, 12 pages. http://dl.acm.org/citation.cfm?id=2959355.2959369
[19]
Xiaoxin Yin, Jiawei Han, and S Yu Philip. 2008. Truth discovery with multiple conflicting information providers on the web. IEEE Transactions on Knowledge and Data Engineering 20, 6(2008), 796-808.
[20]
Xiaoxin Yin and Wenzhao Tan. 2011. Semi-supervised truth discovery. In Proceedings of the 20th international conference on World wide web. ACM, 217-226.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • IW3C2: International World Wide Web Conference Committee

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Active Learning
  2. Maximum Likelihood Estimation
  3. Semi Supervision
  4. Social Sensing
  5. Truth Discovery

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '19
WWW '19: The Web Conference
May 13 - 17, 2019
CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media