research-article

A Semi-Supervised Active-learning Truth Estimator for Social Networks

Authors:

Tarek Abdelzaher,

Lance KaplanAuthors Info & Claims

WWW '19: The World Wide Web Conference

Pages 296 - 306

https://doi.org/10.1145/3308558.3313712

Published: 13 May 2019 Publication History

Abstract

This paper introduces an active-learning-based truth estimator for social networks, such as Twitter, that enhances estimation accuracy significantly by requesting a well-selected (small) fraction of data to be labeled. Data assessment and truth discovery from arbitrary open online sources are a hard problem due to uncertainty regarding source reliability. Multiple truth finding systems were developed to solve this problem. Their accuracy is limited by the noisy nature of the data, where distortions, fabrications, omissions, and duplication are introduced. This paper presents a semi-supervised truth estimator for social networks, in which a portion of inputs are carefully selected to be reliably verified. The challenge is to find the subset of observations to verify that would maximally enhance the overall fact-finding accuracy. This work extends previous passive approaches to recursive truth estimation, as well as semi-supervised approaches where the estimator has no control over the choice of data to be labeled. Results show that by optimally selecting claims to be verified, we improve estimated accuracy by 12% over unsupervised baseline, and by 5% over previous semi-supervised approaches.

References

[1]

Md Tanvir Al Amin, Charu Aggarwal, Shuochao Yao, Tarek Abdelzaher, and Lance Kaplan. 2017. Unveiling polarization in social networks: A matrix factorization approach. Technical Report. IEEE.

[2]

Jeffrey A Burke, Deborah Estrin, Mark Hansen, Andrew Parker, Nithya Ramanathan, Sasank Reddy, and Mani B Srivastava. 2006. Participatory sensing. Center for Embedded Network Sensing(2006).

[3]

Hang Cui, Tarek Abdelzaher, and Lance Kaplan. 2018. Recursive Truth Estimation of Time-Varying Sensing Data from Online Open Sources. In International Conference on Distributed Computing in Sensor Systems (DCOSS). New York, NY.

[4]

Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava. 2009. Integrating conflicting data: the role of source dependence. Proceedings of the VLDB Endowment 2, 1 (2009), 550-561.

Digital Library

[5]

Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava. 2009. Truth discovery and copying detection in a dynamic world. Proceedings of the VLDB Endowment 2, 1 (2009), 562-573.

Digital Library

[6]

Luyang Liu, Hongyu Li, Jian Liu, Cagdas Karatas, Yan Wang, Marco Gruteser, Yingying Chen, and Richard P Martin. 2017. Bigroad: Scaling road data acquisition for dependable self-driving. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 371-384.

Digital Library

[7]

Chuishi Meng, Houping Xiao, Lu Su, and Yun Cheng. 2016. Tackling the Redundancy and Sparsity in Crowd Sensing Applications. In SenSys. 150-163.

Digital Library

[8]

Alan Mislove, Massimiliano Marcon, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, 29-42.

Digital Library

[9]

Praneeth Netrapalli and Sujay Sanghavi. 2012. Learning the Graph of Epidemic Cascades. SIGMETRICS Perform. Eval. Rev. 40, 1 (June 2012), 211-222.

Digital Library

[10]

Praneeth Netrapalli and Sujay Sanghavi. 2012. Learning the graph of epidemic cascades. In ACM SIGMETRICS Performance Evaluation Review, Vol. 40. ACM, 211-222.

Digital Library

[11]

Jeff Pasternack and Dan Roth. 2010. Knowing what to believe (when you already know something). In Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, 877-885.

Digital Library

[12]

Jeff Pasternack and Dan Roth. 2013. Latent credibility analysis. In Proceedings of the 22nd international conference on World Wide Web. ACM, 1009-1020.

Digital Library

[13]

Tauhidur Rahman, Alexander Travis Adams, Perry Schein, Aadhar Jain, David Erickson, and Tanzeem Choudhury. 2016. Nutrilyzer: A Mobile System for Characterizing Liquid Food with Photoacoustic Effect. In SenSys. 123-136.

Digital Library

[14]

Dong Wang, Md Tanvir Amin, Shen Li, Tarek Abdelzaher, Lance Kaplan, Siyu Gu, Chenji Pan, Hengchang Liu, Charu C Aggarwal, Raghu Ganti, 2014. Using humans as sensors: an estimation-theoretic perspective. In Information Processing in Sensor Networks, IPSN-14 Proceedings of the 13th International Symposium on. IEEE, 35-46.

Digital Library

[15]

Dong Wang, Lance Kaplan, Hieu Le, and Tarek Abdelzaher. 2012. On truth discovery in social sensing: A maximum likelihood estimation approach. In Information Processing in Sensor Networks (IPSN), 2012 ACM/IEEE 11th International Conference on. IEEE, 233-244.

Digital Library

[16]

Shiguang Wang, Dong Wang, Lu Su, Lance Kaplan, and Tarek F Abdelzaher. 2014. Towards cyber-physical systems in social spaces: The data reliability challenge. In Real-Time Systems Symposium (RTSS), 2014 IEEE. IEEE, 74-85.

[17]

Shuochao Yao, Md Tanvir Amin, Lu Su, Shaohan Hu, Shen Li, Shiguang Wang, Yiran Zhao, Tarek Abdelzaher, Lance Kaplan, Charu Aggarwal, 2016. Recursive ground truth estimator for social data streams. In Information Processing in Sensor Networks (IPSN), 2016 15th ACM/IEEE International Conference on. IEEE, 1-12.

Digital Library

[18]

Shuochao Yao, Md Tanvir Amin, Lu Su, Shaohan Hu, Shen Li, Shiguang Wang, Yiran Zhao, Tarek Abdelzaher, Lance Kaplan, Charu Aggarwal, and Aylin Yener. 2016. Recursive Ground Truth Estimator for Social Data Streams. In Proceedings of the 15th International Conference on Information Processing in Sensor Networks(IPSN '16). IEEE Press, Piscataway, NJ, USA, Article 14, 12 pages. http://dl.acm.org/citation.cfm?id=2959355.2959369

Digital Library

[19]

Xiaoxin Yin, Jiawei Han, and S Yu Philip. 2008. Truth discovery with multiple conflicting information providers on the web. IEEE Transactions on Knowledge and Data Engineering 20, 6(2008), 796-808.

Digital Library

[20]

Xiaoxin Yin and Wenzhao Tan. 2011. Semi-supervised truth discovery. In Proceedings of the 20th international conference on World wide web. ACM, 217-226.

Digital Library

Cited By

Mao YHovakimyan NAbdelzaher TTheodorou E(2024)Social System Inference From Noisy ObservationsIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.322959911:1(639-651)Online publication date: Feb-2024
https://doi.org/10.1109/TCSS.2022.3229599
Mao YLi JHovakimyan NAbdelzaher TLebiere C(2024)Cost Function Learning in Memorized Social Networks With Cognitive Behavioral AsymmetryIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.321848511:1(418-430)Online publication date: Feb-2024
https://doi.org/10.1109/TCSS.2022.3218485
Cui HAbdelzaher T(2024)Polarization Detection on Social Networks: dual contrastive objectives for Self-supervision2024 IEEE 10th International Conference on Collaboration and Internet Computing (CIC)10.1109/CIC62241.2024.00020(80-89)Online publication date: 28-Oct-2024
https://doi.org/10.1109/CIC62241.2024.00020
Show More Cited By

Recommendations

On truth discovery in social sensing: a maximum likelihood estimation approach
IPSN '12: Proceedings of the 11th international conference on Information Processing in Sensor Networks

This paper addresses the challenge of truth discovery from noisy social sensing data. The work is motivated by the emergence of social sensing as a data collection paradigm of growing interest, where humans perform sensory data collection tasks. A ...
Maximum likelihood analysis of conflicting observations in social sensing

This article addresses the challenge of truth discovery from noisy social sensing data. The work is motivated by the emergence of social sensing as a data collection paradigm of growing interest, where humans perform sensory data collection tasks. ...
SenseLens: An Efficient Social Signal Conditioning System for True Event Detection
This article narrows the gap between physical sensing systems that measure physical signals and social sensing systems that measure information signals by (i) defining a novel algorithm for extracting information signals (building on results from text ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '19: The World Wide Web Conference

May 2019

3620 pages

ISBN:9781450366748

DOI:10.1145/3308558

Editors:
Ling Liu
Georgia Tech, USA
,
Ryen White
Microsoft Research, USA

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

IW3C2: International World Wide Web Conference Committee

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '19

WWW '19: The Web Conference

May 13 - 17, 2019

CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
452
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)1

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mao YHovakimyan NAbdelzaher TTheodorou E(2024)Social System Inference From Noisy ObservationsIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.322959911:1(639-651)Online publication date: Feb-2024
https://doi.org/10.1109/TCSS.2022.3229599
Mao YLi JHovakimyan NAbdelzaher TLebiere C(2024)Cost Function Learning in Memorized Social Networks With Cognitive Behavioral AsymmetryIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.321848511:1(418-430)Online publication date: Feb-2024
https://doi.org/10.1109/TCSS.2022.3218485
Cui HAbdelzaher T(2024)Polarization Detection on Social Networks: dual contrastive objectives for Self-supervision2024 IEEE 10th International Conference on Collaboration and Internet Computing (CIC)10.1109/CIC62241.2024.00020(80-89)Online publication date: 28-Oct-2024
https://doi.org/10.1109/CIC62241.2024.00020
Cui HAbdelzaher T(2024)Unsupervised Node Clustering via Contrastive Hard SamplingDatabase Systems for Advanced Applications10.1007/978-981-97-5572-1_18(285-300)Online publication date: 31-Aug-2024
https://doi.org/10.1007/978-981-97-5572-1_18
Fang XZhuo HSheng QZhang YZhu TDu XSun G(2024)Truth Discovery in Social Sensing Based on Propagation Pattern and Multi-Modal Semantic Consistency AnalysisAdvanced Data Mining and Applications10.1007/978-981-96-0847-8_5(63-78)Online publication date: 14-Dec-2024
https://doi.org/10.1007/978-981-96-0847-8_5
Cui HAbdelzaher T(2021)SenseLens: An Efficient Social Signal Conditioning System for True Event DetectionACM Transactions on Sensor Networks10.1145/348504718:2(1-27)Online publication date: 29-Oct-2021
https://dl.acm.org/doi/10.1145/3485047
Krylov DPoliakov SKhanzhina NZabashta AFilchenkov AFarseev AWei XYe HYang JYang J(2021)Improving Multimodal Data Labeling with Deep Active Learning for Post Classification in Social NetworksMultimedia Understanding with Less Labeling on Multimedia Understanding with Less Labeling10.1145/3476098.3485055(17-25)Online publication date: 24-Oct-2021
https://dl.acm.org/doi/10.1145/3476098.3485055
Gu BLi ZLiu AXu JZhao LZhou X(2021)Improving the Quality of Web-Based Data Imputation With Crowd InterventionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.295408733:6(2534-2547)Online publication date: 1-Jun-2021
https://doi.org/10.1109/TKDE.2019.2954087

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents