skip to main content
10.1145/3038912.3052718acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Detecting Large Reshare Cascades in Social Networks

Published: 03 April 2017 Publication History

Abstract

Detecting large reshare cascades is an important problem in online social networks. There are a variety of attempts to model this problem, from using time series analysis methods to stochastic processes. Most of these approaches heavily depend on the underlying network features and use network information to detect the virality of cascades. In most cases, however, getting such detailed network information can be hard or even impossible.
In contrast, in this paper, we propose SANSNET, a network-agnostic approach instead. Our method can be used to answer two important questions: (1) Will a cascade go viral? and (2) How early can we predict it? We use techniques from survival analysis to build a supervised classifier in the space of survival probabilities and show that the optimal decision boundary is a survival function. A notable feature of our approach is that it does not use any network-based features for the prediction tasks, making it very cheap to implement. Finally, we evaluate our approach on several real-life data sets, including popular social networks like Facebook and Twitter, on metrics like recall, F-measure and breakout coverage. We find that network agnostic SANSNET classifier outperforms several non-trivial competitors and baselines which utilize network information.

References

[1]
R. M. Anderson and R. M. May. Infectious Diseases of Humans. Oxford University Press, 1991.
[2]
C. L. Barrett, K. R. Bisset, S. G. Eubank, X. Feng, and M. V. Marathe. Episimdemics: an efficient algorithm for simulating the spread of infectious disease over large realistic social networks. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, pages 1--12, 2008.
[3]
G. E. Box, G. M. Jenkins, and G. C. Reinsel. Time series analysis: forecasting and control, volume 734. John Wiley & Sons, 2011.
[4]
J. Cheng, L. Adamic, P. A. Dow, J. M. Kleinberg, and J. Leskovec. Can cascades be predicted? In Proceedings of the 23rd International Conference on World Wide Web, pages 925--936. ACM, 2014.
[5]
R. Crane and D. Sornette. Robust dynamic classes revealed by measuring the response function of a social system. In PNAS, 2008.
[6]
P. Cui, S. Jin, L. Yu, F. Wang, W. Zhu, and S. Yang. Cascading outbreak prediction in networks: a data-driven approach. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 901--909, 2013.
[7]
A. Friggeri, L. A. Adamic, D. Eckles, and J. Cheng. Rumor cascades. In ICWSM, 2014.
[8]
J. Goldenberg, B. Libai, and E. Muller. Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing Letters, 2001.
[9]
M. Gomez Rodriguez, J. Leskovec, and A. Krause. Inferring networks of diffusion and influence. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1019--1028. ACM, 2010.
[10]
M. Gomez-Rodriguez, J. Leskovec, and B. Schölkopf. Modeling information propagation with survival theory. In ICML (3), pages 666--674, 2013.
[11]
H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42, 2000.
[12]
T. Hothorn, K. Hornik, and A. Zeileis. Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical statistics, 15(3):651--674, 2006.
[13]
K. Kapoor, M. Sun, J. Srivastava, and T. Ye. A hazard based approach to user return time prediction. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1719--1728. ACM, 2014.
[14]
D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD Iinternational Conference on Knowledge Discovery and Data Mining, 2003.
[15]
J. P. Klein and M. L. Moeschberger. Survival analysis: techniques for censored and truncated data. Springer Science & Business Media, 2003.
[16]
R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. On the bursty evolution of blogspace. In Proceedings of 12th International World Wide Web Conference, pages 568--576, New York, NY, USA, 2003. ACM Press.
[17]
T. Lappas, E. Terzi, D. Gunopulos, and H. Mannila. Finding effectors in social networks. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1059--1068. ACM, 2010.
[18]
J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-effective outbreak detection in networks. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 420--429. ACM, 2007.
[19]
J. Leskovec, M. McGlohon, C. Faloutsos, N. Glance, and M. Hurst. Patterns of cascading behavior in large blog graphs. In Proceedings of the 2007 SIAM International Conference on Data Mining, pages 551--556. SIAM, 2007.
[20]
L. Li, C.-J. M. Liang, J. Liu, S. Nath, A. Terzis, and C. Faloutsos. Thermocast: a cyber-physical forecasting model for datacenters. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1370--1378. ACM, 2011.
[21]
Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, and C. Faloutsos. Rise and fall patterns of information diffusion: model and implications. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012.
[22]
E. E. Papalexakis, T. Dumitras, D. H. P. Chau, B. A. Prakash, and C. Faloutsos. Spatio-temporal mining of software adoption & penetration. In IEEE/ACM ASONAM, Niagara Falls, CA, Aug 2013.
[23]
B. A. Prakash, L. A. Adamic, T. J. Iwashyna, H. Tong, and C. Faloutsos. Fractional immunization in networks. In Proceedings of the 2013 SIAM International Conference on Data Mining, pages 659--667, 2013.
[24]
B. A. Prakash, D. Chakrabarti, M. Faloutsos, N. Valler, and C. Faloutsos. Threshold conditions for arbitrary cascade models on arbitrary networks. Knowledge and Information Systems, 2012.
[25]
B. Ribeiro, M. X. Hoang, and A. K. Singh. Beyond models: Forecasting complex network processes directly from data. In Proceedings of the 24th International Conference on World Wide Web, pages 885--895. ACM, 2015.
[26]
E. M. Rogers. Diffusion of Innovations, 5th Edition. Free Press, August 2003.
[27]
K. Subbian, C. Aggarwal, and J. Srivastava. Content-centric flow mining for influence analysis in social streams. In Proceedings of the 22nd ACM Conference on Information & Knowledge Management, pages 841--846. ACM, 2013.
[28]
K. Subbian and P. Melville. Supervised rank aggregation for predicting influencers in twitter. In SocialCom, pages 661--665, 2011.
[29]
G. Szabo and B. A. Huberman. Predicting the popularity of online content. Communications of the ACM, 53(8):80--88, 2010.
[30]
T. M. Therneau. A Package for Survival Analysis in R, 2015. version 2.38.
[31]
H. Tong, B. A. Prakash, T. Eliassi-Rad, M. Faloutsos, and C. Faloutsos. Gelling, and melting, large graphs by edge manipulation. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012.
[32]
S. Wang, Z. Yan, X. Hu, P. S. Yu, and Z. Li. Burst time prediction in cascades. In AAAI, January 25-30, 2015, Austin, Texas, USA., pages 325--331, 2015.
[33]
J. Yang and S. Counts. Predicting the speed, scale, and range of information diffusion in twitter. ICWSM, 10:355--358, 2010.
[34]
J. Yang and J. Leskovec. Modeling information diffusion in implicit networks. In IEEE 10th International Conference on Data Mining (ICDM), pages 599--608, 2010.
[35]
J. Yang and J. Leskovec. Patterns of temporal variation in online media. In Proceedings of the fourth ACM International Conference on Web Search and Data Mining, pages 177--186. ACM, 2011.
[36]
L. Yu, P. Cui, F. Wang, C. Song, and S. Yang. From micro to macro: Uncovering and predicting information cascading process with behavioral dynamics. In IEEE ICDM, 2015.
[37]
Q. Zhao, M. A. Erdogdu, H. Y. He, A. Rajaraman, and J. Leskovec. Seismic: A self-exciting point process model for predicting tweet popularity. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1513--1522, 2015.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '17: Proceedings of the 26th International Conference on World Wide Web
April 2017
1678 pages
ISBN:9781450349130

Sponsors

  • IW3C2: International World Wide Web Conference Committee

In-Cooperation

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 03 April 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. detecting cascades
  2. reshare cascades
  3. social networks
  4. survival model analysis

Qualifiers

  • Research-article

Conference

WWW '17
Sponsor:
  • IW3C2

Acceptance Rates

WWW '17 Paper Acceptance Rate 164 of 966 submissions, 17%;
Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)4
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media