skip to main content
10.1145/3394486.3403136acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

Mining Persistent Activity in Continually Evolving Networks

Published: 20 August 2020 Publication History

Abstract

Frequent pattern mining is a key area of study that gives insights into the structure and dynamics of evolving networks, such as social or road networks. However, not only does a network evolve, but often the way that it evolves, itself evolves. Thus, knowing, in addition to patterns' frequencies, for how long and how regularly they have occurred-i.e., their persistence-can add to our understanding of evolving networks. In this work, we propose the problem of mining activity that persists through time in continually evolving networks-i.e., activity that repeatedly and consistently occurs. We extend the notion of temporal motifs to capture activity among specific nodes, in what we call activity snippets, which are small sequences of edge-updates that reoccur. We propose axioms and properties that a measure of persistence should satisfy, and develop such a persistence measure. We also propose PENminer, an efficient framework for mining activity snippets' Persistence in Evolving Networks, and design both offline and streaming algorithms. We apply PENminer to numerous real, large-scale evolving networks and edge streams, and find activity that is surprisingly regular over a long period of time, but too infrequent to be discovered by aggregate count alone, and bursts of activity exposed by their lack of persistence. Our findings with PENminer include neighborhoods in NYC where taxi traffic persisted through Hurricane Sandy, the opening of new bike-stations, characteristics of social network users, and more. Moreover, we use PENminer towards identifying anomalies in multiple networks, outperforming baselines at identifying subtle anomalies by 9.8-48% in AUC.

References

[1]
Motivate International Inc. https://www.motivateco.com/where-we-do-it/.
[2]
NYC Taxi & Limousine Commission. https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page.
[3]
Ehab Abdelhamid, Mustafa Canim, Mohammad Sadoghi, Bishwaranjan Bhattacharjee, Yuan-Chi Chang, and Panos Kalnis. Incremental frequent subgraph mining on large evolving graphs. IEEE TKDE, 29(12):2710--2723, 2017.
[4]
Charu C Aggarwal, Yao Li, Philip S Yu, and Ruoming Jin. On dense pattern mining in graph streams. PVLDB, 3(1--2):975--984, 2010.
[5]
Rezwan Ahmed and George Karypis. Algorithms for mining the coevolving relational motifs in dynamic networks. ACM TKDD, 10(1):1--31, 2015.
[6]
Cigdem Aslay, Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, and Aristides Gionis. Mining frequent patterns in evolving graphs. In CIKM, pages 923--932. ACM, 2018.
[7]
Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, and Christos Faloutsos. MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams. In AAAI, 2020.
[8]
Thomas M Cover and Joy A Thomas. Elements of information theory. Wiley, 2012.
[9]
Haipeng Dai, Muhammad Shahzad, Alex X Liu, and Yuankun Zhong. Finding persistent items in data streams. PVLDB, 10(4):289--300, 2016.
[10]
Mohammed Elseidy, Ehab Abdelhamid, Spiros Skiadopoulos, and Panos Kalnis. Grami: Frequent subgraph and pattern mining in a single large graph. PVLDB, 7(7):517--528, 2014.
[11]
Dhivya Eswaran and Christos Faloutsos. Sedanspot: Detecting anomalies in edge streams. In ICDM, pages 953--958. IEEE, 2018.
[12]
Wenjie Feng, Shenghua Liu, Danai Koutra, Huawei Shen, and Xueqi Cheng. Unified dense subgraph detection. In ECML/PKDD, 2020.
[13]
Sudipto Guha, Nina Mishra, Gourav Roy, and Okke Schrijvers. Robust random cut forest based anomaly detection on streams. In ICML, pages 2712--2721, 2016.
[14]
Saket Gurukar, Sayan Ranu, and Balaraman Ravindran. Commit: A scalable approach to mining communication motifs from dynamic networks. In SIGMOD, pages 475--489. ACM, 2015.
[15]
Chuntao Jiang, Frans Coenen, and Michele Zito. A survey of frequent subgraph mining algorithms. The Knowledge Engineering Review, 28(1):75--105, 2013.
[16]
Chrysanthi Kosyfaki, Nikos Mamoulis, Evaggelia Pitoura, and Panayiotis Tsaparas. Flow motifs in interaction networks. In EDBT, 2018.
[17]
Lauri Kovanen, Márton Karsai, Kimmo Kaski, János Kertész, and Jari Saram"aki. Temporal motifs in time-dependent networks. JSTAT, 2011(11):P11005, 2011.
[18]
Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2020.
[19]
Rong-Hua Li, Jiao Su, Lu Qin, Jeffrey Xu Yu, and Qiangqiang Dai. Persistent community search in temporal networks. In ICDE, pages 797--808. IEEE, 2018.
[20]
Richard Lippmann, Robert K Cunningham, David J Fried, Isaac Graf, Kris R Kendall, Seth E Webster, and Marc A Zissman. Results of the darpa 1998 offline intrusion detection evaluation. In RAID, volume 99, pages 829--835, 1999.
[21]
Paul Liu, Austin R. Benson, and Moses Charikar. Sampling methods for counting temporal motifs. In WSDM, 2019.
[22]
Yike Liu, Tara Safavi, Abhilash Dighe, and Danai Koutra. Graph summarization methods and applications: A survey. ACM Comput. Surv., 51(3), 2018.
[23]
Ron Milo, Shai S Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, D. Chklovskii, and Uri Alon. Network motifs: simple building blocks of complex networks. Science, 298 5594:824--7, 2002.
[24]
Ashwin Paranjape, Austin R Benson, and Jure Leskovec. Motifs in temporal networks. In WSDM, pages 601--610. ACM, 2017.
[25]
Abhik Ray, Larry Holder, and Sutanay Choudhury. Frequent subgraph discovery in large attributed streaming graphs. In BigMine, pages 166--181, 2014.
[26]
Konstantinos Semertzidis, Evaggelia Pitoura, Evimaria Terzi, and Panayiotis Tsaparas. Finding lasting dense subgraphs. DAMI, 33(5):1417--1445, 2019.
[27]
Neil Shah, Danai Koutra, Tianmin Zou, Brian Gallagher, and Christos Faloutsos. Timecrunch: Interpretable dynamic graph summarization. In KDD, pages 1055--1064. ACM, 2015.
[28]
Lorenzo De Stefani, Alessandro Epasto, Matteo Riondato, and Eli Upfal. Triest: Counting local and global triangles in fully dynamic streams with fixed memory size. ACM TKDD, 11(4):1--50, 2017.
[29]
Maciej Walczyszyn, Shalin Patel, Maly Oron, and Bushra Mina. Battling superstorm sandy at lenox hill hospital: When the hospital is ground zero. Critical care clinics, 35(4):711--715, 2019.
[30]
Qiankun Zhao, Yuan Tian, Qi He, Nuria Oliver, Ruoming Jin, and Wang-Chien Lee. Communication motifs: a tool to characterize social communications. In CIKM. ACM, 2010.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
August 2020
3664 pages
ISBN:9781450379984
DOI:10.1145/3394486
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. edge streams
  2. evolving networks
  3. persistence

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)181
  • Downloads (Last 6 weeks)18
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media