Article

A density-based algorithm for discovering clusters in large spatial databases with noise

Authors:

Martin Ester,

Hans-Peter Kriegel,

Jörg Sander,

Xiaowei XuAuthors Info & Claims

KDD'96: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining

Pages 226 - 231

Published: 02 August 1996 Publication History

Publisher Site

Abstract

Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, we present the new clustering algorithm DBSCAN relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape. DBSCAN requires only one input parameter and supports the user in determining an appropriate value for it. We performed an experimental evaluation of the effectiveness and efficiency of DBSCAN using synthetic data and real data of the SEQUOIA 2000 benchmark. The results of our experiments demonstrate that (1) DBSCAN is significantly more effective in discovering clusters of arbitrary shape than the well-known algorithm CLAR-ANS, and that (2) DBSCAN outperforms CLARANS by a factor of more than 100 in terms of efficiency.

References

[1]

Beckmann N., Kriegel H.-P., Schneider R, and Seeger B. 1990. The R*-tree: An Efficient and Robust Access Method for Points and Rectangles, Proc. ACM SIGMOD Int. Conf. on Management of Data, Atlantic City, NJ, 1990, pp. 322-331.

Digital Library

Google Scholar

[2]

Brinkhoff T., Kriegel H.-R, Schneider R., and Seeger B. 1994 Efficient Multi-Step Processing of Spatial Joins, Proc. ACM SIGMOD Int. Conf. on Management of Data, Minneapolis, MN, 1994, pp. 197-208.

Digital Library

Google Scholar

[3]

Ester M., Kriegel H.-P., and Xu X. 1995. A Database Interface for Clustering in Large Spatial Databases, Proc. 1st Int. Conf. on Knowledge Discovery and Data Mining, Montreal, Canada, 1995, AAAI Press, 1995.

Google Scholar

[4]

García J.A., Fdez-Valdivia J., Cortijo F. J., and Molina R. 1994. A Dynamic Approach for Clustering Data. Signal Processing, Vol. 44, No. 2, 1994, pp. 181-196.

Digital Library

Google Scholar

[5]

Gueting R.H. 1994. An Introduction to Spatial Database Systems. The VLDB Journal 3(4):357-399.

Digital Library

Google Scholar

[6]

Jain Anil K. 1988. Algorithms for Clustering Data. Prentice Hall.

Digital Library

Google Scholar

[7]

Kaufman L., and Rousseeuw P.J. 1990. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons.

Google Scholar

[8]

Matheus C.J.; Chan P.K.; and Piatetsky-Shapiro G. 1993. Systems for Knowledge Discovery in Databases, IEEE Transactions on Knowledge and Data Engineering 5(6):903-913.

Digital Library

Google Scholar

[9]

Ng R.T., and Han J. 1994. Efficient and Effective Clustering Methods for Spatial Data Mining, Proc. 20th Int. Conf. on Very Large Data Bases, 144-155. Santiago, Chile.

Digital Library

Google Scholar

[10]

Stonebraker M., Frew J., Gardels K., and Meredith J. 1993. The SEQUOIA 2000 Storage Benchmark, Proc. ACM SIGMOD Int. Conf. on Management of Data, Washington, DC, 1993, pp. 2-11.

Digital Library

Google Scholar

Cited By

View all

Heroux MProkopenko AArndt DLebrun-Grandié DTurcksin BFrontiere NEmberson JBuehlmann M(2025)Advances in ArborX to support exascale applicationsInternational Journal of High Performance Computing Applications10.1177/1094342024129829639:1(167-176)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1177/10943420241298296
Mitseva APanchenko ABalzarotti DXu W(2024)Stop, don't click here anymoreProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699132(4139-4156)Online publication date: 14-Aug-2024
https://dl.acm.org/doi/10.5555/3698900.3699132
Zheng KChua HHerschel MJagadish HOoi BYip JSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Exploiting negative samplesProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694606(61287-61320)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694606
Show More Cited By

A density-based algorithm for discovering clusters in large spatial databases with noise
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
  2. Information systems applications
    1. Data mining

Recommendations

Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications

The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper, we generalize this algorithm in two important directions. The generalized ...
A local-density based spatial clustering algorithm with noise

Density-based clustering algorithms are attractive for the task of class identification in spatial database. However, in many cases, very different local-density clusters exist in different regions of data space, therefore, DBSCAN method [M. Ester, H.-...
A New Density-Based Scheme for Clustering Based on Genetic Algorithm

Density-based clustering can identify arbitrary data shapes and noises. Achieving good clustering performance necessitates regulating the appropriate parameters in the density-based clustering. To select suitable parameters successfully, this study ...

Comments

Information & Contributors

Information

Published In

KDD'96: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining

August 1996

387 pages

Publisher

AAAI Press

Publication History

Published: 02 August 1996

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1,365
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Heroux MProkopenko AArndt DLebrun-Grandié DTurcksin BFrontiere NEmberson JBuehlmann M(2025)Advances in ArborX to support exascale applicationsInternational Journal of High Performance Computing Applications10.1177/1094342024129829639:1(167-176)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1177/10943420241298296
Mitseva APanchenko ABalzarotti DXu W(2024)Stop, don't click here anymoreProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699132(4139-4156)Online publication date: 14-Aug-2024
https://dl.acm.org/doi/10.5555/3698900.3699132
Zheng KChua HHerschel MJagadish HOoi BYip JSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Exploiting negative samplesProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694606(61287-61320)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694606
Wang ZMai GJanowicz KLao NSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)MC-GTAProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694164(51086-51104)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694164
Li YHu PPeng DLv JFan JPeng XSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Image clustering with external guidanceProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693187(27890-27902)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693187
Jones RChaudhuri SRitchie DSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Learning to infer generative template programs for visual conceptsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692972(22465-22490)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692972
Dai HLiu YSu PCai HHuang SLv JSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Multi-view clustering by inter-cluster connectivity guided rewardProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692460(9846-9855)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692460
Liao JXiao LXie XZhou XLi Y(2024)Clustering and Prioritization of Web Crowdsourced Test Reports Based on Text ClassificationInternational Journal of Web Services Research10.4018/IJWSR.35799921:1(1-19)Online publication date: 7-Nov-2024
https://dl.acm.org/doi/10.4018/IJWSR.357999
Mészáros ASchumann JAlonso-Mora JZgonnikov AKober JLarson K(2024)ROMEProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/525(4751-4759)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/525
Švadlenka MChrpa LLarson K(2024)A framework for centralized traffic routing in urban areasProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/1038(8810-8814)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/1038
Show More Cited By

Abstract

References

Cited By

Recommendations

Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications

A local-density based spatial clustering algorithm with noise

A New Density-Based Scheme for Clustering Based on Genetic Algorithm

Comments

Information

Published In

Sponsors

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations