An Asymptotically Tighter Bound on Sampling for Frequent Itemsets Mining

Ji, Shiyu; Wan, Kun

Computer Science > Data Structures and Algorithms

arXiv:1703.08273 (cs)

[Submitted on 24 Mar 2017]

Title:An Asymptotically Tighter Bound on Sampling for Frequent Itemsets Mining

Authors:Shiyu Ji, Kun Wan

View PDF

Abstract:In this paper we present a new error bound on sampling algorithms for frequent itemsets mining. We show that the new bound is asymptotically tighter than the state-of-art bounds, i.e., given the chosen samples, for small enough error probability, the new error bound is roughly half of the existing bounds. Based on the new bound, we give a new approximation algorithm, which is much simpler compared to the existing approximation algorithms, but can also guarantee the worst approximation error with precomputed sample size. We also give an algorithm which can approximate the top-$k$ frequent itemsets with high accuracy and efficiency.

Comments:	13 pages, 2 figures, 2 tables
Subjects:	Data Structures and Algorithms (cs.DS); Databases (cs.DB)
Cite as:	arXiv:1703.08273 [cs.DS]
	(or arXiv:1703.08273v1 [cs.DS] for this version)
	https://doi.org/10.48550/arXiv.1703.08273

Submission history

From: Shiyu Ji [view email]
[v1] Fri, 24 Mar 2017 02:59:51 UTC (19 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.DS

< prev | next >

new | recent | 2017-03

Change to browse by:

cs
cs.DB

References & Citations

DBLP - CS Bibliography

listing | bibtex

Shiyu Ji
Kun Wan

export BibTeX citation

Computer Science > Data Structures and Algorithms

Title:An Asymptotically Tighter Bound on Sampling for Frequent Itemsets Mining

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Data Structures and Algorithms

Title:An Asymptotically Tighter Bound on Sampling for Frequent Itemsets Mining

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators