On Approximating the Lp Distances for p>2

Li, Ping

Computer Science > Machine Learning

arXiv:0806.4423 (cs)

[Submitted on 27 Jun 2008]

Title:On Approximating the Lp Distances for p>2

Authors:Ping Li

View PDF

Abstract: Applications in machine learning and data mining require computing pairwise Lp distances in a data matrix A. For massive high-dimensional data, computing all pairwise distances of A can be infeasible. In fact, even storing A or all pairwise distances of A in the memory may be also infeasible. This paper proposes a simple method for p = 2, 4, 6, ... We first decompose the l_p (where p is even) distances into a sum of 2 marginal norms and p-1 ``inner products'' at different orders. Then we apply normal or sub-Gaussian random projections to approximate the resultant ``inner products,'' assuming that the marginal norms can be computed exactly by a linear scan. We propose two strategies for applying random projections. The basic projection strategy requires only one projection matrix but it is more difficult to analyze, while the alternative projection strategy requires p-1 projection matrices but its theoretical analysis is much easier. In terms of the accuracy, at least for p=4, the basic strategy is always more accurate than the alternative strategy if the data are non-negative, which is common in reality.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:0806.4423 [cs.LG]
	(or arXiv:0806.4423v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.0806.4423

Submission history

From: Ping Li [view email]
[v1] Fri, 27 Jun 2008 05:36:09 UTC (39 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2008-06

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Ping Li

export BibTeX citation

Computer Science > Machine Learning

Title:On Approximating the Lp Distances for p>2

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On Approximating the Lp Distances for p>2

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators