Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJanuary 2024
Better Graph Embeddings for Enterprise Graphs
CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)Pages 368–374https://doi.org/10.1145/3632410.3632412Graph embeddings are scalable and performant node representations in a graph. Fast Random Projections (FastRP) is claimed to be thousands of times faster to generate embeddings compared to random walk-based algorithms like DeepWalk and Node2Vec, while ...
- research-articleJune 2023
Depth-𝑑 Threshold Circuits vs. Depth-(𝑑+1) AND-OR Trees
STOC 2023: Proceedings of the 55th Annual ACM Symposium on Theory of ComputingPages 895–904https://doi.org/10.1145/3564246.3585216For any n ∈ ℕ and d = o(loglog(n)), we prove that there is a Boolean function F on n bits and a value γ = 2−Θ(d) such that F can be computed by a uniform depth-(d + 1) AC0 circuit with O(n) wires, but F cannot be computed by any depth-d TC0 circuit ...
- research-articleJanuary 2021
Estimating Leverage Scores via Rank Revealing Methods and Randomization
SIAM Journal on Matrix Analysis and Applications (SIMAX), Volume 42, Issue 3Pages 1199–1228https://doi.org/10.1137/20M1314471We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank. Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction ...
- research-articleDecember 2020
Randomized tests for high-dimensional regression: a more efficient and powerful solution
NIPS '20: Proceedings of the 34th International Conference on Neural Information Processing SystemsArticle No.: 396, Pages 4721–4732We investigate the problem of testing the global null in the high-dimensional regression models when the feature dimension p grows proportionally to the number of observations n. Despite a number of prior work studying this problem, whether there exists ...
- research-articleJanuary 2020
Sparse projection oblique randomer forests
- Tyler M. Tomita,
- James Browne,
- Cencheng Shen,
- Jaewon Chung,
- Jesse L. Patsolic,
- Benjamin Falk,
- Carey E. Priebe,
- Jason Yim,
- Randal Burns,
- Mauro Maggioni,
- Joshua T. Vogelstein
The Journal of Machine Learning Research (JMLR), Volume 21, Issue 1Article No.: 104, Pages 4193–4231Decision forests, including Random Forests and Gradient Boosting Trees, have recently demonstrated state-of-the-art performance in a variety of machine learning settings. Decision forests are typically ensembles of axis-aligned decision trees; that is, ...
-
- research-articleJune 2019
Oblivious dimension reduction for k-means: beyond subspaces and the Johnson-Lindenstrauss lemma
STOC 2019: Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of ComputingPages 1039–1050https://doi.org/10.1145/3313276.3316318We show that for n points in d-dimensional Euclidean space, a data oblivious random projection of the columns onto m∈ O((logk+loglogn)ε−6log1/ε) dimensions is sufficient to approximate the cost of all k-means clusterings up to a multiplicative (1±ε) ...
- research-articleJune 2019
Optimal terminal dimensionality reduction in Euclidean space
STOC 2019: Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of ComputingPages 1064–1069https://doi.org/10.1145/3313276.3316307Let ε∈(0,1) and X⊂d be arbitrary with |X| having size n>1. The Johnson-Lindenstrauss lemma states there exists f:X→m with m = O(ε−2logn) such that <table><tr><td> ∀ x∈ X ∀ y∈ X, ||x−y||2 ≤ ||f(x)−f(y)||2 ≤ (1+ε)||x−y||2 . </td></tr></table> We show that ...
- research-articleAugust 2017
An Average-Case Depth Hierarchy Theorem for Boolean Circuits
Journal of the ACM (JACM), Volume 64, Issue 5Article No.: 35, Pages 1–27https://doi.org/10.1145/3095799We prove an average-case depth hierarchy theorem for Boolean circuits over the standard basis of AND, OR, and NOT gates. Our hierarchy theorem says that for every d ≥ 2, there is an explicit n-variable Boolean function f, computed by a linear-size depth-...
- articleJanuary 2017
Adaptive randomized dimension reduction on massive data
The scalability of statistical estimators is of increasing importance in modern applications. One approach to implementing scalable algorithms is to compress data into a low dimensional latent space using dimension reduction methods. In this paper, we ...
- research-articleJune 2016
Poly-logarithmic Frege depth lower bounds via an expander switching lemma
STOC '16: Proceedings of the forty-eighth annual ACM symposium on Theory of ComputingPages 644–657https://doi.org/10.1145/2897518.2897637We show that any polynomial-size Frege refutation of a certain linear-size unsatisfiable 3-CNF formula over n variables must have depth Ω(√logn). This is an exponential improvement over the previous best results (Pitassi et al. 1993, Krajíček et al. ...
- research-articleJune 2016
Near-optimal small-depth lower bounds for small distance connectivity
STOC '16: Proceedings of the forty-eighth annual ACM symposium on Theory of ComputingPages 612–625https://doi.org/10.1145/2897518.2897534We show that any depth-d circuit for determining whether an n-node graph has an s-to-t path of length at most k must have size nΩ(k1/d/d) when k(n) ≤ n1/5, and nΩ(k1/5d/d) when k(n)≤ n. The previous best circuit size lower bounds were nkexp(−O(d)) (by ...
- articleJune 2016
Toward large-scale continuous eda: A random matrix theory perspective
Evolutionary Computation (EVOL), Volume 24, Issue 2Pages 255–291https://doi.org/10.1162/EVCO_a_00150Estimations of distribution algorithms EDAs are a major branch of evolutionary algorithms EA with some unique advantages in principle. They are able to take advantage of correlation structure to drive the search more efficiently, and they are able to ...
- research-articleOctober 2015
Soft Content Fingerprinting With Bit Polarization Based on Sign-Magnitude Decomposition
IEEE Transactions on Information Forensics and Security (TIFS), Volume 10, Issue 10Pages 2033–2047https://doi.org/10.1109/TIFS.2015.2432744Content identification based on digital content fingerprinting attracts significant attention in different emerging applications. In this paper, we consider content identification based on the sign-magnitude decomposition of fingerprint codewords and ...
- research-articleSeptember 2015
A Quantized Johnson–Lindenstrauss Lemma: The Finding of Buffon’s Needle
IEEE Transactions on Information Theory (ITHR), Volume 61, Issue 9Pages 5012–5027https://doi.org/10.1109/TIT.2015.2453355In 1733, Georges-Louis Leclerc, Comte de Buffon in France, set the ground of geometric probability theory by defining an enlightening problem: what is the probability that a needle thrown randomly on a ground made of equispaced parallel strips lies on two ...
- research-articleJune 2015
SATTVA: SpArsiTy inspired classificaTion of malware VAriants
IH&MMSec '15: Proceedings of the 3rd ACM Workshop on Information Hiding and Multimedia SecurityPages 135–140https://doi.org/10.1145/2756601.2756616There is an alarming increase in the amount of malware that is generated today. However, several studies have shown that most of these new malware are just variants of existing ones. Fast detection of these variants plays an effective role in thwarting ...
- research-articleMay 2015
Parallel Streaming Signature EM-tree: A Clustering Algorithm for Web Scale Applications
WWW '15: Proceedings of the 24th International Conference on World Wide WebPages 216–226https://doi.org/10.1145/2736277.2741111The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundreds of thousands of clusters. It does ...
- research-articleNovember 2014
Solving Linear SVMs with Multiple 1D Projections
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementPages 221–230https://doi.org/10.1145/2661829.2661994We present a new methodology for solving linear Support Vector Machines (SVMs) that capitalizes on multiple 1D projections. We show that the approach approximates the optimal solution with high accuracy and comes with analytical guarantees. Our solution ...
- ArticleOctober 2014
Distributed Compressive Detection with Perfect Secrecy
MASS '14: Proceedings of the 2014 IEEE 11th International Conference on Mobile Ad Hoc and Sensor SystemsPages 674–679https://doi.org/10.1109/MASS.2014.40This paper considers the problem of distributed compressive detection under a perfect secrecy constraint. More specifically, we consider the problem where the distributed inference network operates in the presence of an eavesdropper who wants to ...
- ArticleJune 2014
Using Projection Kurtosis Concentration of Natural Images for Blind Noise Covariance Matrix Estimation
CVPR '14: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern RecognitionPages 2870–2876https://doi.org/10.1109/CVPR.2014.367Kurtosis of 1D projections provides important statistical characteristics of natural images. In this work, we first provide a theoretical underpinning to a recently observed phenomenon known as projection kurtosis concentration that the kurtosis of ...
- articleJanuary 2014
Efficient learning and planning with compressed predictive states
Predictive state representations (PSRs) offer an expressive framework for modelling partially observable systems. By compactly representing systems as functions of observable quantities, the PSR learning approach avoids using local-minima prone ...