Mathematical analysis

Applied Filters

People

Publications

Conferences

Publication Date

62 Results for: Book/Issue: ICML '08: Proceedings of the 25th international conference on Machine learningEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,832,677 records)|Limit your search to The ACM Full-Text Collection (772,752 records)

Showing 1 - 20of62 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
July 2008
Efficient multiclass maximum margin clustering
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1248–1255https://doi.org/10.1145/1390156.1390313

This paper presents a cutting plane algorithm for multiclass maximum margin clustering (MMC). The proposed algorithm constructs a nested sequence of successively tighter relaxations of the original MMC problem, and each optimization problem in this ...
56
442
Metrics
Total Citations56
Total Downloads442
Last 12 Months3
Last 6 weeks0
Get Access
research-article
July 2008
Estimating local optimums in EM algorithm over Gaussian mixture model
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1240–1247https://doi.org/10.1145/1390156.1390312

EM algorithm is a very popular iteration-based method to estimate the parameters of Gaussian Mixture Model from a large observation set. However, in most cases, EM algorithm is not guaranteed to converge to the global optimum. Instead, it stops at some ...
11
316
Metrics
Total Citations11
Total Downloads316
Last 12 Months11
Last 6 weeks1
Get Access
research-article
July 2008
Improved Nyström low-rank approximation and error analysis
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1232–1239https://doi.org/10.1145/1390156.1390311

Low-rank matrix approximation is an effective tool in alleviating the memory and computational burdens of kernel methods and sampling, as the mainstream of such algorithms, has drawn considerable attention in both theory and practice. This paper ...
151
1,166
Metrics
Total Citations151
Total Downloads1,166
Last 12 Months76
Last 6 weeks9
Get Access
research-article
July 2008
A quasi-Newton approach to non-smooth convex optimization
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1216–1223https://doi.org/10.1145/1390156.1390309

We extend the well-known BFGS quasi-Newton method and its limited-memory variant LBFGS to the optimization of non-smooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: The local ...
11
397
Metrics
Total Citations11
Total Downloads397
Last 12 Months34
Last 6 weeks3
Get Access
research-article
July 2008
Preconditioned temporal difference learning
- Hengshuai Yao,
- Zhi-Qiang Liu
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1208–1215https://doi.org/10.1145/1390156.1390308

This paper extends many of the recent popular policy evaluation algorithms to a generalized framework that includes least-squares temporal difference (LSTD) learning, least-squares policy evaluation (LSPE) and a variant of incremental LSTD (iLSTD). The ...
5
168
Metrics
Total Citations5
Total Downloads168
Last 12 Months7
Last 6 weeks0
Get Access
research-article
July 2008
Democratic approximation of lexicographic preference models
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1200–1207https://doi.org/10.1145/1390156.1390307

Previous algorithms for learning lexicographic preference models (LPMs) produce a "best guess" LPM that is consistent with the observations. Our approach is more democratic: we do not commit to a single LPM. Instead, we approximate the target using the ...
12
128
Metrics
Total Citations12
Total Downloads128
Last 12 Months2
Last 6 weeks0
Get Access
research-article
July 2008
Efficiently learning linear-linear exponential family predictive representations of state
- David Wingate,
- Satinder Singh
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1176–1183https://doi.org/10.1145/1390156.1390304

Exponential Family PSR (EFPSR) models capture stochastic dynamical systems by representing state as the parameters of an exponential family distribution over a shortterm window of future observations. They are appealing from a learning perspective ...
3
109
Metrics
Total Citations3
Total Downloads109
Last 12 Months0
Last 6 weeks0
Get Access
research-article
July 2008
Deep learning via semi-supervised embedding
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1168–1175https://doi.org/10.1145/1390156.1390303

We show how nonlinear embedding algorithms popular for use with shallow semi-supervised learning techniques such as kernel methods can be applied to deep multilayer architectures, either as a regularizer at the output layer, or on each layer of the ...
233
2,296
Metrics
Total Citations233
Total Downloads2,296
Last 12 Months121
Last 6 weeks14
Get Access
research-article
July 2008
Graph transduction via alternating minimization
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1144–1151https://doi.org/10.1145/1390156.1390300

Graph transduction methods label input data by learning a classification function that is regularized to exhibit smoothness along a graph over labeled and unlabeled samples. In practice, these algorithms are sensitive to the initial set of labels ...
86
384
Metrics
Total Citations86
Total Downloads384
Last 12 Months8
Last 6 weeks1
Get Access
research-article
July 2008
Sparse multiscale gaussian process regression
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1112–1119https://doi.org/10.1145/1390156.1390296

Most existing sparse Gaussian process (g.p.) models seek computational advantages by basing their computations on a set of m basis functions that are the covariance function of the g.p. with one of its two inputs fixed. We generalise this for the case ...
27
451
Metrics
Total Citations27
Total Downloads451
Last 12 Months29
Last 6 weeks2
Get Access
research-article
July 2008
A semiparametric statistical approach to model-free policy evaluation
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1072–1079https://doi.org/10.1145/1390156.1390291

Reinforcement learning (RL) methods based on least-squares temporal difference (LSTD) have been developed recently and have shown good practical performance. However, the quality of their estimation has not been well elucidated. In this article, we ...
5
139
Metrics
Total Citations5
Total Downloads139
Last 12 Months1
Last 6 weeks0
Get Access
research-article
July 2008
Training restricted Boltzmann machines using approximations to the likelihood gradient
- Tijmen Tieleman
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1064–1071https://doi.org/10.1145/1390156.1390290

A new algorithm for training Restricted Boltzmann Machines is introduced. The algorithm, named Persistent Contrastive Divergence, is different from the standard Contrastive Divergence algorithms in that it aims to draw samples from almost exactly the ...
512
2,006
Metrics
Total Citations512
Total Downloads2,006
Last 12 Months73
Last 6 weeks8
Get Access
research-article
July 2008
ν-support vector machine as conditional value-at-risk minimization
- Akiko Takeda,
- Masashi Sugiyama
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1056–1063https://doi.org/10.1145/1390156.1390289

The ν-support vector classification (ν-SVC) algorithm was shown to work well and provide intuitive interpretations, e.g., the parameter ν roughly specifies the fraction of support vectors. Although ν corresponds to a fraction, it cannot take the entire ...
40
362
Metrics
Total Citations40
Total Downloads362
Last 12 Months12
Last 6 weeks0
Get Access
research-article
July 2008
The many faces of optimism: a unifying approach
- István Szita,
- András Lőrincz
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1048–1055https://doi.org/10.1145/1390156.1390288

The exploration-exploitation dilemma has been an intriguing and unsolved problem within the framework of reinforcement learning. "Optimism in the face of uncertainty" and model building play central roles in advanced exploration methods. Here, we ...
44
358
Metrics
Total Citations44
Total Downloads358
Last 12 Months16
Last 6 weeks3
Get Access
research-article
July 2008
Apprenticeship learning using linear programming
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1032–1039https://doi.org/10.1145/1390156.1390286

In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in that the MDP's true reward function is assumed to be unknown. We show how to ...
108
810
Metrics
Total Citations108
Total Downloads810
Last 12 Months78
Last 6 weeks10
Get Access
research-article
July 2008
A least squares formulation for canonical correlation analysis
ICML '08: Proceedings of the 25th international conference on Machine learningPages 1024–1031https://doi.org/10.1145/1390156.1390285

Canonical Correlation Analysis (CCA) is a well-known technique for finding the correlations between two sets of multi-dimensional variables. It projects both sets of variables into a lower-dimensional space in which they are maximally correlated. CCA is ...
62
581
Metrics
Total Citations62
Total Downloads581
Last 12 Months18
Last 6 weeks2
Get Access
research-article
July 2008
The asymptotics of semi-supervised learning in discriminative probabilistic models
ICML '08: Proceedings of the 25th international conference on Machine learningPages 984–991https://doi.org/10.1145/1390156.1390280

Semi-supervised learning aims at taking advantage of unlabeled data to improve the efficiency of supervised learning procedures. For discriminative models however, this is a challenging task. In this contribution, we introduce an original methodology ...
30
234
Metrics
Total Citations30
Total Downloads234
Last 12 Months2
Last 6 weeks0
Get Access
research-article
July 2008
Sample-based learning and search with permanent and transient memories
ICML '08: Proceedings of the 25th international conference on Machine learningPages 968–975https://doi.org/10.1145/1390156.1390278

We present a reinforcement learning architecture, Dyna-2, that encompasses both sample-based learning and sample-based search, and that generalises across states during both learning and search. We apply Dyna-2 to high performance Computer Go. In this ...
55
594
Metrics
Total Citations55
Total Downloads594
Last 12 Months32
Last 6 weeks1
Get Access
research-article
July 2008
Data spectroscopy: learning mixture models using eigenspaces of convolution operators
ICML '08: Proceedings of the 25th international conference on Machine learningPages 936–943https://doi.org/10.1145/1390156.1390274

In this paper we develop a spectral framework for estimating mixture distributions, specifically Gaussian mixture models. In physics, spectroscopy is often used for the identification of substances through their spectrum. Treating a kernel function K(x, ...
17
214
Metrics
Total Citations17
Total Downloads214
Last 12 Months4
Last 6 weeks1
Get Access
research-article
July 2008
SVM optimization: inverse dependence on training set size
- Shai Shalev-Shwartz,
- Nathan Srebro
ICML '08: Proceedings of the 25th international conference on Machine learningPages 928–935https://doi.org/10.1145/1390156.1390273

We discuss how the runtime of SVM optimization should decrease as the size of the training data increases. We present theoretical and empirical results demonstrating how a simple subgradient descent approach indeed displays such behavior, at least for ...
137
1,218
Metrics
Total Citations137
Total Downloads1,218
Last 12 Months70
Last 6 weeks2
Get Access

Applied Filters

People

Names

Institutions

Authors

Publications

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Efficient multiclass maximum margin clustering

Estimating local optimums in EM algorithm over Gaussian mixture model

Improved Nyström low-rank approximation and error analysis

A quasi-Newton approach to non-smooth convex optimization

Preconditioned temporal difference learning

Democratic approximation of lexicographic preference models

Efficiently learning linear-linear exponential family predictive representations of state

Deep learning via semi-supervised embedding

Graph transduction via alternating minimization

Sparse multiscale gaussian process regression

A semiparametric statistical approach to model-free policy evaluation

Training restricted Boltzmann machines using approximations to the likelihood gradient

ν-support vector machine as conditional value-at-risk minimization

The many faces of optimism: a unifying approach

Apprenticeship learning using linear programming

A least squares formulation for canonical correlation analysis

The asymptotics of semi-supervised learning in discriminative probabilistic models

Sample-based learning and search with permanent and transient memories

Data spectroscopy: learning mixture models using eigenspaces of convolution operators

SVM optimization: inverse dependence on training set size