Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleMarch 2024
Least squares model averaging for distributed data
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 215, Pages 10235–10293Divide and conquer algorithm is a common strategy applied in big data. Model averaging has the natural divide-and-conquer feature, but its theory has not been developed in big data scenarios. The goal of this paper is to fill this gap. We propose two ...
- research-articleJanuary 2022
Model averaging is asymptotically better than model selection for prediction
The Journal of Machine Learning Research (JMLR), Volume 23, Issue 1Article No.: 33, Pages 1463–1516We compare the performance of six model average predictors--Mallows' model averaging, stacking, Bayes model averaging, bagging, random forests, and boosting--to the components used to form them. In all six cases we identify conditions under which the ...
- research-articleJanuary 2020
Empirical priors for prediction in sparse high-dimensional linear regression
The Journal of Machine Learning Research (JMLR), Volume 21, Issue 1Article No.: 144, Pages 5709–5738In this paper we adopt the familiar sparse, high-dimensional linear regression model and focus on the important but often overlooked task of prediction. In particular, we consider a new empirical Bayes framework that incorporates data in the prior in two ...
- articleJanuary 2018
Correlated model fusion
Applied Stochastic Models in Business and Industry (ASMBI), Volume 34, Issue 1Pages 31–43https://doi.org/10.1002/asmb.2261Model fusion methods, or more generally ensemble methods, are a useful tool for prediction. Combining predictions from a set of models smooths out biases and reduces variances of predictions from individual models, and hence, the combined predictions ...
- articleJanuary 2017
Parallelizing stochastic gradient descent for least squares regression: mini-batching, averaging, and model misspecification
This work characterizes the benefits of averaging techniques widely used in conjunction with stochastic gradient descent (SGD). In particular, this work presents a sharp analysis of: (1) mini-batching, a method of averaging many samples of a stochastic ...
- research-articleJuly 2016
Training deep bidirectional LSTM acoustic model for LVCSR by a context-sensitive-chunk BPTT approach
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 24, Issue 7Pages 1185–1193This paper presents a study of using deep bidirectional long short-term memory (DBLSTM) recurrent neural network as acoustic model for DBLSTM-HMM based large vocabulary continuous speech recognition (LVCSR), where a context-sensitive-chunk (CSC) back-...
- research-articleJuly 2016
Fusion methods for speech enhancement and audio source separation
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 24, Issue 7Pages 1266–1279https://doi.org/10.1109/TASLP.2016.2553441A wide variety of audio source separation techniques exist and can already tackle many challenging industrial issues. However, in contrast with other application domains, fusion principles were rarely investigated in audio source separation despite ...
- articleJanuary 2016
Corporate Default Prediction Model Averaging: A Normative Linear Pooling Approach
Focusing on credit risk modelling, this paper introduces a novel approach for ensemble modelling based on a normative linear pooling. Models are first classified as dominant and competitive, and the pooling is run using the competitive models only. ...
- articleMarch 2014
Hierarchical probabilistic interaction modeling for multiple gene expression replicates
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), Volume 11, Issue 2Pages 336–346https://doi.org/10.1109/TCBB.2014.2299804Microarray technology allows for the collection of multiple replicates of gene expression time course data for hundreds of genes at a handful of time points. Developing hypotheses about a gene transcriptional network, based on time course gene ...
- ArticleSeptember 2012
Factor model averaging quantile regression and simulation study
ICICA'12: Proceedings of the Third international conference on Information Computing and ApplicationsPages 291–298https://doi.org/10.1007/978-3-642-34062-8_38In this paper, a model averaging approach is developed for the linear factor regression model in light of smoothed focused information criterion. With respect to factors, a frequentist model averaging estimation of the regression parameter is proposed ...
- ArticleNovember 2009
Averaged Naive Bayes Trees: A New Extension of AODE
ACML '09: Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine LearningPages 191–205https://doi.org/10.1007/978-3-642-05224-8_16Naive Bayes (NB) is a simple Bayesian classifier that assumes the conditional independence and augmented NB (ANB) models are extensions of NB by relaxing the independence assumption. The averaged one-dependence estimators (AODE) is a classifier that ...
- research-articleNovember 2008
Classifying Data Streams with Skewed Class Distributions and Concept Drifts
IEEE Internet Computing (IEEECS_INTERNET), Volume 12, Issue 6Pages 37–49https://doi.org/10.1109/MIC.2008.119Classification is an important data analysis tool that uses a model built from historical data to predict class labels for new observations. More and more applications are featuring data streams, rather than finite stored data sets, which are a ...
- ArticleJuly 2007
Model-averaged latent semantic indexing
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrievalPages 755–756https://doi.org/10.1145/1277741.1277893This poster introduces a novel approach to information retrieval that uses statistical model averaging to improve latent semantic indexing (LSI). Instead of choosing a single dimensionality $k$ for LSI , we propose using several models of differing ...
- research-articleFebruary 2007
Robust Speaker Adaptation by Weighted Model Averaging Based on the Minimum Description Length Criterion
IEEE Transactions on Audio, Speech, and Language Processing (TASLP-II), Volume 15, Issue 2Pages 652–660https://doi.org/10.1109/TASL.2006.876773The maximum likelihood linear regression (MLLR) technique is widely used in speaker adaptation due to its effectiveness and computational advantages. When the adaptation data are sparse, MLLR performance degrades because of unreliable parameter ...
- research-articleAugust 2006
Minimax Adaptive Spectral Estimation From an Ensemble of Signals
IEEE Transactions on Signal Processing (TSP), Volume 54, Issue 8Pages 2865–2873https://doi.org/10.1109/TSP.2006.877639We develop a statistical method for estimating the spectrum from a data set that consists of several signals, all of which are realizations of a common random process. We first find estimates of the common spectrum using each signal; then we construct M ...
- articleApril 2003
PAC-Bayesian Stochastic Model Selection
Machine Language (MALE), Volume 51, Issue 1Pages 5–21https://doi.org/10.1023/A:1021840411064PAC-Bayesian learning methods combine the informative priors of Bayesian methods with distribution-free PAC guarantees. Stochastic model selection predicts a class label by stochastically sampling a classifier according to a “posterior distribution” on ...
- articleJanuary 2000
A comparison of scientific and engineering criteria for Bayesian modelselection
Statistics and Computing (KLU-STCO), Volume 10, Issue 1Pages 55–62https://doi.org/10.1023/A:1008936501289Given a set of possible models for variables X and a set of possible parameters for each model, the Bayesian “estimate” of the probability distribution for X given observed data is obtained by averaging over the possible models and their parameters. An ...
- ArticleAugust 1997
Models and selection criteria for regression and classification
UAI'97: Proceedings of the Thirteenth conference on Uncertainty in artificial intelligencePages 223–228When performing regression or classification, we are interested in the conditional probability distribution for an outcome or class variable Y given a set of explanatory or input variables X. We consider Bayesian models for this task. In particular, we ...