Learning to Aggregate Ordinal Labels by Maximizing Separating Width

Guangyong Chen, Shengyu Zhang, Di Lin, Hui Huang, Pheng Ann Heng
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:787-796, 2017.

Abstract

While crowdsourcing has been a cost and time efficient method to label massive samples, one critical issue is quality control, for which the key challenge is to infer the ground truth from noisy or even adversarial data by various users. A large class of crowdsourcing problems, such as those involving age, grade, level, or stage, have an ordinal structure in their labels. Based on a technique of sampling estimated label from the posterior distribution, we define a novel separating width among the labeled observations to characterize the quality of sampled labels, and develop an efficient algorithm to optimize it through solving multiple linear decision boundaries and adjusting prior distributions. Our algorithm is empirically evaluated on several real world datasets, and demonstrates its supremacy over state-of-the-art methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-chen17i, title = {Learning to Aggregate Ordinal Labels by Maximizing Separating Width}, author = {Guangyong Chen and Shengyu Zhang and Di Lin and Hui Huang and Pheng Ann Heng}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {787--796}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/chen17i/chen17i.pdf}, url = {https://proceedings.mlr.press/v70/chen17i.html}, abstract = {While crowdsourcing has been a cost and time efficient method to label massive samples, one critical issue is quality control, for which the key challenge is to infer the ground truth from noisy or even adversarial data by various users. A large class of crowdsourcing problems, such as those involving age, grade, level, or stage, have an ordinal structure in their labels. Based on a technique of sampling estimated label from the posterior distribution, we define a novel separating width among the labeled observations to characterize the quality of sampled labels, and develop an efficient algorithm to optimize it through solving multiple linear decision boundaries and adjusting prior distributions. Our algorithm is empirically evaluated on several real world datasets, and demonstrates its supremacy over state-of-the-art methods.} }
Endnote
%0 Conference Paper %T Learning to Aggregate Ordinal Labels by Maximizing Separating Width %A Guangyong Chen %A Shengyu Zhang %A Di Lin %A Hui Huang %A Pheng Ann Heng %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-chen17i %I PMLR %P 787--796 %U https://proceedings.mlr.press/v70/chen17i.html %V 70 %X While crowdsourcing has been a cost and time efficient method to label massive samples, one critical issue is quality control, for which the key challenge is to infer the ground truth from noisy or even adversarial data by various users. A large class of crowdsourcing problems, such as those involving age, grade, level, or stage, have an ordinal structure in their labels. Based on a technique of sampling estimated label from the posterior distribution, we define a novel separating width among the labeled observations to characterize the quality of sampled labels, and develop an efficient algorithm to optimize it through solving multiple linear decision boundaries and adjusting prior distributions. Our algorithm is empirically evaluated on several real world datasets, and demonstrates its supremacy over state-of-the-art methods.
APA
Chen, G., Zhang, S., Lin, D., Huang, H. & Heng, P.A.. (2017). Learning to Aggregate Ordinal Labels by Maximizing Separating Width. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:787-796 Available from https://proceedings.mlr.press/v70/chen17i.html.

Related Material