The Yahoo! Music Dataset and KDD-Cup’11

Gideon Dror, Noam Koenigstein, Yehuda Koren, Markus Weimer
Proceedings of KDD Cup 2011, PMLR 18:3-18, 2012.

Abstract

KDD-Cup 2011 challenged the community to identify user tastes in music by leveraging Yahoo! Music user ratings. The competition hosted two tracks, which were based on two datasets sampled from the raw data, including hundreds of millions of ratings. The underlying ratings were given to four types of musical items: tracks, albums, artists, and genres, forming a four level hierarchical taxonomy. The challenge started on March 15, 2011 and ended on June 30, 2011 attracting 2389 participants, 2100 of which were active by the end of the competition. The popularity of the challenge is related to the fact that learning a large scale recommender systems is a generic problem, highly relevant to the industry. In addition, the contest drew interest by introducing a number of scientific and technical challenges including dataset size, hierarchical structure of items, high resolution timestamps of ratings, and a non-conventional ranking-based task. This paper provides the organizers’ account of the contest, including: a detailed analysis of the datasets, discussion of the contest goals and actual conduct, and lessons learned throughout the contest.

Cite this Paper


BibTeX
@InProceedings{pmlr-v18-dror12a, title = {The Yahoo! Music Dataset and KDD-Cup’11}, author = {Dror, Gideon and Koenigstein, Noam and Koren, Yehuda and Weimer, Markus}, booktitle = {Proceedings of KDD Cup 2011}, pages = {3--18}, year = {2012}, editor = {Dror, Gideon and Koren, Yehuda and Weimer, Markus}, volume = {18}, series = {Proceedings of Machine Learning Research}, month = {21 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v18/dror12a/dror12a.pdf}, url = {https://proceedings.mlr.press/v18/dror12a.html}, abstract = {KDD-Cup 2011 challenged the community to identify user tastes in music by leveraging Yahoo! Music user ratings. The competition hosted two tracks, which were based on two datasets sampled from the raw data, including hundreds of millions of ratings. The underlying ratings were given to four types of musical items: tracks, albums, artists, and genres, forming a four level hierarchical taxonomy. The challenge started on March 15, 2011 and ended on June 30, 2011 attracting 2389 participants, 2100 of which were active by the end of the competition. The popularity of the challenge is related to the fact that learning a large scale recommender systems is a generic problem, highly relevant to the industry. In addition, the contest drew interest by introducing a number of scientific and technical challenges including dataset size, hierarchical structure of items, high resolution timestamps of ratings, and a non-conventional ranking-based task. This paper provides the organizers’ account of the contest, including: a detailed analysis of the datasets, discussion of the contest goals and actual conduct, and lessons learned throughout the contest.} }
Endnote
%0 Conference Paper %T The Yahoo! Music Dataset and KDD-Cup’11 %A Gideon Dror %A Noam Koenigstein %A Yehuda Koren %A Markus Weimer %B Proceedings of KDD Cup 2011 %C Proceedings of Machine Learning Research %D 2012 %E Gideon Dror %E Yehuda Koren %E Markus Weimer %F pmlr-v18-dror12a %I PMLR %P 3--18 %U https://proceedings.mlr.press/v18/dror12a.html %V 18 %X KDD-Cup 2011 challenged the community to identify user tastes in music by leveraging Yahoo! Music user ratings. The competition hosted two tracks, which were based on two datasets sampled from the raw data, including hundreds of millions of ratings. The underlying ratings were given to four types of musical items: tracks, albums, artists, and genres, forming a four level hierarchical taxonomy. The challenge started on March 15, 2011 and ended on June 30, 2011 attracting 2389 participants, 2100 of which were active by the end of the competition. The popularity of the challenge is related to the fact that learning a large scale recommender systems is a generic problem, highly relevant to the industry. In addition, the contest drew interest by introducing a number of scientific and technical challenges including dataset size, hierarchical structure of items, high resolution timestamps of ratings, and a non-conventional ranking-based task. This paper provides the organizers’ account of the contest, including: a detailed analysis of the datasets, discussion of the contest goals and actual conduct, and lessons learned throughout the contest.
RIS
TY - CPAPER TI - The Yahoo! Music Dataset and KDD-Cup’11 AU - Gideon Dror AU - Noam Koenigstein AU - Yehuda Koren AU - Markus Weimer BT - Proceedings of KDD Cup 2011 DA - 2012/06/01 ED - Gideon Dror ED - Yehuda Koren ED - Markus Weimer ID - pmlr-v18-dror12a PB - PMLR DP - Proceedings of Machine Learning Research VL - 18 SP - 3 EP - 18 L1 - http://proceedings.mlr.press/v18/dror12a/dror12a.pdf UR - https://proceedings.mlr.press/v18/dror12a.html AB - KDD-Cup 2011 challenged the community to identify user tastes in music by leveraging Yahoo! Music user ratings. The competition hosted two tracks, which were based on two datasets sampled from the raw data, including hundreds of millions of ratings. The underlying ratings were given to four types of musical items: tracks, albums, artists, and genres, forming a four level hierarchical taxonomy. The challenge started on March 15, 2011 and ended on June 30, 2011 attracting 2389 participants, 2100 of which were active by the end of the competition. The popularity of the challenge is related to the fact that learning a large scale recommender systems is a generic problem, highly relevant to the industry. In addition, the contest drew interest by introducing a number of scientific and technical challenges including dataset size, hierarchical structure of items, high resolution timestamps of ratings, and a non-conventional ranking-based task. This paper provides the organizers’ account of the contest, including: a detailed analysis of the datasets, discussion of the contest goals and actual conduct, and lessons learned throughout the contest. ER -
APA
Dror, G., Koenigstein, N., Koren, Y. & Weimer, M.. (2012). The Yahoo! Music Dataset and KDD-Cup’11. Proceedings of KDD Cup 2011, in Proceedings of Machine Learning Research 18:3-18 Available from https://proceedings.mlr.press/v18/dror12a.html.

Related Material