skip to main content
research-article

Automatically Learning Topics and Difficulty Levels of Problems in Online Judge Systems

Published: 07 March 2018 Publication History

Abstract

Online Judge (OJ) systems have been widely used in many areas, including programming, mathematical problems solving, and job interviews. Unlike other online learning systems, such as Massive Open Online Course, most OJ systems are designed for self-directed learning without the intervention of teachers. Also, in most OJ systems, problems are simply listed in volumes and there is no clear organization of them by topics or difficulty levels. As such, problems in the same volume are mixed in terms of topics or difficulty levels. By analyzing large-scale users’ learning traces, we observe that there are two major learning modes (or patterns). Users either practice problems in a sequential manner from the same volume regardless of their topics or they attempt problems about the same topic, which may spread across multiple volumes. Our observation is consistent with the findings in classic educational psychology. Based on our observation, we propose a novel two-mode Markov topic model to automatically detect the topics of online problems by jointly characterizing the two learning modes. For further predicting the difficulty level of online problems, we propose a competition-based expertise model using the learned topic information. Extensive experiments on three large OJ datasets have demonstrated the effectiveness of our approach in three different tasks, including skill topic extraction, expertise competition prediction and problem recommendation.

References

[1]
Ashton Anderson, Daniel Huttenlocher, Jon Kleinberg, and Jure Leskovec. 2014. Engaging with massive online courses. In Proceedings of the 23rd International Conference on World Wide Web. ACM, 687--698.
[2]
John R. Anderson, C. Franklin Boyle, and Brian J. Reiser. 1985. Intelligent tutoring systems. Science 228, 4698 (1985), 456--462.
[3]
Tiffany Barnes, Donald L. Bitzer, and Mladen A. Vouk. 2005. Experimental analysis of the Q-matrix method in knowledge discovery. In Proceedings of the 15th International Symposium on Foundations of Intelligent Systems (ISMIS’05). 603--611.
[4]
David M. Blei and John D. Lafferty. 2007. A correlated topic model of science. Ann. Appl. Stat. (2007), 17--35.
[5]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3, Jan (2003), 993--1022.
[6]
John D. Bransford, Ann L. Brown, and Rodney R. Cocking. 2000. How People Learn: Brain, Mind, Experience, and School: Expanded Edition. National Academy Press, Washington, DC.
[7]
Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th International Conference on World Wide Web. ACM, 1--10.
[8]
Shuo Chen and Thorsten Joachims. 2016. Modeling intransitivity in matchup and comparison data. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining. ACM, 227--236.
[9]
Shuo Chen and Thorsten Joachims. 2016. Predicting matchups and preferences in context. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 775--784.
[10]
Xi Chen, Paul N. Bennett, Kevyn Collins-Thompson, and Eric Horvitz. 2013. Pairwise ranking aggregation in a crowdsourced setting. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining. ACM, 193--202.
[11]
Albert T. Corbett and John R. Anderson. 1994. Knowledge tracing: Modeling the acquisition of procedural knowledge. User Model. User-adapt. Interact. 4, 4 (1994), 253--278.
[12]
Ryan S. J d. Baker, Albert T. Corbett, and Vincent Aleven. 2008. More accurate student modeling through contextual estimation of slip and guess probabilities in bayesian knowledge tracing. In Proceedings of the International Conference on Intelligent Tutoring Systems. Springer, 406--415.
[13]
Pierre Dangauthier, Ralf Herbrich, Tom Minka, and Thore Graepel. 2008. Trueskill through time: Revisiting the history of chess. In Advances in Neural Information Processing Systems. 337--344.
[14]
Jimmy de la Torre. 2009. DINA model and parameter estimation: A didactic. J. Edu. Behav. Stat. 34, 1 (2009), 115--130.
[15]
Jimmy de la Torre and Jeffrey A. Douglas. 2004. Higher-order latent trait models for cognitive diagnosis. Psychometrika 69, 3 (2004), 333--353.
[16]
Michel Desmarais, Behzad Beheshti, and Peng Xu. 2014. The refinement of a Q-matrix: Assessing methods to validate tasks to skills mapping. In Proceedings of the 7th International Conference on Educational Data Mining (EDM’14). 208--311.
[17]
Michel C. Desmarais. 2012. Mapping question items to skills with non-negative matrix factorization. ACM SIGKDD Explor. Newslett. 13, 2 (2012), 30--36.
[18]
Michel C. Desmarais and Rhouma Naceur. 2013. A matrix factorization method for mapping items to skills and for enhancing expert-based q-matrices. In Proceedings of the International Conference on Artificial Intelligence in Education. Springer, 441--450.
[19]
Louis V. DiBello, William F. Stout, and Louis A. Roussos. 1995. Unified cognitive/psychometric diagnostic assessment likelihood-based classification techniques. Cognitively Diagnostic Assessment (1995), 361--389.
[20]
S. E. Embretson and S. P. Reise. 2000. Item Response Theory for Psychologists. Lawrence Erlbaum, Mahwah.
[21]
Mark E. Glickman. 1995. A comprehensive guide to chess ratings. Amer. Chess J. 3 (1995), 59--102.
[22]
Thomas L. Griffiths and Mark Steyvers. 2004. Finding scientific topics. Proc. Natl. Acad. Sci. U.S.A. 101, suppl. 1 (2004), 5228--5235.
[23]
Thomas L. Griffiths, Mark Steyvers, David M. Blei, and Joshua B. Tenenbaum. 2004. Integrating topics and syntax. In Proceedings of the Conference on Neural Information Processing Systems (NIPS’04), Vol. 4. 537--544.
[24]
Amit Gruber, Yair Weiss, and Michal Rosen-Zvi. 2007. Hidden topic markov models. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS’07), Vol. 2. 163--170.
[25]
Ido Guy, Uri Avraham, David Carmel, Sigalit Ur, Michal Jacovi, and Inbal Ronen. 2013. Mining expertise and interests from social media. In Proceedings of the 22nd International Conference on World Wide Web. ACM, 515--526.
[26]
Edward H. Haertel. 1989. Using restricted latent class models to map the skill structure of achievement items. J. Edu. Measure. 26, 4 (1989), 301--321.
[27]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 173--182.
[28]
Ralf Herbrich, Tom Minka, and Thore Graepel. 2006. TrueSkill: A Bayesian skill rating system. In Proceedings of the 19th International Conference on Neural Information Processing Systems. MIT Press, 569--576.
[29]
Thomas Hofmann. 1999. Probabilistic latent semantic indexing. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 50--57.
[30]
James A. Kulik and J. D. Fletcher. 2016. Effectiveness of intelligent tutoring systems: A meta-analytic review. Rev. Edu. Res. 86, 1 (2016), 42--78.
[31]
Andrew S. Lan, Christoph Studer, and Richard G. Baraniuk. 2014. Time-varying learning and content analytics via sparse factor analysis. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 452--461.
[32]
Bin Liu, Yanjie Fu, Zijun Yao, and Hui Xiong. 2013. Learning geographical preferences for point-of-interest recommendation. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1043--1051.
[33]
Jing Liu, Young-In Song, and Chin-Yew Lin. 2011. Competition-based user expertise score estimation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 425--434.
[34]
Tie-Yan Liu et al. 2009. Learning to rank for information retrieval. Found. Trends Info. Retriev. 3, 3 (2009), 225--331.
[35]
Shahriar Manzoor. 2006. Analyzing programming contest statistics. Perspect. Comput. Sci. Compet. Students, 48.
[36]
Julian McAuley and Jure Leskovec. 2013. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proceedings of the 7th ACM Conference on Recommender Systems. ACM, 165--172.
[37]
Andriy Mnih and Ruslan R. Salakhutdinov. 2008. Probabilistic matrix factorization. In Advances in Neural Information Processing Systems. 1257--1264.
[38]
Paul D. Nichols, Susan F. Chipman, and Robert L. Brennan. 2012. Cognitively Diagnostic Assessment. Routledge.
[39]
Zachary A. Pardos and Neil T. Heffernan. 2010. Modeling individualization in a Bayesian networks implementation of knowledge tracing. In Proceedings of the International Conference on User Modeling, Adaptation, and Personalization. Springer, 255--266.
[40]
Jordi Petit, Omer Giménez, and Salvador Roura. 2012. Jutge. org: An educational programming judge. In Proceedings of the 43rd ACM Technical Symposium on Computer Science Education. ACM, 445--450.
[41]
Xuan-Hieu Phan and Cam-Tu Nguyen. 2007. GibbsLDA++: AC/C++ implementation of latent Dirichlet allocation (LDA). Technical report.
[42]
Chris Piech, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas J. Guibas, and Jascha Sohl-Dickstein. 2015. Deep knowledge tracing. In Advances in Neural Information Processing Systems. 505--513.
[43]
Chris Piech, Mehran Sahami, Jonathan Huang, and Leonidas Guibas. 2015. Autonomously generating hints by inferring problem solving policies. In Proceedings of the 2nd ACM Conference on Learning@ Scale. ACM, 195--204.
[44]
Chris Piech, Mehran Sahami, Daphne Koller, Steve Cooper, and Paulo Blikstein. 2012. Modeling how students learn to program. In Proceedings of the 43rd ACM Technical Symposium on Computer Science Education. ACM, 153--160.
[45]
Anna Rafferty, Emma Brunskill, Thomas Griffiths, and Patrick Shafto. 2011. Faster teaching by POMDP planning. In Artificial Intelligence in Education. Springer, 280--287.
[46]
Md Mustafizur Rahman and Hongning Wang. 2016. Hidden topic sentiment model. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 155--165.
[47]
P. V. Rao and Lawrence L. Kupper. 1967. Ties in paired-comparison experiments: A generalization of the Bradley-Terry model. J. Amer. Statist. Assoc. 62, 317 (1967), 194--204.
[48]
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. AUAI Press, 452--461.
[49]
Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010. Factorizing personalized markov chains for next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web. ACM, 811--820.
[50]
Miguel A. Revilla, Shahriar Manzoor, and Rujia Liu. 2008. Competitive learning in informatics: The UVa online judge experience. Olympiads Informat. 2 (2008), 131--148.
[51]
Doug Rohrer. 2009. The effects of spacing and mixing practice problems. J. Res. Math. Edu. 40, 1 (2009), 4--17. http://www.jstor.org/stable/40539318
[52]
Doug Rohrer and Kelli Taylor. 2006. The effects of overlearning and distributed practise on the retention of mathematics knowledge. Appl. Cogn. Psychol. 20, 9 (2006), 1209--1224.
[53]
Cristobal Romero and Sebastian Ventura. 2007. Educational data mining: A survey from 1995 to 2005. Expert Syst. Appl. 33, 1 (2007), 135--146.
[54]
Alexander Joseph Romiszowski. 2016. Designing Instructional Systems: Decision Making in Course Planning and Curriculum Design. Routledge.
[55]
Michal Rosen-Zvi, Thomas Griffiths, Mark Steyvers, and Padhraic Smyth. 2004. The author-topic model for authors and documents. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. AUAI Press, 487--494.
[56]
Steven V. Shannon. 2008. Using metacognitive strategies and learning styles to create self-directed learners. Inst. Learn. Styles J. 1, 1 (2008), 14--28.
[57]
Robert E. Slavin and Nicola Davis. 2006. Educational psychology: Theory and practice. (2006).
[58]
Yuan Sun, Shiwei Ye, Shunya Inoue, and Yi Sun. 2014. Alternating recursive method for Q-matrix learning. In Proceedings of the 7th International Conference on Educational Data Mining (EDM’14). 14--20.
[59]
Yee Whye Teh, Michael I. Jordan, Matthew J. Beal, and David M. Blei. 2004. Sharing clusters among related groups: Hierarchical dirichlet processes. In Proceedings of the Conference on Neural Information Processing Systems (NIPS’04). 1385--1392.
[60]
Pengfei Wang, Jiafeng Guo, Yanyan Lan, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2015. Learning hierarchical representation model for nextbasket recommendation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 403--412.
[61]
Run-ze Wu, Qi Liu, Yuping Liu, Enhong Chen, Yu Su, Zhigang Chen, and Guoping Hu. 2015. Cognitive modelling for predicting examinee performance. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI’15). 1017--1024.
[62]
Jiang Yang, Lada A. Adamic, and Mark S. Ackerman. 2008. Competing to share expertise: The taskcn knowledge sharing community. In Proceedings of the International Conference on Web and Social Media (ICWSM’08).
[63]
Liu Yang, Minghui Qiu, Swapna Gottipati, Feida Zhu, Jing Jiang, Huiping Sun, and Zhong Chen. 2013. Cqarank: Jointly model topics and expertise in community question answering. In Proceedings of the 22nd ACM International Conference on Information 8 Knowledge Management. ACM, 99--108.
[64]
Michael V. Yudelson, Kenneth R. Koedinger, and Geoffrey J. Gordon. 2013. Individualized bayesian knowledge tracing models. In Proceedings of the International Conference on Artificial Intelligence in Education. Springer, 171--180.
[65]
Wayne Xin Zhao, Jing Liu, Yulan He, Chin-Yew Lin, and Ji-Rong Wen. 2016. A computational approach to measuring the correlation between expertise and social media influence for celebrities on microblogs. World Wide Web 19, 5 (2016), 865--886.
[66]
Barry J. Zimmerman, Dale H. Schunk, Anita Woolfolk Hoy, and Pamela J. Gaskill. 2003. Self-regulated learning. Psyccritiques 48, 1 (2003), 16--18.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems
ACM Transactions on Information Systems  Volume 36, Issue 3
July 2018
402 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/3146384
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 March 2018
Accepted: 01 November 2017
Revised: 01 August 2017
Received: 01 May 2017
Published in TOIS Volume 36, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Topic models
  2. expertise learning
  3. online judge systems

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Beijing Natural Science Foundation
  • National Natural Science Foundation of China
  • National Key Basic Research Program (973 Program) of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)5
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media