skip to main content
research-article

Personalized reading support for second-language web documents

Published: 03 April 2013 Publication History

Abstract

A novel intelligent interface eases the browsing of Web documents written in the second languages of users. It automatically predicts words unfamiliar to the user by a collective intelligence method and glosses them with their meaning in advance. If the prediction succeeds, the user does not need to consult a dictionary; even if it fails, the user can correct the prediction. The correction data are collected and used to improve the accuracy of further predictions. The prediction is personalized in that every user's language ability is estimated by a state-of-the-art language testing model, which is trained in a practical response time with only a small sacrifice of prediction accuracy. The system was evaluated in terms of prediction accuracy and reading simulation. The reading simulation results show that this system can reduce the number of clicks for most readers with insufficient vocabulary to read documents and can significantly reduce the remaining number of unfamiliar words after the prediction and glossing for all users.

References

[1]
Baker, F. and Kim, S. 2004. Item Response Theory: Parameter Estimation Techniques. CRC Press, Boca Raton, FL.
[2]
Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer.
[3]
Brants, T. and Franz, A. 2006. Web 1T 5-gram Version 1. Linguistic Data Consortium, Philadelphia. PA.
[4]
Cerego Japan Inc. 2009. smart.fm. http://smart.fm/.
[5]
Coolest.com Inc. 2002. popjisyo.com http://www.popjisyo.com/.
[6]
Cristianini, N. and Shawe-Taylor, J. 2000. An Introduction to Support Vector Machines: And Other Kernel-Based Learning Methods. Cambridge University Press.
[7]
Dale, E. 1965. Vocabulary measurement: Techniques and major findings. Element. Eng. 42, 895--901, 948.
[8]
Fan, R., Chang, K., Hsieh, C., Wang, X., and Lin, C. 2008. LIBLINEAR: A library for large linear classification. J. Mach. Learn. Res. 9, 1871--1874.
[9]
Lin, C., Weng, R., and Keerthi, S. 2008. Trust region Newton method for logistic regression. J. Mach. Learn. Res. 9, 627--650.
[10]
Liu, D. and Nocedal, J. 1989. On the limited memory BFGS method for large scale optimization. Math. Program. 45, 1, 503--528.
[11]
Nation, I. S. P. 2006. How large a vocabulary is needed for reading and listening? Canad. Modern Lang. Rev. 63, 1, 59--82.
[12]
Nori. 2005. Firedictionary.com. http://www.firedictionary.com/.
[13]
Novikoff, A. B. 1963. On convergence proofs for perceptrons. In Proceedings of the Symposium on the Mathematical Theory of Automata. Vol. 12. 615--622.
[14]
Paribakht, T. and Wesche, M. 1997. Vocabulary enhancement activities and reading for meaning in second language vocabulary acquisiti. In Second Language Vocabulary Acquisition: A Rationale for Pedagogy, 174--200.
[15]
popIn Inc. 2008. popin. http://www.popin.cc/en/home.html.
[16]
Read, J. 2000. Assessing Vocabulary. Cambridge University Press.
[17]
SPACE ALC Inc. 1998. Standard vocabulary list 12,000. http://www.alc.co.jp/goi/PW_top_all.htm.
[18]
Sumita, E., Sugaya, F., and Yamamoto, S. 2005. Measuring non-native speakers' proficiency of english by using a test with automatically-generated fill-in-the-blank questions. In Proceedings of the 2nd Workshop on Building Educational Applications Using NLP. Association for Computational Linguistics, 61--68.
[19]
Rudick, T. D. 2001. Rikai. http://www.rikai.com/.
[20]
Tamayo, J. M. 1987. Frequency of use as a measure of word difficulty in bilingual vocabulary test construction and translation. Educ. Psychol. Measu. 47, 4, 893--902.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 4, Issue 2
Special section on agent communication, trust in multiagent systems, intelligent tutoring and coaching systems
March 2013
339 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2438653
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 April 2013
Accepted: 01 April 2011
Revised: 01 January 2011
Received: 01 October 2010
Published in TIST Volume 4, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Reading support
  2. Web pages
  3. glossing systems
  4. item response theory
  5. logistic regression

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)3
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media