Generating Diverse Translation with Perturbed kNN-MT

Yuto Nishida, Makoto Morishita, Hidetaka Kamigaito, Taro Watanabe


Abstract
Generating multiple translation candidates would enable users to choose the one that satisfies their needs.Although there has been work on diversified generation, there exists room for improving the diversity mainly because the previous methods do not address the overcorrection problem—the model underestimates a prediction that is largely different from the training data, even if that prediction is likely.This paper proposes methods that generate more diverse translations by introducing perturbed k-nearest neighbor machine translation (kNN-MT).Our methods expand the search space of kNN-MT and help incorporate diverse words into candidates by addressing the overcorrection problem.Our experiments show that the proposed methods drastically improve candidate diversity and control the degree of diversity by tuning the perturbation’s magnitude.
Anthology ID:
2024.eacl-srw.2
Volume:
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Neele Falk, Sara Papi, Mike Zhang
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9–31
Language:
URL:
https://aclanthology.org/2024.eacl-srw.2
DOI:
Bibkey:
Cite (ACL):
Yuto Nishida, Makoto Morishita, Hidetaka Kamigaito, and Taro Watanabe. 2024. Generating Diverse Translation with Perturbed kNN-MT. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 9–31, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
Generating Diverse Translation with Perturbed kNN-MT (Nishida et al., EACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.eacl-srw.2.pdf
Video:
 https://aclanthology.org/2024.eacl-srw.2.mp4