SUBWORD FOR VIETNAMESE-ENGLISH STATISTICAL MACHINE TRANSLATION

3 views

Authors

  • Dang Thanh Quyen (Corresponding Author) Military Information Technology Institute, Academy of Military Science and Technology

Keywords:

Subword; Word alignment; Statistical machine translation.

Abstract

In this paper, we propose an approach for applying subword methods in SMT to improve word alignment in Vietnamese-English SMT systems. In addition to applying subword methods as a preprocessing step, we propose a new algorithm for decoding alignment table of translation model. The proposed method has been implemented and evaluated with various subword methods: BPE, Wordpiece, unigram, and Morfessor. Experimental results show that the proposed method produces better results with every subword method, and the highest improvement is 0.81 BLEU from the model with the BPE subword method.

Published

26-08-2021

How to Cite

Dang Thanh, Q. “SUBWORD FOR VIETNAMESE-ENGLISH STATISTICAL MACHINE TRANSLATION”. Journal of Military Science and Technology, no. 74, Aug. 2021, pp. 121-8, https://en.jmst.info/index.php/jmst/article/view/19.

Issue

Section

Research Articles