SUBWORD FOR VIETNAMESE-ENGLISH STATISTICAL MACHINE TRANSLATION3 views
Keywords:Subword; Word alignment; Statistical machine translation.
In this paper, we propose an approach for applying subword methods in SMT to improve word alignment in Vietnamese-English SMT systems. In addition to applying subword methods as a preprocessing step, we propose a new algorithm for decoding alignment table of translation model. The proposed method has been implemented and evaluated with various subword methods: BPE, Wordpiece, unigram, and Morfessor. Experimental results show that the proposed method produces better results with every subword method, and the highest improvement is 0.81 BLEU from the model with the BPE subword method.