id author title date pages extension mime words sentences flesch summary cache txt work_z4osyr4lbfg65mntn73iqciy6a Xiao Pu Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation 2018 15 .pdf application/pdf 9848 1002 71 Therefore, the NMT decoders cannot clearly identify the contexts in which one word sense should modeling of word senses can be helpful to NMT • Weakly supervised word sense disambiguation (WSD) approaches integrated into NMT, • Three sense selection mechanisms for integrating WSD into NMT, respectively based occurrence of such nouns or verbs in the training data, we use word2vec to build word vectors all ambiguous words before clustering their occurrences, and do not adapt to what is actually observed in the data; as a result, the senses are often To model word senses for NMT, we concatenate the embedding of each token with a vector our sense-aware NMT models on large data sets Table 5: BLEU scores of our sense-aware NMT systems over five language pairs: ATTini is the best one among Word sense disambiguation improves statistical machine translation. Improving word sense disambiguation in neural machine translation ./cache/work_z4osyr4lbfg65mntn73iqciy6a.pdf ./txt/work_z4osyr4lbfg65mntn73iqciy6a.txt