Code-switching Sentence Generation

by Generative Adversarial Networks and its Application to Data Augmentation



Code-switching point (CSP) prediction


SEAME

(not shown in the article)
Ground truth Huh 你們 要 去 哦 你們 要 去 gym 啊
Input 哼 你們 要 去 哦 你們 要 去 體育館 啊
(Huh? Will you go? Will you go to the gym?)
Random 哼 you guys 要 go 哦 你們 要 去 體育館
Noun 哼 你們 要 去 哦 你們 要 去 gym 啊
Proposed 哼 you guys 要 去 哦 你們 要 去 體育館 啊
Proposed (+pos) Huh 你們 要 去 哦 你們 要 去 gym 啊
Description:

In this case, the code-switching points appear at a noun (gym) and an interjection (huh). Also, the rule-based approach "noun" is accurate, but cannot find out all CSPs. The proposed method with POS tagging can find out all CSPs.




[top]