Code-switching Sentence Generation

by Generative Adversarial Networks and its Application to Data Augmentation



Code-switching (CS) generation


Bad SEAME examples generated by our proposed model

Generation: 我 拿 了 一 THREE NINE 啊
Origin: 我 拿 了 一 三 九 啊
    (I took one three nine)
Description

It's better to code-switch all numbers "一 (one)," "三 (three)" and "九 (nine)" together.

Generation: 他 又 FIND 你 吵架 了
Origin: 他 又 找 你 吵架 了
    (He argued with you again)
Description

"找" doesn't mean "find" in this sentence. "找" here doesn't have essential meaning.

Generation: CALL 我 SHOOT 一下 他 就 SAUCE 哦
Origin: 叫 我 拍 一下 他 就 醬 哦
    (Call me to take a picture of him)
Description

"拍" in this sentence is simplified from "拍照 (take a photo)." "就 醬" in this sentence is simplified from "就 這樣 (that's it)."

Generation: 你 NOON 去 看戲 啊
Origin: 你 中午 去 看戲 啊
    (You go to the theater at noon)
Description

"中午" in this sentence is simplified from "在 中午 (at noon)."

Generation: 因為 她 真的 是 開不到 了 BE 換掉 了
Origin: 因為 她 真的 是 開不到 了 被 換掉 了
    (Because she really couldn't get it, so she was replaced)
Description

We usually code-switch auxiliary verbs "被 (be)" along with verbs "換掉 (replaced)" together.

Generation: 我 也 會 怕 不要 NEAR JUST FINE 了 咯
Origin: 我 也 會 怕 不要 靠近 就好 了 咯
    (I am also afraid It's fine not to be near it)
Description

It's weird to code-switch "就好" to "just fine." We may say "It's fine ..."

Generation: 對 啊 就是 不要 FIRST 進 那麼 多
Origin: 對 啊 就是 不要 先 進 那麼 多
    (Yes Dont's input so much in advance)
Description

"先" in this sentence means "事先 (in advance)" not "first."

Generation: 我 有 想過 要 DO 這個 因為 你 看 他們 網拍 的 價錢 是 有 夠 便宜的 咯
Origin: 我 有 想過 要 弄 這個 因為 你 看 他們 網拍 的 價錢 是 有 夠 便宜的 咯
    (I have thought about doing this because you can see that the price of their webcam is so cheap)
Description

It's weird to code-switch only "do". We may code-switch "弄 這個 (do this)."




Code-switching (CS) generation

Bad SEAME examples generated by our proposed (+pos) model

Generation: 沒有 啦 打 ONE 比如 啦
Origin: 沒有 啦 打 個 比如 啦
    (No, take an example...)
Description

It's weird to code-switch "個 (one)" because it's not an important word in this sentence.

Generation: COMPARISON 常 唱 男孩子 的 歌手
Origin: 比較 常 唱 男孩子 的 歌手
    (I sing songs which are sung boy singers more often)
Description

"比較" means "more" to describe "常 (often)" in this sentence.

Generation: MEETING SOMEONE 衝上去 THEN FROM WINDOW 進去 的
Origin: 會 有人 衝上去 然後 從 窗口 進去 的
    (Someone will rush and go in from the window)
Description

In Chinese, we may say "會 有人... (there will be someone..)" while we say "Someone will ..." in English. Besides, "會" means "將會 (will)" instead of "會議 (meeting)" here.

Generation: 沒有 啦 我們 大概 是 四分 之 ONE 這樣
Origin: 沒有 啦 我們 大概 是 四分 之 一 這樣
    (No, we are probably a quarter)
Description

"四分 之 一 (a quarter)" should code-switch together.

Generation: VERY 很多 的 東西 啊 的 男 跟 女 都是 VERY INDEPENDENT 的
Origin: 很 很多 的 東西 啊 的 男 跟 女 都是 很 獨立 的
    (Many things. Men and women are very independent)
Description

"很 很多" is better code-switched to "very many."

Generation: IS NOT IT FEEL I FEEL VERY STRANGE
Origin: 不是 覺得 我 覺得 很奇怪
    (I don't think that. I feel very weird)
Description

It's not a code-switching sentences if all words are English. Besides, it's a bad translated sentence due to word-level translation.




[top]