2632-6779 (Print)
2633-6898 (Online)
Scopus
Ulrich’s Periodicals Directory (ProQuest)
MLA International Bibliography
MLA Directory of Periodicals
Directory of Open Access Journals (DOAJ)
QOAM (Quality Open Access Market)
British National Bibliography
WAC Clearinghouse Journal Listings
EBSCO Education
ICI Journals Master List
ERIH PLUS
CNKI Scholar
Gale-Cengage
WorldCat
Crossref
Baidu Scholar
British Library
J-Gate
ROAD
BASE
Publons
Google Scholar
Semantic Scholar
ORE Directory
TIRF
China National Center for Philosophy and Social Sciences Documentation
Zhenyan Ye
University of Hong Kong, China
Abstract
Machine translation (MT) systems such as Google Translate, Bing or Youdao are increasingly present in everyday life. Anecdotal evidence suggests that language students might use them to produce written work in the target language (TL) and thus possibly get around a potentially difficult writing task. The crucial question to ask would be whether it is possible to differentiate the output of MT from learner language. This paper seeks to address this question by comparing the lexical features of these two types of discourse in the Chinese context. In particular, it examines the use of English translation equivalents of polysemous Chinese words in two parallel corpora: A Chinese webpage corpus translated into English using Bing and Youdao on the one hand and a Chinese learner writing corpus on the other. While the comparison yields similar error rates, it also establishes that human learners and translation engines have difficulties with different sets of words. Word frequency also plays a significant role in differentiating between the two sets of output. The paper concludes with the finding that MT output is sufficiently different from learner language in terms of lexis. The findings could be used to create an algorithm for the detection of ethics code violation through the use of MT engines in written assignments.
Keywords
Lexical transfer, polysemy, machine translation, writing