When I was a linguistics fresher back in 1990, we were told a well-known anecdote about the early days of machine translation: When the sentence “The spirit is willing, but the flesh is weak” (an allusion to Mark 14:38) was translated into Russian and then back to English, the result was “The vodka is good, but the meat is rotten”.
I vaguely remember trying out this sentence in the early days of Google Translate, with amusing result.
However, I recently decided to try it again, and imagine my surprise when I realised that Google Translate can translate this exact phrase into any of the available languages and back into English without making a single error.
The obvious explanation is that Google must have added Mark 14:38 to the training corpus to ensure that nobody mocks them for getting it wrong.
It’s only this specific sentence that it handles this well. As soon as you start moving the words around or adding extra words, the quality of the translation decreases. For instance, “The spirit is willing, but the flesh is weak” becomes “Ånden er rede, men kødet er skrøbeligt” when translated into Danish, but “The spirit in the bottle is willing, but the flesh in the box is weak” becomes “Ånden i flasken er villig, men kødet i boksen er svag”. I’m not saying this translation is bad, but I find it interesting how it suddenly becomes unable to add the neuter -t to svag, although it managed perfectly well to add it to skrøbelig.
It’s quite interesting to investigate how Google Translate handles the individual words in this sentence. For instance, in the case of translating “spirit”, it appears the singular normally triggers the soul sense, whereas the plural triggers the alcohol sense. The result is that “The house of the spirits” gets translated into Danish as “Huset af spiritus” (“The house of alcohol”) rather than the expected “Åndernes hus”.