16 August 2017

Word length in Malay and English

Words tend to be longer in Malay than in English. This is partly because English has lots of short words, like 'a', 'of', 'to', and 'by', while Malay has fewer. The only common two-letter words in Malay are ke ('to') and di ('at'). At the same time, Malay has more prefixes and suffixes than English, so Malay words can be quite long.

The claim that Malay has longer words than English can easily be checked by comparing two similar texts. Here, I compare the words in the North Wind and the Sun text with its Malay equivalent, Angin Utahar dan Matahari (see here). The average word length in the Malay text is 6.64 letters, while the average word length in the English text is 4.22 letters, and this difference is highly significant (t=7.48, df=189, p<0.0001). A chart showing the distribution of the word lengths is shown here:

From this, we can see that while two- and three-letter words are common in English, they are rare in Malay. In contrast, there are lots of Malay words with 6, 7 and 8 letters, but few in English.