Machine Translation Ready - Translate into your language
English Arabic Bulgarian Chinese (Simplified) Chinese (Traditional) Croatian Czech Danish Dutch Finnish French German Greek Hindi Italian Japanese Korean Norwegian Polish Portuguese Romanian Russian Spanish Swedish Catalan Filipino Indonesian Latvian Lithuanian Serbian Slovak Slovenian Ukrainian Vietnamese Albanian Estonian Galician Hungarian Maltese Thai Turkish Persian Afrikaans Malay Swahili Irish Welsh Belarusian Icelandic Macedonian

Towards Machine Translation-Friendly Sites

Written by Yousef Elbes   
Tuesday, 22 July 2008 19:25

Read this first: Who is reading your text?

Machine Translation Ready Unless we spread the awareness of the existence of Machine Translation (MT), no progress can be made in this domain. Collaboration between writers and MT is vital for the latter to succeed.

From here, let's encourage web writers and authors to tailor their texts according to the basic needs of MT. Once the text is polished and prepared for MT, authors can then tell their readers that their pages are "Machine Translation Friendly", or "Machine Translation Ready". For this purpose, adding a small etiquette will distinguish your friendly site from other unfriendly sites and your particular content from the rest of content.

It would be very useful to place on your front page if all your content is "MT-Friendly" or on the pages that you think they meet the following minimum requirements:

1. Short paragraphs
If Machine Translation is using Google AJAX Language API, then your paragraph should be less than 500 characters (including spaces); the rest will be dropped!

2. Short phrases
Avoid long phrases as much as you can.

3. Clear phrases
Avoid ambiguous usage of linguistic components (subject, verb, object, etc.). Unclear phrases will produce erroneous translations. Use your knowledge of a second language to imagine translation scenarios; how this would translate into that language.

4. Correct spelling, no typos
Wrongly spelt words will not be recognized by MT systems. These are the "white socks" of your elegant site! Little extra effort will give your site the suitable socks.

5. No slang or invented words
Words which do not exist in known dictionaries will be ignored by MT systems. No MT Data Base will contain all your vocabulary.

6. No highly elaborated terminology
Try to simplify the usage of special terminology as much as you can; use common synonyms when possible.

7. No acronyms or abbreviations at all
Acronyms and abbreviations can be translated into anything; they should be avoided completely.

8. Don't mix languages
Mixing languages in your text is confusing for humans; for the machine it can be a nightmare. If you have to include text from other languages, declare the language of that text in your HTML code.

9. Image position
If your images have "titles", try to place them at the beginning of your text, or at the end; otherwise MT system will think that this is the title of your article.

10. Clean pages
Clean HTML code will make the life of MT much easier; messy code can cause your translated text to break.

Share Link:
Bookmark Google Yahoo MyWeb Del.icio.us Digg Facebook Reddit Ma.gnolia Technorati Stumble Upon Furl Mister Wong