Tag Archives: Corsican language

Interesting case of first name disambiguation

Here is an interesting case of first name disambiguation for machine translation. Consider the following first name ‘Camille’. It can apply to both genders. In Corsican (taravese or sartinese variants) it translates either into Cameddu (masculine) or Camedda (feminine). In … Continue reading

Posted in blog | Tagged , , , , , | Leave a comment

Writing differences between Corsican and Gallurese

Here are some writing differences between Corsican and Sardinian gallurese, that result from historical writing habits. These writing differences prevail, even when the words are the same: ghj is replaced by gghj: acciaghju (corsu), acciagghju (gallurese) , steel chj is … Continue reading

Posted in blog | Tagged , , , , , , , | Comments Off on Writing differences between Corsican and Gallurese

How rule-based and statistical machine translation can help each other

Here are a few suggestions on how rule-based and statistical machine translation  can help each other: (This is a follow-up to the previous post) to begin with, rule-based and statistical machine translation are often contrasted and compared: it would be … Continue reading

Posted in blog | Tagged , , , , , , | Leave a comment

A first 100%!

Now scoring 1 – 0/124 = 100%. Translated into Corsican ‘sartinesu’. Another Feigenbaum hit.

Posted in blog | Tagged , , | Leave a comment

French ‘fin’ followed by a year number: fixed

Tagger improvement: fixed this issue. French ‘l’Empire allemand’ now translates properly into l’Imperu alimanu (the German Empire). French word ‘fin’ is now identified as a preposition when followed by a year number. The above excerpt is translated into the ‘sartinesu’ … Continue reading

Posted in blog | Tagged , , , , | Comments Off on French ‘fin’ followed by a year number: fixed