Tag Archives: Corsican language

Interesting case of first name disambiguation

Here is an interesting case of first name disambiguation for machine translation. Consider the following first name ‘Camille’. It can apply to both genders. In Corsican (taravese or sartinese variants) it translates either into Cameddu (masculine) or Camedda (feminine). In … Continue reading

Posted in blog | Tagged , , , , , | Leave a comment

Writing differences between Corsican and Gallurese

Here are some writing differences between Corsican and Sardinian gallurese, that result from historical writing habits. These writing differences prevail, even when the words are the same: ghj is replaced by gghj: acciaghju (corsu), acciagghju (gallurese) , steel chj is … Continue reading

Posted in blog | Tagged , , , , , , , | Comments Off on Writing differences between Corsican and Gallurese

How rule-based and statistical machine translation can help each other

Here are a few suggestions on how rule-based and statistical machine translation  can help each other: (This is a follow-up to the previous post) to begin with, rule-based and statistical machine translation are often contrasted and compared: it would be … Continue reading

Posted in blog | Tagged , , , , , , | Leave a comment

Rough typology of remaining errors

French to Corsican: performing on French wikipedia sample test currently amounts to 93% on average. Below is a rough typology of remaining errors (presumably an average of 95% performance should be attainable on the basis of correction of ‘easy’ tagged errors): … Continue reading

Posted in blog | Tagged , | Comments Off on Rough typology of remaining errors

A first 100%!

Now scoring 1 – 0/124 = 100%. Translated into Corsican ‘sartinesu’. Another Feigenbaum hit.

Posted in blog | Tagged , , | Leave a comment