Rough typology of remaining errors

French to Corsican: performing on French wikipedia sample test currently amounts to 93% on average. Below is a rough typology of remaining errors (presumably an average of 95% performance should be attainable on the basis of correction of ‘easy’ tagged errors):



  • unknown vocabulary: 50% (easy)
  • basic disambiguation: 15%  (easy)
  • erroneous accord (relates to (i) words that are masculine in French and feminine in Corsican language; and (ii) ) words that are feminine in French and masculine in Corsican language: 5% (medium difficulty )
  • inadequate locution: 10% (medium difficulty or hard)
  • false positives: 5% (medium difficulty or hard)
  • semantic disambiguation: 5% (hard). For example, disambiguating French ‘échecs’ = fiaschi/scacchi (failures/chess)
  • specific grammatical case: 2% (hard)
  • word reference error: 2% (hard)
  • unknown, unclassified: 6% (hard)
This entry was posted in blog and tagged , . Bookmark the permalink.