Percentage of ambiguous words in French sentence (from French to Corsican translation perspective)

What is the average percentage of ambiguous words in a French sentence (from a French to Corsican translation perspective). In the above example, this percentage amounts to 20/99 words = approximately 20%. Not all semantic ambiguities are taken into account here, so the real average should amount at least to 25%.

  • le = u/lu: definite article or pronoun (the/it)
  • est = livanti/hè: masculine noun or verb (east/is)
  • culminant = culminanti/culminendu: adjective or gerund
  • émerge = emerghju/emerghji: first person or third person verb
  • commence = principiu/principia: first person or third person verb (begin/begins)
  • cesse = cessu/cessa: first person or third person verb (cease/ceases)
  • volcanique = vulcanicu/vulcanica: adjective, masculine of feminine (volcanic, unambiguous from a French to English translation perspective)
Posted in blog | Comments Off on Percentage of ambiguous words in French sentence (from French to Corsican translation perspective)

Disambiguation of two consecutive ambiguous words: ‘plusieurs mois’

 

Testing improved disambiguation engine. This is a special case of disambiguation of two consecutive ambiguous words. French ‘au terme de plusieurs mois’ translates into à u capu di parechji mesa (at the end of several months) in Corsican (taravese variant).  In this case, ‘plusieurs’ and ‘mois’ are ambiguous:

  • ‘plusieurs’ (several) as an indefinite plural pronoun can be either masculine of feminine.
  • ‘mois’ as a noun can be either singular (month, mesi) or plural (months, mesa: plural with a final –a is reminiscent of latine neutral)

There is only one error in the above translation: da latu should be replaced by da cantu.

 

Posted in blog | Comments Off on Disambiguation of two consecutive ambiguous words: ‘plusieurs mois’

Disambiguation of ‘vie’

We face here a special case of disambiguation: ‘un général byzantin du vie siècle’ (a Byzantine general of the sixth century) should translate: un generali bizantinu di u 6esimu seculu. French ‘vie’ is ambiguous between vita and 6esimu or VIesimu (life/sixth). In effect, ‘vi’ is sometimes used for the roman numeral ‘VI’. In this case, ‘VIe’ is unambiguous.

This also rises the interesting and more general issue: are ambiguities a weakness for a language? Is it better for a language to have few ambiguities?

Posted in blog | Comments Off on Disambiguation of ‘vie’

A virsioni 1.1 hè dispunibuli

Okchakko Traduttori: a virsioni 1.1 hè dispunibuli. Ci sò i nuvità:

– traduci da u francesu à i trè varietà maestri di a lingua corsa: cismuntincu, sartinesu, taravesu
– migliuramentu riguardu à u schidariu d’aiutu
– migliuramentu riguardu à l’elisioni
– vucabulariu allargatu

Posted in blog | Comments Off on A virsioni 1.1 hè dispunibuli

Light version 1.1 is available

Light version 1.1 is available. New features:

  • translates from French to one of the three main variants of Corsican language: cismuntincu, sartinesu, taravesu
  • some improvements made to the help file
  • improvements on elision
  • additional vocabulary

 

 

Posted in blog | Comments Off on Light version 1.1 is available

Gender reversal: masculine to feminine

Here is a series of words that are masculine in French and feminine in Corsican language (taravese variant).

Posted in blog | Comments Off on Gender reversal: masculine to feminine

Gender reversal: feminine to masculine

French to Corsican (taravese variant): here is a list of words that are changing from feminine to masculine.

Posted in blog | Comments Off on Gender reversal: feminine to masculine

French to English: handling adjective order

Now beginning to handle adjective order in French to English translation:

‘un peintre russe juif’: un pittori russiu ghjudeiu (a Russian Jewish painter)

Posted in blog | Comments Off on French to English: handling adjective order

Word-sense disambiguation: first test of new engine

Now testing the new engine with the semantically ambiguous French ‘échecs’ = fiaschi/scacchi (failures/chess).

What is interesting here is that semantic disambiguation transfers successfully into English (although the French/English engine is still in its infancy as there are still a lot of grammatical errors):

Now further tests are needed with some other semantically ambiguous words:

  • ‘défense’: defense/tusk; Corsican: difesa/sanna
  • ‘fils’: sons/wires; Corsican: figlioli/fili
  • ‘comprendre’:
    understand/comprise; Corsican: capisce/cumprende
  • ‘vol’: flight/theft; Corsican: bulu/arrubecciu
  • ‘voler’: fly/steal; Corsican: bulà/arrubà
  • ‘échecs’: chess/failures; Corsican: scacchi/fiaschi
  • ‘palais’: palace/palaces/palate/palates; Corsican: palazzu/palazzi/palate/palates

In the background, the unresolved threefold ambiguity of French ‘partie’ = parti/partita/partita (part/game/gone) is lurking…

 

Posted in blog | Tagged , , , , , | Comments Off on Word-sense disambiguation: first test of new engine

Feigenbaum test and semantic disambiguation

Now it is patent that there cannot be successful  Feigenbaum test (i.e. not only occasional Feigenbaum hits, but regular and average performance) without an adequate treatment of semantic disambiguation. Arguably, it is one hard problem of machine translation. Here are some typical instances:

 

  • ‘défense’: defense/tusk; Corsican: difesa/sanna
  • ‘fils’: sons/wires; Corsican: figlioli/fili
  • ‘comprendre’:
    understand/comprise; Corsican: capisce/cumprende
  • ‘vol’: flight/theft; Corsican: bulu/arrubecciu
  • ‘voler’: fly/steal; Corsican: bulà/arrubà
  • ‘échecs’: chess/failures; Corsican: scacchi/fiaschi
  • and the fourfold ambiguous ‘palais’: palace/palaces/palate/palates; Corsican: palazzu/palazzi/palate/palates

In short: no successful semantic disambiguation = no genuine successful  Feigenbaum test. Semantic disambiguation engine needs to be rewritten.

Posted in blog | Tagged , , | Comments Off on Feigenbaum test and semantic disambiguation