- Superintelligent machine translation
- Writing differences between Corsican and Gallurese
- What are the conditions for a given endangered language to be a candidate for rule-based machine translation?
- Quandu da la forza à la raghjoni cuntrasta Tandu vinci la forza è la raghjoni ùn basta
- How rule-based and statistical machine translation can help each other
- Why rule-based translation is (presently) best suited to endangered languages
- Rough typology of remaining errors
- Enhancing French to Italian translation
- Very first draft on French to Italian
- Improvement in grammatical structures: another 100% hit
- Percentage of ambiguous words in French sentence (from French to Corsican translation perspective)
- Disambiguation of two consecutive ambiguous words: ‘plusieurs mois’
- Disambiguation of ‘vie’
- A virsioni 1.1 hè dispunibuli
- Light version 1.1 is available
Italian source text
Tagsadjective accordance cismuntincu conjugation Corse Corsica Corsican 'sartinesu' Corsican language corsu dependency parser dependency parsing disambiguation false positive Feigenbaum hit Feigenbaum test francese-corsu francesu-corsu français-corse French into Corsican French to Corsican French to English gaddhuresu gallurese language Italian Italian language langue corse machine translation numbers grammatical type past participe accordance preposition reference reference language shift rule-based machine translation sartinesu self-reference semantic disambiguation statistical machine translation taravesu traduction traduction automatique traduttore traduttori translation corpora translation corpus translator word-sense dismbiguation
A jeweler examines an emerald. "Aha," he says, "another green emerald. In all my years in this business, I must have seen thousands of emeralds, and every one has been green." We think the jeweler reasonable to hypothesize that all emeralds are green. Next door is another jeweler having equally comprehensive experience with emeralds. He speaks only the Choctaw Indian language. Color distinctions are not as universal as might be thought. The Choctaw Indians made no distinction between green and blue—the same words applied to both. The Choctaws did make a linguistic distinction between okchamali, a vivid green or blue, and okchakko, a pale green or blue. The Choctaw-speaking jeweler says: All emeralds are okchamali. He maintains that all his years in the jewelry business confirm this hypothesis. (William Poundstone, Labyrinths of reason)
The Corsican language is currently considered by Unesco as a "definitely endangered language". This site's aim is to help reviving the Corsican language by providing translation into Corsican. It translates French and Italian into one of the three main Corsican variants: 'cismuntincu', 'sartinesu' or 'taravesu'.
Most illustrations are from Wiki Commons
Monthly Archives: January 2017
Semantical disambiguation is lurking: défense = difesa/sanna = defense/tusk It should read: L’avvucatu priparava a so difesa. A sanna di u cignale era tronca. A tazzina era sculpita in una sanna d’elefante.
There are sometimes false positives. Some words should remain untranslated, notably proper names. Interestingly, it is due to the fact that the english word ‘transport’ is the same in french: transport (fr) = transport (en) = trasportu (co).
Testing #machine translation now facing new elision problem: Riventosa (fr) = A Riventosa(co) proper noun (fr) = definite article + proper noun (co) it should read: in u paese di A Riventosa (without elision) Elision rules are not trivial: le … Continue reading
Rule-based translation : adjective accordance : interesting stuff: sur les réseaux japonais et américain (fr) = annantu à e rete sgiappunesa è americana (co) = on the japanese and american networks (en) noun (networks) is plural but adjectives (japanese and american) are … Continue reading
Now handling gender reversal: – mer (FR, feminine) = sea (EN) = mare (CO, masculine) – saveur (FR, feminine) = flavor (EN) = sapore (CO, masculine) – liqueur (FR, feminine) = liquor (EN) = licore (CO, masculine) ‘c’est une bonne liqueur’ … Continue reading
Introducing new feature for #MachineTranslation: some verbal locutions: prendre d’assaut = assaltà mettre à sac = sacchighjà prendre au collet = incappià
Now considering the issue of Semantical disambiguation. Some instances For French to Corsican are: – ‘défense’ = sanna/difesa = tusk/defense – ‘vol’ = bulu/furtu = flight/theft – ‘comprend’ = capisce/cumprende = understands/comprises – ‘palais’ = palatu/palazzu = palate/palace – ‘expérience’ = … Continue reading
Now scoring 1 – 6/134 = 95.52%. Lack of vocabulary ‘passacaile’. It should read: ‘versu u 1678‘, ‘di a so epica‘, ‘di u so tempu‘.
Let us mention the issue of threefold ambiguity: french ‘nouvelle’ can translate into: ‘nutizia‘ (‘piece of news’) or ‘nuvella‘ (short story’) or ‘nova‘ (‘new’) The disambiguation between ‘nutizia‘ (‘piece of news’) or ‘nuvella‘ (short story’) is semantic (hard) while the … Continue reading
Some further reflections on the definition of ‘above human level’ translation: – the answer may not be based solely on the quantitative side, being of the type: ‘above 96%’, “above 97%’, ‘above 98%’, etc. – it seems the answer should also … Continue reading