-
Recent Posts
- Adjective modifiers again
- On ‘reflexive pronouns’
- Grammatical word-disambiguation again
- First steps in gallurese language
- Hinting at the Control problem
- On the implementation of grammatical disambiguation
- The 90% rule
- A “traducidori gaddhuresu” in preparation
- Gallurese language
- Updating our grammatical typology
- On the category of adverb modifiers
- The case of adjective modifiers and the notion of grammatical proof
- The status of adverbs
- The status of adjective modifiers
- Grammatical typology again
Archives
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- July 2019
- June 2019
- May 2019
- March 2019
- February 2019
- January 2019
- October 2018
- August 2018
- July 2018
- June 2018
- May 2018
- March 2018
- February 2018
- January 2018
- December 2017
- November 2017
- October 2017
- August 2017
- July 2017
- June 2017
- May 2017
- April 2017
- March 2017
- February 2017
- January 2017
- October 2016
- August 2016
- July 2016
- June 2016
- May 2016
- January 2016
Tags
- adjective accordance
- adverb
- AGI
- artificial general intelligence
- Corsican 'sartinesu'
- Corsican language
- disambiguation
- endangered languages
- false positive
- Feigenbaum hit
- Feigenbaum test
- French into Corsican
- French to Corsican
- French to Corsican translation
- French to English
- gaddhuresu
- Gallura
- gallurese
- gallurese language
- grammar
- grammatical analysis
- grammatical categories
- grammatical type disambiguation
- grammatical typology
- lexical disambiguation
- lingua corsa
- machine translation
- modulator
- nlp
- rule-based machine translation
- rule-based MT
- rule-base translation
- self-evaluation
- self-reference
- semantic disambiguation
- statistical machine translation
- superintelligence
- superintelligent machine translation
- translation
- translation corpora
- translation corpus
- two-sided grammar
- two-sided grammatical analysis
- word-sense disambiguation
- word-sense dismbiguation
Author Archives: pilinu
How to translate ‘Cette phrase est en français’ ? (This sentence is in French) – updated
Let us consider the following French sentence: Le comté de Kronoberg est un comté suédois dont le nom signifie en français ‘Couronne de montagne’. It translates into Corsican: A cuntea di Kronoberg hè una cuntea svedese chì u so nome significheghja in … Continue reading
Posted in blog
Tagged machine translation, self-reference, superintelligence, superintelligent machine translation
Comments Off on How to translate ‘Cette phrase est en français’ ? (This sentence is in French) – updated
Superintelligent machine translation (updated)
Let us consider superintelligence with regard to machine translation. To fix ideas, we can propose a rough definition: it consists of a machine with the ability to translate with 99% (or above) accuracy from one of the 8000 languages to … Continue reading
Posted in blog
Tagged AGI, AI, artificial general intelligence, lexical disambiguation, machine self-improvement, machine translation, MT, rule-based machine translation, self-improvement, superintelligence, superintelligent machine translation, word-sense disambiguation
Comments Off on Superintelligent machine translation (updated)
Brain Emulation
What is it to make a rule-based translation software for a given language pair? It amounts to making part of Brain Emulation, dedicated to translating one language into another i.e. emulating the brain of a bilingual individual. Arguably, ‘human cognition emulation’ is … Continue reading
Posted in blog
Tagged human reasoning emulation, human translation reasoning emulation
Leave a comment
Rough typology of remaining errors (updated march 2018)
French to Corsican: performing on French wikipedia sample test currently amounts to 94% on average. Below is a rough typology of remaining errors (presumably an average scoring of 95% on the open test should be attainable on the basis of correction … Continue reading
What are the conditions for a given endangered language to be a candidate for rule-based machine translation?
What are the conditions for a given endangered language to be a candidate for rule-based machine translation? For a given endangered language to be a candidate for rule-based machine translation, some requirements are in order. There is notably need for: a … Continue reading
Posted in blog
Tagged dictionary, diglossia, elision, euphony, gaddhuresu, gallurese language, grammar, lexicon, machine translation, rule-based machine translation
Comments Off on What are the conditions for a given endangered language to be a candidate for rule-based machine translation?
Interesting case of first name disambiguation
Here is an interesting case of first name disambiguation for machine translation. Consider the following first name ‘Camille’. It can apply to both genders. In Corsican (taravese or sartinese variants) it translates either into Cameddu (masculine) or Camedda (feminine). In … Continue reading
Writing differences between Corsican and Gallurese
Here are some writing differences between Corsican and Sardinian gallurese, that result from historical writing habits. These writing differences prevail, even when the words are the same: ghj is replaced by gghj: acciaghju (corsu), acciagghju (gallurese) , steel chj is … Continue reading
Posted in blog
Tagged Corsica, Corsican 'sartinesu', Corsican language, gaddhuresu, Gallura, gallurese, gallurese language, machine translation
Comments Off on Writing differences between Corsican and Gallurese
Quandu da la forza à la raghjoni cuntrasta Tandu vinci la forza è la raghjoni ùn basta
Quandu da la forza à la raghjoni cuntrasta Tandu vinci la forza è la raghjoni ùn basta. This is a rare Corsican proverb. In French, litterally: “Lorsque la force et la raison s’opposent, alors la force gagne car la raison … Continue reading
How rule-based and statistical machine translation can help each other
Here are a few suggestions on how rule-based and statistical machine translation can help each other: (This is a follow-up to the previous post) to begin with, rule-based and statistical machine translation are often contrasted and compared: it would be … Continue reading
Why rule-based translation is (presently) best suited to endangered languages
Here are some arguments in favor of the choice of rule-based translation concerning machine translation of endangered languages (it relates to the philosophy of language policy): there does not exist at present time a reliable corpus between the given endangered … Continue reading
Enhancing French to Italian translation
Some improvements made to French to Italian translation: fixed several contractors (della, dello, …) the nice thing is that semantic disambiguation is working: ‘échecs’ = fallimenti/scacchi (failures/chess) and translates properly into scacchi
Posted in blog
Tagged French_Italian translator, French-Italian, Italian, Italian language
Leave a comment
Very first draft on French to Italian
Now testing French to Italian translation: it is the very first draft. A rough 80%. A lot of things to fix.
Posted in blog
Leave a comment
Improvement in grammatical structures: another 100% hit
Progress on grammatical structures: some improvements to be included in future 1.2 version yield another Feigenbaum hit: 100%. In the present case, the Corsican language variety is taravese.
Posted in blog
Comments Off on Improvement in grammatical structures: another 100% hit
Percentage of ambiguous words in French sentence (from French to Corsican translation perspective)
What is the average percentage of ambiguous words in a French sentence (from a French to Corsican translation perspective). In the above example, this percentage amounts to 20/99 words = approximately 20%. Not all semantic ambiguities are taken into account here, so … Continue reading
Posted in blog
Comments Off on Percentage of ambiguous words in French sentence (from French to Corsican translation perspective)
Disambiguation of two consecutive ambiguous words: ‘plusieurs mois’
Testing improved disambiguation engine. This is a special case of disambiguation of two consecutive ambiguous words. French ‘au terme de plusieurs mois’ translates into à u capu di parechji mesa (at the end of several months) in Corsican (taravese variant). … Continue reading
Posted in blog
Comments Off on Disambiguation of two consecutive ambiguous words: ‘plusieurs mois’
Disambiguation of ‘vie’
We face here a special case of disambiguation: ‘un général byzantin du vie siècle’ (a Byzantine general of the sixth century) should translate: un generali bizantinu di u 6esimu seculu. French ‘vie’ is ambiguous between vita and 6esimu or VIesimu (life/sixth). In … Continue reading
Posted in blog
Comments Off on Disambiguation of ‘vie’
Gender reversal: masculine to feminine
Here is a series of words that are masculine in French and feminine in Corsican language (taravese variant).
Posted in blog
Comments Off on Gender reversal: masculine to feminine
Gender reversal: feminine to masculine
French to Corsican (taravese variant): here is a list of words that are changing from feminine to masculine.
Posted in blog
Comments Off on Gender reversal: feminine to masculine
French to English: handling adjective order
Now beginning to handle adjective order in French to English translation: ‘un peintre russe juif’: un pittori russiu ghjudeiu (a Russian Jewish painter)
Posted in blog
Comments Off on French to English: handling adjective order
Word-sense disambiguation: first test of new engine
Now testing the new engine with the semantically ambiguous French ‘échecs’ = fiaschi/scacchi (failures/chess). What is interesting here is that semantic disambiguation transfers successfully into English (although the French/English engine is still in its infancy as there are still a … Continue reading
Posted in blog
Tagged disambiguation, French to Corsican, French to English, machine translation, semantic disambiguation, word-sense dismbiguation
Comments Off on Word-sense disambiguation: first test of new engine
Feigenbaum test and semantic disambiguation
Now it is patent that there cannot be successful Feigenbaum test (i.e. not only occasional Feigenbaum hits, but regular and average performance) without an adequate treatment of semantic disambiguation. Arguably, it is one hard problem of machine translation. Here are some … Continue reading
Posted in blog
Tagged disambiguation, Feigenbaum test, semantic disambiguation
Comments Off on Feigenbaum test and semantic disambiguation
French to English: superlative
Now testing French to English translation. Still a lot of grammatical errors. The scoring is a rough 80%. Adjective-noun order is now handled properly. But some progress in superlative is expected: ‘le plus important’ should translate: the most important ‘le plus … Continue reading
Posted in blog
Comments Off on French to English: superlative
Adjective-noun reversal in English
Now beginning to incrementally improve the translation from French to English. Beginning with adjective order in noun + adjective structures: ‘présence significative’ = significant presence ‘soldats coloniaux’ = colonial soldiers ‘colonies françaises’ = French colonies
Posted in blog
Comments Off on Adjective-noun reversal in English
A first 100%!
Now scoring 1 – 0/124 = 100%. Translated into Corsican ‘sartinesu’. Another Feigenbaum hit.
Disambiguation of fourfold ambiguous French ‘pygmée’
Translation in the ‘sartinesa’ variant of the Corsican language. Scoring 1 – (1/110) = 99.09%. Let us focus on the disambiguation of fourfold ambiguous French ‘pygmée’. A rare case of ambiguity between masculine/feminine singular. It can consist of: masculine singular noun: translates … Continue reading
Posted in blog
Comments Off on Disambiguation of fourfold ambiguous French ‘pygmée’
French to English: first experimental test
Now testing the translation from French to English. A lot of grammatical errors (a rough 75%). To mention but a few of them: adjective + noun inversion: localities Alsatian should read: Alsatian localities date format inversion: 4 March should read: March … Continue reading
Translating French into the ‘sartinesu’ variant of Corsican language
Translating French into the ‘sartinesu’ variant of Corsican language: scoring 1 – (1/105) = 99.04%. Feigenbum hit? Yes: ‘médailleur’ is not common word.
Posted in blog
Tagged Corsican 'sartinesu', Feigenbaum hit, Feigenbaum test, French into Corsican, sartinesu
Leave a comment
French ‘fin’ followed by a year number: fixed
Tagger improvement: fixed this issue. French ‘l’Empire allemand’ now translates properly into l’Imperu alimanu (the German Empire). French word ‘fin’ is now identified as a preposition when followed by a year number. The above excerpt is translated into the ‘sartinesu’ … Continue reading
Posted in blog
Tagged adjective accordance, Corsican language, disambiguation, machine translation, numbers grammatical type
Comments Off on French ‘fin’ followed by a year number: fixed
French ‘fin’ followed by a year number
There is one informative error here: ‘a agité l’Empire allemand fin 1913’ (agitated the German Empire at the end of 1913) should translate into chì hà agitatu l’Imperu alimanu à a fini di u 1913. The translation error (l’Imperu alimana … Continue reading
Posted in blog
Comments Off on French ‘fin’ followed by a year number
Object of the verb ‘to exist’
French ‘il existe 29 parcs nationaux’ (there are 29 national parks) translates into Corsican: esistenu 29 parchi naziunali. When the verb ‘to exist’ is used and its object is plural, a plural form of the verb is required in Corsican … Continue reading
Posted in blog
Comments Off on Object of the verb ‘to exist’