Characteristics of an AGI (artificial general intelligence)

What are the characteristics we want for an AGI (artificial general intelligence)? An AGI should have a very advanced capacity in NLP and language comprehension. One of the qualities we expect from an AGI is respect for multilingualism. Hopefully, the AGI should have extensive NLP capabilities, which apply to a large number of languages, and even to the 8000 languages of the planet, i.e. also to the 90% of endangered languages. The AGI could thus help to solve an important problem inherent to the problem of language extinction, which affects human cultural diversity (it can be assumed that some languages will be extinct at the time of the AGI event, but the AGI could thus help to revitalize them).

Posted in blog | Tagged , , , | Leave a comment

The two-language matching problem

Here is a problem for a human intelligence (or an AGI): we have a dictionary (with words, lemmas and grammatical types) in a language A and a second dictionary in a language B. If we have an extensive corpus of each of the two languages, is it possible to create a translation dictionary from A to B, and how? To take an example: if the two languages were French and English, we would have to associate ‘cheval’ with ‘horse’, etc. in the final translation dictionary, and so on for all the words of language A.

Highly related seems to be this paper: Deciphering Undersegmented Ancient Scripts Using Phonetic Prior.

Posted in blog | Tagged , , , | Leave a comment

Prototype of text search with optional grammatical type

Inconditional search

Let us expand the idea of text analysis derived from rule-based translation. Above is an example of a classic word-based search. In this particular case, it is the French word ‘été’. This word is ambiguous because it can be a common noun (‘summer’), or a past participle (‘been’). Below is an example of a search for the word ‘summer’ associated with the grammatical type ‘common noun’.

Conditional search based on ‘noun’ grammatical type

Finally, we have below an example of a search for the word ‘summer’ associated with the grammatical type ‘past participle’.

Conditional search based on ‘past participle’ grammatical type
Posted in blog | Tagged , , , , , , , , , , | Leave a comment

Why it’s worth it to engage in rule-based translation

Rule-based translation is difficult to implement. The main difficulty encountered is taking into account the groups of words, so as to be on a par with statistics-based translation. The main problems in this regard are (i) polymorphic disambiguation; and (ii) building a fair typology of grammatical types. But once these steps begin to be mastered, there are many advantages. What seems essential here is that with the same piece of software, both machine translation and text analysis can be carried out. Among the modules that are easy to implement are the following:

  • lemmatizer
  • part-of-speech tagger
  • singularizer
  • pluralizer
  • grammar checker
  • type extractor: a module that allows you to extract words from a text according to their grammatical category

For the implementation of rule-based translation provides the machine with some inherent understanding of the text, in the same way that a human being does. To put it in a nutshell, it is better artificial intelligence.

Finally, other modules, more advanced, seem possible (to be confirmed).

Posted in blog | Tagged , , , , , , , , , , , , , , | Leave a comment

A two-sided analysis of postpositions

#preposition #postposition Consider the following adverbs: après (after, dopu) (he would eat after), avant (before, nanzi) (they had seen them before). They can also be considered as prepositions:

  • après la fête: after the feast, dopu à a festa
  • avant le mois de juin: before the month of June, nanzi u mesi di ghjunghju
    Likewise, during is also a preposition: durant la procession, during the procession, mentri a prucissioni
    But après, avant, durant can also be used differently:
  • deux jours après: two days after, dui ghjorni dopu
  • une semaine avant: one week before, una sittimana innanzi
  • deux mois durant: for two months, mentri dui mesi
    From our point of view, these are postpositions, because they are then followed by punctuation (in general), and preceded by a common name.
    If we now extend this analysis to locutions, the following locutions are also postpositions:
  • plus tard: later, dopu; deux jours plus tard: two days later, dui ghjorni dopu
  • plus loin: further, più luntanu; trois mètres plus loin: three meters further
  • plus près: closer, più vicinu; dix centimètres plus près: ten centimeters closer

Posted in blog | Leave a comment

More on two-sided grammar


Let’s focus on analyzing the following phrases:

  • à force de courage (bravely)
  • à force de courage et de persévérance (by dint of courage and perseverance)
  • avec beaucoup d’abnégation (selflessly)
  • d’une manière ou d’une autre (in any way)
  • d’une façon vraiment admirable (in a very admirable way)
  • au moment le plus opportun (when most appropriate)

What is their grammatical nature? From the point of view of two-sided grammar, what are they?

From a synthetic standpoint, first of all, they are adverbs. Let us turn now to their nature from an analytical point of view.

  • à force de courage (bravely): analytically, it is a preposition, followed by a common noun, then another preposition, then another common noun: PS-NC-PS-NC.
  • à force de courage et de persévérance (by dint of courage and perseverance): analytically, it is a preposition, followed by a common noun, then another preposition, then another common noun, then a conjunction, then another preposition and then another common noun: PS-NC-PS-NC-CONJ-PS-NC.
  • and so on
Posted in blog | Tagged , , , | Comments Off on More on two-sided grammar

Lemmatizer for French language updated

I just updated the lemmatizer for French language. Many new options are available.

The API can be tested here: https://rapidapi.com/okchakkotranslator/api/lemmatizer-for-french-language

Posted in blog | Tagged , , , | Leave a comment

Reflections on grammatical typologies

It is useful to point out the differences that may exist between different grammatical typologies. The classical grammatical taxonomy is essentially aimed at teaching and comprehension. It therefore has a pedagogical purpose. On the other hand, the taxonomy that is useful for rule-based machine translation has a different purpose: it aims essentially at allowing disambiguation, both grammatically and semantically, because ambiguity is a fundamental and very common problem in this particular context. Such a typology essentially focuses on the location of word types, on the structures encountered in the sentence. This explains why typologies can be different, as they have different goals and purposes.

Posted in blog | Tagged , , | Leave a comment