Tag Archives: rule-based

Artificial Intelligence (Overview )

Is AI dangerous or useful?

This question is currently the subject of extensive debate. The aim here is not to repeat well-known opinions, but to shed light on the basics of the technology that you are almost certainly unaware of. Or do you know where AI gets its intelligence from?

For a quarter of a century, I have been developing ‘intelligent’ IT systems and I am astonished that we ascribe real intelligence to artificial intelligence at all. That’s exactly what it doesn’t have. Its intelligence always comes from humans, who not only provide the data, but also have to evaluate its meaning before the AI can use it. Only then, AI can surprise us with its impressiv performance and countless useful applications in a wide variety of areas. How does it achieve this?

In 2019, I started a blog series on this topic, which you can see an overview of below. In 2021, I then summarised the articles in a book entitled “Wie die künstliche Intelligenz zur Intelligenz kommt” (in German). See below a list of blogposts which form the basis of the book.

While the book is in German, the blogseries is available both in German and English.

Latest Posts about AI

English Posts

German Posts

Scetch of the history of AI since Aristotle (only in German)
AI and Music (only in German)
How dangerous is AI? (only in German)
Weaknesses of AI (only in German)
The 21st travel and AI (German original of “Ijon Tichy meets AI”)
Has Chatbot LamDA an Own Consciousness? (only in German)
Three Observations on AI / 1 (only in German)
Three Observations on AI / 2 (only in German)
Three Observations on AI / 3 (only in German)

Earlier Posts on AI (basis of the KI-book)

Rule-based or corpus-based?

These are the two fundamentally different methods of computer intelligence. They can either be based on rules or a collection of data (corpus). In the introductory post, I present the two with the help of two characteristic anecdotes:

AI: vodka and tanks

With regard to success, the corpus-based systems have obviously outstripped the rule-based ones:

Corpus-based AI overcomes its weaknesses

The rule-based systems had a more difficult time of it. What are their challenges? How can they overcome their weaknesses? And where is their intelligence situated inside them?

How are corpus-based systems set up? How is their corpus compiled and assessed? What are neural networks all about? And what are the natural limits of corpus-based systems?

Next, we’ll have a look at search engines, which are also corpus-based systems. How do they arrive at their proposals? Where are their limits and dangers? Why, for instance, is it inevitable that bubbles are formed?

Is a program capable of learning without human beings providing it with useful pieces of advice? It appears to work with deep learning. To understand this, we first compare a simple card game with chess: what requires more intelligence? Surprisingly, it becomes clear that for a computer, chess is the simpler game.

With the help of the general conditions of the board games Go and chess, we recognise under what conditions deep learning works.

In the following blog post, I’ll provide an overview of the AI types known to me. I’ll draw a brief outline of their individual structures and of the differences in the way they work.

Overview of the AI systems

So where is the intelligence?

Now where in artificial intelligence is the intelligence located?

The considerations reveal what distinguishes natural intelligence from artificial intelligence:

Artificial and natural intelligence: the difference

AI only shows its capabilities when the task is clear and simple. As soon as the question becomes complex, they fail. Or they fib by arranging beautiful sentences found in their treasure trove of data in such a way that it sounds intelligent (ChatGPT, LaMDA). They do not work with logic, but with statistics, i.e. with probability. But is what appears to be true always true?

The weaknesses necessarily follow from the design principle of AI. Further articles deal with this:

The Weaknesses of AI (in German)
How dangerous is AI really? (in German)

Overview of the AI systems

5. May 2020 Hans Rudolf Straub Leave a comment

All the systems we have examined so far, including deep learning, can in essence be traced back to two methods: the rule-based method and the corpus-based method. This also applies to the systems we have not discussed to date, namely simple automata and hybrid systems, which combine the two above approaches. If we integrate these variants, we will arrive at the following overview:

A: Rule-based systems

Rule-based systems are based on calculation rules. These rules are invariably IF-THEN commands, i.e. instructions which assign a certain result to a certain input. These systems are always deterministic, i.e. a certain input always leads to the same result. Also, they are always explicit, i.e. they involve no processes that cannot be made visible, and the system is always completely transparent – at least in principle. However, rule-based systems can become fairly complex.

A1: Simple automaton (pocket calculator type)

Fig. 1: Simple automaton

Rules are also called algorithms (“Algo”) in Fig. 1. Input and outputs (results) need not be figures. The simple automaton distinguishes itself from other systems in that it does not require any special knowledge base, but works with a few calculation rules. Nevertheless, simple automata can be used to make highly complex calculations, too.

Perhaps you would not describe a pocket calculator as an AI system, but the differences between a pocket calculator and the more highly developed systems right up to deep learning are merely gradual in nature – i.e. precisely of the kind that is being described on this page. Complex calculations soon strike us as intelligent, particularly if we are unable to reproduce them that easily with our own brains. This is already the case with simple arithmetic operations such as divisions or root extraction, where we quickly reach our limits. Conversely, we regard face recognition as comparatively simple because we are usually able to recognise faces quite well without a computer. Incidentally, nine men’s morris is also part of the A1 category: playing it requires a certain amount of intelligence, but it is complete in itself and easily controllable with an AI program of the A1 type.

A2: Knowledge-based system

Fig. 2: Compiling a knowledge base (IE=Inference Engine)

These systems distinguish themselves from simple automata in that part of their rules have been outsourced to a knowledge base. Fig. 2 indicates that this knowledge base has been compiled by a human being, and Fig. 3 shows how it is applied. The intelligence is located in the rules; it originates from human beings – in the application, however, the knowledge base is capable of working on its own.

Fig. 3: Application of a knowledge-based system

The inference machine (“IE” in Figs. 2 and 3) corresponds to the algorithms of the simple automaton in Fig. 1. In principle, algorithms, the inference engine and the rules of the knowledge bases are always rules, i.e. explicit IF-THEN commands. However, these can be interwoven and nested in a variety of different ways. They can refer to figures or concepts. Everything is made by human experts.

The rules in the knowledge base are subordinate to the rules of the inference engine. The latter control the flow of the interpretation, i.e. they decide what rules of the knowledge base are to be applied and how they are to be implemented. The rules of the inference engine are the actual program that is read and executed by the computer. The rules of the knowledge base, however, are not directly executed by the computer, but indirectly through the instructions provided by the inference engine. This is nesting – which is typical of commands, i.e. software in computers; after all, the rules of the inference engine are not implemented directly but read by deeper rules right down to the machine language at the core (in the kernel) of a computer. In principle, however, the rules of the knowledge base are calculation rules just like the rules of the inference machine, but in a “higher” programming language. It is an advantage if the human domain experts, i.e. the human specialists, find this programming language particularly easy and safe to read and use.

With regard to the logic system used in inference machines, we distinguish between rule-based systems

– with a static logic (ontologies type / semantic web type),
– with a dynamic logic (concept molecules type).

For this, cf. the blog post on the three innovations of rule-based AI.

B: Corpus-based systems

Corpus-based systems are compiled in three steps (Fig. 4). In the first step, as large as possible a corpus is collected. The collection does not contain any rules, only data. Rules would be instructions; however, the data of the corpus are not instructions: they are pure data collections, texts, images, game processes, etc.

Fig. 4: Compiling a corpus-based system

These data must now be assessed. As a rule, this is done by a human being. In the third step, a so-called neural network is trained on the basis of the assessed corpus. In contrast to the data corpus, the neural network is again a collection of rules like the knowledge base of the rule-based systems A. Unlike those, however, the neural network is not constructed by a human being but built and trained by the assessed corpus. Unlike the knowledge base, the neural network is not explicit, i.e. it is not readily accessible.

Fig. 5: Application of a corpus-based system

In their applications, both neural networks and the rule-based systems are fully capable of working without human beings. Even the corpus is no longer necessary. All the knowledge is located in the algorithms of the neural network. In addition, neural networks are also quite capable of interpreting poorly structured contents such as a mess of pixels (i.e. images), where rule-based systems (B type) very quickly reach their limits. In contrast to these, however, corpus-based systems are less successful with complex outputs, i.e. the number of possible output results must not be too large since if it is, the accuracy rate will suffer. What are best suited here are binary outputs of the “our tank – foreign tank” type (cf. preceding post) or of “male author – female author” in the assessment of Twitter texts. For such tasks, corpus-based systems are vastly superior to rule-based ones. This superiority quickly declines, however, when it comes to finely differentiated outputs.

Three subtypes of corpus-based AI

The three subtypes differ from each other with regard to who or what assesses the corpus.

Fig. 6: The three types of corpus-based system and how they assess their corpus

B1: Pattern recognition type

I described this type (top in Fig. 6) in the tank example. The corpus is assessed by a human expert.

B2: Search engine type

Cf. middle diagram in Fig. 6: in this type, the corpus is assessed by the customers. I described such a system in the search engine post.

B3: Deep learning type

In contrast to the above types, this one (bottom in Fig. 6) does not require a human being to train or assess the neural network. The assessment results solely from the way in which the games proceed. The fact that deep learning is only possible in very restricted conditions is explained in the post on games and intelligence.

C: Hybrid systems

Of course the above-mentioned methods (A1-A2, B1-B3) can also be combined in practice.

Thus a face identification system, for instance, may work in such a way that in the images provided by a surveillance camera, a corpus-based system B1 is capable of recognising faces as such, and in the faces the crucial shapes of eyes, mouth, etc. Subsequently, a rule-based system A2 uses the points marked by B1 to calculate the proportions of eyes, nose, mouth, etc., which characterise an individual face. Such a combination of corpus- and rule-based systems allows for individual faces to be recognised in images. The first step would not be possible for an A2 system, the second step would be far too complicated and inaccurate for a B1 system. A hybrid system makes it possible.

In the following blog post, I will answer the question as to where the intelligence is located in all these systems. But you have probably long found the answer yourself.

This is a blog post about artificial intelligence.

Translation: Tony Häfliger and Vivien Blandford

Artificial Intelligence

Rule-based AI: Where is the intelligence situated

16. April 2020 Hans Rudolf Straub Leave a comment

Two AI variants: rule-based and corpus-based

The two AI variants mentioned in previous blog posts are still topical today, and they have registered some remarkable successes. The two differ from each other not least in where precisely their intelligence is situated. Let’s first have a look at the rule-based system.

Structure of a rule-based system

In the Semfinder company, we used a rule-based system. I drew the following sketch of it in 1999:

Semantic interpretation system

Green: data
Yellow: software
Light blue: knowledge ware
Dark blue: knowledge engineer

The sketch consists of two rectangles, which represent different locations. The rectangle bottom left shows what happens in the hospital; the rectangle top right additionally shows what goes on in knowledge engineering.

In the hospital, our coding program reads the doctors’ free texts, interprets them and converts them into concept molecules, and allocates the relevant codes to them with the help of a knowledge base. The knowledge base contains the rules with which the texts are interpreted. In our company, these rules were drawn up by people (human experts). The rules are comparable to the algorithms of a software program, apart from the fact that they are written in a “higher” programming language to ensure that non-IT specialists, i.e. the domain experts, who in our case are doctors, are able to establish them easily and maintain them safely. For this purpose, they use the knowledge base editor, which enables them to view the rules, to test them, to modify them or to establish completely new ones.

Where, then, is the intelligence situated?

It is situated in the knowledge base – but it is not actually a genuine intelligence. The knowledge base is incapable of thinking on its own; it only carries out what a human being has instilled into it. I have therefore never described our system as intelligent. At the very least, intelligence means that new things can be learnt, but the knowledge base learns nothing. If a new word crops up or if a new coding aspect is integrated, then this is not done by the knowledge base but by the knowledge engineer, i.e. a human being. All the rest (hardware, software, knowledge base) only carry out what they have been prescribed to do by human beings. The intelligence in our system was always and exclusively a matter of human beings – i.e. a natural rather than an artificial intelligence.

Is this different in the corpus-based method? In the following post, we will therefore have a closer look at a corpus-based system.

This is a post about artificial intelligence.

Translation: Tony Häfliger and Vivien Blandford

Artificial Intelligence

The three innovations of rule-based AI

30. March 2020 Hans Rudolf Straub Leave a comment

Have the neural networks outpaced the rule-based systems?

It cannot be ignored: corpus-based AI has overtaken rule-based AI by far. Neural networks are making the running wherever we look. Is the competition dozing? Or are rule-based systems simply incapable of yielding equivalent results to those of neural networks?

My answer is that both methods are predisposed for performing very different functions as a matter of principle. A look at their respective modes of action makes clear what the two methods can usefully be employed for. Depending on the problem to be tackled, one or the other has an advantage.

Yet the impression remains: the rule-based variant seems to be on the losing side. Why is that?

In what dead end has rule-based AI got stuck?

In my view, rule-based AI is lagging behind because it is unwilling to cast off its inherited liabilities – although doing so would be so easy. It is a matter of

acknowledging semantics as an autonomous field of knowledge,
using complex concept architectures,
integrating an open and flexible logic (NMR).

We have been doing this successfully for more than 20 years. What do the three points mean in detail?

Point 1: acknowledging semantics as an autonomous field of knowledge

Usually, semantics is considered to be part of linguistics. In principle, there would not be any objection to this, but linguistics harbours a trap for semantics which is hardly ever noticed: linguistics deals with words and sentences. The error consists in perceiving meaning, i.e. semantics, through the filter of language, and assuming that its elements have to be arranged in the same way as language does with words. Yet language is subject to one crucial limitation: it is linear, i.e. sequential – one letter follows another, one word comes after another. It is impossible to place words in parallel next to each other. When we are thinking, however, we are able to do so. And when we investigate the semantics of something, we have to do so in the way we think and not in the way we speak.

Thus we have to find such formalisms for the concepts as occur in thought. The limitation imposed by the linear sequence of the elements and the resulting necessity to reproduce compounds and complex relational structures with grammatical tricks in a makeshift way, and differently in every language – this structural limitation does not apply to thinking, and this results in structures on the side of semantics that are completely different from those on the side of language.

Word ≠ concept

What certainly fails to work is a simple “semantic” annotation of words. A word can have many and very different meanings. One meaning (= a concept) can be expressed with different words. If we want to analyse a text, we must not look at the individual words but always at the general context. Let’s take the word “head”. We may speak of the head of a letter or the head of a company. We cannot integrate the context into our concept by associating the concept of <head< with other concepts. Thus there is a <body part<head< and a <function<head<. The concept on the left (<body part<) then states the type of the concept on the right (<head<). We are thus engaged in typification. We look for the semantic type of a concept and place it in front of the subconcept.

Consistantly composite data elements

The use of typified concepts is nothing new. However, we go further and create extensive structured graphs, which then constitute the basis for our work. This is completely different from working with words. The concept molecules that we use are such graphs possess a very special structure to ensure that they can be read easily and quickly by both people and machines. This composite representation has many advantages, among them the fact that combinatorial explosion is countered very simply and that the number of atomic concepts and rules can thus be drastically cut. Thanks to typification and the use of attributes, similar concepts can be refined at will, which means that by using molecules we are able to speak with a high degree of precision. In addition, the precision and transparency of the representation have very much to do with the fact that the special structure of the graphs (molecules) has been directly derived from the multifocal concept architecture (cf. Point 2).

Point 2: using complex concept architectures

Concepts are linked by means of relations in the graphs (molecules). The above-mentioned typification is such a relation: when the <head< is perceived as a <body part<, then it is of the <body part< type, and there is a very specific relation between <head< and <body part<, namely a so-called hierarchical or ‘is-a’ relation – the latter because in the case of hierarchical relations, we can always say ‘is a”, i.e. in our case: the <head< is a <body part<.

Typification is one of the two fundamental relations in semantics. We allocate a number of concepts to a superordinate concept, i.e. their type. Of course this type is again a concept and can therefore be typified again in turn. This results in hierarchical chains of ‘is-a’ relations with increasing specification, such as <object<furniture<table<kitchen table<. When we combine all the chains of concepts subordinate to a type, the result is a tree. This tree is the simplest of the four types of architecture used for an arrangement of concepts.

This tree structure is our starting point. However, we must acknowledge that a mere tree architecture has crucial disadvantages which preclude the establishment of semantics which are really precise. Those who are interested in the improved and more complex types of architecture and their advantages and disadvantages, will find a short description of the four types of architecture on the website of meditext.ch.

In the case of the concept molecules, we have geared the entire formalism, i.e. the intrinsic structure of the rules and molecules themselves, to the complex architectures. This has many advantages, for the concept molecules now have precisely the same structure as the axes of the multifocal concept architecture. The complex folds of the multifocal architecture can be conceived of as a terrain, with the dimensions or semantic degrees of freedom as complexly interlaced axes. The concept molecules now follow these axes with their own intrinsic structure. This is what makes computing with molecules so easy. It would not work like this with simple hierarchical trees or multidimensional systems. Nor would it work without consistently composite data elements whose intrinsic structure follows the ramifications of the complex architecture almost as a matter of course.

Point 3: integrating an open and flexible logic (NMR)

For theoretically biased scientists, this point is likely to be the toughest, for classic logic appears indispensable to most of them, and many bright minds are proud of their proficiency in it. Classic logic is indeed indispensable – but it has to be used in the right place. My experience shows me that we need another logic in NLP (Natural Language Processing), namely one that is not monotonic. Such non-monotonic reasoning (NMR) enables us to attain the same result with far fewer rules in the knowledge basis. At the same time, maintenance is made easier. Also, it is possible for the system to be constantly developed further because it remains logically open. A logically open system may disquiet a mathematician, but experience shows that an NMR system works substantially better for the rule-based comprehension of the meaning of freely formulated text than a monotonic one.

Conclusion

Today, the rule-based systems appear to be lagging behind the corpus-based ones. This impression is deceptive, however, and derives from the fact that most rule-based systems have not yet succeeded in jumping ahead of themselves and becoming more modern. This is why they are either

only applicable for ckear tasks in a small and well defined domain , or
very rigid and therefore hardly employable, or
they require an unrealistic use of resources and become unmaintainable.

If, however, we use consistently composite data elements and a higher degree of concept architectures, and if we deliberately refrain from monotonic conclusions, a rule-based system will enable us to get further than a corpus-based one – for the appropriate tasks.

Rule-based and corpus-based systems differ a great deal from each other, and depending on the task in hand, one or the other has the edge. I will deal with this in a later post.

The next post will deal with the current distribution of the two AI methods.

This is a post about artificial intelligence.

Translation: Tony Häfliger and Vivien Blandford

Artificial Intelligence

Specification of the challenges for rule-based AI

30. March 2020 Hans Rudolf Straub Leave a comment

Rule-based AI is lagging behind

The distinction between rule-based AI and corpus-based AI makes sense in several respects since the two systems work in completely different ways. This does not only mean that their challenges are completely different, it also means that as a consequence, their development trajectories are not parallel in terms of time.

In my view, the only reason for this is that rule-based AI has reached a dead end from which it will only be able to extricate itself once it has correctly identified its challenges. This is why these challenges will be described in more detail below.

Overview of the challenges

In the preceding post, I listed four challenges for rule-based AI. Basically, the first two cannot be remedied: it takes experts to draw up the rules, and these must be experts both in abstract logic and in the specialist field concerned. There is not much that can be changed about this. The second challenge will also remain: finding such experts will remain a problem.

The situation is better for challenges three and four, namely the large number of rules required, and their complexity. Although it is precisely these two that represent seemingly unalterable obstacles of considerable size, the necessary insights may well take the edge off them. However, both challenges must be tackled consistently, and this means that we will have to jettison some cherished old habits and patterns of thought. Let’s have a closer look at this.

The rules require a space and a calculus

Rule-based AI consists of two things:

rules which describe a domain (specialist field) in a certain format, and
an algorithm which determines which rules are executed at what time.

In order to build the rules, we require a space which specifies the elements which the rules may consist of and thus the very nature of the statements that can be made within the system. Such a space does not exist of its own accord but has to be deliberately created. Secondly, we require a calculus, i.e. an algorithm which determines how the rules thus established are applied. Of course, both the space and the calculus can be created in completely different ways, and these differences “make the difference”, i.e. they enable a crucial improvement of rule-based AI, albeit at the price of jettisoning some cherished old habits.

Three innovations

In the 1990s, we therefore invested in both the fundamental configuration of the concept space and the calculus. We established our rule-based system on the basis of the following three innovations:

data elements: we consistently use composite data elements (concept molecules);
space: we arrange concepts in a multidimensional-multifocal architecture;
calculus: we rely on non-monotonic reasoning (NMR).

These three elements interact and enable us to capture a greater number of situations more accurately with fewer data elements and rules. The multifocal architecture enables us to create better models, i.e. models which are more appropriate to their situations and contain more details. Since the number of elements and rules decreases at the same time, we succeed in going beyond the boundaries which previously constrained rule-based systems with regard to extent, precision and maintainability.

In the next post, we will investigate how the three above-mentioned innovations work.

This is a post about artificial intelligence.

Translation: Tony Häfliger and Vivien Blandford

Artificial Intelligence

AI: Vodka and tanks

16. March 2020 Hans Rudolf Straub Leave a comment

AI in the last century

AI is a big buzzword today but was already of interest to me in my field of natural language processing in the 1980s and 1990s. At that time, there were two methods which were occasionally labelled AI, but they could not have been more different from each other. The exciting thing is that these two different methods still exist today and continue to be essentially different from each other.

AI-1: vodka

The first method, i.e. the one already used by the very first computer pioneers, was purely algorithmic, i.e. rule-based. Aristotle’s syllogisms are a paradigm of this type of rule-based system:

Premise 1: All human beings are mortal.
Premise 2: Socrates is a human being.
Conclusion: Socrates is mortal.

The expert posits premises 1 and 2, the system then draws the conclusion autonomously. Such systems can be underpinned mathematically. Set theory and first-order logic are often regarded as a safe mathematical basis. Theoretically, such systems were thus waterproof. In practice, however, things looked somewhat different. Problems were caused by the fact that even the smallest details had to be included in the rule system; if they were not, the whole system would “crash”, i.e. draw completely absurd conclusions. The correction of these details increased disproportionately to the extent of the knowledge that was covered. At best, the systems worked for small special fields for which clear-cut rules could be found; when it came to wider fields, however, the rule bases were too large and were no longer maintainable. A further serious problem was the fuzziness which is peculiar to many expressions and which is difficult to grasp with such hard-coded systems.

Thus this type of AI came in for increasing criticism. The following translation attempt may serve as an example of why this was the case. An NLP program translated sentences from English into Russian and then back again. The input of the biblical passage “The spirit is willing but the flesh is weak.” resulted in the retranslation “The vodka is good but the meat is rotten.”

This story may or may not have happened precisely like this, but it demonstrates the difficulties encountered in attempts to capture language with rule-based systems. This example demonstrates the difficulties encountered in attempts to capture language with rule-based systems. The initial euphoria associated with the “electronic brain” and “machine intelligence” since the 1950s fizzled out, the expression “artificial intelligence” became obsolete and was replaced by the term “expert system”, which sounded less pretentious.

Later, in about 2000, the stalwarts of rule-based AI were buoyed up again, however. Tim Berners-Lee, the pioneer of the WWW, launched the Semantic Web initiative with the purpose of improving the usability of the internet. The experts of rule-based AI, who had been educated at the world’s best universities, were ready and willing to establish knowledge bases for him, which they now called ontologies. With all due respect to Berners-Lee and his efforts to introduce semantics to the net, it must be said that after almost 20 years, the Semantic Web initiative has not substantially changed the internet. In my view, there are good reasons for this: the methods of classic mathematical logic are too rigid to map the complex processes of thinking – more about this in other posts, particularly on static and dynamic logic. At any rate, both the classic rule-based expert systems of the 20th century and the Semantic Web initiative have fallen short of the high expectations.

AI-2: tanks

However, there were alternatives which tried to correct the weaknesses of rigid propositional logic as early as the 1990s. For this purpose, the mathematical toolkit was extended.

Such an attempt was fuzzy logic. A statement or a conclusion was now no longer unequivocally true or false; rather, its veracity could be weighted. Besides set theory and predicate logic, probability calculus was now also included in the mathematical toolkit of the expert systems. Yet some problems remained: again, there had to be precise and elaborate descriptions of the rules that were applicable. Thus fuzzy logic was also part of rule-based AI, even though is was equipped with probabilities. Today, such programs work perfectly well in small, well-demarcated technical niches, beyond which they are insignificant.

At that time, another alternative was constituted by the neural networks. The were considered to be interesting; however, their practical applications tended to attract some derision. To illustrate this, the following anecdote was bandied about:

The US Army – which has been an essential driver of computer technology all along – is supposed to have set up a neural network for the identification of US and foreign tanks. A neural network operates in such a way that the final conclusions are found through several layers of conclusions by the system itself. People need not input any rules any longer; they are generated by the system itself.

How is the system able to do this? It requires a learning corpus for this purpose. In the case of tank recognition, this consisted of a series of American and Russian tanks. Thus it was known for every photograph whether it was American or Russian, and the system was trained until it was capable of generating the required categorisation itself. The experts only exerted an indirect influence on the program in that they established the learning corpus; the program compiled the conclusions in the neural network autonomously – without the experts knowing precisely what rules the system used to draw which conclusions from which details. Only the result had to be correct, of course. Now, once the system had completely integrated the learning corpus, it could be tested by being shown a new input, for instance a new tank photo, and it was expected to categorise the new image correctly on the basis of the rules it had found in the learning corpus. As mentioned before, this categorisation was conducted by the system on its own, without the experts exerting any further influence and without them knowing how conclusions were drawn in a specific case.

It was said that this worked perfectly with regard to tank recognition. No matter how many photos were shown to the program, the categorisation was always spot on. The experts could hardly believe that they had really created a program with a 100% identification rate. How could this be? Ultimately, they discovered the reason: the photos of the American tanks were in colour, those of the Russian tanks were in black and white. Thus the program only had to recognise the colour; the contours of the tanks were irrelevant.

Rule-based vs corpus-based

The two anecdotes show what problems were lying in wait for rule-based and corpus-based AI at the time.

In the case of rule-based AI (vodka), they were
– the rigidity of mathematical logic,
– the fuzziness of our words,
– the necessity to establish very large knowledge bases,
– the necessity to use specialist experts for the knowledge bases.
In the case of corpus-based AI (tanks), they were
– the lack of transparency of the paths along which conclusions were drawn,
– the necessity to establish a very large and correct learning corpus.

I hope that I have been able to describe the characters and modes of operation of the two AI types with the two above (which admittedly are somewhat unfair) examples, including the weaknesses with characterise each type.

Needless to say, the challenges persist. In the following posts I will show how the two AI types have reacted against this and where the intelligence now really resides in the two systems. To begin with, we’ll have a look at corpus-based AI.

This is a blog post about artificial intelligence.

Translation: Tony Häfliger and Vivien Blandford