Category Archives: Logic

Specification of the challenges for rule-based AI

30. March 2020 Hans Rudolf Straub Leave a comment

Rule-based AI is lagging behind

The distinction between rule-based AI and corpus-based AI makes sense in several respects since the two systems work in completely different ways. This does not only mean that their challenges are completely different, it also means that as a consequence, their development trajectories are not parallel in terms of time.

In my view, the only reason for this is that rule-based AI has reached a dead end from which it will only be able to extricate itself once it has correctly identified its challenges. This is why these challenges will be described in more detail below.

Overview of the challenges

In the preceding post, I listed four challenges for rule-based AI. Basically, the first two cannot be remedied: it takes experts to draw up the rules, and these must be experts both in abstract logic and in the specialist field concerned. There is not much that can be changed about this. The second challenge will also remain: finding such experts will remain a problem.

The situation is better for challenges three and four, namely the large number of rules required, and their complexity. Although it is precisely these two that represent seemingly unalterable obstacles of considerable size, the necessary insights may well take the edge off them. However, both challenges must be tackled consistently, and this means that we will have to jettison some cherished old habits and patterns of thought. Let’s have a closer look at this.

The rules require a space and a calculus

Rule-based AI consists of two things:

rules which describe a domain (specialist field) in a certain format, and
an algorithm which determines which rules are executed at what time.

In order to build the rules, we require a space which specifies the elements which the rules may consist of and thus the very nature of the statements that can be made within the system. Such a space does not exist of its own accord but has to be deliberately created. Secondly, we require a calculus, i.e. an algorithm which determines how the rules thus established are applied. Of course, both the space and the calculus can be created in completely different ways, and these differences “make the difference”, i.e. they enable a crucial improvement of rule-based AI, albeit at the price of jettisoning some cherished old habits.

Three innovations

In the 1990s, we therefore invested in both the fundamental configuration of the concept space and the calculus. We established our rule-based system on the basis of the following three innovations:

data elements: we consistently use composite data elements (concept molecules);
space: we arrange concepts in a multidimensional-multifocal architecture;
calculus: we rely on non-monotonic reasoning (NMR).

These three elements interact and enable us to capture a greater number of situations more accurately with fewer data elements and rules. The multifocal architecture enables us to create better models, i.e. models which are more appropriate to their situations and contain more details. Since the number of elements and rules decreases at the same time, we succeed in going beyond the boundaries which previously constrained rule-based systems with regard to extent, precision and maintainability.

In the next post, we will investigate how the three above-mentioned innovations work.

This is a post about artificial intelligence.

Translation: Tony Häfliger and Vivien Blandford

The challenges for rule-based AI

19. March 2020 Hans Rudolf Straub Leave a comment

Rule-based in comparison with corpus-based

Corpus-based AI (the “Tanks” type; cf. introductory AI post) successfully overcame its weaknesses (cf. preceding post). This was the result of a combination of “brute force” (improved hardware) and an ideal window of opportunity, i.e. when during the super-hot phase of internet expansion, companies such as Google, Amazon, Facebook and many others were able to collect large volumes of data and feed their data corpora with them – and a sufficiently big data corpus is the linchpin of corpus-based AI.

Brute force was not enough for rule-based AI, however, nor was there any point in collecting lots of data, since data also have to be organised for rule construction – and largely manually at that, i.e. by human expert specialists.

Challenge 1: different mentalities

Not everyone is equally fascinated by the process of building algorithms. Building algorithms requires a particular faculty of abstraction combined with a very meticulous vein – with regard to abstractions, at any rate. No matter how small an error in the rule construction may be, it will inevitably have an impact. Mathematicians possess the consistently meticulous mentality that is called for here, but natural scientists and engineers are also favourably characterised by it. Of course, accountants must also be meticulous, but AI rule construction additionally requires creativity.

Salespersons, artists and doctors, however, work in a different field. Abstractions are often incidental; the importance lies in what is tangible and specific. Empathy for other people can also be very important, or someone has to be able to act with speed and precision, as is the case with surgeons. These characteristics are all very valuable, but they are less relevant to algorithm construction.

This is a problem for rule-based AI because rule construction requires the skills of one camp and the knowledge of the other: it requires the mentality that makes a good algorithm designer combined with the way of thinking and the knowledge of the specialist field to which the rules refer. Such combinations of specialist knowledge with a talent for abstraction are rare. In the hospitals in which I worked, both cultures were quite clearly visible in their separateness: on the one hand the doctors, who at best accepted computers for invoicing or certain expensive technical devices but had a low opinion of information technology in general, and on the other hand the computer scientists, who did not have a clue about what the doctors did and talked about. The two camps simply avoided each other most of the time. Needless to say, it was not surprising that the expert systems designed for medical purposes only worked for very small specialist fields – if they had progressed beyond the experimentation stage at all.

Challenge 2: where can I find the experts?

Experts who are creative and equally at home in both mentality camps are obviously hard to find. This is aggravated by the fact that there are no training facilities for such experts. Equally realistic are the following questions: where are the instructors who are conversant with the current challenges? Which diplomas are valid for what? And how can an investor in this new field evaluate whether the experts employed are fit for purpose and the project is moving in the right direction?

Challenge 3: the sheer volume of detailed rules required

The fact that a large volume of detailed knowledge is required to be able to draw meaningful conclusions in a real situation was already a challenge for corpus-based AI. After all, it was only with really large corpora, i.e. thanks to the internet and a boost in computer performance, that it succeeded in gathering the huge volume of detailed knowledge which is one of the fundamental prerequisites for every realistic expert system.

For rule-based AI, however, it is particularly difficult to provide the large volume of knowledge since this provision of knowledge requires people who manually package this large volume of knowledge into computer-processable rules. This is very time-consuming work, which additionally requires hard-to-find human specialist experts who are able to meet the above-mentioned challenges 1 and 2.

In this situation, the question arises as to how larger-scale rule systems which actually work can be built at all. Could there be any possibilities for simplifying the construction of such rule systems?

Challenge 4: complexity

Anyone who has ever tried to really underpin a specialist field with rules discovers that they quickly encounter complex questions to which they find no solutions in the literature. In my field of Natural Language Processing (NLP), this is obvious. The complexity cannot be overlooked here, which is why it is imperative to deal with it. In other words: the principle of hope is not adequate to the task; rather, the complexity must be made the subject of debate and be studied intensively.

What complexity means and how it can be countered will be the subject matter of a further post. Of course, complexity must not result in an excessive increase in rules (cf. challenge 3). The question which therefore arises for rule-based AI is: how can we build a rule system which takes into consideration the wealth of details and complexity while still remaining simple and manageable?

The good news is: there are definitely answers to this question.

In a following post, the challenges will be specified.

This is a post about artificial intelligence.

Translation: Tony Häfliger and Vivien Blandford

Artificial Intelligence

Corpus-based AI overcomes its weaknesses

19. March 2020 Hans Rudolf Straub Leave a comment

Two AI variants: rule-based and corpus-based

In the preceding post, I mentioned the two fundamental approaches to attempting to imbue computers with intelligence, namely the rule-based approach and the corpus-based approach. In a rule-based system, the intelligence is situated in a rule pool that is deliberately designed by people. In the corpus-based method, the knowledge is contained in the corpus, i.e. in a data collection which is analysed by a sophisticated program.

The performance of both methods has been massively boosted since the 1990s. The most impressive boost has been achieved with the corpus-based method, which is now regarded as the artificial intelligence proper and is making headlines across the board today. What, then, are the crucial improvements of the two methods? To begin with, we’ll have a look at how corpus-based AI works.

How does corpus-based AI work?

Corpus-based AI (c-AI) consists of two parts:

the corpus,
algorithms (neural network).

corpus and neural network — Fig. 1: Structure of a corpus-based AI system

The corpus, which is also called learning corpus, is a collection of data. This can consist of photographs of tanks or faces, but also of collections of search queries, for instance of Google. What is important is that the corpus already contains the data in a weighted form. In the tank example, it has been written into the corpus whether the tanks are friendly or hostile. The collection of faces contains information about the owners of those faces. In the case of the search queries, Google records the links that a searcher clicks, i.e. which suggestion offered by Google is successful. Thus the learning corpus contains knowledge which the corpus-based AI is going to use.

Now the c-AI has to learn. The aim is for the AI to be able to categorise a new tank image, a new face or a new query correctly. For this purpose, the c-AI makes use of the knowledge in the corpus, i.e. the pictures of the tank collection, where it is noted for each image whether the tank is ours or foreign – as represented in Fig. 1.

Now the second component of the c-AI comes into play: the algorithm. Essentially, this is a neural network. It consists of several layers of “neurons” which pick up the input signals, process them and then transmit their own signals to the next higher level. Fig. 1 shows how the first (yellow) neuron layer picks up the signals (pixels) from the image and, after processing them, forwards its own signals to the next (orange) layer until finally, the network arrives at the result of “our tank” or “foreign tank”.

When the neural network is now shown a new image that has not been assessed yet, the process is precisely the same as with the other picture. If the network has been trained well, the program should be able to categorise on its own, i.e. the neural network should be able to discern whether the tank is ours or someone else’s

Query about an unclassified tank — Fig. 2: Search query to the neural network about an unclassified tank

The significance of the data corpus for corpus-based AI

A corpus-based AI finds its detailed knowledge in the corpus that has been specially compiled for it and evaluates the connections which it discovers there. The corpus therefore contains the knowledge which the c-AI evaluates. In our example, the knowledge consists in the connection of the photograph, i.e. a set of wildly arranged pixels, with a simple binary piece of information (our tank/foreign tank). This knowledge is already part of the corpus before the algorithms conduct an evaluation. The algorithms of the c-AI thus do not detect anything that is not already in the corpus. However, the c-AI is now also able to apply the knowledge found in the corpus to new, unassessed cases.

The challenges for corpus-based AI

The challenges for c-AI are unequivocal:

Corpus size: the more images there are in the corpus, the higher the certainty of the categorisation. A corpus that is too small will result in faulty results. The size of the corpus is crucial for the precision and reliability of the results.
Hardware: the processing power required by a c-AI is very high and becomes higher the more precise the method is intended to be. Hardware performance is the decisive factor for the practical applicability of the method.

This quickly clarifies how c-AI has been able to improve its performance so impressively in the last two decades:

The data volumes which Google and other organisations are capable of collecting in the internet have increased drastically. In this respect, Google profits from quite an important amplification effect: the more queries Google receives, the better the corpus and thus its hit rate. The better the hit rate, the more queries Google will receive.
The hardware that is required to evaluate the data is becoming less expensive and more performant. Today, internet companies and other organisations operate huge server farms, without which the processor-intensive evaluations would not be possible in the first place.

Besides the corpus and the hardware, the sophistication of the algorithms naturally also plays a part. However, the algorithms were not bad even decades ago. In comparison with the other two factors – hardware and corpus – the progress made in the field of algorithms only plays a modest part in the impressive success of c-AI.

The success of corpus-based AI

The challenges for c-AI were tackled by the big corporations and organisations extremely successfully.

The above description of the operating mode of c-AI, however, should also reveal the weaknesses immanent in the system, which are accorded less media attention. I will discuss them in more detail in a later post.

Next we will have a look at the challenges for rule-based AI.

This is a post about artificial intelligence.

Translation: Tony Häfliger and Vivien Blandford

Artificial Intelligence

AI: Vodka and tanks

16. March 2020 Hans Rudolf Straub Leave a comment

AI in the last century

AI is a big buzzword today but was already of interest to me in my field of natural language processing in the 1980s and 1990s. At that time, there were two methods which were occasionally labelled AI, but they could not have been more different from each other. The exciting thing is that these two different methods still exist today and continue to be essentially different from each other.

AI-1: vodka

The first method, i.e. the one already used by the very first computer pioneers, was purely algorithmic, i.e. rule-based. Aristotle’s syllogisms are a paradigm of this type of rule-based system:

Premise 1: All human beings are mortal.
Premise 2: Socrates is a human being.
Conclusion: Socrates is mortal.

The expert posits premises 1 and 2, the system then draws the conclusion autonomously. Such systems can be underpinned mathematically. Set theory and first-order logic are often regarded as a safe mathematical basis. Theoretically, such systems were thus waterproof. In practice, however, things looked somewhat different. Problems were caused by the fact that even the smallest details had to be included in the rule system; if they were not, the whole system would “crash”, i.e. draw completely absurd conclusions. The correction of these details increased disproportionately to the extent of the knowledge that was covered. At best, the systems worked for small special fields for which clear-cut rules could be found; when it came to wider fields, however, the rule bases were too large and were no longer maintainable. A further serious problem was the fuzziness which is peculiar to many expressions and which is difficult to grasp with such hard-coded systems.

Thus this type of AI came in for increasing criticism. The following translation attempt may serve as an example of why this was the case. An NLP program translated sentences from English into Russian and then back again. The input of the biblical passage “The spirit is willing but the flesh is weak.” resulted in the retranslation “The vodka is good but the meat is rotten.”

This story may or may not have happened precisely like this, but it demonstrates the difficulties encountered in attempts to capture language with rule-based systems. This example demonstrates the difficulties encountered in attempts to capture language with rule-based systems. The initial euphoria associated with the “electronic brain” and “machine intelligence” since the 1950s fizzled out, the expression “artificial intelligence” became obsolete and was replaced by the term “expert system”, which sounded less pretentious.

Later, in about 2000, the stalwarts of rule-based AI were buoyed up again, however. Tim Berners-Lee, the pioneer of the WWW, launched the Semantic Web initiative with the purpose of improving the usability of the internet. The experts of rule-based AI, who had been educated at the world’s best universities, were ready and willing to establish knowledge bases for him, which they now called ontologies. With all due respect to Berners-Lee and his efforts to introduce semantics to the net, it must be said that after almost 20 years, the Semantic Web initiative has not substantially changed the internet. In my view, there are good reasons for this: the methods of classic mathematical logic are too rigid to map the complex processes of thinking – more about this in other posts, particularly on static and dynamic logic. At any rate, both the classic rule-based expert systems of the 20th century and the Semantic Web initiative have fallen short of the high expectations.

AI-2: tanks

However, there were alternatives which tried to correct the weaknesses of rigid propositional logic as early as the 1990s. For this purpose, the mathematical toolkit was extended.

Such an attempt was fuzzy logic. A statement or a conclusion was now no longer unequivocally true or false; rather, its veracity could be weighted. Besides set theory and predicate logic, probability calculus was now also included in the mathematical toolkit of the expert systems. Yet some problems remained: again, there had to be precise and elaborate descriptions of the rules that were applicable. Thus fuzzy logic was also part of rule-based AI, even though is was equipped with probabilities. Today, such programs work perfectly well in small, well-demarcated technical niches, beyond which they are insignificant.

At that time, another alternative was constituted by the neural networks. The were considered to be interesting; however, their practical applications tended to attract some derision. To illustrate this, the following anecdote was bandied about:

The US Army – which has been an essential driver of computer technology all along – is supposed to have set up a neural network for the identification of US and foreign tanks. A neural network operates in such a way that the final conclusions are found through several layers of conclusions by the system itself. People need not input any rules any longer; they are generated by the system itself.

How is the system able to do this? It requires a learning corpus for this purpose. In the case of tank recognition, this consisted of a series of American and Russian tanks. Thus it was known for every photograph whether it was American or Russian, and the system was trained until it was capable of generating the required categorisation itself. The experts only exerted an indirect influence on the program in that they established the learning corpus; the program compiled the conclusions in the neural network autonomously – without the experts knowing precisely what rules the system used to draw which conclusions from which details. Only the result had to be correct, of course. Now, once the system had completely integrated the learning corpus, it could be tested by being shown a new input, for instance a new tank photo, and it was expected to categorise the new image correctly on the basis of the rules it had found in the learning corpus. As mentioned before, this categorisation was conducted by the system on its own, without the experts exerting any further influence and without them knowing how conclusions were drawn in a specific case.

It was said that this worked perfectly with regard to tank recognition. No matter how many photos were shown to the program, the categorisation was always spot on. The experts could hardly believe that they had really created a program with a 100% identification rate. How could this be? Ultimately, they discovered the reason: the photos of the American tanks were in colour, those of the Russian tanks were in black and white. Thus the program only had to recognise the colour; the contours of the tanks were irrelevant.

Rule-based vs corpus-based

The two anecdotes show what problems were lying in wait for rule-based and corpus-based AI at the time.

In the case of rule-based AI (vodka), they were
– the rigidity of mathematical logic,
– the fuzziness of our words,
– the necessity to establish very large knowledge bases,
– the necessity to use specialist experts for the knowledge bases.
In the case of corpus-based AI (tanks), they were
– the lack of transparency of the paths along which conclusions were drawn,
– the necessity to establish a very large and correct learning corpus.

I hope that I have been able to describe the characters and modes of operation of the two AI types with the two above (which admittedly are somewhat unfair) examples, including the weaknesses with characterise each type.

Needless to say, the challenges persist. In the following posts I will show how the two AI types have reacted against this and where the intelligence now really resides in the two systems. To begin with, we’ll have a look at corpus-based AI.

This is a blog post about artificial intelligence.

Translation: Tony Häfliger and Vivien Blandford

Logic

Logodynamics

7. December 2019 Hans Rudolf Straub Leave a comment

What is logic for?

Is logic about thinking? I used to think so, believing that logic was something like the ‘doctrine of thinking’, or even the ‘doctrine of correct thinking’. A closer look, however, reveals that what we call logic, and the field of study that goes by this name, is about proving rather than thinking. Classical logic is in fact the science of the proof.

But there’s a lot more to thinking than proving. If you want to proof something, first you have to find the proofs. Then you have to assess these proofs in context – a context that can change. And what do you do about contradictions? I believe it is the job of logic to investigate the question of how we think in a more general sense. It should be more than just a science of proof. But how do we arrive at such an extended version of logic?

The decisive step for me was the realisation that there are two types of logic: one static and one dynamic. Only when we dare to leave the safe garden of static logic can we begin to examine real thinking.

Classical logic = logostatics

Classical logic shaped Western intellectual life for more than two millennia – from the syllogisms of Aristotle to the scholasticism of the Middle Ages including the teachings of Thomas Aquinas, to the first order logic (FOL) of mathematicians, which represents the widely accepted state of the art today. These systems of logic are truly static. Every statement within them has a generally valid, absolute truth value; the statement is either true or false – and that must not change. In other words: the logical building is static. Mathematicians call such logic monotonic.

Logodynamics

Although contradictions cannot be tolerated in a classical system of logic, in a dynamic one they make up crucial elements in the network of statements. It’s the same in our own minds, where contradictions are nothing more than starting points for our thinking. Finally, contradictions, e.g. observations that are incompatible with one another, force us to take a closer look. If statements are contradictory, it makes us want to reflect on where the truth lies. Contradictions, forbidden in classical logic, are actually the starting point for thinking in dynamic logic. Just as in physics, where an electric voltage supplies the energy that allows current to flow, in logic a contradiction provides the tension that drives us to carry on thinking.

But continuing to think also means always being open to completely new statements. This is another way that logodynamics differs from classical logic. The classical system first defines its ‘world’, i.e. all the elements that may be used subsequently, or indeed at all. The system must be closed. Classical logic requires a clear demarcation (definition) of the world of a system of statements (both true and false) before any conclusions can be drawn in this closed world of statements. By contrast, our thinking is by no means closed. We can always include new objects, test new differentiations for known objects, find new reasons and re-evaluate existing ones. In other words: we can learn. Therefore, a system of logic that approximates the way people think must always be open.

In a classical system of logic, time does not exist. Everything that is true is always true. The situation is very different in a logodynamic system. What is considered true today may be recognised as an error tomorrow. Without this possibility there is no learning. The logodynamic system recognises time as a necessary and internal element. This fundamentally changes the logical mechanism, the ‘basic switch’ of logic, namely the IF-THEN. The IF-THEN of dynamic logic always has a time element to it – the IF always comes before the THEN. A static system could, at most, recognise time as an object for consideration, along the lines of one of its variables, but not as something that plays a role in its own functioning.

Thus, a logodynamic system has the following three properties that differentiate it from a logostatic one:

Non-monotony: contradictions in the system are allowed.
Openness: new elements can appear in the system at any time.
System-internal time: time passes between IF and THEN.

(Translation: R. Waddington)

Information, Logic

Is ‘IF-THEN’ static or dynamic?

8. July 2019 Hans Rudolf Straub Leave a comment

IF-THEN and Time

It’s a commonly held belief that there’s nothing complicated about the idea of IF-THEN from the field of logic. However, I believe this overlooks the fact that there are actually two variants of IF-THEN that differ depending on whether the IF-THEN in question possesses an internal time element.

Dynamic (real) IF-THEN

For many of us, it’s self-evident that the IF-THEN is dynamic and has a significant time element. Before we can get to our conclusion – the THEN – we closely examine the IF – the condition that permits the conclusion. In other words, the condition is considered FIRST, and only THEN is the conclusion reached.

This is the case not only in human thinking, but also in computer programs. Computers allow lengthy and complex conditions (IFs) to be checked. These must be read from the computer’s memory by its processor. It may be necessary to perform even smaller calculations contained in the IF statements and then compare the results of the calculations with the set IF conditions. These queries naturally take time. Even though the computer may be very fast and the time needed to check the IF minimal, it is still measurable. Only AFTER checking can the conclusion formulated in the computer language – the THEN – be executed.

In human thinking, as in the execution of a computer program, the IF and the THEN are clearly separated in time. This should come as no surprise, because both the sequence of the computer program and human thinking are real processes that take place in the real, physical world, and all real-world processes take time.

Static (ideal) IF-THEN

It may, however, surprise you to learn that in classic mathematical logic the IF-THEN takes no time at all. The IF and the THEN exist simultaneously. If the IF is true, the THEN is automatically and immediately also true. Actually, even speaking of a before and an after is incorrect, since statements in classical mathematical logic always take place outside of time. If a statement is true, it is always true, and if it is false, it is always false (= monotony, see previous posts).

The mathematical IF-THEN is often explained using Venn diagrams (set diagrams). In these visualisations, the IF may, for example, be represented by a set that is a subset of the THEN set. For mathematicians, IF-THEN is a relation that can be derived entirely from set theory. It’s a question of the (unchangeable) states of true or false rather than of processes, such as thinking in a human brain or the execution of a computer program.

Thus, we can distinguish between

Static IF-THEN:
In ideal situations, i.e. in mathematics and in classical mathematical logic.
Dynamic IF-THEN:
In real situations, i.e. in real computer programs and in the human brain.

Dynamic logic uses the dynamic IF-THEN

If we are looking for a logic that corresponds to human thinking, we must not limit ourselves to the ideal, i.e. static, IF-THEN. The dynamic IF-THEN is a better match for the normal thought process. This dynamic logic that I am arguing for takes account of time and needs the natural – i.e. the real and dynamic – IF-THEN.

If time is a factor and the world may be a slightly different place after the first conclusion has been drawn, it matters which conclusion is drawn first. Unless you allow two processes to run simultaneously, you cannot draw both conclusions at the same time. And even if you do, the two parallel processes can influence each other, complicating the matter still further. For this reason along with many others, dynamic logic is much more complex than the static variant. This increases our need for a clear formalism to help us deal with this complexity.

Static and dynamic IF-THEN side by side

The two types of IF-THEN are not mutually exclusive; they complement each other and can coexist. The classic, static IF-THEN describes logical states that are self-contained, whereas the dynamic variant describes logical processes that lead from one logical state to another.

This interaction between statics and dynamics is comparable with the situation in physics, where we find statics and dynamics in mechanics, and electrostatics and electrodynamics in the study of electricity. In these fields, too, the static part describes the states (without time) and the dynamic part the change of states (with time).

This is a blog post about dynamic logic. The next post specifies the topic of the dynamic IF-THENs.

Logic

Non-Monotonic Reasoning (NMR)

3. June 2019 Hans Rudolf Straub Leave a comment

Concept Molecules and NMR

In the article Two types of coding 1, I described the challenge of getting computers to ‘understand’ the incredibly diverse range of medical diagnoses that may crop up in a text. To meet this challenge, the computer has to convert the various diagnostic formulations encountered into a consistent format that represents all the semantic details in an easily retrievable form.

With concept molecules we have succeeded in doing this. We were aided here by two properties of the concept molecules method:
a) the consistently composite representation of semantics, and
b) a non-monotonic reasoner.
At the time, the use of a non-monotonic reasoner was very much out of vogue. Most research groups in the field of medical computational linguistics were in the process of switching from First Order Logic (FOL) to Description Logic (DL), believing that DL is the best way to get computers to interpret complex semantics. As it turned out, however, it was us – a small private research company without state support – that was successful. Instead of the accepted doctrine of FOL and DL based upon a monotonic approach, we used a non-monotonic method.

What is monotonic logic?

In logic, monotony means that the truth of statements does not change even if new contradictory information subsequently crops up. Thus, what has been recognised within the system as true remains true, and what has been recognised as false remains false. Under non-monotony, on the other hand, conclusions drawn by the system can be called into question on the basis of additional information.

So, what’s the problem with non-monotony?

It is clear that proof is only possible in a monotonic system. In a non-monotonic system, on the other hand, there is always the possibility of another argument cropping up that leads to completely different conclusions. Since proof is essential in mathematics, it is obvious that mathematical logic relies on monotony.

Computational linguistics, however, is not about proof, but about the correct assignment of words to concepts. Thus, the advantage of being able to supply proof – as important as it clearly is for mathematics – is irrelevant to our task.

And the problem with monotony?

A system that cannot change its conclusions is not able to learn in any real sense. The human brain, for example, is in no way monotonic.

Moreover, a monotonic system must also be closed, whereas in practice scientific ontologies are not closed, but grow as knowledge progresses. Progress of this kind is also evident in the development of an interpretation program with its complex algorithms: here, too, there is continuous improvement and expansion that poses problems for monotonic systems.

In addition, monotonic systems are not particularly efficient when it comes to dealing with exceptions. It is well known that there are exceptions to every rule, and a non-monotonic system can handle these in a much more effective and straightforward way.

Non-monotony in practice

If we compare rules-based systems, I believe that non-monotonic systems are clearly preferable to monotonic ones for our purposes. Non-monotony is by no means the easy option and has a few pitfalls and knotty issues of its own, but the ease with which even detailed and complex fields can be modelled decides the issue in its favour.