Tag Archives: Bit

Entropy and Information

Entropy and Information

The term entropy is often avoided because it contains a certain complexity that cannot be argued away.
But when we talk about information, we also have to talk about entropy. Because entropy is the measure of the amount of information. We cannot understand what information is without understanding what entropy is.

Information is Always Relative

We believe that we can pack information, just as we store bits in a storage medium. Bits are there objectively available, like little beads in a chain that can say yes or no. For us, this is information. But this image is deceptive. We have become so accustomed to this picture that we can’t imagine otherwise.

Of course, the bit-spherules do not say ‘yes’ or ‘no’, nor 0 or 1, nor TRUE or FALSE, of anything else in particular. Bits have no meaning at all, unless you have defined this meaning from the outside. Then they can perfectly say 1, TRUE, ‘I’m coming to dinner tonight’ or something else, but only together with their environment, their context.

This consideration makes it clear that information is relative. The bit only acquires its meaning from its placement. Depending on its context, it means 0 or 1, ‘true’ or ‘false’, etc. The bit is set in its place, but its meaning only comes from its place.

The place and its context, must therefore be taken into account so that it becomes clear what the bit is supposed to mean. And of course, the meaning is relative, i.e. the same bit can have a completely different meaning in a different context, a different place.

This relativity characterises not only the bit, but every type of information. Every piece of information only acquires its meaning through the context in which it is placed. It is therefore relative. Bits are just signals. What they mean only becomes clear when you interpret the signals from your perspective, when you look at it from your context.

Only then does the signal take on a meaning for you. This meaning is not absolute, because whenever we try to isolate it from its context, it will be reduced to a signal mere signal.  The meaning can only be found relative in the interaction between your expectation, the context and the position of the bit. There it is a switch, which can be set to ON or OFF. However, ON and OFF only inform about the position of the switch. Everything else is in the surroundings.

Definition of Entropy

Considering how important information and information technologies are today, it is astonishing how little is known about the scientific definition of entropy, i.e. information:

Entropy is a measure of the information that is 
– known in the micro 
– but unknown in the macro level.

Entropy is therefore closely related to information at the micro and macro levels and can be seen as the ‘distance’ or difference between the information at the two information levels.

Micro and Macro Level Define Information

What is meant by this gap between the micro and macro levels?  –  When we look at an object, the micro level contains the details (i.e. a lot of information), and the macro level contains the overview (i.e. less, but more targeted information).

The distance between the two levels can be very small (as with the bit, where the microlevel knows just two pieces of information: on or off) or huge, as with the temperature (macrolevel) in the coffee cup, for example, where the kinetic energies of the many molecules (microlevel) determine the temperature of the coffee. The number of molecules in this case lies in the order of Avogadro’s number 1023, i.e. quite high, and the entropy of the coffee in the cup is correspondingly high.
On the other hand, when the span between micro and macro level becomes very narrow, the information (entropy) will be small and comes very close to the size of a bit (info content = 1). However, it always depends on the relation between micro and macro level. This relation – i.e. what is known in the micro level but not in the macro level – defines the information that you receive, namely the information that a closer look at the details reveals.

The Complexity of the Macro Level

A state on the macro level always contains less information than on the micro state. The macro state is not complete, it never can contain all information one could possibly get by a closer look, but in most cases it is a well targeted and intended simplification of the information at the micro level.

This means that the same micro-state can supply different macro-states. For example: a certain individual (micro level), can belong to the collective macro groups of Swiss inhabitants, computer scientists, older men, contemporaries of the year 2024, etc., all at the same time.

The possibility of simultaneously drawing out several macro-states from different micro-states is characteristic of the complexity of micro- and macro-states and thus also of entropy.

If we transfer the entropy consideration of thermodynamics to more complex networks, we must deal with their higher complexity, but the ideas of micro and macro state remain and help us to understand what is going on when information is gained and processed.

Translation: Juan Utzinger


Continued in Entropy, Part 2


See also:
Paradoxes and Logic, Part 2
– Georg Spencer-Brown’s Distinction and the Bit
–  What is Entropy?
Five Preconceptions about Entropy

Georg Spencer-Browns Distinction and the Bit

continues paradoxes and logic (part 2)


History

Before we Georg Spencer-Brown’s (GSB’s) distinction as basic element for logic, physics, biology and philosophy, it is helpful to compare it with another, much better-known basic form, namely the bit. This allows us to better understand the nature of GSB’s distinction and the revolutionary nature of his innovation.

Bits and GSB forms can both be regarded as basic building blocks for information processing. Software structures are technically based on bits, but the forms of GSB (‘draw a distinction’) are just as simple, fundamental and astonishingly similar. Nevertheless, there are characteristic differences.

 Fig. 1: Form and bit show similarities and differences

Both the bit and the Spencer-Brown form were found in the early phase of computer science, so they are relatively new ideas. The bit was described by C. A. Shannon in 1948, the distinction by Georg Spencer-Brown (GSB) in his book ‘Laws of Form’ in 1969, only about 20 years later. 1969 fell in the heyday of the hippie movement and GSB was warmly welcomed Esalen, an intellectual hotspot and starting point of this movement.  This may have put him – on the other hand – in a bad light and hindered the established scientific community to look closer into his ideas. While the handy bit vivified California’s nascent high-tech information movement, Spencer-Brown’s mathematical and logical revolution was rather ignored by the scientific community. It’s time to overcome this disparity.

Similarities between Distinction and Bit

Both the form and the bit refer to information. Both are elementary abstractions and can therefore be seen as basic building blocks of information.

This similarity reveals itself in the fact that both denote a single action step – albeit a different one – and both assign a maximally reduced number of results to this action, exactly two.

Table 1: Both Bit and Distinction each contain
one action and two possible results (outcomes)

Exactly one Action, Exactly Two Potential Results

The action of the distinction is – as name says – the distinction, and the action of the bit is the selection. Both actions can be seen as information actions and are as such fundamental, i.e. not further reducible. The bit does not contain further bits, the distinction does not contain further distinctions. Of course, there are other bits in the vicinity of the bit and other distinctions in the vicinity of a distinction. However, both actions are to be seen as fundamental information actions. Their fundamentality is emphasised by the smallest possible number of results, namely two. The number of results cannot be smaller, because a distinction of 1 is not a distinction and a selection of 1 is not a selection. Both are only possible if there are two potential results.

Both distinction and bit are thus indivisible acts of information of radical, non-increasable simplicity.

Nevertheless, they are not the same and are not interchangeable. They complement each other.

While the bit has seen a technical boom since 1948, its prerequisite, the distinction, has remained unmentioned in the background. It is all the more worthwhile to bring it to the foreground today and shed new light on what links mathematics, logic, the natural sciences and the humanities.


Differences

Information Content and Shannons Bit

Both form and bit refer to information. In physics, the quantitative content of information is referred to as entropy.

At first glance, the information content when a bit is set or a distinction is made appears to be the same in both cases, namely the information that distinguishes between two states. This is clearly the case with a bit. As Shannon has shown, its information content is log2(2) = 1. Shannon called this dimensionless value 1 bit. The bit therefore contains – not surprisingly – the information of one bit, as defined by Shannon.

The Bit and its Entropy

The bit measures nothing other than entropy. The term entropy originally came from thermodynamics and was used to calculate the behaviour of heat machines. Entropy is in thermodynamics the partner term of energy, but it applies – like the term energy – to all fields of physics, not just to thermodynamics.

What is Entropy ?

Entropy is a measure for the information content. If I do not know something and then discover it, information flows. In a bit, there are – before I know which one is true – two states possible, the two states of the bit . When I find out which of the two states is true, I receive a small basic portion of information with the quantitative value of 1 bit.

One bit decides about two results. If more than two states are possible, the number of bits increases logarithmically with the number of possible states; so it takes three binary elections (bits) to find the correct choice out of 8 possibilities. The number of choices (bits) behaves logarithmically to the number of possible choices, as the example shows.

Dual choice = 1 Bit = log2(2).
Quadruple choice= 2 Bit = log2(4)
Octuple choice = 3 Bit = log2(8)

The information content of a single bit is always the information content of a single binary choice, i.e. log2(2) = 1.

The bit as a physical quantity is dimensionless, i.e. a pure number. This suits because the information about the choice is neutral, and not a length, a weight, an energy or a temperature. The bit serves well as the technical unit of quantitative information content. What is different with the other basic unit of information, the form of Spencer-Brown?


The Information Content of the Form

The information content of the bit is exactly 1 if the two outcomes of the selection have exactly the same probability. As soon as one of the two states is less probable, its choice reveals more information. When it is selected despite its lower prior probability, this makes more of a difference and reveals more information to us. The less probable its choice is, the greater the information will be, if it is selected. The classic bit is a special case in this regard: the probability of its two states is equal by definition and the information content of the choice is exactely 1.

This is entirely different with Spencer-Brown’s form of distinction. The decisive factor lies in the ‘unmarked space’. The distinction distinguishes something from the rest and marks it. The rest, i.e. everything else, remains unmarked. Spencer-Brown calls it the ‘unmarked space’.

We can and must now assume that the remainder, the unmarked, is much greater, and the probability of its occurrence is much higher than the probability that the marked will occur. The information content of the mark, i.e. of the drawing the distinction, is therefore usually greater than 1.

Of course, the distinction is about the marked and the marked is what interests us. That is why the information content of the distinction is calculated based on the marked and not the unmarked.

How large is the space of the unmarked? We would do well to assume that it is infinite. I can never know what I don’t know.

The difference in information content, measured as entropy, is the first difference we can see between bit and distinction. The information content of the bit, i.e. its entropy, is exactly 1. In the case of distiction, it depends on how large the unmarked space is, but it is always larger than the marked space and the entropy of the distinction is therefore always greater than 1.


Closeness and Openness

Fig. 1 above shows the most important difference between distinction and bit, namely their external boundaries. These are clearly defined in the case of the bit.

The meaning in the bit

The bit contains two states, one of which is activated, the other not. Apart from these two states, nothing can be seen in the bit and all other information is outside the bit. Not even the meanings of the two states are defined. They can mean 0 and 1, true and false, positive and negative or any other pair that is mutually exclusive. The bit itself does not contain these meanings, only the information as to which of the two predefined states was selected. The meaning of the two states is regulated outside the bit and assigned from outside. This neutrality of the bit is its strength. It can take on any meaning and can therefore be used anywhere where information is technically processed.

The meaning in the distinction

The situation is completely different with distinction. Here the meaning is marked. To do this, the inside of the distinction is distinguished from the outside. The outside, however, is open and there is nothing that does not belong to it. The ‘unmarked space’, in principle, is infinite. A boundary is defined, but it is the distinction itself. That is why the distinction cannot really separate itself from the outside, unlike the bit. In other words: The bit is closed, the distinction is not.


Differences between Distinction and Bit

There are two essential differences between distionction and bit.

Table 2: Differences between Distinction (Form) and Bit


Consequences

The two difference between distinction and bit have some interesting consequences.

Example NLP (Natural Language Processing)

The bit, due to its defined and simple entropy and its close borders, has the technological advantage of simple usability, which we exploit in the software industry. Distinctions, on the other hand, are more realistic due to their openness. For our specific task of interpreting medical texts, we therefore came across the need to introduce openness into the bit world of technical software through certain principles: The keywords here are

  1. Introduction of an acting subject that evaluates the input according to its own internal rules,
  2. Working with changing ontologies and classifications,
  3. Turning away from the classical, i.e. static and montonic logic and turning towards a non-monotonic logic,
  4. Integration of time as a logical element (not just as a variable).

Translation: Juan Utzinger

 

Five Preconceptions about Entropy

An Overview of the Five Preconceptions

  1. Entropy is for nerds
  2. Entropy is incomprehensible
  3. Entropy is thermodynamics
  4. Entropy is noise
  5. Entropy is absolute


Details

1. Entropy is the Basis of our Daily Lives

Nerds like to be interested in complex topics and entropy fits in well, doesn’t it? It helps them to portray themselves as superior intellectuals. This is not your game and you might not see any practical reasons to occupy yourself with entropy. This attitude is very common and quite wrong. Entropy is not a nerdy topic, but has a fundamental impact on our lives, from elementary physics to practical everyday life.

Examples (according to W. Salm1)

  • A hot coffee cup cools down over time
  • Water evaporates in an open container
  • Pendulums that have been knocked remain stationary after a while
  • Iron rusts
  • Magnets become weaker after some years
  • Lessons learnt are forgotten
  • Combed hair becomes dishevelled
  • White shirts become stained
  • Rocks crumble
  • Radioactive elements decay

So there are plenty of reasons to look into the phenomenon of entropy, which can be found everywhere in everyday life. But most people tend to avoid the term. Why is that? This is mainly due to the second preconception.


2. Entropy is a Perfectly Understandable and Indispensable Fundamental Concept

It is true, that at first glance, entropy is rather confusing. However, entropy is only difficult to understand because of persistent preconceptions (see points 4 and 5, below). These ubiquitous preconceptions are the obstacles that make the concept of entropy seem incomprehensible. Overcoming these thresholds not only helps to understand many real and practical phenomena, but also sheds light on the foundations that hold our world together.


3. Entropy Plays a Role Everywhere in Nature

The term entropy stems from thermodynamics. But we should not be mislead by this. In reality, entropy is something that exists everywhere in physics, chemistry, biology and also in art and music. It is a general and abstract concept and it refers directly to the structure of things and the information they contain.

Historically, the term was introduced not 200 years ago in thermodynamics and was associated with the possibility of allowing heat (energy) to flow. It helped to understand the mode of operation of machines (combustion engines, refrigerators, heat pumps, etc.). The term is still taught in schools this way.

However, thermodynamics only shows a part of what entropy is. Its general nature was only described by C.E. Shannon2 in 1948. The general form of entropy, also known as Shannon or information entropy, is the proper, i.e. the fundamental form. Heat entropy is a special case.

Through its application to heat flows in thermodynamics, entropy as heat entropy was given a concrete physical dimension, namely J/K, i.e. energy per temperature. However, this is the special case of thermodynamics, which deals with energies (heat) and temperature. If entropy is understood in a very general and abstract way, it is dimensionless, a pure number.

As the discoverer of abstract and general information entropy, Shannon gave this number a name, the “bit”. For his work as an engineer at the Bell telephone company, Shannon used the dimensionless bit to calculate the flow of information in the telephone wires. His information entropy is dimensionless and applies not only in thermodynamics, but everywhere where information and flows play a role.


4. Entropy is the Difference between not Knowing and Knowing

Many of us learnt at school that entropy is a measure of noise and chaos. Additionally, the second law of physics tells us that entropy can only ever increase. Thus, disorder should but increase. However, identifying entropy with noise or even chaos is misleading.

There are good reasons for this misleading idea: If you throw a sugar cube into the coffee, its well-defined crystal structure dissolves, the molecules disperse disorderly in the liquid and the sugar shows a transition from ordered to disordered. This decay of order can be observed everywhere in nature. In physics, it is entropy that drives the decay of order according to the second law. And decay and chaos can hardly be equated with Shannon’s concept of information. Many scientists thought the same way and therefore equated information with negentropy (entropy with a negative sign). At first glance, this doesn’t seem to be a bad match. In this view, entropy is noise and the absence of noise, i.e. negentropy, would then be information. Actually logical, isn’t it?

Not quite, because information is contained both in the sugar cube as well as in the dissolved sugar molecules floating in the coffee. In some ways, there is even more information in the individually floating molecules because each has its own path. Their bustling movements contain information. For us coffee drinker, however, the bustling movements of the many molecules in the cup does not contain useful information and appears only chaotic. Can this chaos be information?

The problem is our conventional idea of information. Our idea is too static. I suggest that we see entropy as something that denotes a flow, namely the flow between not knowing and knowing. This dynamic is characteristic of learning, of absorbing new information.

Every second, an incredible amount of things happen in the cosmos that could be known. The information in the entire world can only increase. This is what the second law says, and what increases is entropy, not negentropy. Wouldn’t it be much more obvious to put information in parallel with entropy and not with negentropy? More entropy would then mean more information and not more chaos.

Where can the information be found? In the noise or in the absence of noise? In entropy or in negentropy?


Two Levels

Well, the dilemma can be solved. The crucial step is to accept that entropy is the tension between two states, the overview state and the detail state. The overview view does not need the details, but only sees the broad lines. C.F. Weizsäcker speaks of the macro level. The broad lines are the information that interests us. Details, on the other hand, appear to us as unimportant noise. But the details, i.e. the micro level, contain more information, usually a whole lot more, just take the movements of the water molecules in the coffee cup (micro level), whose chaotic bustle contains more information than the single indication of the temperature of the coffee (macro level). Both levels are connected and their information depends on each other in a complex way. Entropy is the difference between the two amounts of information. This is always greater at the detail level (micro level), because there is always more to know in the details than in the broad lines and therefore also more information.

But because the two levels refer to the same object, you as the observer can look at the details or the big picture. Both belong together. The gain in information about details describes the transition from the macro to the micro level, the gain in information about the overview describes the opposite direction.

So where does the real information lie? At the detailed level, where many details can be described, or at the overview level, where the information is summarised and simplified in a way that really interests us?

The answer is simple: information contains both the macro and the micro level. Entropy is the transition between the two levels and, depending on what interests us, we can make the transition in one direction or the other.


Example Coffee Cup

This is classically demonstrated in thermodynamics. The temperature of my coffee can be seen as a metric for the average kinetic energy of the individual liquid molecules in the coffee cup. The information contained in the many molecules is the micro state, the temperature is the macro state. Entropy is the knowledge that is missing in the macro state but is present in the micro state. But for me as a coffee drinker, only the knowledge of the macro state, the temperature of the coffee, is relevant. This is not present in the micro state insofar as it does not depend on the individual molecules, but rather statistically on the totality of all molecules. It is only in the macro state that knowledge about temperature becomes tangible.

For us, only the macro state shows relevant information. But there is additional information in the noise of the details. How exactly the molecules move is a lot of information, but these details don’t matter to me when I drink coffee, only their average speed determines the temperature of the coffee, which is what matters to me.

The information-rich and constantly changing microstate has a complex relationship with the simple macroinformation of temperature. The macro state also influences the micro state, because the molecules have to move within the statistical framework set by the temperature. Both pieces of information depend on each other and are objectively present in the object at the same time. What differs is the level or scope of observation. The difference in the amount of information in the two levels determines the entropy.

These conditions have been well known since Shannon2 and C.F. Weizsäcker. However, most schools still teach that entropy is a measure of noise. This is misleading. Entropy should always be understood as a delta, as a difference (distance) between the information in the overview (macro state) and the information in the details (micro state).


5. Entropy is a Relative Value

The fact that entropy is always a distance, a delta, i.e. mathematically a difference, also results in the fact that entropy is not an absolute value, but rather a relative value.

Example Coffee Cup
Let’s take the coffee cup as an example. How much entropy is in there? If we only look at the temperature, then the microstate corresponds to the average kinetic energy of the molecules. But the coffee cup contains even more information: How strong is the coffee? How strongly sweetened? How strong is the acidity? What flavours does it contain?

Example School Building
Salm1 gives the example of a lost door key that a teacher is looking for in a school building. If he knows which classroom the key is in, he has not yet found it. At this moment, the microstate only names the room. Where in the room is the key? Perhaps in a cupboard. In which one? At what height? In which drawer, in which box? The micro state varies depending on the depth of the request. It is a relative value. 

Because the information entropy is always a difference, the entropy, i.e. the span between overview and details, can always be extended to even more details.

Entropy is a relative value. If we specify it in absolute terms, we set – without explicitly declaring it – a lowest level (classroom, shelf or drawer). This is legal as long as we are aware that the seemingly absolute value only represents the distance to the assumed micro-level.

Statics and Dynamics 

Energy and entropy are two complementary quantities that permeate the entire description of nature (physics, chemistry, biology). The two fundamental laws of physics each contain one of the two general quantities E (energy) and S (entropy):

  1. Law:  E = 0    oder:   dE/dt = 0
  2. Law:  ∆S ≥ 0    oder:   dS/dt ≥ 0

Energy remains constant over time (in a closed system), while entropy can only increase. In other words: energy is a static value and shows what does not change, while entropy is essentially dynamic and shows flows, e.g. in heat machines, in Shannon’s current in telephone wires and whenever our thoughts flow and we learn and think.

Entropy and Time

Entropy is essentially linked to the phenomenon of time by the second law (∆S ≥ 0). While energy remains constant in a closed system (Noether’s theorem), entropy changes over time and increases in a closed system. Entropy therefore knows time, not only heat entropy in particular, but also the much more general information entropy.


Conclusion

  • Entropy is a key concept in physics and information theory.
  • The term entropy comes from thermodynamics, but the concept of entropy refers to information in general.
  • The thermodynamical entropy is the special case, information entropy is the general concept.
  • Everything that happens physically, chemically and in information processing, whether technical or biological, has to do with entropy. In particular, everything that has to do with information flows and structures. In other words, everything that really interests us.
  • Entropy is always relative and refers to the distance between the macro and micro levels.
  • The macro level contains less information than the micro level
  • The macro level contains the information of interest.
  • Neither is absolute: the micro level can always be described in more detail. The macro level is defined from the outside: What is of interest? The temperature of the coffee? The concentration of sugar molecules? The acidity? The caffeine content …
  • Only the definition of the two states makes it possible to specify the entropy in seemingly absolute terms. However, what counts for entropy is the relative value, i.e. the delta between the two states. This delta, the entropy, determines the flow.
  • The flow happens in time.

(Translation: Juan Utzinger)


1 Salm, W: Entropie und Information – naturwissenschaftliche Schlüsselbegriffe, Aulis Verlag Deubner, Köln, 1997

2 Shannon, C.E. und Weaver W: The Mathematical Theory of Information, Illinois Press, Chicago, 1949


See also: What is Entropy?