Stigmergy as a Universal Coordination Mechanism (I)

Heylighen, F. . Stigmergy as a universal coordination mechanism I: Definition and components . Cognitive Systems Research (Elsevier) 38 . pp. 4-13. 2016

1. Past, present and future of the “stigmergy” concept

The concept is introduced by Pierre-Paul Grassé 1959 to describe a coordination mechanism used by insects: the work of one leaves traces in the environment that stimulates subsequent work by that insect or by others: ‘This mediation via the environment ensures that tasks are executed in the right order, without any need for planning, control, or direct interaction between agents’ [p 4]. DPB: how can execution in the right order be assured: it is not sure in what order the other agents will encounter the traces and hence in what order they will be motivated to act? From the examples in the text it appears that the stage in which work is left by the previous worker is input for the decision rules of a later worker; this implies that the stage of the work can be recognized. This is not the same as the agents assessing the stage of the work in the sense of attributing a meaning to it, or as in distinguishing this earlier stage from that later stage, because in that case the agent would have to have an idea of the finalized work and to what extent it would have to be complete in relation to the finalized work. Another example is pheromone trails left by insects and that are followed by others. These ideas can in some cases explain self-organization in social systems aka swarm intelligence (Deneubourg 1977). Conceptually a next step is computer supported collaboration between human agents, in particular via the www; another example is the establishing of a price on a market: a price emerges from the myriad of interactions between people that then serves as a reference for their decisions thereafter. DPB: anchoring means that once one has become used to some mark, it serves as a frame of reference thereafter, priming means that once a reference price was given, this serves as a frame of reference thereafter; are these stigmergic effects of a Luhmannian communication on the human mind; is spoken human language an example also, because it damages the direct environment and it only lasts as a damage in the minds of the people involved in the conversation; is written language an example in a kind of slow and long lasting way: once written its damaging effects remain forever; in that way, language (spoken or written can be deframed and reframed and be assigned a new meaning). Understood in this sense stigmergy is ubiquitous and it can clarify many things: ‘Stigmergy in the most general sense does not require either markers or quantities. Another, even more common misunderstanding is that stigmergy only concerns groups or swarms consisting of many agents. As we will show, stigmergy is just as important for understanding the behavior of a single individual’ [p 5]. The notion that an unintentional trace in a passive medium is far removed from the notion of a direct influence of the behavior of one agent on the behavior of another agent.

2 From etymology to definition

Stigmergy is derived from the Greek stigma which means mark or puncture and ergon which means work, the product of work or action: as a joint concept it was originally as a goad or prod or spur (prikkel): ‘Thus (Grassé, 1959) defined stigmergy as ‘the stimulation of workers by the very performances they have achieved (from the original English abstract)’ [p 6]. More recently it was understood as follows: ‘if we understand stigma as “mark” or “sign”and ergon as “action”, then stigmergy is “the notion that an agent’s actions leave signs in the environment, signs that it and other agents sense and that determine their subsequent actions”’ [p 6]. DPB: the understanding of Grassé is that stigmergy means motivation by the work (of others) and the understanding of Parunak is motivation by marks left by the work. Suppose an uttered word already leaves a mark on the mind of some people in a network that is the environment of someone, then the difference between the two is that in the notion of Grassé one has to be present and in the notion of Parunak one does not. DPB: the expression of a meme leads to other expressions of it. ; ‘Stigmergy is an indirect, mediated mechanism of coordination between actions, in which the trace of an action left on a medium stimulates the performance of a subsequent action’ [p 6]. Also the picture is interesting:

In the medium: a mark: which stimulates >>

In the agent: an action: which produces >> [p 6].

DPB: this is my Logistical Model exactly! Using memes it is: an expression of a meme produces a mark in a medium and a perception of that mark stimulates an action in an agent. But what I find missing here is the effect of a meme in the internal, the mark that is left within the agent. That is a difference; let’s see how the stigmergy is defined later on, and whether it includes the mind of the agent when it is included in a social system.

3. Basic components of stigmergy

Action is defined as a causal process that produces a change in the world (real). Agent is defined as a goal-directed autonomous system: this concept is not necessary because actions of a single unspecified agent can be coordinated by stigmergy (but it is useful if more than one agent is involved with different kinds of actions: stigmergy is the coordinator of actions that are merely events or (agentless) processes. This can be represented by a condition-action rule: the condition specifies the state of the world inducing the action, and the action specifies the subsequent transformation of that state. This can also be written as: a+b+c+.. >> x+y+.., where the + indicaes the conjunction of the conditions and of the actions. Chemical Organization Theory (Dittrich & Winter, 2008) show how collections of these simple reactions tend to become coordinated by acting on a shared medium (reaction vessel), where they produce an evolving trace expressed by the concentrations of the different ‘molecules’ (a,b,..). This coordinated pattern of activity defines an organization: a self-sustaining, dynamic network of interacting ‘molecules’. The relation is causal but not deterministic: the probability that an action takes place increases if the conditions are met (P (action I condition) > P (action). DPB: the medium is the whole of the environment that can contain (be damaged to show) data in the sense of a signal whether fast or slow to disappear and widely or narrowly distributed, e.g. a tombstone (in the real) or a change in the state of the mind of one’s interlocutor caused by the irritation of one’s words (in the virtual of Simondon). In the latter case the minds of the interlocutors are a part of the environment of the person: ‘The medium is that part of the world that undergoes changes through the actions and whose states are sensed as conditions for further actions’ [p 7]. The medium is an aspect of the environment: ‘First, .. , the environment is not in general perceivable an controllable. Second, the environment normally denotes everything outside the system or agent under consideration. However, stigmergy can also make use of an internal medium’ (emphasis by the author) [p 7]. DPB: waarvan acte! As a consequence aspects of the agent system are controllable by elements in the environment and hence they belong to the medium. The environment is that part of the world with which an agent interacts; phenomena perceivable and controllable are different for each agent and hence every agent has a different environment; ‘When we consider stigmergic coordination between different agents, we need to define the medium as that part of the world that is controllable and perceivable by all of them’ [p 7]. DPB: this reminds of the discourse / population idea, where a multitude of people included by a communication (the discourse) is defined as a population. This is different because in the discourse people are included that find themselves to be attracted as a result of their life experience and because of the selections of the communication. The medium is a broader and wider concept because it is determined by what people can perceive and control, but that does not necessarily attract them because of their life experience so far. The role of the medium is to allow interaction between different actions to take place, and thus, indirectly, between different agents; this mediating function is the true power of stigmergy. A final component of a stigmergic system is a trace or a mark; it is the result of an action and as such it contains information about the action that produced it: ’We might see the trace as a message, deposited in a medium, through which the pattern of activity communicates with itself, while maintaining a continuously updated “memory” of its achievements. From the point of view of an individual agent, on the other hand, the trace is a challenge: a situation that incites action, in order to remedy a perceived problem or shortcoming, or to exploit an opportunity for advancement (Heylighen, 2012)’ (emphasis by the author) [p 8]. DPB: I think in the Logistical Model the medium is the mind of the person as well as the communication: both are simultaneously and differently damaged through their mutual irritations.

4. Coordination

According to the Oxford Dictionary, coordination can be defined as the organization of the different elements of a complex body or activity so as to enable them to work together effectively’ (emphasis by the author) [p 8]. In the case of stigmergy the ‘elements’ are actions or agents; ‘effectively’ means that a goal is pursued; ‘working together’ means that the agents or actions are harmonious or synergetic ‘the one rather helping than hindering the other’ [p 8]. ‘Organization’ means a structure with a function, where ‘function’ is the achievement of the intended effect and ‘structure’ is the way agents or actions are connected such that they form a coherent whole. ‘This brings the focus on the connections that integrate the actions into a synergetic, goal-oriented whole’ (emphasis of the author) [p 8]. DPB: this reminds me of autopoietic systems: the properties of the elements of a systems determine the relations between them. The goal-orientation and the synergy (or harmony) of the elements (or rather of the body they form) is per definition dedicated to their autopoiesis.

5. The benefits of stigmergy

Stigmergic organization limits the gap between planning, instructions and reality; it is robust to contingency and shock; it is less prone to error of communication and errors of control than traditional forms of organization; it is less dependent on the number of agents or actions involved or the dependencies between them. The only requirement is that the agents have access to the medium and that they can recognize the conditions to start their actions. There is no need for: planning, memory, direct communication, mutual awareness, simultaneous presence, imposed sequence, imposed dividion of labor, commitment, centralized control.

6. Self-organization through negative feedback

Error-controlled regulation means that a deviation from the goal of an agent implies a change of behavior of the agent such that a compensatory action suppresses the effect of the deviation, the error. The agent must be capable to sense the error and to execute a compensatory action. In regards to the establishment of effective collective action, the only additional assumption is that the goals of the agents are not contradictory, but the goals are not necessary the same for it. ‘We may assume that agents have acquired their condition-action rules (and thus their implicit goals) through natural selection of instinctual behavior or differential reinforcement of learned behavior. This means that their condition-action rules are generally appropriate to the local environment, including the other agents with which they regularly interact’ [p 11]. DPB: the entire system maintains its autopoiesis and its parts maintain theirs; the entire systems develops (evolutionarily) in its environment of other systems and its parts develop in their environments of other parts; the parts develop autopoietically within the conditions of the autopoiesis of the entire system. Their ‘goals’ are their autopoiesis as it is trained to the requirements of their (local) context.

7. Self-organization through positive feedback

This is the amplification of movements towards an existing goal; they can be called diversions because they divert action from its ongoing course.

8. Conclusion

Virtually all evolved processes that require coordination between actions seem to rely at some level on stigmergy, in the sense that subsequent actions are stimulated by the trace left by previous actions in some observable and manipulable medium. The trace functions like a registry and map, indicating which actions have been performed and which still need to be performed. It is shared by all agents that have access to the medium, thus allowing them to coordinate their actions without need for agent-to agent communication. It even allows the coordination of “agentless” actions, as investigated e.g. by Chemical Organization Theory (Dittrich & Fenizio, 2007)’ [p 12]. DPB: I disagree with the ‘that require coordination’ phrase: what about a wandering discussion, where the medium involves the brains of the the other participants. This does not require coordination as such but it is coordinated.

Time and the Other

Fabian, J. . Time and the Other – How Anthroplogy Makes its Object . Columbia University Press . New York . 1983 . ISBN 0-231-05590-0

Anthropology is the study of humans and their societies in the past and present. Its main subdivisions are social anthropology and cultural anthropology, which describes the workings of societies around the world, linguistic anthropology, which investigates the influence of language in social life, and biological or physical anthropology, which concerns long-term development of the human organism.

‘Time much like language or money, is a carrier of significance, a form through which we define the content of relations between the Self and the Other.. Time may give form to relations of power and inequality under the conditions of capitalist industrial production’ [Preface and Acknowledgements p. IX]. This means that time is an aspect that determines the interface between Self and the Other and so Time influences our view on the Other.

How does our use of the concept of time influence the construction of the object of study of antropology? The difficulty is in our understanding of we as the subject of anthropology, because in that role we as the subject of history can not be presupposed or left implicit nor should it be allowed to define the Other in an easy way. The contradiction is that the study of anthropoloy is conducted by involving with the object of research intensively, but based on the knowledge gained in that field research, to pronounce a discours construing the Other in terms of spatial and temporal distance.

Ch 1 Time and the Emerging Other

Knowledge is power and the claim to power of anthropology stems from its roots: the constituting of its own object of study, the Other (originally the object was the savage). All knowledge of the Other also has a historical, therefore a temporal element. In this sense accumulating knowledge involves a political act, namely from the systematic oppression to anarchic mutual recognition.

Universal time was established in the renaissance and its spread during the Enlightenment. The confusion exists because of the multitude of historical fact. Universal history is a device to distinguish different times by comparing the histories of individual countries with it: in this way it is what a general map is to particular maps [Bossuet 1845: 1, 2]. Universal can have the connotation of total (the entire worlds at all times) and general (applicable to many instances). Bossuet doesn’t address the first, but the second: how can history be presented in terms of generally valid principles? This can be done if in the ‘sequence of things, la suite des choses’ one can discern the ‘order of times’. This can be done if the order can be abbreviated to allow an instant view. the ‘epoch’ is proposed as a device, a resting place in time to consider everything that happened before that point and everything after it.

Travel gave a new impetus to anthropology and to time. Travel is now a vehicle for self-realization and the documents produced as a result form a new discours. The new traveler citiqued the existing philosophes: things seen and experienced while traveling are not as per the reality distorted by preconceived ideas.

The objective of the modern navigators is ‘to complete the history of man’ [La Pérouse in Moravia 1967:964 f in Fabian p. 8]. The meaning of complete can be to self-realisation and it can also be understood as to fill out (like a form).

The conceived authenticity of a past, found in ‘savage cultures’ is used to denounce an overly acculturated and urbanized present by presenting counterimages to the pristine wholeness of the authentic life. Time is at this point in the nineteenth century secularized.

From history to evolution (from secularization of time to evolutionary temporalizing): 1) time is immanent to the world, nature, the universe, 2) relations between parts of the world can be understood as as temporal relations. The theory of Darwinian evolution can only be accepted on the condition that the concept of time that is crucial to it, is adapted to the one in vigor. Only then can tis theory be applied to projects with the objective to show evolutionary laws in society. Darwin had based his concept of time on [Charles Lyell’s Principles of Geology 1830] and he cites in a section in the origin of Species named ‘On the lapse of Time’: ‘He who can read Sit Charles Lyell’s grand work on the Principles of Geology, which the future historian will recognize as having produced a revolution in the natural sciences, yet does not admit how incomprehensibly vast have been the past periods of time, may at once close this#volume’ [1861 third ediction:111]. Lyell suggests the theory of Uniformitarianism: ‘All former changes of the organic and physical creation are referable to one uniterrupted succession of physical events, governed by laws now in operation’ [quoted in Peel 1971:293n9 in Fabian p.13]. Geological time endowed them with plausibility and scope they did not have before; the biblical time wasn’t the right kind of time, because it relays significant events from a Christian perspective, but not a neutral time independent of the events it marks. And so it cannot be part of a Cartesian time-space system.

Darwin states that time has no inner necessity or meaning: ‘The mere lapse of time itself doesn’t do anything either for or against natural selection. I state this because it has been erroneously asserted that the element of time is assumed by me to play an all-important part in natural selection, as if all species were necessarily undergoing slow modification from some innate law’ [Darwin 1861:110 f]. Also Darwin hinted at the epistemological status o scientific discovery as a sort of developing language or code. The new naturalized time is a way to order the discontinuous and fragmentary record of natural history of the world. Evolutionists now ‘spatialized’ time: instead of viewing it as a sequence of events, it now becomes a tree of related events.

By claiming to make sense of society in terms of evolutionary stages, Christian Time was now replaced with scientific Time. ‘In fact little more had been done than to replace faith in salvation by faith in progress and industry..’ [Fabian 1981 p17]. In this way the epistemology of anthropology became intellectually linked to colonization and imperialism. All societies past, present and future were placed on a stream of Time. This train of thought implies that the Other is studied in terms of the primitive, Primitive principally being a temporal concept, a category, not an object of western thought.

The Use of Time

The use of Time in anthropologic field research is different from the theoretical discourse. The latter is used for different purposes:

  • Physical time used as a parameter to describe sociocultural process.
  • Mundane time used for grand-scale periodizing.
  • Typological time, used to measure the intervals between sociocultural events.
  • Intersubjective time: an emphasis on the action-interaction in human communication.

‘As soon as culture is no longer primarily conceived as a set of rules to be enacted by individual members of distinct groups, but as the specific way in which actors create, and produce beliefs, values, and other means of social life, it has to be recognized that Time is a constitutive dimension of social reality’ [Fabian 1981 P 24].

The naturalization of time defines temporal relations as exclusive and expansive: the pagan is marked for salvation, the savage is not yet ready for civilization. What makes the savage significant for evolutionary time is that he lives in another time. All knowledge acquired by the anthropologist is affected by the historically established relations (of power and domination) between his society and the society of the one he studies; and therefore it is political in nature. The risk however is distancing. Moreover: distancing is often seen as objective by practitioners. Intersubjective time would seem to preclude distancing as the practitioner and the object are coeval (of the same same age, duration or epoch, similar to synchronous, simultaneous, contemporary), namely share the same time. But for human communication to occur, coevalness has to be created: communication is about creating the same shared time. And so in human communication recognizing intersubjectivity, establishing objectivity is connected with the creating of distance between the participants or object and subject in research. This distancing is implied in the distinction between the sender, the message and the receiver. Even if the coding and decoding of the message is taken out then the TRANSFER of it implies a temporal distance between the sender and the receiver. Distancing devices produce a denial of coevalness: ‘By that I mean a persistent and systematic tendency to place the referent(s) of anthropology in a Time other than the present of the producer of the anthropological discourse‘ [Fabian 1981 p. 31].

Coevalness can be denied by Typological time and by Physical time, intersubjective time may pose the problem described above: if coevalness is a condition for communication and anthropology is based on ethnography and ethnography is a form of communication then the anthropologist is not free to choose coevalness for his interlocutors or not. Either he submits to the condition of coevalness and produces ethnographic knowledge or he doesn’t. If anachronism is a fact or statement that is outdated in a certain timeframe: it is a mistake or an accident. As a device, and not a mistake, this is named allochronism.

Coevalness is present in the field research and not in the theory development and writing. This latter activity is political in the sense that is rooted in the early existence of the science and so it is connected with colonialism. At this point hardly more than technological advance and economical exploitation seem the most available arguments to explain western superiority (p. 35).

Ch 3: Time and Writing about the Other

Even if (an observer) is in communication with other observers, he can only hear what they have seen in their absolute pasts, at times which are also his absolute past. So whether knowledge originates in the experience of a group of people or of a society, it must always be based on what is past and gone, at the moment when it is under consideration‘ [David Bohm in Fabian 1981 p. 71].

In previous chapters it was argued that the temporal conditions experienced in the field differ from those as expressed when writing or teaching. Empirical research can only be productive if the researcher and the researched share time. Usually the intepretation of the research occurs at a (temporal) distance, denying coevalness to the object of inquiry. This is a problem if both activities are part of the same discipline: this was not always so (travelogues versus armchair anthropology). This is also a problem if the practice of coevalness assumed a given in field research indeed contributes to the quality of the research and that it should not in fact be distanced also in an ideal world.

Now historical discourse introduces two new presuppositions in that it, first, replaces the concept of achronicity with that of temporality. At the same time it assumes that the signifier of the text which is in the present has a signified in the past. Then it reifies its signified semantically and takes it for a referent external to the discourse‘ [Greimas 1976:29 in Fabian pp. 77-8]. The referent being a society or a culture of reference, to reify means ‘render something concrete’.

The Ethnographic Present as a literary convention means to give account of other societies and cultures in the present tense. Historical accuracy, if the past tense in the accounts is used, is a matter of the ‘critique of the sources’. Also the comparison with the referents is not strict anymore, because that needs to be based on past data of the referent also. Another problem is that the present tense may freeze the picture of the state of affairs as it is found in a culture, which is a dynamical thing in nature and freezing it doesn’t take this into account. Another issue is with the autobiographical style of reporting of field research: this has a partly etymological and partly practical backdrop.

This is an important foundation for intersubjective knowledge ‘Somehow we must be able to share each other’s past in order to be knowingly in each other’s present‘ [Fabian 1981 p. 92]. In other words: reflexive (reflexion, revealing the researcher) experience is more important than reflective (reflection, neutralised for the researcher’s presence thus eliminating subjectivity) experience, because if the first were unavailable then the information about the object (the individual and his society) would be unidirectional in time and therefore tangential (irrelevant and beside the point) and therefore another symptom of the denial of coevalness. Additionally reflexion requires the researcher to ‘travel back and forth in time’ and so the researched can know the researcher as well as the converse. The same goes for the storing of data.

The method of observation can be a source of denial of coevalness also: the structure of the observations, the planning, the visual aspects deemed relevant, the representation of the visual data, the indications of speed included in the observations all presuppose a format stemming from one time and projecting itself and / or conditioning the observation. These are criteria brought to the observation process by the researcher and forms the basis for the production of knowledge. In additon to changing and emphasizing some criteria deemed relevant by the researcher and other criteria are left out at his choice.

Conclusions

Anthropology emerged and established itself as an allochronic discourse; it is a science of other men in another Time. It is a discourse whose referent has been removed from the present of the speaking/writing subject. This ‘petrified relation’ is a scandal. Anthropology’s Other is, ultimately, other people who are our contemporaries‘ [Fabian 1985 p. 143].

The western countries needed Time to accommodate the schemes of a one-way history: progress, development, modernity and their negative mirror images: stagnation, underdevelopment and tradition. The fiction is that interpersonal, intergroup, international the time is ‘public time’, there for the taking of anyone interested and as a consequence allotted by the powers that be. The notion of ‘public time’ provided a notion of simultaneity that is natural and independent of ideology and individual consciousness. And as a result coevalness is no longer required.

As soon as it was realized that fieldwork is a form of communicative interaction with an Other, one that must be carried out coevally, on the basis of shared intersubjective Time and intersocietal contemporaneity, a contradiction had to appear between research and writing, because anthropological writing had become suffused with the strategies and devices of an allochronic discourse‘ [Fabian 1985 p. 148].

they (the sign-theories of culture DPB) have a tendency to reinforce the basic premises of an allochronic discourse in that they consistently align the Here and Now of the signifier (the form, the structure, the meaning) with the Knower, and the There and Then of the signified (the content, the function or event, the symbol or icon) with the Known‘ [Fabian 1985 p. 151].

It is expressive of a political cosmology, that is, a kind of myth. Like other myths, allochronism has the tendency to establish a total grip on our (the anthropologists DPB) discourse. It must therefore be met by a ‘total’ response, which is not to say that the critical work be accomplished in one feel swoop‘ [Fabian 1985 p 152].

The ideal of coevalness must of course also guide the critique of the many forms in which coevalness is denied in anthropological discourse‘ [Fabian 1985 p. 152].

Evolutionism established anthropological discourse as allochronic, but was also an attempt to overcome a paralyzing disjunction between the science of nature and the science of man‘ [Fabian 1985 p 153].

That which is past enters the dialectics of the present, if it is granted coevalness’ [Fabian 1985 p. 153].

The absence of the Other from our Time has been his mode of presence in our discourse – as an object and victim‘ [Fabian 1985 p. 154].

Is not the theory of coevalness which is implied (but by no means fully developed) in these arguments a program for ultimate temporal absorption of the Other, just the kind of theory needed to make sense of present history as a ‘world system’, totally dominated by monopoly- and state-capitalism?’ [Fabian 1985 p. 154].

Are there, finally, criteria by which to distinguish denial of coevalness as a condition of domination from refusal of coevalness as an act of liberation?‘ [Fabian 1985 p. 154].

What are opposed, in conflict in fact, locked in antagonistic struggle, are not the same societies at different stages of development, but different societies facing each other at the same Time‘ [Fabian 1985 p 155].

Point of departure for a theory of coevalness: 1) recuperation of the idea of totality (‘.. we can make sense of another society only to the extent that we grasp it as a whole, an organism, a configuration, a system’ [Fabian 1985 p 156]. This is flawed because a) system rules are imposed from outside and above and because culture is now a system, a theory of praxis (the process by which a theory, lesson, or skill is enacted, practiced, embodied, or realized) not provided b) if a theory of praxis is not conceived then anthropology cannot be perceived as an activity that is part of what is studied.

.. the primitive assumption, the root metaphor of knowledge remains that of a difference, and a distance, between thing and image, reality and representation. Inevitably, this establishes and reinforces models of cognition stressing difference and distance between a beholder and an object‘ [Fabian 1985 p 160].

‘A first and fundamental assumption of a materialist theory of knowledge, .. , is to make consciousness, individual and collective, the starting point. Not disembodied consciousness, however, but ‘consciousness with a body’, inextricably bound up with language. A fundamental role for language must be postulated.. Rather, the only way to think of consciousness without separating it from the organism or banning it to some ‘forum internum’ is to insist on its sensuous nature; .. to tie consciousness as an activity to the production of meaningful sound. Inasmuch as the production of meaningful sound involves the transforming, shaping of matter, it may still be possible to distinguish form and content, but the relationship between the two will then constitutive of consciousness. Only in a secondary, derived sense (one in which the conscious organism is presupposed rather than accounted for) can that relationship be called representational (significative, symbolic), or informative in the sense of being a tool or carrier of information’ [Fabian 1985 p 161].

it is wrong to think of the human use of language as characteristically informative, in fact or in intention. Human language can be used to inform or to mislead, to clarify one’s own thoughts or ot display one’s cleverness, or simply for play. If I speak with no concern for modifying your behavior or thoughts, I am not using language any less than if I say exactly the same things with such intention. If we hope to understand human language and the psychological capacities on which it rests, we must first ask what it is, not how or for what purpose it is used‘ [Chomsky 1972 p 70 in Fabian p 162]. Chomsky, N. . Language and Mind – Enlarged Edition . New York: Harcourt Brace Jovanovic . 1972

‘Man does not ‘need’ language; man, in the dialectical, transitive understanding of ‘to be’, is language (much like he does not need food, shelter, and so on, but is his food and house). Consciousness, realized by the (producing) meaningful sound, is self-conscious. The Self, however, is constituted fully as a speaking and hearing Self. Awareness, if we may thus designate the first stirrings of knowledge beyond the registering of tactile impressions, is fundamentally based on hearing meaningful sounds produced by Self and Others. .. Not solitary perception but social communication is the starting point for a materialist anthropology, provided that we keep in mind that man does not ‘need’ language as a means of communication, or by extension, society as a means of survival, Man is communication and survival. What saves these assumptions from evaporating in the clouds of speculative metaphysics is, I repeat, a dialectical understanding of the verb ‘to be’ in these propositions. Language is not predicated on man (nor is the ‘human mind’ or ‘culture’). Language produces man as man produces language. Production is the pivotal concept of materialist anthropology‘ [Fabian 1985 p162].

The element of thought itself – the element of thought’s living expression-language-is of a sensuous nature. The social reality of nature, and human natural science, or the natural science about man, are identical terms‘ [Marx 1953:245 f, translation from The Economic and Philosophic Manuscripts of 1844 1964:143 in Fabian 1985 p 163]. [Marx, K. . Die Frühschriften . Siegfried Landshut, ed Stuttgart: A. Kröner – 1964. The Economic and Philosophic Manuscripts of 1844 . Dirk Struik . ed. New York : International] en [Marx, K. and Engels, F. . Marx and Engels: Basic Writings on Politics and Philosophy . Feuer, L. S. . ed. Garden City. New York: Doubleday]

‘Concepts are products of sensuous interaction; they themselves are of a sensuous nature inasmuch as their formation and use is inextricably bound up with language… it is the sensuous nature .. that makes language an eminently temporary phenomenon. Its materiality is based on articulation, on frequencies, pitch, tempo, all of which are realized in the dimension of time… The temporality of speaking .. implies cotemporality of producer and product, speaker and listener, Self and Other’ [Fabian 1985 p. 163-4].

A New Kind of Science

Wolfram concludes that ‘the phenomenon of complexity is quite universal – and quite independent of the details of particular systems’. This complex behaviour does not depend on system features such as the way cellulare automata are typically arranged in a rigid array or that they are processed in parallel. Very simple rules of cellular automata generally lead to repetitive behaviour, slightly more complex rules may lead to nested behaviour and even more complex rules may lead to complex behaviour of the system. Complexity with regards to the underlying rules means how they can be intricate or their assembly or make-up is complicated. Complexity with regards to the behaviour of the overall system means that little or no regularity is observed.

The surprise is that the threshold for the level of complexity of the underlying rules to generate overall system complexity is relatively low. Conversely, above the threshold, there is no requirement for the rules to become more complex for the overall behaviour of the system to become more complex.

And vice versa: even the most complex of rules are capable of producing simple behaviour. Moreover: the kinds of behaviour at a system level are similar for various kinds of underlying rules. They can be categorized as repetitive, nested, random and ‘including localized structures’. This implies that general principles exist that produce the behaviour of a wide range of systems, regardless of the details of the underlying rules. And so, without knowing every detail of the observed system, we can make fundamental statements about its overall behaviour. Another consequence is that in order to study complex behaviour, there is no need to design vastly complicated computer programs in order to generate interesting behaviour: the simple programs will do [Wolfram, 2002, pp. 105 – 113].

Numbers
Systems used in textbooks for complete analysis may have a limited capacity to generate complex behaviour because they, given the difficulties to make a complete analysis, are specifically chosen for their amenability to complete analysis, hence of a simple kind. If we ignore the need for analysis and look only at results of computer experiments, even simple ‘number programs’ can lead to complex results.

One difference is that in traditional mathematics, numbers are usually seen as elementary objects, the most important attribute of which is their size. Not so for computers: numbers must be represented explicity (in their entirety) for any computer to be able to work with it. This means that a computer uses numbers as we do: by reading them or writing them down fully as a sequence of digits. Whereas we humans do this on base 10 (0 to 9), computers typically use base 2 (0 to 1). Operations on these sequences have the effect that the sequences of digits are updated and change shape. In tradional mathematics, this effect is disregarded: the effect of an operation on a sequence as a consequence of an operation is considered trivial. Yet this effect amongst others is by itself capable of introducing complexity. However, even when the size only is represented as a base 2 digit sequence when executing a simple operator such as multiplication with fractions or even whole numbers, complex behaviour is possible.

Indeed, in the end, despite some confusing suggestions from traditional mathematics, we will discover that the general behavior of systems based on numbers is very similar to the general behavior of simple programs that we have already discussed‘ [Wolfram, 2002, p 117].

The underlying rules for systems like cellular automata are usually different from those for systems based on numbers. The main reason forr that is that rules for cellular automata are always local: the new color of any particular cell depends only on the previous colour of that cell and its immediate neighbours. But in systems based on numbers there is usually no such locality. But despite the absence of locality in the underlying rules of systes based on numbers it is possible to find localized structures also seen in cellular automata.

When using recursive functions of a form such as f(n) = f(n – f(n- 1) then only subtraction and addition are sufficient for the development of small programs based on numbers that generate behaviour of great complexity.

And almost by definition, numbers that can be obtained by simple mathematical operations will correspond to simple such (symbolic DPB) expressions. But the problem is that there is no telling how difficult it may be to compute the actual value of a number from the symbolic expression that is used to represent it‘ [Wolfram, 2002, p143].

Adding more dimensions to a cellular automaton or a turing machine does not necessarily mean that the complexity increases.

But the crucial point that I will discuss more in Chapter 7 is that the presence of sensitive dependence on initial conditions in systems like (a) and (b) in no way implies that it is what is responsible for the randomness and complexity we see in these systems. And indeed, what looking at the shift map in terms of digit sequences shows us is that this phenomenon on its own can make no contribution at all to what we can reasonably consider the ultimate production of randomness‘ [Wolfram, 2002, p. 155].

Multiway Systems
The design of this class of systems is so that the systems can have multiple states at any one step. The states at some time generate states at the nex step according to the underlying rules. All states thus generated remain in place after they have been generated. Most Multiway systems grow very fast or not at all and slow growth is as rare as is randomness. The usual behaviour is that repetition occurs, even if it is after a large number of seemingly random states. The threshold seems to be in the rate of growth: if the system is allowed to grow faster then the chances that it will show complex behaviour increases. In the process, however, it generates so many states that it becomes difficult to handle [Wolfram 2002, pp. 204 – 209].

Chapter 6: Starting from Randomness
If systems are started with random initial conditions (up to this point they started with very simple initial conditions such as one black or one white cell), they manage to exhibit repetitive, nested as well as complex behaviour. They are capable of generating a pattern that is partially random and partially locally structured. The point is that the intial conditions may be in part but not alone responsible for the existence of complex behaviour of the system [Wolfram 2002, pp. 223 – 230].

Class 1 – the behaviour is very simple and almost all initial conditions lead to exactly the same uniform final state

Class 2 – there are many different possible final states, but all of them consist just of a certain set of simple structures that either remain the same forever or repeat every few steps

Class 3 – the behaviour is more complicated, and seems in many respects random, although triangles and other small-scale structures are essentially always on some level seen

Class 4 – this class of systems involves a mixture of order and randomness: localized structures are produced which on their own are fairly simple, but these structures move around and interact with each other in very complicated ways.

‘There is no way of telling into which class a cellular automaton falls by studying its rules. What is needed is to run them and visually ascertain which class it belongs to’ [Wolfram 2002, Chapter 6, pp.235].

One-dimensional cellular automata of Class 4 are often on the boundary between Class 2 and Class 3, but settling in neither one of them. There seems to be some kind of transition. They do have characteristics of their own, notably localized structures, that do neither belong to Class 2 or Class 3 behaviour. This behaviour including localized structures can occur in ordinary discrete cellular automata as well as in continuous cellular automata as well as in two-dimensional cellular automata.

Sensitivity to Initial Conditions and Handling of Information
Class 1 – changes always die out. Information about a change is always quickly forgotten

Class 2 – changes may persist, but they remain localized, contained in a part of the system. Some information about the change is retained in the final configuration, but remains local and therefore not communicated thoughout the system

Class 3 – changes spread at a uniform rate thoughout the entire system. Change is communicated long-range given that local structures travelling around the system are affected by the change

Class 4 – changes spread sporadically, affecting other cells locally. These systems are capable of communicating long-range, but this happens only when localized structures are affected [Wolfram 2002, p. 252].

In Class 2 systems, the logical connection between their eventually repetitive behaviour and the fact that no long-range communication takes place is that the absence of long-range communication forces the system to behave as if its size were limited. This behaviour follows a general result that any system of limited size, discrete steps and definite rules will repeat itself eventually.

In Class 3 systems the possible sources of randomness are the randomness present in initial conditions (in the case of a cellular automaton the initial cells are chosen at random versus one single black or white cell for simple initial conditions) and the sensitive dependence on initial conditions of the process. Random behaviour in a Class 3 system can occur if there is no randomness in its initial conditions. There is not an a priori difference in the behaviour of most systems generated on the basis of random initial conditions and one based on simple intial conditions1. The dependence on the initial conditions of the patterns arising in the pattern in the overall behaviour of the system is limited in the sense that although the produced randomness is evident in many cases, the exact shape can differ from the initial conditions. This is a form of stability, for, whatever the initial conditions the system has to deal with, it always produces similar recognizable random behaviour as a result.

In Class 4 there must be some structures that can persist forever. If a system is capable of showing a sufficiently complicated structure then eventually at some initial condition, a moving structure is found also. Moving structures are inevitable in Class 4 systems. It is a general feature of Class 4 cellular automata that with appropriate initial conditions they can mimick the behaviour of all sorts of other systems. The behaviour of Class 4 cellular automata can be diverse and complex even though their underlying rules are very simple (compared to other cellular automata). The way that diffferent structures existing in Class 4 systems interact is difficult to predict. The behaviour resulting from the interaction is vastly more complex than the behaviour of the individual structures and the effects of the interactions may take a long time (many steps) after the collision to become clear.

It is common to be able to design special initial conditions so that some cellular automaton behaves like another. The trick is that the special initial conditions must then be designed so that the behaviour of the cellular automaton emulated is contained within the overall behaviour of the other cellular automaton.

Attractors
The behaviour of a cellular automaton depends on the specified initial conditions. The behaviour of the system, the sequences shown, gets progressively more restricted as the system develops. The resulting end-state or final configuration can be thought of as an attractor for that cellular automaton. Usually many different but related initial conditionss lead to the same end-state: the basin of attraction leads it to an attractor, visible to the observer as the final configuration of the system.

Chapter 7 Mechanisms in Programs and Nature
Processes happening in nature are complicated. Simple programs are capable of producing this complicated behaviour. To what extent is the behaviour of the simple programs of for instance cellular automata relevant to phenomena observed in nature? ‘It (the visual similarity of the behaviour of cellular automata and natural processes being, DPB) is not, I believe, any kind of coincidence, or trick of perception. And instead what I suspect is that it reflects a deep correspondence between simple programs and systems in nature‘ [Wolfram 2002, p 298].

Striking similarities exist between the behaviours of many different processes in nature. This suggests a kind of universality in the types of behaviour of these processes, regardless the underlying rules. Wolfram suggests that this universality of behaviour encompasses both natural systems’ behaviour and that of cellular automata. If that is the case, studying the behaviour of cellular automata can give insight into the behaviour of processes occurring in nature. ‘For it (the observed similarity in systems behaviour, DPB) suggests that the basic mechanisms responsible for phenomena that we see in nature are somehow the same as those responsible for phenomena that we see in simple programs‘ [Wolfram 2002, p 298].

A feature of the behaviour of many processes in nature is randomness. Three sources of randomness in simple programs such as cellular automata exist:
the environment – randomness is injected into the system from outside from the interactions of the system with the environment.
initial conditions – the initial conditions are a source of randomness from outside. The randomness in the system’s behaviour is a transcription of the randomness in the initial conditions. Once the system evolves, no new randomness is introduced from interactions with the environment. The system’s behaviour can be no more random than the randomness of the initial conditions. In practical terms many times isolating a system from any outside interaction is not realistic and so the importance of this category is often limited.
intrinsic generation – simple programs often show random behaviour even though no randomness is injected from interactions with outside entities. Assuming that systems in nature behave like the simple programs, it is reasonable to assume that the intrinsic generating of randomness occurs in nature also. How random is this internally generated randomness really? Based on tests using existing measures for randomness they are at least as random as any process seen in nature. It is not random by a much used definition classifying behaviour as random if it can never be generated by a simple procedure such as the simple programs at hand, but this is a conceptual and not a practical definition. A limit to the randomness of numbers generated with a simple program, is that it is bound to repeat itself if it exists in a limited space. Another limit is the set of initial conditions: because it is deteministic, running a rule twice on the same initial conditions will generate the same sequence and the same random number as a consequence. Lastly truncating the generated number will limit its randomness. The clearest sign of intrinsic randomness is its repeatability: in the generated graphs areas will evolve with similar patterns. This is not possible starting from different initial conditions or with external randomness injected while interacting. The existence of intrinsic randomness allows a discrete system to behave in seemingly continuous ways, because the randomness at a local level averages out the differences in behaviour of individual simple programs or system elements. Continuous systems are capable of showing discrete behaviour and vice versa.

Constraints
But despite this (capability of constraints to force complex behaviour DPB) my strong suspicion is that of all of the examples we see in nature almost none can in the end best be explained in terms of constraints‘ [Wolfram 2002, p 342]. Constraints are a way of making a system behave as the observer wants it to behave. To find out which constraints are required to deliver the desired behaviour of a system in nature is in practical terms far too difficult. The reason for that difficulty is that the number of configurations in any space soon becomes very large and it seems impossible for systems in nature to work out which constraint is required to satisfy the constraints at hand, especially if this procedure needs to be performed routinely. Even if possible the procedure to find the rule that actually satisfies the constraint is so cumbersome and computationally intensive, that it seems unlikely that nature uses it to evolve its processes. As a consequence nature seems to not work with constraints but with explicit rules to evolve its processes.

Implications for everyday systems
Intuitively from the perspective of traditional science the more complex is the system, the more complex is its behaviour. It has turned out that this is not the case: simple programs are much capable of generating compicated behaviour. In general the explicit (mechanistic) models show behaviour that confirms the behaviour of the corresponding systems in nature, but often diverges in the details.
The traditional way to use a model to make predictions about the behaviour of an observed system is to input a few numbers from the observed system in your model and then to try and predict the system’s behaviour from the outputs of your model. When the observed behaviour is complex (for example if it exhibits random behaviour) this approach is not feasible.
If the model is represented by a number of abstract equations, then it is unlikely (nor was it intended) that the equations describe the mechanics of the system, but only to describe its behaviour in whatever way works to make a prediction about its future behaviour. This usually implies disregarding all the details and only taking into account only the important factors driving the behaviour of the system.
Using simple programs, there is also no direct relation between the behaviour of the elements of the studied system and the mechanics of the program. ‘.. all any model is supposed to do – whether it is a cellular automaton, a differential equation or anything else – is to provide an abstract representation of effects that are important in detemining the behaviour of a system. And below the level of these effects there is no reason that the model should actually operate like the system itself‘ [Wolfram 2002, p 366].
The approach in the case of the cellular automata is to then visually compare (compare the pictures of) the outcomes of the model with the behaviour of the system and try and draw conclusions about similarities in the behaviour of the observed system and the created system.

Biological Systems
Genetic material can be seen as the programming of a life form. Its lines contain rules that determine the morphology of a creature via the process of morphogenesis. Traditional darwinism suggests that the morphology of a creature determines its fitness. Its fitness in turn detemines its chances of survival and thus the survival of its genes: the more individuals of the species survive, the bigger its representation in the genepool. In this evolutionary process, the occurrence of mutations will add some randommness, so that the species continuously searches the genetic space of solutions for the combination of genes with the highest fitness.
The problem of maximizing fitness is essentially the same as the problem of satisfying constraints..‘ [Wolfram 2002, p386]. Sufficiently simple constraints can be satisfied by iterative random searches and converge to some solution, but if the constraints are complicated then this is no longer the case.
Biological systems have some tricks to speed up this process, like sexual reproduction to mix up the genetic offspring large scale and genetic differentiation to allow for localized updating of genetic information for separate organs.
Wolfram however consides it ‘implausible that the trillions or so of generations of organisms since the beginning of life on earth would be sufficient to allow optimal solutions to be found to constraints of any significant complexity‘ [Wolfram 2002 p 386]. To add insult to injury, the design of many existing organisms is far from optimal and is better described as a make-do, easy and cheap solution that will hopefully not immediately be fatal to its inhabitant.
In that sense not every feature of every creature points at some advantage towards the fitness of the creature: many features are hold-overs from elements evolved at some earlier stage. Many features are so because they are fairly straightforward to make based on simple programs and then they are just good enough for the species to survive, not more and not less. Not the details filled in afterwards, but the relatively coarse features support the survival of the species.
In a short program there is little room for frills: almost any mutation in the program will tend to have an immediate effect on at least some details of the phenotype. If, as a mental exercise, biological evolution is modeled as a sequence of cellular automata, using each others output sequentially as input, then it is easy to see that the final behaviour of the morphogenesis is quite complex.
It is, however, not required that the program be very long or complicated to generate complexity. A short program with some essential mutations suffices. The reason that there isn’t vastly more complexity in biological systems while it is so easy to come by and while the forms and patterns usually seen in biological systems are fairly simple is that: ‘My guess is that in essence it (the propensity to exhibit mainly simple patterns DPB) reflects limitations associated with the process of natural selection .. I suspect that in the end natural selection can only operate in a meaningful way on systems or parts of systems whose behaviour is in some sense quite simple‘ [Wolfram 2002, pp. 391 – 92]. The reasons are:
when behaviour is complex, the number of actual configurations quickly becomes too large to explore when the layout of different individuals in a species becomes very differnent then the details may have a different weight in their survival skills. If the variety of detail becomes large then acting consitently and definitively becomes increasingly difficult when the overall behaviour of a system is more complex then any of its subsystems, then any change will entail a large number of changes to all the subsystems, each with a different effect on the behaviour of the individual systems and natural selection has no way to pick the relevant changes
if chances occur in many directions, it becomes very difficult for changes to cancel out or find one direction and thus for natural selection to understand what to act on iterative random searches tend to be slow and make very little progress towards a global optimum.

If a feature is to be succesfully optimized for different environments then it must be simple. While it has been claimed that natural selection increases complexity of organisms, Wolfram suggests that it reduces complexity: ..’it tends to make biological systems avoid complexity, and be more like systems in engineering‘ [Wolfram 2002, p 393]. The difference is that in engineering systems are designed and developed in a goal oriented way, whereas in evolution it is done by an iterative random search process.

There is evidence from the fossil record that evolution brings smooth change and relative simplicity of features in biological systems. If all this evoltionary process points at simple features and smooth changes, then where comes the diversity from? It turns out that a change in the rate of growth changes the shape of the organism dramatically as well as its mechanical operation.

Fundamental Physics
My approach in investigating issues like the Second Law is in effect to use simple programs as metaphors for physical systems. But can such programs in fact be more than that? And for example is it conceivable that at some level physical systems actually operate directly according to the rules of a simple program? ‘ [Wolfram 2002, p. 434].

Out of 256 rules for cellular automata based on two colours and nearest neighbour interaction, only six exhibit reversible behaviour. This means that overall behaviour can be reversed if the rules of each automaton are played backwards. Their behaviour, however, is not very interesting. Out of 7,500 billion rules based on three colours and next-neighbour interaction, around 1,800 exhibit reversible behaviour of which a handful shows interesting behaviour.

The rules can be designed to show reversible behaviour if their pictured behaviour can be mirrored vertically (the graphs generated are usually from top to bottom, DPB): the future then looks the same as the past. It turns out that the pivotal design feature of reversible rules is that existing rules can be adapted to add dependence on the states of neighbouring cells two steps back. Note that this reversibily of rules can also be constructed by using the preceding step only, if, instead of two states, four are allowed. The overall behaviour showed by these rules is reversible, whether the intial conditons be random or simple. It is shown that a small fraction of the reversible rules exhibit complex behaviour for initial conditions that are simple or random alike.

Whether this reversibility actually happens in real life depends on the theoretical definition of the initial conditions and in our ability to set them up so as to exhibit the reversible overall behaviour. If the initial conditons are exactly right then increasingly complex behaviour towards the future can become simpler when reversed. In practical terms this hardly ever happens, because we tend to design and implement the intial conditions so that they are easy to describe and construct to the experimenter. It seems reasonable that in any meaningful experiment, the activities to set up the experiment should be simpler than the process that the experiment is intended to observe. If we consider these processes as computations, then the computations required to set up the experiment should be simpler than the computations involved in the evolution of the system under review. So starting with simple initial conditions and trace back to the more complex ones, then, starting the evolution of the system there anew, we will surely find that the system shows increasingly simple behaviour. Finding these complicated seemingly random initial conditions in any other way than tracing a reversible process to and fro the simple initial conditions seems impossible. This is also the basic argument for the existence of the Second Law of Thermodynamics.

Entropy is defined as the amount of information about a system that is still unknown after measurements on the system. The Second Law means that if more measurements are performed over time then the entropy will tend to decrease. In other words: should the observer be able to know with absolute certainty properties such as the positions and velocities of each particle in the system, then the entropy would be zero. According to the definition entropy is the information with which it would be possible to pick out the configuration the system is actually in from every possible configuration of the distribution of particles in the system satisfying the outcomes of the measurements on the system. To increase the number and quality of the measurements involved amounts to the same computational effort as is required for the actual evolution of the system. Once randomness is produced, the actual behaviour of the system becomes independent of the details of the initial conditions of the system. In a reversible system different initial conditions must lead to a diffent evolution of the system, for else there would be no way of reversing the system behaviour in a unique way. But even though the outcomes from different initial conditions can be much different, the overall patterns produced by the system can still look much the same. But to identify the initial conditions from the state of a system at any time implies a computational effort that is far beyond the effort for a practical and meaningful measurement procedure. If a system generates sufficient randomness, then it evolves towards a unique equilibrium whose properties are for practical reasons independent of its initial conditions. In this why it is possible to identify many systems based on a few typical parameters.

‘With cellular automata it is possible, using reversible rules and starting from a random set of initial conditions, to generate behaviour that increases order instead of tending towards more random behaviour, e.g. rule 37R [Wolfram 2002, pp. 452 – 57]. Its behaviour neither completely settles down to order nor does it generate randomness only. Although it is reversible, it does not obey the Second Law. To be able to reverse this process, however, the experimenter would have to set up initial conditions exactly so as to be able to reach the ‘earlier’ stages, else the information generated by the system is lost. But how can there be enough information to reconstruct the past? All the intermediate local structures that passed on the way to the ‘turning point’ would have to be absorbed by the system on its way back to in the end to reach its original state. No local structure emitted on the way to the turning point can escape.

The evolution in systems is therefore intrinsically? not reversible. All forms of self organisation in cellular automata without reversible rules can potentially occur?

For these reasons it is possible to parts of the universe get more organised than other parts, even with all laws of nature being reversible. What the cellular automata such as 37R show is that this is even possible for closed systems to not follow the Second Law. If the systems gets partitioned then within the partitions order might evolve while simultaneously elsewhere in the system randomness is generated. Any closed system will repeat itself at some point in time. Until then it must visit every possible configuration. Most of these will be or seem to be random. Rule 37R does not produce this ergodicity: it visits only a small fraction of all possible states before repeating.

Conserved Quantities and Continuum Phenomena
Examples are quantities of energy and electric charge. Can the amount of information in exchanged messages be a proxy for a quantity to be conserved?

With nearest neighbour rules, cellular automata do exhibit this principle (shown as the number of cells of equal colour conserved in each step), but without showing sufficient complex behaviour. Using next-neighbour rules, they are capable of showing conservation while exhibiting interesting behaviour also. Even more interesting and random behaviour occurs when block cells are used, especially using three colours instead of two. In this setup the total number of black cells must remain equal for the entire system. On a local level, however, the number of black cells does not necessarily remain the same.

Multiway systems
In a multiway system all possible replacements are always executed at every step, thereby generating many new strings (i.e. combinations of added up replacements) at each step. ‘In this way they allow for multiple histories for a system. At every step multiple replacements are possible and so, tracing back the different paths from string to string, different histories of the system are possible. This may appear strange, for our understanding of the universe is that it has only one history, not many. But if the state of the universe is a single string in the multiway system, then we are part of that string and we cannot look into it from the outside. Being on the inside of the string it is our perception that we follow just one unique history and not many. Had we been able to look at it from without, then the path that the system followed would seem arbitrary‘ [Wolfram 2002, p 505]. If the universe is indeed a multiway system then another source of randomness is the actual path that its evolution has followed. This randomness component is similar to the outside randomness discussed earlier, but different in the sense that in would occur even if this universe would be perfectly isolated from the rest of the universe.

There are sufficient other sources of randomness to explain interesting behaviour in the universe and that by itself is no sufficient reason to assume the multiway system as a basis for the evolution of the universe. What other reasons can there be to underpin the assumption that the underlying mechanism of the uiverse is a multiway system? For one, multiway systems are much capable of generating a vast many different possible strings and therefore many possible connections between them, meaning different histories.

However, looking at the sequences of those strings it becomes obvious that these can not be arbitrary. Each path is defined by a sequence of ways in which replacements by multiway systems’ rules are applied. And each such path in turn defines a causal network. Certain underlying rules have the property that the form of this causal network ends up being the same regardless of the order in which the replacements are applied. And thus regardless of the path that is followed in the multiway system. If the multiway system ends up with the same causal network, then it must be possible to apply a replacement to a string already generated, to end up at the same final state. Whenever paths always eventually converge then there will be similarities on a sufficiently large scale in the obtained causal networks. And so the structure of the causal networks may vary a lot at the level of individual events. But at a sufficiently large level, the individual details will be washed out and the structure of the causal network will be essentially the same: on a sufficiently high level the universe will appear to have a unique history, while the histories on local levels are different.

Processes of perception and analysis
The processes that lead to some forms of behaviour in systems are comparable to some processes that are involved in their perception and analysis. Perception relates to the immediate reception of data via sensory input, analysis involves conscious and computational effort. Perception and analysis are an effort to reduce events in our everyday lives to manageable proportions so that we can use them. Reduction of data happens by ignoring whatever is not necessary for our everyday survival and by finding patterns in the remaining data so that individual elements in the data do not need to be specified. If the data contains regularities then there is some redundance in the data. The reduction is important for reasons of storage and communication.
This process of perception and analysis is the inverse of the evolving of systems behaviour from simple programs: to identify whatever it is that produces some kind of behaviour. For observed complex behaviour this is not an easy task, for the complex behaviour generated bears no obvious relation to the simple programs or rules that generate them. An important difference is that there are many more ways to generate complex behaviour than there are to recognize the origins of this kind of behaviour. The task of finding the origins of this behaviour is similar to solving problems satisfying a set of constraints.
Randomness is roughly defined as the apparent inability to find a regularity in what we perceive. Absence of randomness implies that redundancies are present in what we see, hence a shorter description can be given of what we see that allows us to reproduce it. In the case of randomness, we would have no choice but to repeat the entire picture, pixel by pixel, to reproduce it. The fact that our usual perceptional abilities do not allow such description doesn’t mean that no such description exists. It is very much possible that randomness is generated by the repetition of a simple rule a few times over. Does it, then, imply that the picture is not random? From a perceptory point of view it is, because we are incapable to find the corresponding rule, from a conceptual point of view this definition is not satisfactory. In the latter case the definition would be that randomness exists if no such rule exists and not only if we cannot immediately discern it. However, finding the short description, i.e. the short program that generates this random behaviour is not possible in a computationally finite way. Resticting the computational effort to find out whether something is random seems unsatisfactory, because it is arbitrary, it still requires a vast amount of computational work and many systems will not be labelled as random for the wrong reasons. So in the definition of randomness some reference needs to be made to how the short descriptions are to be found. ‘..something could be considered to be random whenever there is essentially no simple program that can succeed in detecting regularities in it‘ [Wolfram 2002, p 556]. In practical terms this means that after comparing the behaviour of a few simple programs with the behaviour of the observed would-be random generator and if no regularities are found in it, then the behaviour of the observed system is random.

Complexity
If we say that something is complex, we say that we have failed to find a simple description for it hence that our powers of perception and analysis have failed on it. How the behaviour is described depends on what purpose the description serves, or how we perceive the observed behaviour. The assessment of the involved complexity may differ depending on the purpose of the observation. Given this purpose, then the shorter the description the less complex the behaviour. The remaining question is whether it is possible to define complexity independent of the details of the methods of perception and analysis. The common opinion traditionally was that any complex behaviour stems from a complex system, but this is no longer the case. It takes a simple program to develop a picture for which our perception can find no simple overall description.
So what this means is that, just like every other method of analysis that we have considered, we have little choice but to conclude that traditional mathematics and mathematical formulas cannot in the end realistically be expected to tell us very much about patterns generated by systems like rule 30‘ [Wolfram 2002, p 620].

Human Thinking
Human thinking stands out from other methods of perception in its extensive use of memory, the usage of the huge amount of data that we have encountered and interacted with previously. The way human memory does this is by retrieval based on general notions of similarity rather than exact specifications of whatever memory item is that we are looking for. Hashing could not work, because similar experiences summarized by different words might end up being stored in completely different locations and the relevant piece of information might not be retrieved on the occasion that the key search word involved hits a different hash code. What is needed is information that really sets one piece of information apart from other pieces, to store that and to discard all others. The effect is that the retrieved information is similar enough to have the same representation and thus to be retrieved of some remotely or seemingly remote association occurs with some situation at hand.
This can be achieved with a number of templates that the information is compared with. Only if the remaining signal per layer of nerve cells generates a certain hash code then the information is deemed relevant and retrieved. It is very rare that a variation in the input results in a variation in the output; in other words: quick retrieval (based on the hash code) of similar (not necessarily exactly the same) information is possible. The stored information is pattern based only and not stored as meaningful or a priori relevant information.

But it is my strong suspicion that in fact logic is very far from fundamental, particularly in human thinking‘ [Wolfram 2002, 627]. We retrieve connections from memory without too much effort, but perform logical reasoning cumbersomely, going one step after the next, and it possible that we are in that process mainly using elements of logic that we have learned from previous experience only.

Chapter 11 The Notion of Computation
All sorts of behaviour can be produced by simple rules such as cellular automata. There is a need for a framework for thinking about this behaviour. Traditional science provides this framework only if the observed behaviour is fairly simple. What can we do if the observed behaviour is more complex? The key-idea is the notion of computation. If the various kinds of behaviour are generated by simple rules, or simple programs then the way to think about them is in terms of the computations they can perform: the input is provided by the initial conditions and the output by the state of the system after a number of steps. What happens in between is the computation, in abstract terms and regardless the details of how it actually works. Abstraction is useful when discussing systems’ behaviour in a unified way, regardless the different kinds of underlying rules. For even though the internal workings of systems may be different, the computations they perform may be similar. At this pivot it may become possible to formulate principles applying to a variety of different systems and independent of the detailed structures of their rules.

At some level, any cellular automaton – or for that matter, any system whatsoever – can be viewed as performing a computation that determines what its future behaviour will be‘ [Wolfram, 2002, p 641]. And for some of the cellular automata described it so happens that the computations they perform can be described to a limited extent in traditional mathematical notions. Answers to the question of the framework come from practical computing.

The Phenomenon of Universality
Based on our experience with mechanical and other devices it can be assumed that we need different underlying constructions for different kinds of tasks. The existence of computers has shown that different underlying constructions make universal systems that can be made to execute different tasks by being programmed in different ways. The hardware is the same, different software may be used, programming the computer for different tasks.
This idea of universality is also the basis for programming languages, where instructions from a fixed set are strung together in different ways to create programs for different tasks. Conversely any computer programmed with a program designed in any language can perform the same set of tasks: any computer system or language can be set up to emulate one another. An analog is human language: virtually any topic can be discussed in any language and given two languages, it is largely possible to always translate between them.
Are natural systems universal as well? ‘The basic point is that if a system is universal, then it must effectively be capable of emulating any other system, and as a result it must be able to produce behavior that is as complex as the behavior of any other system. So knowing that a particular system is universal thus immediately implies that the system can produce behavior that is in a sense arbitrarily complex‘ [Wolfram 2002, p 643].

So as the intuition that complex behaviour must be generated by complex rules is wrong, so the idea that simple rules cannot be universal is wrong. It is often assumed that universality is a unique and special quality but now it becomes clear that it is widespread and occurs in a wide range of systems including the systems we see in nature.

It is possible to construct a universal cellular automaton and to input initial conditions so that it emulates any other cellular automata and thus to produce any behaviour that the other cellular automaton can produce. The conclusion is (again) that nothing new is gained by using rules that are more complex than the rules of the universal cellular automaton, because given it, more complicated rules can always be emulated by the simple rules of the universal cellular automaton and by setting up appropriate initial conditions. Universality can occur in simple cellular automata with two colours and next-neighbour rules, but their operation is more difficult to follow than cellular automata with a more complex set-up.

Emulating other Systems with Cellular Automata
Mobile cellular automata, cellular automata that emulate Turing machines, substitution systems2, sequential substitution systems, tag systems, register machine, number systems and simple operators. A cellular automaton can emulate a practical computer as it can emulate registers, numbers, logic expressions and data retrieval. Cellular automata can perform the computations that a practical computer can perform.
And so a universal cellular automaton is universal beyond being capable of emulating all other cellular automata: it is capable of emulating a vast array of other systems, including practical computers. Reciprocally all other automata can be made to emulate cellular automata, including a universal cellular automaton, and they must therefore itself be universal, because a universal cellular automaton can emulate a wide array of systems including all possible mobile automata and symbolic systems. ‘By emulating a universal cellular automaton with a Turing machine, it is possible to construct a universal Turing machine‘ [Wolfram 2002, p 665].

And indeed the fact that it is possible to set up a univeral system using essentially just the operations of ordinary arthmetic is closely related to the proof af Godel’s Theorem‘ [Wolfram 2002, p 673].

Implications of Universality
All of the discussed systems can be made to emulate each other. All of them have certain features in common. And now, thinking in terms of computation, we can begin to see why this might be the case. They have common features just because they can be made to emulate each other. The most important consequence is that from a computational perspective a very wide array of systems with very different underlying structures are at some level fundamentally equivalent. Although the initial thought might have been that the different kinds of systems would have been suitable for different kinds of computations, this is in fact not the case. They are capable of performing exactly the same kinds of computations.
Computation therefore can be discussed in abstract terms, independent of the kind of system that performs the computation: it does not matter what kind of system we use, any kind of system can be programmed to perform any kind of computation. The results of the study of computation at an abstract level are applicable to a wide variety of actual systems.
To be fair: not all cellular automata are capable of all kinds of computations, but some have large computational capabilties: once past a certain threshold, the set of possible computations will be always the same. Beyond that threshold of universality, no additional complexity of the underlying rules might increase the computational capabilties of the system. Once the threshold is passed, it does not matter what kind of system it is that we are observing.

The rule 110 Cellular Automaton
The threshold for the complexity of the underlying rules required to produce complex behaviour is remarkably low.

Class 4 Behaviour and Universality
Rule 110 with random initial conditions exhibits many localized structures that move around and interact with each other. This is not unique to that rule, this kind of behaviour is produced by all cellular automata of Class 4. The suspicion is that any Class 4 system will turn out to have universal computational capabilities. For the 256 nearest-neighbour rules and two colours, only four more or less comply (rule 124, 137 and 193, all require some trivial amendments). But for rules involving more colours, more dimensions and / or next-neigbour rules, Class 4 localized structures often emerge. The crux for the existence of class 4 behaviour is the control of the transmission of information through the system.

Universality in Turing Machines and other Systems
The simplest Universal Turing Machine currently known has two states and five possible colours. It might not be the simplest Universal Turing Machine in existence and so the simplest lies between this and two states and two colours, none of which are Universal Turing Machines; there is some evidence that a Turing Machine with two states and three colours is universal, but no proof exists as yet. There is a close connection between the appearance of complexity and universality.

Combinators can emulate rule 110 and are known to be universal from the 1930s. Other symbolic sytems show complex behaviour and may turn out to be universal too.

Chapter 12 The Principle of Computational Equivalence
The Principle of Computational Equivalence applies to any process of any kind, natural or artificial: ‘all processes, whether they are produced by human effort or occur spontaneously in nature, can be viewed as computations‘ [Wolfram 2002, p 715]. This means that any process that follows definite rules can be thought of as a computation. For example the process of evolution of a system like a cellular automaton can be viewed as a computation, even though all it does is generate the behaviour of the system. Processes in nature can be thought of as computations, although the rules they follow are defined by the basic laws of nature and all they do is generate their own behaviour.

Outline of the principle
The principle asserts that there is a fundamental equivalence between many kinds of processes in computational terms.

Computation is defined as that which a universal system as meant here can do. It is possible to imagine another system capable of computations beyond universal cellular automata or other such systems but they can never be constructed in our universe.

Almost all processes that are not obviously simple can be viewed as computations of equivalent sophistication. In other words: there is just one level of computational sophistication and it is achieved by almost all processes that do not seem obviously simple. Universality allows the construction of universal systems that can perform any computation and thus they must be capable of exhibiting the highest level of computational sophistication. From a computational view this means that systems with quite different underlying structures will show equivalence in that rules can be found for them that achieve universality and that can thus exhibit the same level of computational sophistication.
The rules need not be more complicated themselves to achieve universality hence a higher level of computational sophistication. On the contrary: many simple though not overly simple rules are capable of achieving universality hence computational sophistication. This property should furthermore be very common and occur in a wide variety of systems, both abstract and natural. ‘And what this suggests is that a fundamental unity exists across a vast range of processes in nature and elsewhere: despite all their detailed differences every process can be viewed as corresponding to a computation that is ultimately equivalent in its sophistication‘ [Wolfram 2002, p 719].

We could identify all of the existing processes, engineered or natural, and observe their behaviour. It will surely become clear that in many instances it will be simple repetitive or nested behaviour. Whenever a system shows vastly more complex behaviour, the Principle of Computational Equivalence then asserts that the rules underlying it are universal. Conversely: given some rule it is usually very complicated to find out if it is universal or not.

If a system is universal then it is possible, by choosing the appropriate initial conditions, to perform computations of any sophistication. No guarantee exists, however, that some large portion of all initial conditions result in behaviour of the system that is more interesting and not merely obviously simple. But even rules that are by themselves not complicated, given simple initial conditions, may produce complex behaviour and may well produce processes of computational sophistication.

Introduction of a new law to the effect that no system can carry out explicit computations that are more sopisticated than that can be carried out by systems such as cellular automata or Turing Machines. Almost all processes except those that are obviously simple achieve the limit of Computational Equivalence implying that almost all possible systems with behaviour that is not obviously simple an overwhelming fraction are universal. Every process in this way can be thought of as a ‘lump of computation’.

The Validity of the Principle
The principle is counter-intuitive from the perspective of traditional science and there is no proof for it. Cellular automata are fundamentally discrete. It appears that systems in nature are more sophisticated than these computer systems because they should from a traditional perspective be continuous. But the presumed continuousness of these systems itself is an idealization required by traditional methods. As an example: fluids are traditionally described by continuous models. However, they consist of discrete particles and their computational behaviour must be of a system of discrete particles.
It is my strong suspicion that at a fundamental level absolutely every aspect of our universe will in the end turn out to be discrete. And if this is so, then it immediately implies that there cannot ever ultimately be any form of continuity in our universe that violates the Principle of Computational Equivalence’ [Wolfram 2002, p 730]. In a continuous system, the computation is not local and every digit has in principle infinite length. And in the same vein: ‘.. it is my strong belief that the basic mechanisms of human thinking will in the end turn out to correspond to rather simple computational processes’ [Wolfram 2002, p 733].

Once a system reaches a relatively low threshold of complexity then any real system must exhibit the same level of computational sophistication. This means that observers will tend to be computationally equivalent to the observed systems. As a consequence they will consider the behaviour of such systems complex.

Computational Irreducibility
Scientific triumphs have in common that almost all of them are based on finding ways to reduce the amount of computational work in order to predict how it will behave. Most of the time, the idea is to derive a mathematical formula that allows to detemine what the outcome of the evolution of the system will without having to trace its every step explicitly. There is great shortage of formulas describing all sorts of known and common systems’ behaviour.
Traditional science takes as a starting point that much of the evolutionary steps perfomed by a system are an unnecessarily large effort. It is attempted to shortcut this process and find an outcome with less effort. However, describing the behaviour of systems exhibiting complex behaviour is a difficult task. In general not only the rules for the system are required to do that, but its initial conditions as well. The difficulty is that, knowing the rules and the initial condtions, it might still take an irreducible amount if time to predict its behaviour. When computational irreducibility exists there is no other way to find out how it will behave but to go though its every evolutionary step up to the required state. The predicting system can only outrun the actual system of which we are trying to predict its future with less effort if its computations are more sophisticated. This idea violates the Principle of Computational Equivalence: every system that shows no obviously simple behaviour is computationally exactly equivalent. So predicting models cannot be more sophisticated than the systems they intend to describe. And so for many systems no systematic predictions can be done, their process of evolution cannot be shortcut and they are computationally irreducible. If the behaviour of a system is simple, for example repetitive or nested, then the system is computationally reducible. This reduces the potential of traditional science to advance in studying systems of which the behaviour is not quite simple.

To make use of mathematical formulas for instance only makes sense if the computation is reducible hence the system’s behaviour is relatively simple. Science must constrain itself to the study of relatively easy systems because only these are computationally reducible. This is not the case for the new kind of science, because it uses limited formulas but pictures of the evolution of systems instead. The observed systems may very well be computationally irreducible. They are not a preamble to the actual ‘real’ predictions based on formulas, but they are the real thing themselves. A universal system can emulate any other system, including the predictive model. Using shortcuts means trying to outrun the observed system with another that takes less effort. Because the latter can be emulated by the former (as it is universal), this means that the predictive model must be able to outrun itself. This is relevant because universality is abound in systems.

As a consequence of computational irreducibility there can be no easy theory for everything, there will be no formula that predicts any and every observable process or behaviour that seems complex to us. To deduce the consequences of these simple rules that generate complex behaviour will require irreducible amounts of computational effort. Any system can be observed but there can not be a guarantee that a model of that system exists that accurately describes or predicts how the observed system will behave.

The Phenomenon of Free Will
Though a system may be governed by definite underlying laws, its behaviour may not be describable by reasonable laws. This involves computational irreducibility, because the only way to find out how the system will behave is to actually evolve the system. There is no other way to work out this behaviour more directly.
Analog to this is the human brain: although definite laws underpin its workings, because of irreducible computation no way exists to derive an outcome via reasonable laws. It then seems that, knowing that definite rules underpin it, the system seems to behave in some way that it does not seem to follow any reasonable law at all doing this or that. And yet the underpinning rules are definite without any freedom yet allowing the system’s behaviour some form of apparent freedom. ‘For if a system is computationally irreducible this means that there is in effect a tangible separation between the underlying rules for the system and its overall behaviour associated with the irreducible amount of computational work needed to go from one to the other. And it is this separation, I believe, that the basic origin of the apparent freedom we see in all sorts of systems lies – whether those systems are abstract cellular automata or actual living brains‘ [Wolfram 2002, p 751].
The main issue is that it is not possible to make predictions about the behaviour of a system, for if we could then the behaviour would be determined in a definite way and cannot be free. But now we know that definite simple rules can lead to unpredictability: the ensuing behaviour is so complex that it seems free of obvious rules. This occurs as a result of the evolution of the system itself and no external input is required to derive that behaviour.
‘But this is not to say that everything that goes on in our brains has an intrinsic origin. Indeed, as a practical matter what usually seems to happen is that we receive external input that leads to some train of thought which continues for a while, but then dies out until we get more input. And often this the actual form of this train of thought is influenced by the memory we have developed from inputs in the past – making it not necessarily repeatable evn with exactly the same input‘ [Wolfram 2002, p752 – 53].

Undecidability and Untractability
Undecidability as per Godels Entscheidungsproblem is not a rare case, it can be achieved with very simple rules and it is very common. For every system that seems to exhibit complex behaviour, its evolution is likely to be undecidable. Finite questions about a system can ultimately answered by finite computation, but the computations may have an amount of difficulty that makes intractable. To assess the difficulty of a computation, one assesses the amount of time it takes, how big the program is that runs it and how much memory it takes. However, it is often not knowable if the progam that is used for the computation is the most efficient for the job. Working with very small programs it becomes possible to assess their efficiency.

Implications for Mathematics and its Foundations
Applications in mathematics. In nature and in mathematics simple laws govern complex behaviour. Mathematics has distantiated itself increasingly from correspondence with nature. Universality in an axiom system means that any question about the behaviour of any other universal system can be encoded as a statement in the axiom system and that if the answer can be established in the other system then it can also be given by giving a proof in the axiom system. Every axiom system currently in use in mathematics is universal: it can in a sense emulate every other system.

Intelligence in the Universe
Human beings have no specific or particular position in nature: their computational skills do not vary vastly from the skills of other natural processes.

But the question then remains why when human intelligence is involved it tends to create artifacts that look much simpler than objects that just appear in nature. And I believe the basic answer to this has to do with the fact that when we as humans set up artifacts we usually need to be able to foresee what they will do – for otherwise we have no way to tell whether they will achieve the purposes we want. Yet nature presumably operates under no such constraint. And is fact I have argued that among systems that appear in nature a great many exhibit computational irreducibility – so that in a sense it becomes irreducibly difficult to foresee what they will do‘ [Wolfram 2002, p 828].

A firm as such is not a complicated thing: it takes one question to know what it is (answer: a firm) and another to find out what it does (answer: ‘we manufacture coffee cups’). More complicated is the answer to the question: ‘how do you make coffeecups’, for this requires some considerable explanation. And yet more complicated is the answer to: ‘what makes your firm stand out from other coffeecup manufacturing firms?’. The answer to that will have to involve statements about ‘how we do things around here’, the intricate details of which might have taken you years to learn and practice and now to explain.

A system might be suspected to be built for a purpose if it is the minimal configuration for that purpose.

It would be most satisfying if science were to prove that we as humans are in some fundamental way special, and above everything else in the universe. But if one looks at the history of science many of its greatest advances have come precisely from identifying ways in which we are not special – for this is what allows science to make ever more general statements about the universe and the things in it‘ [Wolfram 2002, p 844].

‘So this means that there is in the end no difference between the level of computational sophistication that is achieved by humans and by all sorts of other systems in nature and elsewhere’ [Wolfram 2002, p 844].

Mikhailovsky and Levic: Entropy, Information and Complexity or Which Aims the Arrow of Time?

This below is my summery of a somewhat quirky article by George E. Mikhailovsky and Alexander P. Levic on MDPI. It suggests a mathematical model for the variation of complexity, using conditional local maximum entropy for (hierarchical) interrelated objects or elements in systems. I am not capable to verify whether this model makes sense mathematically. However I find the logic of it appealing because it brings a relation between entropy, information and complexity. I need this to be able to assess the complexity of my systems, i.e. businesses. Also it is based on / akin to ‘proven technology’ (i.e. existing models for these concepts in a mathematical grid) and it is seems to be more than a wild guess. Additionally it implicates relations between hierarchical levels and objects of a system, using a resources view. Lastly, and connecteed to this last issue, it addresses this ever-intriguing matter of irreversibility and the concept of time on different scales, and the mutual relation to time at a macroscopic level, i.e. how we experience it here and now.

This quote below from the last paragraph is a clue of why I find it important: “The increase of complexity, according to the general law of complification, leads to the achievement of a local maximum in the evolutionary landscape. This gets a system into a dead end where the material for further evolution is exhausted. Almost everybody is familiar with this, watching how excessive complexity (bureaucratization) of a business or public organization leads to the situation when it begins to serve itself and loses all potential for further development. The result can be either a bankruptcy due to a general economic crisis (external catastrophe) or, for example, self-destruction or decay into several businesses or organizations as a result of the loss of effective governance and, ultimately, competitiveness (internal catastrophe). However, dumping a system with such a local maximum, the catastrophe gives it the opportunity to continue the complification process and potentially achieve a higher peak.”

According to the second law entropy increases in isolated systems (Carnot, Clausius). Entropy is the first physical quantity that varies in time asymmetrically. The H-theorem of Ludwig Boltzmann shows how the irreversibility of entropy increase is derived from the reversibility of microscopic processes obeying Newtonian mechanics. He deduced the formula to:

 (1) S = KblnW

S is entropy

Kb is the Boltzmann constant equal to 1.38×10 23 J/K

W is the number of microstates related to a given macrostate

This equation relates to values at different levels or scales in a system hierarchy, resulting in a irreversible parameter as a result.

In 1948, Shannon and Weaver (The Mathematical Theory of Communication) suggested a formula for informational entropy:

(2) H = −KΣpilog pi

K is an arbitrary positive constant

pi the probability of possible events

If we define the events as microstates, consider them equally probable and choose the nondimensional Boltzmann constant, the Shannon Equation (2) becomes the Boltzmann Equation (1). The Shannon equation is a generalisation of the Boltzmann equation with different probabilities for letters making up a message (different microstates leading to a macrostate of a system). Shannon says (p 50): “Quantities of the form H = −KΣpilog pi (the constant K merely amounts to a choice of a unit of measure) play a central role in information theory as measures of information, choice and uncertainty. The form of H will be recognized as that of entropy as defined in certain formulations of statistical mechanics, where pi is the probability of a system being in cell i of its phase space.”. Note that no reference is quoted to a difference between information and information entropy. Maximum entropy exists when probabilities in all locations, pi, are equal and the information of the system (message) is in maximum disorder. Relative entropy is the ratio of H to maximum entropy.

The meaning of these values has proven difficult, because the concept of entropy is generally seen as something negative, whereas the concept of information is seen as positive. This is an example by Mikhailovsky and Levic: “A crowd of thousands of American spectators at an international hockey match chants during the game “U-S-A! U-S-A!” We have an extremely ordered, extremely degenerated state with minimal entropy and information. Then, as soon as the period of the hockey game is over, everybody is starting to talk to each other during a break, and a clear slogan is replaced by a muffled roar, in which the “macroscopic” observer finds no meaning. However, for the “microscopic” observer who walks between seats around the arena, each separate conversation makes a lot of sense. If one writes down all of them, it would be a long series of volumes instead of three syllables endlessly repeated during the previous 20 minutes. As a result, chaos replaced order, the system degraded and its entropy sharply increased for the “macro-observer”, but for the “micro-observer” (and for the system itself in its entirety), information fantastically increased, and the system passed from an extremely degraded, simple, ordered and poor information state into a much more chaotic, complex, rich and informative one.” In summary: the level of orde depends on the observed level of hierarchy. Additionally, the value attributed to order has changed in time and so may have changed the qualification ‘bad’ and ‘good’ used for entropy and information respectively.

A third concept connected to order and chaos is complexity. The definition of algorithmic complexity K(x) of the final object x is the length of the shortest computer program that prints a full, but not excessive (i.e. minimal), binary description of x and then halts. The equation for Kolmogorov complexity is:

(3) K(x) = lpr + Min(lx)

D is a set of all possible descriptions dx in range x

L is the set of equipotent lengths lx of the descriptions dx in D

lpr is the binary length of the printing algorithm mentioned above

In case x is not binary, but some other description using n symbols, then:

(4) K(x) = lpr + Min((1/n)Σpi2log(pi))

Mikhailovsky and Levic conclude that, although Equation (4) for complexity is not

completely equivalent to Equations (1) and (2), it can be regarded as their generalization in a broader sense.

Now we define an abstract representation of the system as a category that combines a class of objects and a class of morphisms. Objects of the category explicate (nl: expliciteren) the system’s states and morphisms define admissible transitions from one state to another. Categories with the same objects, but differing morphisms are different and describe different systems. For example, a system with transformations as arbitrary conformities differs from a system where the same set of objects transforms only one-to-one. Processes taking place in the first system are richer than in the latter because the first allows transitions between states of a variable number of elements, while the second requires the same number of elements in different states.

Let us take a system described by category S and the system states X and A, identical to objects X and A from S. Invariant I {X in S} (A) is a number of morphisms from X to A in the category S preserving the structure of objects. In the language of systems theory, invariant I is a number of transformations of the state X into the state A, preserving the structure of the system. We interpret the structure of the system as its “macrostate”. Transformations of the state X into the state A will be interpreted as ways of obtaining the state A from state X, or as “microstates”. Then, the invariant of a state is the number of microstates preserving the macrostate of the system, which is consistent with the Boltzmann definition of entropy in Equation (1). More strictly: we determine generalized entropy of the state A of system S (relating to the state X of the same system) as a value:

(5) Hx (A) = ln( I{X in Q}(A) / I{X in Q}(A) )

I{X in Q}(A) is the number of morphisms from set X into set A in the category of structured sets Q, and I{X in Q}(A) is the number of morphisms from set X into set A in the category of structureless sets Q with the same cardinality (number of dimensions) as in category Q, but with an “erased structure”. In particular cases, generalized entropy has the usual “Boltzmann” or, if you like, “Shannon” look (example given). This represents a ratio of the number of transformations preserving the structure by the total number Q of transformations that can be interpreted as the probability of the formation of the state with a given structure. Statistical entropy (1), information (2) and algorithmic complexity (4) are only a few possible interpretations of Equation (5). It is important to emphasize that the formula for the generalized entropy is introduced with no statistic or probabilistic assumptions and is valid for any large or small amounts of elements of the system.

The amount of “consumed” (plus “lost”) resources determines “reading” of the so-called “metabolic clock” of the system. Construction of this metabolic clock implies the ability to count the number of elements replaced in the system. Therefore, a non-trivial application of the metabolic approach requires the ability to compare one structured set to another. This ability comes from a functorial method comparison of structures that offers system invariants as generalization of the concept “number of elements” for structureless sets. Note that the system that consumes several resources exists in several metabolic times. The entropy of the system is an “averager” of metabolic times, and entropy increases monotonically with the flow of each of metabolic time, i.e., entropy and metabolic times of a system are linked uniquely, monotonously and can be calculated one through the other. This relationship is given by:

(7)

Here, H is structural entropy, L ≡ {L1 , L2 , . ., Lm} the set of metabolic times (resources) of system and Lagrange multipliers of the variational problem on the conditional maximum of structural entropy, restricted by flows of metabolic times. For the structure of sets with partitions where morphisms are preserving the partition mapping (or their dual compliances), the variational problem has the form:

(8)

It was proven that ≥ 0, i.e., structural entropy monotonously increases (or at least does not decrease) in the metabolic time of the system or entropy “production” does not decrease along a system’s trajectory in its state space (the theorem is analogous to the Boltzmann H-theorem for physical time). Such a relationship between generalized entropy and resourcescan be considered as a heuristic explanation of the origin of the logarithm in the dependence of entropy on the number of transformations: with logarithms the relationship between entropy and metabolic times becoming a power, not exponential, which in turn simplifies the formulas, which involve both parameterizations of time. Therefore, if the system metabolic time is, generally speaking, a multi-component magnitude and level-specific (relating to hierarchical levels of the system), then entropy time “averaging” metabolic times of the levels parameterizes system dynamics and returns the notion of the time to its usual universality.

The class of objects that explicates a system of categories can be presented as a system’s state space. An alternative to the postulation of the equations of motion in theoretical physics, biology, economy and other sciences is the postulation of extremal principles that generate variability laws of the systems studied. What needs to be extreme in a system? The category-functorial description gives a “natural” answer to this question, because category theory has a systematical method to compare system states. The possibility to compare the states by the strength of their structure allows one to offer an extremal principle for systems’ variation: from a given state, the system goes into a state having the strongest structure. According to the method, this function is the number of transformations admissible by structure of the system. However, a more usual formulation of the extremal principle can be obtained if we consider the monotonic function of the specific amount of admissible transformations that we defined as the generalized entropy of the state; namely given that the state of the system goes into a state for which the generalized entropy is maximal within the limits set by available resources. A generalized category-theoretic entropy allows not guessing and not postulating the objective functions, but strictly calculating them from the number of morphisms (transformations) allowed by the system structure.

Let us illustrate this with an example. Consider a very simple system consisting of a discrete space of 8 × 8 (like a chess board without dividing the fields on the black and white) and eight identical objects distributed arbitrary on these 64 elements of the space (cells). These objects can move freely from cell to cell, realizing two degrees of freedom each. The number of degrees of freedom of the system is twice as much as the number of objects due to the two-dimensionality of our space. We will consider the particular distribution of eight objects on 64 elements of our space (cells) as a system state that is equivalent in this case to a “microstate”. Thus, the number of possible states equals the number of combinations of eight objects from 64 ones: W8 = 64!/(64−8)!/8! = 4,426,165,368 .

Consider now more specific states when seven objects have arbitrary positions, while the position of the eighth one is completely determined by the positions of one, a few or all of the others. In this case, the number of degrees of freedom will reduce from 16 (eight by two) to 14 (seven by two), and the number of admissible states will decrease up to the number of combinations by seven objects, seven from 64 ones: W7 = 64!/(64−7)!/7! = 621,216,192

Let us name a set of these states a “macrostate”. Notice that the number of combinations of k elements from n calculated by the formula

(9) n! / (k! * (n-k)!)

is the cumulative number of “microstates” for “macrostates” with 16, 14, 12, and so on, degrees of freedom. Therefore, to reveal the number of “microstates” related exclusively to a given “macrostate”, we have to subtract W7 from W8 , W6 from W7, etc. These figures make quite clear that our simple model system being left to itself will inevitably move into a “macrostate” with more degrees of freedom and a larger number of admissible states, i.e., “microstates”. Two obvious conclusions immediately follow from these considerations:

• It is far more probable to find a system in a complex state than in a simple one.

• If a system came to a simple state, the probability that the next state will be simpler is immeasurably less than the probability that the next state will be more complicated.

This defines a practically irreversible increase of entropy, information and complexity, leading in turn to the irreversibility of time. For space 16 × 16, we could speak about practical irreversibility only, when reversibility is possible, although very improbable, but for real molecular systems where the number of cells is commensurate with the Avogadro’s number (6.02 × 1023), irreversibility becomes practically absolute. This absolute irreversibility leads to the absoluteness of the entropy extremal principle, which, as shown above, can be interpreted in an information or a complexity sense. This extremal principle implies a monotonic increase of state entropy along the trajectory of the system variation (sequence of its states). Thus, the entropy values parametrize the system changes. In other words, the system’s entropy time does appear. The interval of entropy time (i.e., the increment of entropy) is the logarithm of the value that shows how many times the number of non-equivalent transformations admissible by the structure of the system have changed.

Injective transformations ordering the structure are unambiguous nesting. In other words, the evolution of systems, according to the extremal principle, flows from sub-objects to objects: in the real world, where the system is limited by the resources, a formalism corresponding to the extremal principle is a variation problem on the conditional, rather than global, extremum of the objective function. This type of evolution could be named conservative or causal: the achieved states are not lost (the sub-object “is saved” in the object like some mutations of Archean prokaryotes are saved in our genomes), and the new states occur not in a vacuum, but from their “weaker” (in the sense of ordering by the strength of structure) predecessors.

Therefore, the irreversible flow of entropy time determines the “arrow of time” as a monotonic increase of entropy, information, complexity and freedom as the number of its realized degrees up to the extremum (maximum) defined by resources in the broadest sense and especially by the size of the system. On the other hand, available system resources that define a sequence of states could be considered as resource time that, together with entropy time, explicates the system’s variability as its internal system time.

We formulated and proved a far more general extremal principle applicable to any dynamic system (i.e., described by categories with morphisms), including isolated, closed, opened, material, informational, semantic, etc., ones (rare exceptions are static systems without morphisms, hence without dynamics described exceptionally by sets, for example a perfect crystal in a vacuum, a memory chip with a database backup copy or any system at a temperature of absolute zero). The extremum of this general principle is maximum, too, while the extremal function can be regarded as either generalized entropy, or generalized information, or algorithmic complexity. Therefore, before the formulation of the law related to our general extremal principle, it is necessary to determine the extremal function itself.

In summary, our generalized extremal principle is the following: the algorithmic complexity of the dynamical system, either being conservative or dissipative, described by categories with morphisms, monotonically and irreversibly increases, tending to a maximum determined by external conditions. Accordingly, the new law, which is a natural generalization of the second law of thermodynamics for any dynamic system described by categories, can be called the general law of complification:

Any natural process in a dynamic system leads to an irreversible and inevitable increase in its algorithmic complexity, together with an increase in its generalized entropy and information.

Three differences between this new law and the existing laws of nature are:

1) It is asymmetric with respect to time;

2) It is statistical: chances are larger that a system becomes more complex than that it will simplify over time. These chances for the increase of complexity grow with the increase of the size of the system, i.e. the number of elements (objects) in it;

The vast majority of forces considered by physics and other scientific disciplines could be determined as horizontal or lateral ones in a hierarchical sense. They act inside a particular level of hierarchy: for instance, quantum mechanics at the micro-level, Newton’s laws at the macro-level and relativity theory at the mega-level. The only obvious exception is thermodynamic forces when the movement of molecules at the micro-level (or at the meso-level if we consider the quantum mechanical one as the micro-level) determines the values of such thermodynamic parameters as temperature, entropy, enthalpy, heat capacity, etc., at the macro-level of the hierarchy. One could name these forces bottom-up hierarchical forces. This results in the third difference:

3) Its close connection with hierarchical rather than lateral forces.

The time scale at different levels of the hierarchy in the real world varies by orders of magnitude, the structure of time moments (the structure of the present) on the upper level leads to the irreversibility on a lower level. On the other hand, the reversibility at the lower level, in conditions of low complexity, leads to irreversibility on the top one (Boltzmann’s H-theorem). In both cases, one of the consequences of the irreversible complification is the emergence of Eddington’s arrow of time. Thus:

4) the general law of complification, leading to an increase in diversity and, therefore, accumulation of material for selection, plays the role of the engine of evolution; while selection of “viable” stable variants from all of this diversity is a kind of driver of evolution that determines its specific direction. The role of a “breeder” of this selection plays other, usually less general, laws of nature, which remain unchanged.

External catastrophes include the unexpected and powerful impacts of free energy, to which the system is not adapted. The free energy as an information killer drastically simplifies the system and throws it back in its development. However, the complexity and information already accumulated by the system are not destroyed completely, as a rule, and the system according to conservative or casual evolution, continues developing, not from scratch, but from some already achieved level.

Internal catastrophes are caused by ineffective links within the system, when complexity becomes excessive for a given level of evolution and leads to duplication, triplication, and so on, of relations, circuiting them into loops, nesting loop ones into others and, as a result, to the collapse of the system due to loss of coordination between the elements.

Information, Entropy, Complexity

Original question

If information is defined as ‘the amount of newness introduced’ or ‘the amount of surprise involved’ then chaotic behaviour implies maximum information and ‘newness’. Systems showing periodic or oscillating behaviour are said to ‘freeze’ and nothing new emerges from them. New structure or patterns emerge from systems showing behaviour just shy of chaos (the edge of chaos) and not from systems showing either chaotic or oscillating behaviour. What is the, for lack of a better word, role of information in this emergent behaviour of complex adaptive systems (cas).

Characterizing cas

One aspect characterizing cas is generally associated with complex behaviour. This behaviour in turn is associated with emergent behavior or forming of patterns new to the system, that are not programmed in its constituent parts and that are observable. The mechanics of a cas are also associated with large systems of a complicated make-up and consisting of a large number of hierarchically organised components of which the interconnections are non-linear. These ‘architectural’ conditions are not a sine-qua-non for systems to demonstrate complex behaviour. They may very well not show behaviour as per the above, and they may for that reason not be categorised as cas. They might become one, if their parameter space is adapted via an event at some point in time. Lastly systems behaviour is associated with energy usage (or cost) and with entropy production and information. However, confusion exists as to how to perform the measuring and interpret the outcomes of measurements. No conclusive definition exists about the meaning of any of the above. In other words: to date to my knowledge none of these properties when derived from a cas give a decisive answer to the question whether the system at hand is in fact complex.

The above statements are obviously self-referencing, unclear and undecisive. It would be useful to have an objective principle by which it is possible to know whether a given system shows complex behaviour and is therefore to be classified as a cas. The same goes for clear definitions for the meaning of the terms energy, entropy (production) and information in this context. It is useful to have a clear definition of the relationships of some such properties between themselves and between them and the presumed systems characteristics. This enables an observer to identify characteristics such as newness, surprise, reduction of uncertainty, meaning, information content and their change.

Entropy and information

It appears to me (no more than that) that entropy and information are two sides of the same coin, or in my words: though not separated within the system (or aspects of the same system at the same time), they are so to speak back-to-back, simultaneously influencing the mechanics (the interrelations of the constituent parts) and the dynamics (the interactions of the parts leading up to overall behavioral change of the system in time) of the system. What is the ‘role’ of information when a cas changes and how does it relate to the proportions mentioned.

The relation between information and entropy might then be: structures/patterns/algorithms distributed in a cas enable it in the long run to increase its relative fitness by reducing the cost of energy used in its ‘daily activities’. The cost of energy is part of the fitness function of the agent and stored information allows it to act ‘fit’. Structures and information in cas are distributed: the patterns are proportions of the system and not of individual parts. Measurements therefore must lead to some system characteristic (ie overall and not stop at individual agents) to get a picture of the learning/informational capacity of the entire CAS as a ‘hive’. This requires correlation between the interactions of the parts to allow the system to ‘organize itself’.

CAS as a TM

I suspect (no more than that) that it is in general possible to treat cas as a Turing Machine (TM), ‘disguised’ in any shape or, conversely, to treat complex adaptive systems as an instance of a TM. That approach makes the logic corresponding to TM available to the observer. An example of a system for which this classification is proven is 2-dimensional Cellular Automata of Wolfram class 4. This limited proof decreases the general applicability, because complex adaptive systems, unlike TM in all aspects, are parallel, open and asynchronous.

Illustration

Perhaps illustrative for a possible outcome, is, misusing the Logistic map because no complexity lives there, to ‘walk the process’ by changing parameter mu. Start at the right: in the chaotic region, newness (or reduction of uncertainty / surprise / information) is large, bits are very many, meaning (as in emerging patterns): small. Travel left to any oscillating region: newness is small, bits are very few, meaning is small. Now in between where there is complex behaviour: newness is high, bits fewer than the chaotic region, meaning is very high.

The logical underpinning of ‘newness’ or ‘surprise’ is: if no bit in a sequence can be predicted from a subset of that sequence, it is random. Each bit in the sequence is a ‘surprise’ or ‘new’ and the amount of information is highest. If 1 bit can be predicted, there is a pattern, an algorithm can be designed and, given it is shorter than this bit (this is theoretical) the surprise is less, as is the amount of information. The more pattern, the less surprise it holds and the more information appears to be stored ‘for later use’ such as processing of a new external signal that the system has to deal with. What we observe in a cas is patterns and so a limitation of this ‘surprise’.

A research project

I suggest the objective of such project is to design and test meaningful measurements for entropy production, energy cost and information processing of a complex adaptive system so as to relate them to each other and to the system properties of a cas in order to better recognize and understand them.

The suggested approach is to use a 2-dimensional CA structure parameterized to show complex behavior as per Wolfram class 4 as described in ‘A New Kind of Science’ of Stephen Wolfram.

The actual experiment is then to use this system to solve well-defined problems. As the theoretical characteristics of (the processing of and the storage by) a TM are known, this approach allows for a reference for the information processing and information storage requirements that can be compared to the actual processing and storing capacities of the system at hand.

Promising measurements are:

Measurement Description Using
Entropy Standard: this state related to possible states Gibbs or similar
Energy cost Theoretical energy cost required to solve a particular problem versus the energy the complex adaptive system at hand uses See slide inserted below, presentation e-mailed earlier: https://www.youtube.com/watch?v=9_TMMKeNxO0#t=649

Schermafdruk van 2015-06-09 12:56:03

Information Earlier discussion: Using this approach, we could experimentally compute the bits of information that agents have learned resulting from the introduction of new information into the system. I suggest to add: ..compute the bits of information that agents have learned relating to the system…. That subset of information distributed stored in the system representing the collective aspect of the system, i.e. distributed collective information. Amount of information contained in the co-evolving interfaces of the agents or parts of the system equivalent to labels as suggested by Holland.

Heat and Information

David Wolpert of SFI presents a logical model of the relation between heat and information. Many things fall out of this, and satisfying, for me at least, is that complex adaptive systems are large and by considered to be engineered systems: computers in other words. Ground breaking stuff that I want to learn much more about.

Turing Machines and Beyond

To put to bed the discussion about companies being the computer – and for me to finalize the invention of yet another existing wheel, find attached this document. The author surveys the latest in computational logic, in the process describing Natural Computation. This is apparently an existing name for the beast I described in the posts categorized under Turing Machines so far!

Networks are capable of processing information in parallel, while interacting dynamically with their changing environment, asynchronously if necessary (companies!). TM as defined here compute solutions for given problems using algorithms and as such are a special case for the general principle of Natural Computation.

SignificanceOfModelsOfComputation

Lane over de MAX Rule

Deze post is ook gebaseerd op het artikel ‘Information Contagion: Is what is good for each best for all?’ van David Lane, 1997, in SFI Proceedings The Economy as an Evolving Complex System. Dit is zo’n opmerkelijk onderdeel daaruit dat ik er een aparte post aan wijd. Lees verder Lane over de MAX Rule

Lane over Individuele Keuzes en Marktaandeel

Deze post is gebaseerd op het artikel ‘Information Contagion: Is what is good for each best for all?’ van David Lane, 1997, in SFI Proceedings The Economy as an Evolving Complex System. De vraag is hoe keuzes van individuele kopers via hun interacties leiden tot een marktaandeel. Lees verder Lane over Individuele Keuzes en Marktaandeel

Mandelbrot set ingezoomd

In dit filmpje op youtube wordt 10^340 keer op de Mandelbrot fractal ingezoomd. Dit betekent op zich niets, maar het geeft wel een idee van de (semi-) zelfgelijkvormigheid op allerlei zoomniveaus. Het verband is dat attractoren van veel (niet alle) chaotische systemen een fractale structuur hebben. Dat betekent een mate van (niet altijd volledige) zelfgelijkvormigheid op elke schaal. Oftewel, de patronen op verschillende schalen lijken veel op elkaar.