Aceasta este versiunea html a fişierului http://www.iis.ee.ic.ac.uk/~g.briscoe/transferReport.pdf.
G o o g l e generează automat versiuni html ale documentelor căutate pe web.
Pentru un link sau un semn de carte la această pagină, folosiţi următoarea adresă: http://www.google.com/search?q=cache:qNEcEDHbKeAJ:www.iis.ee.ic.ac.uk/~g.briscoe/transferReport.pdf+business+digital+ecosystems+thesis&hl=ro&gl=ro&ct=clnk&cd=37


Google nu este afiliat cu autorii acestei pagini şi nici nu este responsabil pentru conţinutul ei.
Termenii din căutare au fost evidenţiaţi.  business  digital  ecosystems  thesis 

Page 1
Self-Organisation
of
Evolving Agent Ecosystems
M.Phil to Ph.D Transfer Report
Gerard Briscoe
Supervisor
Dr. Philippe De Wilde
Intelligent Systems and Network Group
Department of Electrical and Electronic Engineering
Imperial College London
February 2005

Page 2
Abstract
A mechanism to achieve clustering based self-organisation in evolving agent populations is
sought. A first step in achieving this is the study of a definition for self-organisation based
on statistical physics. The measure calculates the entropy in the population to determine the
randomness, which is then used in determining the organisational complexity. An efficiency measure
is then presented for the organisational complexity present, relative to the maximum possible in
the population, which is useful for detecting clustering.
Keywords: self-organisation, clustering, agents, complexity, entropy, evolution,
population, ecosystem
Acronyms
DBE
= Digital Business Ecosystem
DEC
= Distributed Evolutionary Computing
DE
= Digital Ecosystem
DNA = Deoxyribose Nucleic Acid
EC
= Evolutionary Computing
ESS
= Evolutionary Stable Strategies
KC
= Kolmogorov-Chaitin
MAS = Mobile Agent System
MDL = Minimum Description Length
MFT = Mean Field Theory
SOC
= Self-Organised Criticality
UTM = Universal Turing Machine
VLP
= variable length population

Page 3
Contents
1 Introduction
1
1.1 Research Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.2 Report Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2 Digital Ecosystem Definition
2
2.1 Ecosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2.2 Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2.3 Evolutionary Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2.4 Distributed Evolutionary Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2.5 Mobile Agent System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.6 Digital Ecosystem Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
3 Literature Review
7
3.1 Philosophy of Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
3.1.1 System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
3.1.2 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
3.1.3 Self . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
3.1.4 Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
3.2 Current Self-Organisation Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
4 Measure of Organisation
11
4.1 Existing Physical Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
4.1.1 Origins of Physical Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
4.1.2 Definition of Physical Complexity . . . . . . . . . . . . . . . . . . . . . . . . .
12
4.2 Physical Complexity Extended . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
4.2.1 Physical Complexity for Variable Length Populations(VLPs) . . . . . . . . . .
13
4.2.2 Performance Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
4.2.3 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
4.2.4 Atomicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
4.3 Relation to Fitness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
4.3.1 Original Physical Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
4.3.2 Physical Complexity for VLPs . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
5 Methods
22
5.1 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
5.1.1 Mobile Agent System Components . . . . . . . . . . . . . . . . . . . . . . . . .
22
5.1.2 Evolutionary Computing Components . . . . . . . . . . . . . . . . . . . . . . .
23
5.1.3 Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
5.2 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
5.2.1 Physical Complexity for VLPs . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
5.2.2 Efficiency Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
5.2.3 Investigate Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25

Page 4
6 Results
26
6.1 Visualisation of Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
6.2 Physical Complexity for VLPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
6.3 Efficiency Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
6.4 Investigate Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
7 Conclusions
30
7.1 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
7.1.1 Digital Ecosystem Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
7.1.2 Definition for Organisational Complexity . . . . . . . . . . . . . . . . . . . . .
30
7.1.3 Performance Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
7.1.4 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
7.1.5 Atomicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
7.1.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
7.1.7 Visualisation & Semantic Filter . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
7.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
7.2.1 Possible Limitations of the Chain Structure . . . . . . . . . . . . . . . . . . . .
32
7.2.2 Clustering Indicator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
7.2.3 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
7.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
8 Future Work & Research Plan
33
8.1 PhD Thesis Proposed Chapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
8.2 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
References
35
A Interdisciplinary Dictionary
37
B Spreadsheet Tool
39

Page 5
1 Introduction
Creating a Digital Ecosystem (DE) capable of the self-organising complex behaviour of a biological
ecosystem is of great interest to many, including an EU Framework 6 Integrated Project, Digital
Business Ecosystems (DBEs), which funds this work. The DBE is modelled as a Mobile Agent System
(MAS) with Distributed Evolutionary Computing (DEC) as an optimisation technique. At each Agent
Station an evolutionary optimisation process is used to find the optimal combination of agents to meet
user defined requests [11].
1.1 Research Background
Over time a biological ecosystem becomes increasingly more complex, driven by the evolution of
the populations within the ecosystem. A Digital Ecosystem’s increasing complexity comes from the
populations of agents being evolved to meet user requests.
Self-organisation is an emergent global ’behaviour’ of complex systems, which consist of many
autonomous components with local interaction rules. In our Digital Ecosystem the components are
the agents which interact with one another in evolving populations, forming agent-chains to provide,
at the global level, optimal solutions to user requests.
This research will initially focus on the self-organisation within a single evolving population. A
mechanism of self-organisation is sought to form clusters within an evolving population to optimise
the evolutionary process. A cluster is a grouping of same or similar individuals, and when genetic
recombination is encouraged within a cluster it can potentially reduce the number of generations
required for an evolving population to generate a solution.
This principle of encouraging inter-cluster genetic recombination to promote specific features is
known in more practical social contexts. For example, there is the idea that academics that marry
can potentially have children who will become even better academics in the future. The government
of Singapore, which is well known for not leaving anything to chance, took this idea of encouraging
academics to marry one another to create even ’brainier’ children to the extent of offering free dates
between post-graduate students.
A biological ecosystem becomes increasingly complex through a process called succession. There
are defined stages for the increasing complexity, finally resulting in what is called a climax community.
The complexity of an ecosystem comes from the evolving populations within, so we will first focus on
evolving populations. A measure of the complexity which can identify clustering is sought to encourage
inter-cluster genetic recombination within an evolving population. A self-organisation mechanism to
form clusters, within an evolving population, will potentially optimise the evolutionary process by
reducing the number of generations required to generate a solution.
Due to the inter-disciplinary nature of the research, a multi-disciplinary approach will be used,
drawing on ideas and understanding from MAS, specifically mobile agent migration, biological theory,
specifically ecosystems theory and evolutionary theory; and as it turned out statistical physics,
specifically entropy theory.
1.2 Report Outline
In the following section, Section 2, the key concepts drawn upon are explained in general and in their
context of a Digital Ecosystem.
1

Page 6
Section 3 is a summary of the literature review performed, including the philosophical meaning
of organisation and of self in organisation. The possibly applicable measures of organisation will
be described and considered. In Section 4, the measure chosen to be investigated further will
be introduced, and then adapted for evolving agent populations. Its suitability as a measure of
organisation in evolving agent populations will then be considered.
The hypotheses to be tested will be described with the simulation scenarios for their testing, in
Section 5. The results from these simulations will then be shown and analysed, in Section 6.
Concluding remarks regarding the achievements and implications will be made in Section 7, with the
future work and research plan presented in Section 8.
Appendix A is an Interdisciplinary Dictionary of the key terms used within this document to
translate between biological terminology and MAS terminology. Appendix B shows the spreadsheet
tool created to allow experiments with the organisation measure defined in Section 4.
2 Digital Ecosystem Definition
The key concepts and models upon which a DE is defined will be brought together to create a consistent
hybrid model, so that we can clearly define an evolving population in a DE. Then we can focus on an
evolving population, as stated previously.
2.1 Ecosystem
An ecosystem is a natural unit made up of living and non-living components whose interactions give
rise to a stable, self-perpetuating system. It is made up of one of more habitats and communities of
organisms, consisting of populations existing in their microhabitats[27].
environment
community
habitat
niche
niche
niche
ecosystems
microhabitats
populations
Evolving Population = microhabitat + population
Figure 1: Abstract Ecosystem Definition
2

Page 7
Therefore an evolving population is a population in its microhabitat, in which a population comes
to occupy a niche, which is the functional relationships of a population to the environment that it
occupies[9]. Niche is generally used to refer a highly specialised adaptation of a population to its
microhabitat.
2.2 Evolution
Evolution is the process that governs the adaptation of populations to their environment, which
involves changes to the genetic-makeup of these populations. These genetic changes are passed on
through successive generations[27, 20]. A population adapts over several generations through a cyclic
process of mutation, replication, and death.
mutation
replication
death
Figure 2: Cyclic Process of Evolution
Genetic change is introduced through the mutation and replication of individuals within a
population. A ’selection pressure’ is defined by the environment and is the sum total of the forces
acting upon a population, which results in genetic change and natural selection. Those individuals
best fitted to survive the selection pressure operating upon them, will pass on their biological fitness
to their progeny through the replication process.[27, 20]
2.3 Evolutionary Computing
Evolutionary Computing is an optimisation technique capable of finding solutions in a large
combination space of possible solutions, and is able to avoid being trapped at local optima. It works
on the principle of using evolution to generate a solution to a specific problem. The cyclic process
of mutation, replication and death is applied to a population of possible solutions. The ’selection
pressure’, which is defined as the problem to be solved, determines the fitness of the different solutions
and therefore their replication rate. This determination of fitness for each individual is achieved by
a fitness function which is instantiated with the ’specific problem’ to which a solution is to be
found. So the better solutions get to replicate more. For this process to work it must be possible
to incrementally build solutions, just as is done in nature. Also, often a replication operator is used
which includes crossover to add further genetic recombination to help generate solutions faster.
2.4 Distributed Evolutionary Computing
Distributed Evolutionary Computing aims to achieve parallel processing to find solutions more
efficiently. We will focus on the ’Island Model’ which is probably the most closely related to the
notion of migration in nature[18]. There is a distance between the sub-populations on each island,
and by the same token a probability of migration between one island and another.
3

Page 8
Island
1
Island
2
Island
3
Island
4
Island
5
P
13
P
31
P
12
P
21
P
15
P
51
P
14
P
41
P
32
P
23
P
42
P
24
P
52
P
25
P
34
P
43
P
35
P
53
P
54
P
45
Figure 3: Island Model of Distributed Evolutionary Computing
In Figure 3 there are different probabilities of going from island ’1’ to island ’2’ as there is of going
from island ’2’ to island ’1’. This allows maximum flexibility for the migration process. It also mirrors
the naturally inspired quality that, although two populations have the same separation no matter
what way you look at them, it may be easier for one population to migrate to another than vice-versa.
An example of this from nature could be the situation whereby it is easier for one species of fish to
migrate down-stream than for a different species to migrate up-stream.
This model has been used successfully in the determination of investment startegies in the
commercial sector, in a product known as the Galapgos toolkit.[34]
2.5 Mobile Agent System
A mobile agent is a program that can migrate from host to host in a network of heterogeneous computer
systems and fulfill a task specified by its owner. The agents self-initiate migration, work autonomously
and communicate with other agents and host systems. On each host they visit, mobile agents need a
special software that we name an Agent Station, which is responsible for executing agents, providing
a safe execution environment, and offers several services for agents residing on this host. A Mobile
Agent System is the union of all the Agent Stations, with agents running on these Agent Stations
being part of agent-based applications.
4

Page 9
Agent
Station
Agent
Station
Agent
Station
Figure 4: Mobile Agent System
Migration of agents is based on an infrastructure that has to provide the necessary services in the
network. The infrastructure is a set of Agent Stations that run on platforms (nodes) within a possibly
heterogeneous network. Each Agent Station hides the vendor specific aspects of its host platform and
offers standardised services to an agent that is visiting such an Agent Station. Services include access
to local resources and applications, for example Web servers or Web services, the local exchange of
information between agents via message passing, basic security services, creation of new agents, etc.
I have purposely left defining the agents until the last as it is difficult to find any definition for
an agent that includes all of the things that most researchers and developers consider agents to be,
and excludes all of the things they aren’t. However, a generally accepted abstract definition is a
computer system that is situated in some environment, and that is capable of autonomous action
in this environment in order to meet pre-defined objectives[35]. Generally the definition of an agent
depends on the context in which its used, so it will be more accurately defined in the next subsection
within the context of a Digital Ecosystem.
2.6 Digital Ecosystem Model
The Digital Ecosystem (DE) model contains elements from MAS, DEC and ecosystem theory. The
DE will consist of interconnected habitats just as in a biological ecosystem. These interconnected
Habitats will provide facilities similar to an Agent Station, including an ’Agent Pool’ similar to ’gene
pool’ (please see Appendix A). At each Habitat, Evolving Population objects (similar to an ’Island’
in the DEC Island Model) will be instantiated which use Evolutionary Computing to evolve the fittest
(optimal) solution to a user defined request. The agents can migrate through the interconnected
Habitats combining with one another in the Evolving Populations to meet user requests. The set of
agents registered at the Habitat are used as the gene pool for Evolving Populations.
5

Page 10
Habitat
Habitat
Habitat
Habitat
Habitat
Habitat
Habitat
Agent Pool
Evolving
Population
migrating
agent
Evolving
Population
Agent
Agent-chain
Fitness
Function
mutation
replication
death
Figure 5: Digital Ecosystem Model
The agents in the DE are light-weight entities consisting of an ontology-based semantic description
of their functionality and a reference (local or remote) to their executable component. The description
contained within each agent acts as a guarantee of its functionality, and is the inheritable component
from one generation to the next. Although there is no constraint on the language used for the
executable components, they should have compatible interfacing, so that agents have the ability to
aggregate into chains to perform more complex tasks. As the agent-chains have to be comparable to
the user request, the semantic descriptions in both are written in the same language. The fitness will
be based on comparing the descriptive components of the agents with the complex description of the
user request.
Each Evolving Population of agent-chains will search the agent-chain combination space through
evolution to find the optimal solution(s) to a user request. The fitness of individuals within a
population will be determined by a selection pressure applied as a fitness function instantiated from
the user request, and works primarily on comparing the semantic descriptions of the agent-chains with
the semantic description in the user request. Mutations can occur by switching agents in and out of
the chain structure. Recombination (Crossover) can occur by combining elements of two agent-chains
into a new agent-chain. Now that the DE model has been defined, the question of the self-organisation
within an evolving population of agent-chains can be investigated.
6

Page 11
3 Literature Review
A review of the available literature on self-organisation will be presented, with respect to its general
properties, its application to MASs, and its application to populations in MASs. Self-organisation has
been around since the late 1940s[6], but has escaped general formalisation. There have been many
attempts at creating a general formalisation of self-organisation[24, 14]. This will not be the case here,
because it shall be argued that any definition of self-organisation is context dependent, in the same
way that the choice of statistical measures is dependent on the data being analysed.
3.1 Philosophy of Organisation
The philosophy of organisation is complicated, because organisation has different meanings to different
people. There have been many notions and definitions of organisation, useful within their different
contexts. They have come from cybernetics, thermodynamics, mathematics, information theory,
synergetics, and others. Many people use the term self-organising, but it has no generally accepted
meaning, as the abundance of definitions suggests.[17]
Proposing a definition or organisation faces the cybernetic problem of defining system, the cognitive
problem of perspective, the philosophical problem of defining self, and the universal problem of defining
organisation.[17]
3.1.1 System
The system in this context is the evolutionary system, which includes a population of agent-chains,
recombination of the agent-chains, replication of the agent-chains from one generation to the next,
and a selection pressure which causes differential fitness between the agent-chains.
3.1.2 Perspective
This can be defined as the perception of the observer, in perceiving the organisation of a system. This
matches the intuitive definition, Ill know it when I see it[29]. Although this intuitive definition may
make formalisation difficult, it does show that organisation is perspective-dependent, i.e. relative to
the context in which it occurs.
In the context of an evolutionary system, the observer does not exist in the traditional sense, but
is the selection pressure(environment). Therefore, for an evolutionary system, the organisation of its
population is relative to its environment.
3.1.3 Self
Whether a system is self-organising or merely organised depends on whether or not the process causing
the organisation is an internal component of the system under consideration. Although somewhat
simplistic, it does intuitively make sense and relegates the argument to defining the boundaries of the
system being considered, in order to determine if the force causing organisation is internal or external
to the system. For an evolving population the force leading to organisation within the population
is the selection pressure, which is formed by the environment of the populations existence and the
competition between the individuals of the population. As these are internal components of the system,
the system is self-organising. Put simply, an evolutionary system without a selection pressure is not
evolving, and therefore would not be an evolving system.
7

Page 12
3.1.4 Organisation
Now that definitions for the system for which organisation is context dependent, the perspective to
which it is relative, and the self by which it is caused have been proposed, a definition of organisation
can be considered. To visualise the context, an evolving population of agent-chains which lack a 2D
or 3D metric space, it is necessary to consider it in a more abstract form. We will let a single square
represent an agent, with different colours to represent the different agents. agent-chains will be
represented by a sequence of squares
. A population is represented by multiple agent-chains, as
shown below:
agent
agent-chain
population (HIGH organisation)
population (LOW organisation)
Figure 6: Visualisation of Organisation in agent-chain Populations
For Figure 6, the number of agents in total and of each type(colour) is the same in both
populations. However, the population of agent-chains on the left intuitively shows organisation through
the uniformity of the colours across the agent-chains (clustering), whereas the population to the right
shows little or no organisation. The organisation of a system is the clustering of its components, in
this case the agents, into coherent patterns and structures, where the complexity of these patterns and
structures can be a measure of its organisational complexity. So the organisational complexity
measures the information content in a population of agent-chains.
3.2 Current Self-Organisation Definitions
There are biologically inspired definitions of self-organisation defined for MASs without evolution[25],
such measures, although valuable in their own right, are not applicable to evolving agent populations.
Many alternative techniques have been proposed to model self-organisation within populations in
general, and agent populations specifically. Each with their own definition of what property or
properties demonstrate organisation of, or within, a population. The possible applicable alternatives
will be considered for their suitability in defining the organisational complexity (organisation) of a
population of agent-chains.
The relationship of fitness and self-organisation is made confusing by the multiple definitions
of both. The widely accepted definition of fitness is that it is a property of individual phenotypes,
and is a measure of the ability to produce mature offspring in the next generation which themselves
will be able to reproduce. The self-organisation sought is a property of an entire population, not an
individual. Therefore, fitness cannot be the self-organisation of a population. The two are indeed
connected and will be discussed more, once a definition of self-organisation is determined.
Self-organised criticality(SOC) in evolution is defined as punctuated equilibrium, in which the
populations critical state is when the fitness of the individuals is uniform, and an avalanche is caused
by the appearance and spread of advantageous mutations within the population, which temporarily
8

Page 13
disrupts the uniformity of individual fitness across the population. Whether this process displays SOC
remains unclear. There are those who claim that SOC is demonstrated by the available fossil data[31]
with a power law distribution on the lifetimes of genera drawn from fossil records, and artificial life
simulations[1] with a power law distribution on the lifetimes of competing species. On the other
hand, there are those who feel that the fossil data is inconclusive, and the artificial life simulations
do not show SOC, because the key power law behaviour in both can be generated by models without
SOC[23]. SOC has nothing to say about the organisation of the population at the critical state, only
the organisation of the events, avalanches, which moves the population temporarily away, and then
back to the critical state.
One possibility for determining the organisational complexity (organisation) of the population of
agent-chains would be the application of the Minimum Description Length(MDL) principle[7] to
the executable components of the agent-chains in the population. The best model, among a collection
of tentatively suggested ones, is the one that gives the smallest stochastic complexity to the given
data. However, the MDL principle has nothing to say about how to select the family of model classes
to be applied for determining the stochastic complexity. In fact, this problem cannot be adequately
formalised[28]. In practice, their selection is based on human judgement and prior knowledge of the
kinds of models that have been used in the past. This limitation would make it impossible to automate
the organisational complexity estimation process, due to the necessity of human intervention at model
selection for every different user request.
Mean Field Theory (MFT) requires a neighbour model to describe the interactions between
neighbours in the systems it is applied to, and is therefore easily applied to Cellular Automata[19].
The main concept in MFT is that, for a single particle, the most important contribution to its
interactions comes from its neighbouring particles, and therefore its behaviour can be approximated
by relying upon the mean field caused by its neighbouring particles. MFT application to the evolution
of a population[16] requires a neighbour model, which in actual biological systems is a reasonable
assumption. agent populations lack neighbour models based on a 2D or 3D metric space. The only
available neighbour model becomes a distance measure on a parameter space measuring dissimilarity.
However, such a neighbourhood model cannot represent the information-based interactions between
individuals of a population of agent-chains.
In Replicator dynamics, of evolutionary game theory, agents of a population play a game and do
not optimise over strategic alternatives, but inherit a fixed strategy and then replicate depending on
the strategys payoff (fitness)[15]. An equilibrium, a stable steady state, can be reached in which all the
strategies have the same expected payoff. It is called a stable steady state, because if pushed slightly
off, it returns back to the equilibrium of the stable steady state. An evolutionarily stable strategy
(ESS) is an equilibrium strategy that can overcome the presence of a small number of invaders, so
it is an asymptotically stable steady state, which is a more stringent stable equilibrium concept[33],
than a stable steady state. The self-organisation found in replicator dynamics is not the composition
of the population directly, but the presence of stable steady states, in which the genotype frequencies
of the population cease to change from one generation to the next. The composition of the resulting
population can even be randomly distributed, but stable. This self-organisation measure is more
relevant to the genetic stability of the population from one generation to the next, rather than the
organisational complexity (organisation).
Kolmogorov-Chaitin(KC) complexity measures the complexity of binary sequences by the
smallest possible Universal Turing Machine (UTM), algorithm (program and input), that produces
9

Page 14
the sequence. A sequence is said to be regular if the algorithm necessary to produce it on a UTM is
shorter than the sequence itself. A regular sequence is said to be compressible, whereas its compression
into the most succinct UTM is said to be incompressible, as it cannot be reduced any further in
length. A purely random sequence is said to be incompressible, because the UTM to represent it
cannot be shorter than the random sequence itself. This intuitively makes sense from the point of
view of algorithmic complexity, because algorithmically regular sequences require a shorter program
to produce them. To measure a population of sequences, the KC-complexity would be the shortest
UTM to produce the entire population of sequences. Chaitin himself has considered the application of
KC-complexity to evolution[12], and realised that although KC-complexity represents a satisfactory
definition of randomness in algorithmic information theory, it is not so useful in biology. For example
applying it to physical structures, one sees that a gas is the most random, and a crystal the least
random, but neither has any significant biological organisation. For populations of evolving agent-
chains, this problem manifests itself most significantly when the agents are randomly distributed
within the agent-chains of the population, having maximum KC-complexity, rather than a complexity
of zero which it ought to have. This property makes KC-complexity unsuitable as a measure of the
organisational complexity (organisation) of a population of agent-chains.
The Prugell-Bennett Shapiro formalism models the evolutionary dynamics of a population of
sequences, using statistical mechanics techniques, focusing on replica symmetry[26]. The individual
sequences are not considered directly, but in terms of the statistical properties of the population. So,
it is described as using a macroscopic level of description, and the particular statistical properties,
which are used to characterise the population, are called macroscopics. A macroscopic formulation
of an evolving population reduces the huge number of degrees of freedom to the dynamics of a few
quantities. A non-linear system of a few degrees of freedom can be readily solved or numerically
iterated. So, the formalism can predict the optimal amount of selection. However, since a macroscopic
description throws information away, human insight is essential so that the appropriate macroscopics
are chosen[30]. There is no procedural method for determining the macroscopics, which would make
it impossible to automate, due to the necessity of human intervention at every different user request.
A measure called the Physical complexity can be estimated for a population of sequences,
calculated from the difference between the maximal entropy of the population, and the actual
entropy of the population when in its environment[4]. In effect, the environment(selection
pressure) entropically weights the combination space of sequences, which is shown by
the composition of the population. This Physical complexity is based on Shannons entropy
measure of information, and measures the amount of information contained in the population about
its environment, and is therefore conditional on its environment. The Physical complexity can be
estimated by counting the number of loci per sequence that are fixed in the population. The measure
is however formulated for a population of sequences with the same length.
3.3 Summary
Any definition chosen should define the organisational complexity (organisation), the clustering, of
the agents within an evolving population of agent-chains, with no initial constraints on the number
of genotypes or phenotypes, or models specifically seeking speciation. Neither will organisation be
defined by the presence or lack of evolutionary stable strategies or self-organised criticality. The
organisational complexity sought after is the clustering within a population in general, without the
specific inclusion of the models previously mentioned, but capable of representing their appearance in
10

Page 15
the population.
None of the proposed measures are directly applicable as a measure of organisational complexity
(organisation) for a population of agent-chains. Self-organised criticality is concerned with the
organisation of events affecting the population, rather than the organisation within the population.
Mean Field Theory is not applicable due to its necessity of a neighbour model for defining interaction.
Replicator dynamics measures the genetic stability of the population, rather than the organisational
complexity. KC-complexity is not applicable as random sequences, and randomness in general, have
maximum complexity. The Prugell-Bennett Shapiro formalism also cannot be automated, as it requires
human intervention to choose the appropriate properties of the population, for constructing the model.
The Physical complexity for a population of sequences is in essence the organisational complexity
(organisation), because it estimates complexity based upon the individuals of the population within
the context of their environment. The formulation is not without problems, as it is for fixed length
populations. However, this problem is not a fundamental property of the measure, so it should not be
insurmountable. The use of Physical complexity as a measure of the organisational complexity will
be investigated further to determine its suitability.
4 Measure of Organisation
4.1 Existing Physical Complexity
4.1.1 Origins of Physical Complexity
Physical complexity was born from the need to determine the proportion of information in sequences
of DNA. It has long been established that the length of DNA sequences is an unsatisfactory measure
of the information content, as they contain significant redundancy. So the information contained is
not directly proportional to the length[5]. Understanding the DNA requires knowing the environment
(context) in which it exists.
The purpose of DNA in the context of life may initially sound straightforward, as DNA is considered
to be language of life. However, consider viruses which can consist of only a strand of DNA covered
by a protein coating. Scientists are still split on whether viruses are alive or not. The only function
viruses perform is to replicate, not showing any of the other commonly accepted signs of life in more
complex organisms, such as respiration, growth, etc. We shall consider viruses to be alive, and accept
that their only function is to replicate. The process of replication requires resources, energy and
matter, to be harvested. Viruses are the simplest form of life known. More complex forms of life have
evolved far more complex, specialised, specific and effective ways to acquire the necessary energy and
matter for replication.
So consider that for any individual the environment represents the problem of extracting energy
for replication. Then the DNA sequence of an individual represents a solution to this problem. This
hopefully alleviates the misconception that the DNA of individuals encodes their environment; it
does not, it encodes a solution to extract energy and matter from their environment for its replication.
Furthermore, the individual DNA solution is not a simple inverse of the ’problem’ that the environment
represents.
Even with this understanding, the problem remains of the need to define the environment, to be able
to distinguish the information from the redundancy in the solution. This situation is resolved in the
Physical complexity measure by analysing a group of solutions to the same problem. The consistency
11

Page 16
between the different solutions shows the information, and the differences are the redundancy. Entropy
is a measure of disorder, as it measures the number of states accessible to a system with equal
probability. A large number of accessible states is usually associated with disorder. So entropy is used
to determine the redundancy from the information, in a population of solutions. The measure therefore
provides a context-relative measure of organisational complexity, by measuring the population, which
relives us of the need to define the context (environment).
4.1.2 Definition of Physical Complexity
Physical complexity is derived[4] from the notion of conditional complexity defined by Kolmogorov,
which is different from traditional Kolmogorov complexity. It states that the determination of
complexity of a sequence is conditional on the environment in which the sequence is interpreted.
The complexity C of a population of sequences,
C = −
i=1
H(i)
(1)
where is the maximal entropy (length) of the population minus the sum over the length of the
per-site entropies H(i),
H(i) = −
d∈D
p
d
(i)log
|D|
p
d
(i)
(2)
where is the length, D is the alphabet of characters found in the sequences, the site i varies between
1 ≤ i ≤ and the p
d
(i) is the probability that at a site i (in the sequences) takes on character d from
the alphabet D, rather than any other character. So the sum of the p
d
(i) probabilities equals one,
d∈D
p
d
(i) = 1. Taking the log to the base |D| results in H(i) ranging between 0 and 1.
If G represents the set of all possible genotypes constructed from an alphabet D and of length ,
|G| = |D|
(3)
where the size(cardinality) of |G| is equal to the size of the alphabet |D| raised to the length . For
the complexity measure to be accurate, a population sample size of |D| is suggested to minimise the
error[4, 8]. This quantity can be computationally unfeasible. For practical applications, Adami[5]
chooses a population size of 3600 for an alphabet of size twenty eight, |D| = 28, and a length of one
hundred, = 100. This is a population size of about 1.29|D| . The reason it is greater than |D| is
because the population size will fluctuate in the simulation, and it is necessary to maintain a minimum
of |D| for statistical reliability of any trends present. So, for a population S, we choose with Adami a
computationally feasible population size of |D| times , which is sufficient to show any trends present:
|S| ≥ |D|
(4)
4.2 Physical Complexity Extended
Reformulating the complexity measure for a population of agent-chains requires consideration of the
following issues; managing a population of variable length sequences, the creation of a performance
measure, and clustering within populations.
12

Page 17
4.2.1 Physical Complexity for Variable Length Populations(VLPs)
Managing populations of variable length agent-chains is the most significant part of the reformulation,
because unlike the other issues which are extensions of the measure, this requires changing and re-
justifying the fundamental assumptions.
It is necessary to understand the measures conditions and limits, in terms suitable for extending the
measure to variable length populations. In (1) the Physical Complexity C is defined for a population
in which the individuals all have length . The most important question is what does the length equal
if the population is of variable length? The problem is what represents, which is the complexity
potential C
p
, the maximum complexity possible for the population. The maximum complexity occurs
when the per-site entropies sum to zero, as there is no randomness in the sites (all contain information),
in (1)
if
i=1
H(i) → 0
then C →
(5)
So the complexity potential equals the length,
C
P
=
(6)
provided the population S is of sufficient size |S| for accurate calculations, i.e. |S| is equal or greater
than |D| , as found in (4), justified at the end of section 4.1.2. For a variable length population(VLP)
S, the complexity potential for a VLP, C
V
P
, cannot be equivalent to the length , because it does not
exist. Based on the minimum sample size from (4) when = 1, if the population size is less than the
alphabet size, |S| ≤ |D|, then C
V
P
equals zero. If |S| ≥ |D|, then there are at least |D| individual
samples of length one or more, so the C
V
P
will be greater than zero and will be equivalent to the
length of a VLP
V
(which will be defined shortly),
C
V
P
=
0
if |S| < |D|
V
if |S| ≥ |D|
(7)
where S is the population, D is the alphabet,
V
≥ 1 and |D| > 0. If
V
where to be equal to the length
of the longest individual
max
in the VLP S, then the operational problem is that for some of the later
sites i in the range of one to
max
, the number of samples per-site will be less than the population size.
So the complexity potential C
V
P
=
V
=
max
would be incorrect, as there are insufficient samples at
the later sites. Consider the samples of DNA sequences shown below in Figure 7.
site
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Sample 1:
C G C G A T A C C T T T T G A T T G G
Sample 2:
C G C G A T A C C T A T T G A T T G G
Sample 3:
C G C G A T A C C T G T T G A T T
Sample 4:
C G C G A T A C C T C T T G A T T
Figure 7: Genome Samples
If the entropy is calculated for site 18 in Figure 7 it is H(18) = 0, but there is an insufficient
13

Page 18
sample size for the estimated probabilities that are used in the calculation to be accurate. Therefore
the length
V
for a VLP is defined as the highest value within the range between one and the maximum
length 1 ≤
V
max
, for which there are sufficient samples to calculate the complexity. To specify
the value of
V
precisely, a function is required which provides the number of samples at a given site,
sampleSize(i : site)output : int
(8)
where the output varies between 1 ≤ output ≤ |S| and |S| is the population size. Therefore the length
V
for a VLP is defined as the highest value within the range between one and the maximum length
1 ≤
V
max
, for which the number of samples at a site specified by
V
is greater than or equal to
the alphabet size |D| multiplied by the length
V
,
sampleSize(
V
+ x) ≤ sampleSize(
V
) ≥ |D|
V
(9)
where x varies between 0 < x <
max
V
,
V
varies between 1 ≤
V
max
, D is the alphabet
and |D| > 0,
V
is the length for a VLP, and
max
is the maximum length in a VLP. This definition
intrinsically includes the minimum population size for VLPs, |D|
V
. This replaces the minimum
population size for same length populations as specified in (4).
C
V
P
in (7) will always equal
V
provided the condition |S| ≥ |D| is true. If it is false, the C
V
P
will
be zero and therefore the complexity of the VLP S will be zero, so no calculation involving
V
will be
performed.
The length used in the limits of (2) no longer exists, and therefore (2) must be updated. So the
per-site entropy calculation for VLPs will be denoted by H
V
(i),
H
V
(i) = −
d∈D
p
d
(i)log
|D|
p
d
(i)
(10)
where the site i ranges beteeen 1 ≤ i ≤
V
, the probability p
d
(i) ranges between 0 ≤ p
d
(i) ≤ 1, the
sum over D of the p
d
(i) is 1,
V
is the length for a VLP, and D is the alphabet. It remains algebraically
almost identical to (2), but the conditions an constraints of its use will change, specifically is replaced
by
V
. H
V
(i) in (10) ranges between 0 and 1. When the entropy is minimum, zero, then the site i
holds information, as every sample shows the same character of the alphabet at that site. When the
entropy is maximum, the character found in the site i is uniformly random and therefore holds no
information.
The complexity C
V
of a variable length population(VLP) is the complexity potential of the VLP
C
V
P
(7), minus the sum over the length of the VLP
V
of the per-site entropies (10),
C
V
=
0
if C
V
P
= 0
V
V
i=1
H
V
(i) if C
V
P
> 0
(11)
where
V
is the length for a VLP, and H
V
(i) is the VLP entropy for a site i. Now that the Physical
complexity can be applied to VLPs, and the site mapping has been resolved, it can used to measure
the organisational complexity of agent-chains populations shown in Figure 8, where l
V
is calculated
from (9):
14

Page 19
agent
agent-chain
population (HIGH organisation)
population (LOW organisation)
C
V
= 0.575
C
V
= 4.420
D = alphabet = { ,
,
}
|D| = alphabet size = 3
max
= maximum length = 6
V
= length for a V LP = 5 complexity potential C
V
P
=
V
= 5
Figure 8: Organisational Complexity in Populations of agent-chains
If we recall the definition of organisational complexity(organisation), which measures the
complexity of the coherent patterns and structures, formed by the clustering of the agents within
the population of agent-chains. The Physical complexity values match the intuitive understanding
that one gets visually about the organisational complexity in the populations.
4.2.2 Performance Measure
Based on the Physical Complexity measure of organisation, a performance measure can be
constructed which shows the use of the information space, i.e. the organisational complexity relative
to the maximum possible organisational complexity. The efficiency E for a VLP,
E =
C
V
C
V
P
(12)
is the actual complexity C
V
over the complexity potential C
V
P
. The efficiency E ranges between zero
and one:
0 ≤ E ≤ 1
(13)
when the actual complexity C
V
equals the complexity potential C
V
P
, when there is no randomness
in the population. The agent-chain populations considered earlier, for which the complexity measure
C
V
was applied are shown below with their respective efficiencies in Figure 9.
agent
agent-chain
population (HIGH organisation)
population (LOW organisation)
C
V
= 0.575
C
V
= 4.420
%E = 88.4%
%E = 11.5%
Figure 9: Efficiency in Populations of agent-chains
The efficiency of the populations is as expected. It shows the population with high organisational
complexity is efficient, but that there is still some randomness in the population. For the population
15

Page 20
with low organisational complexity, it shows that the population is almost entirely random. The
performance measure is concise, succinct, and matches ones intuition.
4.2.3 Clustering
Clustering in a population of evolving agent-chains is the grouping of same or similar sequences around
an optimum genome on the fitness landscape. If the evolutionary process does not become trapped at
local optima, then it will reach the global optima. It is important to realise that the fitness landscape
is the combination space of the agents weighted with their fitness values. As such it is dependent on
the set of agents(alphabet) from which solutions are constructed as much as it is dependent on the
selection pressure (fitness function).
Figure 10: Fitness Landscape - Single Global Optimum
The population will move across the fitness landscape, temporarily clustering over local optima, on
the way to clustering around the optimal genome at the peak of the global optimum, assuming that the
evolutionary process does not become trapped at local optima. The population will eventually cluster
over all global optima. In Figure 10 the population will cluster into a single cluster at the peak of the
global optimum. This process of clustering at the peak of a global optimum is indicated by the the
average population fitness F
avg
tending towards the maximum fitness F
max
, the clustering indicator.
The optimal solution(agent-chain) which has maximum fitness F
max
is increasingly becoming the
dominant agent-chain in the population.
F
avg
→ F
max
(14)
The clustering indicator is especially useful as it easily and clearly indicates the occurrence of
clustering, independent of the number clusters in the population. Also its output is much more
reliable to measure and programme than the clustering coefficient which is about to be introduced.
This will be discussed further in section (7.2.2), Conclusions.
At the same time, the population complexity C
V
(11) tends towards the complexity potential
C
V
P
(7), because the uniformity of sites across the population is increasing as the optimal solution
(sequence) is increasingly becoming the dominant sequence in the population,
E =
C
V
C
V
P
→ 1 ∧ |T| → 1
(15)
16

Page 21
when C
V
→ C
V
P
, given (12), and where |T| is the number of clusters. The efficiency E will not quite
reach one due to mutations. The efficiency E acts as clustering coefficient, tending towards its
maximum, and when the population consists of only one cluster then |T| = 1. The other extreme is
when the number of clusters equals the size of the population, |T| = |S|. This would only occur with a
non-discriminating selection pressure, in which the fitness landscape would be perfectly flat, as shown
in Figure 11 below:
Figure 11: 3D Fitness Landscape - Perfectly Flat
The population occupancy of the fitness landscape would be uniformly random, as any position(agent-
chain combination) has the same fitness for any agent-chain in the population.
So the en-
tropy(randomness) would be tending towards maximum, resulting in the complexity C
V
(11) tending
towards zero, the efficiency E would tend to zero,
E =
C
V
C
V
P
→ 0 ∧ |T| → |S|
(16)
when C
V
→ 0, given (12), where |T| is the number of clusters and |S| is the population size. It
will generally never quite reach zero, due to mutation. So the number of clusters |T| will tend to the
population size and each cluster would in fact consist of only one agent-chain. The clustering indicator
(14) would be showing clustering, as the average fitness is always the maximum fitness, F
avg
= F
max
.
If there are multiple global optima, the clustering indicator F
avg
= F
max
behaves as before, because
it is independent of the number of clusters.
Figure 12: 3D Fitness Landscape - Multiple Global Optima
However, the efficiency E will no longer tend towards its maximum (13), precisely for the reason
that the population consists of more than one cluster. For a population S, a cluster is a sub-population
with an efficiency that tends towards the maximum (13).
17

Page 22
The simplest scenario of multiple clusters, is when there are pure clusters. Pure meaning that
there are no agents shared between the clusters. So, there are as many totally distinct optimal
solutions with maximum fitness F
max
, as there are clusters. In this scenario, the value the efficiency E
no longer tends towards one, but a value based on the number of clusters |T|, because a number of the
probabilities p
d
(i) in (10) at each site become one over the number of clusters |T|. Given p
d
(i) =
1
|T|
,
(7), (10), (12) and the fact that the number of the probabilities is equal to the number of clusters.
Then the per-site entropy calculation for VLPs H
V
(i),
H
V
(i) = log
|D|
|T|
(17)
where i is the site, |D| is the alphabet size, and |T| is the number of clusters. Hence the efficiency E,
E → 1 − (log
|D|
|T|)
(18)
given (7), (11) and (12). So with pure clusters, the value to which the clustering coefficient E is tends
towards can be used to determine the number of clusters |T|.
For clusters that are not pure, the relationship cannot be specified so succinctly. Environments
with multiple optima will potentially lead to the population clustering around each global optimum,
in which case the the efficiency E will no longer tend towards the maximum. For a population S
with clusters, each cluster is a sub-population with an efficiency that tends towards the maximum.
To specify this more accurately the following function is required,
efficiency(input : population)output : int
(19)
which provides the efficiency E between the range 0 ≤ output ≤ 1, as defined in (12), from the input
of a population.
Assuming that the clustering indicator is actively indicating clustering (F
avg
→ F
max
) for a multiple
global optima fitness landscape, then a cluster t,
t ∈ T → t ⊆ S ∧ efficiency(t) → 1 ∧ |t| ≈
|S|
|T|
t∈T
|t| = |S|
(20)
is a member of the set of clusters T in the population S, and therefore a sub-population of the
population S. A cluster by definition has an efficiency E tending towards the maximum, one (12).
Furthermore, the cluster size |t| is roughly equal to the population size |S| divided by the number
of clusters |T|. It is only roughly equal, as the division may not result in a whole number. These
conditions are true for all members of the set of clusters T, and the summation of the cluster sizes in
T equals the size of the population |S|. This last condition ensures that clusters are non-overlapping,
i.e. do not share agent-chains(members).
If we visualise the population on the 3D fitness landscape of Figure 12, in Figure 13, even though
the clustering indicator is active, the clustering cannot be seen in the visualisation of the population.
18

Page 23
F
avg
F
max
C
V
P
=
V
= 3
D = alphabet size = 3
D = alphabet ={ , , }
C
V
= 1.107
E = 0.369
agent
agent-chain
Figure 13: Population with Hidden Clusters
If we arrange the population to show the clustering, then we can clearly see the two clusters present
in the population, in Figure 14.
Cluster 2
Cluster 1
Figure 14: Population with Clusters Showing
Figure 14 clearly shows that the clusters in the population both have efficiencies E tending towards
their maximum, compared to the efficiency E of the population as a whole which is tending towards
a value significantly below the maximum. This is the behaviour of clusters as specified in (20).
The population size |S| in Figures 11 and 12 is double the minimum requirement, specified in
(9), so that the complexity C
V
in (11) and efficiency E in (12) could be applied to the clusters
without introducing new definitions, while simultaneously explaining the basic principles of clustering.
However, when determining the variable length
V
of a cluster t the sample size requirements are
different, specifically a cluster t is a sub-population of S, and therefore by definition cannot have a
population size equivalent to S (except when the population consists of only one cluster). Therefore
(9) must be updated to manage clusters,
V
=
sampleSize(
V
+ x) < sampleSize(
V
) ≥ |D|
V
if for population S
sampleSize(
V
+ x) < sampleSize(
V
) ≈
|D|
V
|T|
if for cluster t
(21)
where the length for a VLP ranges between 1 ≤
V
max
, x ranges between 0 < x <
max
V
, S is
the population and |S| ≥ |D| > 0, D is the alphabet, t is a member of the set of clusters T,
max
is the
maximum length in a VLP, and T is the set of clusters in population S. A population with multiple
clusters will always have an efficiency E that never tends towards the maximum. A reformulation is
required of the efficiency measure E, to an efficiency measure capable of managing populations with
multiple clusters E
m
,
E
m
(S) =
C
V
C
V
P
if |T| = 1
t∈T
E
m
(t)
|T|
if |T| > 1
(22)
where T is the set of clusters of population S as defined in (18). It is equivalent to E if the population
consists of only one cluster, and if there are multiple clusters E
m
is the average of the efficiencies of
19

Page 24
the clusters.
Different clusters in a population are not necessarily different species. This depends on the
definition of species within the DE model. In biology, different species are often defined by the
lack of ability to interbreed. This is represented in evolutionary computing by crossover not being
able to be performed.
4.2.4 Atomicity
The property of atomicity in the Agent Pool, which is the set of available agents at the Habitat,
requires no agent be able to functionally replace an agent-chain with a length of two or more.
a
b
b
c
c
d
a
c
D = alphabet = { ,
,
,
}
Figure 15: Non-atomic agents
This requirement was necessary, because non-atomic agents adversely affect the uniformity
calculated in the per-site entropies, which ultimately indicates information. So, information is being
lost. Consider the following simple example:
site 2 =
population
Cluster 1
Cluster 2
Figure 16: Population including Non-atomic agents
As
functionally replaces
, the uniformity across site two is lost, shown in Figure 16. The
efficiency E of the population is 0.5 rather than 1, because of the non-atomicity. The clustering
defined in the previous subsection is vital in managing the non-atomicity, because the non-atomicity
leads to the formation of clusters. These clusters can be found, indicated by the clustering indicator
(14) and clustering coefficient (15), and the efficiency for populations with multiple clusters E
m
(22)
can be used to calculate the actual efficiency, which in this case is one, as shown in Figure 16.
It should be noted that clusters formed by the presence of non-atomic services cannot be compared
by their complexity values, C
V
(11), even though they are functionally identical, i.e. perform the same
task. Due to the non-atomicity they have different complexity values C
V
, as shown in Figure 16. The
complexity C
V
is an absolute measure, whereas the efficiency E (12) is a relative measure. So, the
clusters can be compared by their efficiencies, as shown in Figure 16.
With the efficiency measure E
m
(22) for populations with multiple clusters, and with non-atomicity
fitting the clustering model, atomicity is no longer required in the agent pool for the application of
the organisational complexity measure, Physical complexity.
20

Page 25
4.3 Relation to Fitness
Now that organisational complexity (11) has been defined, its relation to fitness can be considered.
Fitness is a property of individuals, not of the population. The average fitness, although a population
measure, does not measure the organisation(clustering).
4.3.1 Original Physical Complexity
The maximum fitness is part of the evolutionary process and increases over the generations. The
maximum fitness never decreases as the selection pressure is static and the mutation rate is not high
enough to cause the loss of all the maximum fitness solutions from the population[5].
0
2
4
6
8
10
50
60
70
80
90
100
Approximate Complexity (C
1
)
Updates [x10
4
]
0
2
4
6
8
10
10
−2
10
0
10
2
10
4
Fitness
Updates [x10
4
]
Figure 17: Original Physical Complexity Graph [3]
The complexity increases over the generations, but can suffer short-term decreases due to the arrival
of fitter mutants. As new fitter mutants arrive and spread throughout the population over several
generations, the uniformity of the sites will decrease, increasing the randomness, which is measured
in the per-site entropies. Therefore, the complexity, which is the complexity potential minus the sum
of the per-site entropies, decreases temporarily[5].
The complexity initially starts high, which is due to the population being seeded with a single
sequence (individual), that temporarily takes over the population[3].
4.3.2 Physical Complexity for VLPs
The Physical Complexity for VLPs in (11) has the same structure and properties as the original
Physical complexity in (1), only the input has changed, so the relation between fitness and complexity
is the same.
The further work using fitness as a clustering indicator in (14), can be summarised as follows. The
convergence of the average fitness to the maximum fitness indicates clustering within the population
(14), and the efficiency (12) acts as clustering coefficient, showing if there are one or more clusters. In
the case of pure clusters, the clustering coefficient can show the exact the number of clusters (18).
21

Page 26
5 Methods
5.1 Simulation
The Digital Ecosystem model from Section 2 is the primary requirements for the simulation. A
simulation is required that has the basic features from MASs and Evolutionary Computing, so that
the organisational complexity measure can tested.
5.1.1 Mobile Agent System Components
The Habitat (Agent Station) provides a location where agents can be created, where they can migrate
to, and where Evolving Populations are used to find solutions to user Requests. agents registered
at the Habitat are found in the Agent Pool, which provides the alphabet from which to construct
solutions. For the scope of these simulations, there will be one Habitat with an Agent Pool will of 15
randomly generated agents.
Each agent contains a Process object to represent the agent’s semantic description. A Process
object provides an abstract representation of a semantic description, in the form of a list of integer
tuples. Each tuple representing an attribute of the semantic description, one integer for the attribute
identifier, and one for the attribute value. The attribute identifiers, range in value between one and
ten, and in quantity between three and five. The values of the attributes range between one and a
hundred. A sample is shown below:
agent s Semantic Description = [(1,40),(5,67),(3,88)]
Figure 18: agent’s Abstract Semantic Description - Process Object
The user Requests are handled by the Habitat by instantiating an Evolving Population object
which uses EC to find the optimal solution(s). The Request object consists of a list of Process objects.
A sample is shown below:
User Request = [[(3,91),(4,15)],[(8,57),(6,45),(3,88)],[(1,40),(5,67)]]
Figure 19: User Request - List of Process Object
For these simulations there will be one request consisting of 20 randomly generated Process objects.
The request will remain unchanged during the simulation run as it is used to instantiate the Fitness
Function object and provide the ’selection pressure’.
Although the Process objects abstract definition makes it widely applicable and easy to implement,
feedback suggested that it was unclear. So a version of the simulation was constructed showing the
agents and the user Request in the context of the travel industry. Basically the numeric semantic
descriptions where given meaning. Although the simulation still works on the numeric representation,
a filter was created that assigns meaning to the numbers. An example is shown below:
22

Page 27
agent’s Abstract Semantic Description = (1,25) , (2,35) , (3,55) , (4,6) , (5,37) , (6,18)
agent’s Semantic Description = Business = Airline
Company = Luthansa
quality = economy
cost = 60 per person
from London to Rome
Figure 20: Agent’s Semantic Description
5.1.2 Evolutionary Computing Components
A simple evolutionary process is required to evolve a population of agent-chains to meet a user Request.
The most important aspect is the Fitness Function which produces a selection pressure from a user
Request, to guide the evolutionary process. The Fitness Function is instantiated with a Request object
(user request) to determine the fitness of the solutions in the population. This is done by comparing
the Process objects of the agent-chains with the list of Process objects in the Request object (user
request). The comparison calculates the distance as a percentage value (real number), between the
agent-chains and the user request with respect to their Process objects. So, the Fitness Function
assigns fitness values between 0.0 and 100.0 to each agent-chain in the population.
An Evolving Population object is instantiated to find a solution to a user Request, from the set of
agents registered at the Agent Pool of the Habitat (Agent Station) handling the request. The solution
is found by evolving the population of solutions, which are initially created by copying the set of agents
from the Habitat’s Agent Pool to the Evolving Population object.
The evolutionary process works by assigning fitness values to the current population using the
Fitness Function. Then the bottom 10% of the population is deleted (death rate), and then the top
10% of the population is allowed to replicate (replication rate) more than once. The middle 80% are
allowed to only replicate once. During replication crossover can occur to 10% of the population, in
which two agent-chains exchange their later halves. Subsequently, 10% of the population is mutated
randomly using point mutations (mutation rate), with agents drawn from the Agent Pool. The point
mutations consists of insertions (an agent is inserted into an agent-chain), replacements (an agent is
replaced in an agent-chain), and deletions (an agent is deleted from an agent-chain).
The population size |S| is maintained at 1.2|D|
V
, to ensure (9) remains true, so that the
organisational complexity measure can be calculated, Physical complexity for VLPs C
V
in (11). This
is done by temporarily increasing the replication rate or death rate. If the population size drops
below 1.2|D|
V
, then the replication rate is increased proportionally to the difference in size until
the population size returns to 1.2|D|
V
, at which point it is dropped back to 10%. If the population
size increases above 1.2|D|
V
, then the death rate is increased to proportionally to the difference in
size until the population size decreases to 1.2|D|
V
, at which point the death rate is dropped back
to 10%. So the population size |S| is maintained around 1.2|D|
V
by a negative feedback process, to
ensure that the population size |S| is always greater than |D|
V
.
5.1.3 Organisation
The Physical complexity for VLPs will be implemented, C
V
in (11), to determine the organisational
complexity. The involves calculating
V
in (9) and then the per-site entropies H
V
in (10). The
23

Page 28
Efficiency E in (12) will also be implemented.
5.2 Hypotheses
5.2.1 Physical Complexity for VLPs
The modified measure C
V
in (11) is not only expected, but is required to increase over the generations,
and suffer dips from the arrival of new mutants. The change to VLPs should not affect this, and if it
does then the construction of the modified measure would be flawed.
generations
complexity
C
V
0
Figure 21: Expected Results for Physical Complexity for VLPs
The Physical complexity C
V
can is expected to increase over the generations, as the complexity
potential will increase as the length of the agent-chains increase. Realised complexity (information)
increases with the uniformity of sites across the population, which will in general increase over the
generations. However diversity, such as the arrival of a new fitter mutation can cause the Physical
complexity C
V
to decrease in the short-term, due to the temporary increase in diversity.
5.2.2 Efficiency Measure
The efficiency measure should converge towards its maximum one, in populations with only one
cluster. This convergence should be indicated by the fitness measures, specifically the average fitness
converging to the maximum fitness.
generations
efficiency
E
0
1
Figure 22: Expected Results for Efficiency
The efficiency is expected to tend towards one, although erratically, as it is dependent on the
complexity C
V
which itself suffers from decreases in the short-term.
24

Page 29
5.2.3 Investigate Clustering
The efficiency measure E (clustering coefficient) should act in conjunction with the fitness measures
(clustering indicator) to allow the determination of the number of clusters. The agents in the Habitat
will be instantiated to allow the formation of two pure clusters within the population. They will be
measured after the population reaches equilibrium. So the efficiency E, due to the multiple clusters
within the population, should tend towards a value significantly below the maximum:
generations
efficiency
E
0
1
0.74
Figure 23: Expected Results for Efficiency with Two Pure Clusters
Given |D| is 15, |T| is 2, and (18), then the efficiency E,
E → 0.744
The efficiency E will tend towards 0.744 rather than the maximum, as there will be two clusters within
the population. This will reduce the uniformity across the sites, and therefore the complexity C
V
(11)
upon which the efficiency E depends. Furthermore the efficiencies E of both clusters should be near
their maximum, which is one (12).
25

Page 30
6 Results
6.1 Visualisation of Population
The simulation was used in conjunction with a spreadsheet macro to create the visualisation shown
below. It shows the extremes of organisation and disorganisation, within an evolving agent population.
Both populations were run for a 1000 generations, the one on the left under normal conditions, and
the one on the right without a selection pressure. Each coloured square
represents and individual
agent, and a horizontal line of coloured squares
represents an agent-chain.
Population (with selection pressure)
agent-chain length
1
v
1
v
Population (without selection pressure)
agent-chain length
C
v
= 19.351
C
v
= 4.227
Figure 24: Visualisation of Organisational Complexity
The difference is evident, and shows that their respective complexity C
V
(11) values are accurate
in describing their organisational complexity. The visualisation also shows, that the highly ordered
population on the left is not without significant variation. This is good, because it shows that the
evolutionary computing process creates the opportunity to find fitter(better) solutions, not getting
trapped at local optima.
To show that the abstract Process objects which represent the agents semantic descriptions are
a reasonable abstraction, a version of the simulation was constructed showing the agents and the
user Request in the context of the travel industry. The abstract numeric semantic descriptions are
26

Page 31
processed through a filter, a small sample of the output is shown below:
Generation = 17
Population size = 296
Highest % Match = 85.53600000000002
Average % Match = 68.85558445945946
Organisational Complexity = 3.7579389676583435
Complexity Potential = 5
Efficiency(Certainty Best Solution) = 0.7515877935316687
USER REQUEST:
Business = Airline Company = Air France quality = economy cost = 60 per person Paris to Rome
Business = Hotel Company = Intercontinental quality = 2* cost = 110 per night Rome for 3 nights
Business = Airline Company = Air France quality = economy cost = 60 per person Rome to Monte Carlo
Business = Hotel Company = Intercontinental quality = 2* cost = 200 per night Monte Carlo for 2 nights
Business = Airline Company = KLM quality = economy cost = 50 per person Monte Carlo to London
BEST SOLUTIONS(S) = 1
SERVICE: fitness=85.53600000000002 size=5
Business = Airline Company = Luthansa quality = economy cost = 90 per person Paris to Madrid
Business = Hotel Company = Marriot quality = 3* cost = 90 per night Rome for 3 nights
Business = Airline Company = Air Italia quality = economy cost = 100 per person Rome to Monte Carlo
Business = Hotel Company = Intercontinental quality = 3* cost = 200 per night Milan for 2 nights
Business = Airline Company = Luthansa quality = economy cost = 60 per person London to Rome
Figure 25: Semantic Filtered Output
6.2 Physical Complexity for VLPs
Below shows a graph of the Physical complexity modified for VLPs C
V
(11), against the generation.
The maximum fitness F
max
has also been included.
50
40
30
20
10
0
generation
1
50
100
150
200
250
300
Graph1: Population Complexity
complexity
maximum fitness
F
max
C
v
Figure 26: Graph 1: Population Complexity
27

Page 32
The complexity for a VLP, C
V
, increases over the generations, and shows short-term decreases as
expected. The increase is due to increasing information being stored, and the small jumps is when the
effective length
V
of the population increases. A temporary decrease that starts at generation 140 is
preceded by the arrival of a new fitter mutant, shown by a jump in maximum fitness F
max
.
6.3 Efficiency Measure
The graph below shows the efficiency measure E (12) over the generations. The efficiency measure
E(12) is the complexity C
V
(11) over the complexity potential C
V
e
(7).
1
0.9
0.8
0.7
0.6
e
f
f
i
c
i
e
n
c
y
300
250
200
150
100
50
0
generation
Graph 2: Population Complexity
Figure 27: Graph 2: Population Efficiency
The population being used has only one cluster, so the efficiency E in (12) tends to its maximum
1, as expected. The significant decreases on the way towards the maximum, can be seen to decrease in
magnitude and frequency over the generations. The measure is directly dependent on the complexity
C
V
, so decreases in C
V
are mirrored here. These decreases are caused by the creation of fitter(better)
mutants within the population which eventually becomes the dominant genotype, but during the
process causes the complexity and efficiency to decrease in the short-term.
28

Page 33
6.4 Investigate Clustering
The scenario considered, is a population with two pure clusters, such that the clusters share no agents.
Below the graph shows the efficiency E over the generations:
0.8
0.7
0.6
0.5
e
f
f
i
c
i
e
n
c
y
600
400
200
0
generation
Graph 3: Efficiency staying below a Bound
0.744
Figure 28: Graph 3: Clustering Coefficient
The efficiency E, the clustering coefficient, tends towards a value significantly below the maximum
of one, 0.744 as predicted. This indicates clustering with more than one cluster in the population. A
visualisation of the population is shown below, indicating the clusters.
agent-chain length
1
v
Cluster2
C
v
= 19.663
Population
C
v
=14.630
Cluster1
C
v
=
19.513
Figure 29: Visualisation of Clusters
The population contains two clusters, which can be clearly seen from the visualisation. The
clusters, as expected, each have a much higher complexity C
V
on their own, and near maximum
efficiency E on their own. The efficiency E shown in Graph 3 (Figure 28) appears to oscillate around
the best fit curve. It is mostly above the best fit curve for the first two hundred generations, after
which it only shows minor oscillations away from the best fit curve. The large peaks and decreases
29

Page 34
initially are due to the creation of fitter(better) longer mutants(agent-chains) in the population. As
they spread initially they cause decreases in efficiency, but once they take over the population the
efficiency rises to a new higher level.
7 Conclusions
7.1 Achievements
7.1.1 Digital Ecosystem Model
An abstract model for a Digital Ecosystem has been constructed in terms of Mobile Agent Systems and
Evolutionary Computing. Finally, the flexibility of the MAS model, due to the nature of agents, will
allow the work to continue without suffering from any interface or execution technology advancements
that may occur in the future.
7.1.2 Definition for Organisational Complexity
The alternative measures for representing organisational complexity in evolving populations have
been evaluated and considered for their suitability to the Digital Ecosystem. The Physical complexity
measure was then chosen for further investigation, as its properties closely matched our intuitive
definition of organisational complexity.
The Physical complexity measure has been studied in detail, and a review of the literature
beyond that of the measure’s author, Adami, has been performed to confirm that the measure is
the most suitable choice. It effectively measures a populations organisational complexity relative to
the environment, without actually considering the environment (selection pressure/fitness function)
directly. This is a very powerful and useful property which effectively insulates the formulation of the
measure from any changes in the fitness function.
It has been adapted to provide a general definition of organisational complexity (organisation) in
agent populations of real agent systems, rather than the near machine code self-replicating programs
of the Avida artificial life simulator[2]. This definition of Physical complexity matches the intuitive
definition of organisational complexity in a population. Most significantly it has been reformulated
algebraically, for populations of variable length individuals, which has been shown to be correct
experimentally through simulations.
7.1.3 Performance Measure
An effective performance measure, the efficiency E (12), has been constructed from the definition of
Physical complexity for variable length populations, and describes the efficiency of information storage
within a population.
The efficiency measure is always between zero and one, so it can be used to compare any two
populations, independent of their size, their length, and whether the length is variable or not.
7.1.4 Clustering
The Physical complexity measure intrinsically and when extended to variable length populations, has
minimal requirements to be applicable. This means it can cater for many different scenarios. One
being clustering, for which the preliminary work is promising.
30

Page 35
When the efficiency E performance measure tends to a bound it acts as a clustering coefficient
(15), which not only indicates the level of clustering, but can also distinguish between a population
with a single cluster and population of multiple clusters. In the case of pure clusters, the number of
clusters can be determined from the value that the clustering coefficient tends towards. It is suspected,
and will be further investigated, that for impure clusters the number of clusters can be estimated.
The clustering coefficient (15) is when the efficiency E performance measure tends to a bound.
7.1.5 Atomicity
This atomicity of agents, requiring that no single agent can functionally replace an agent-chain of
length two or more, represented a substantial challenge. It would have been an unrealistic constraint
to propagate form the model to the real-world. Non-atomic agents were problematic, because they
affect the summing of the per-site entropies, which is the main construct of the Physical complexity
measure.
The non-atomicity in itself causes clustering, so the clustering concepts were used to resolve the
problem by creating a variant of the efficiency measure for multiple clusters E
m
(22). Thereby removing
the effect on the per-site entropies, as the complexity of the clusters are calculated separately, and
their individual efficiency measures averaged. So, atomicity is no longer required for the application
of the Physical complexity measure.
7.1.6 Experimental Results
The experimental results, though preliminary, are very promising. The results show that the Physical
complexity measure of organisational complexity has been successfully extended for variable length
populations, shown by the similarity of Graph 1 (Figure 26) to its prediction in Figure 21. However,
the temporary decreases in the Physical complexity C
V
were not always as severe as predicted. The
reason for decreases is the arrival of new mutants to the population. The decreases were small when
only a single mutation had generated the mutant, whereas large decreases occurred when the mutant
had several mutations. Thus, the fundamental reason is because the mutation rate was relatively low
at 10%.
The results for the efficiency E performance measure were as predicted, shown by the similarity
of Graph 2 (Figure 27) to its prediction in Figure 22. Although the presumed oscillations were
underestimated, and were caused by the increasing length
V
in the population. As
V
increases, so
does the complexity potential C
V
P
, which is the denominator of the efficiency E. So as the population
length
V
came to stabilise to its optimum, so did the efficiency E.
With respect to the clustering, the clustering coefficient, efficiency E, under clustering tended
towards the predicted value, shown by Graph 3 (Figure 28) compared to its prediction in Figure 23.
Again, here the oscillations were underestimated, and have the same cause, the initially increasing
length
V
.
Overall, the simulation results have supported the hypotheses, and have provided more detail to
the behaviour of the phenomena under investigation.
7.1.7 Visualisation & Semantic Filter
The results can be visualised to allow results to be checked intuitively as well as numerically. It is
an important tool for comparing the intuitive understanding from the visualisation to the numerical
31

Page 36
conclusions.
The semantic filter output not only shows that the numerical semantic descriptions are a reasonable
modelling assumption, but can show the semantic descriptions in a human readable form. As the filter
acts as a kind of translation matrix, theoretically it can be altered without too much difficulty to show
output for applications other than the tourism industry.
7.2 Limitations
7.2.1 Possible Limitations of the Chain Structure
The one significant limitation of the Physical complexity measure is the requirement that when
agents/services aggregate, they do so in the form of chains. Their linear structure is the basis for the
measure and for industry sectors other than the tourism industry it not be applicable. For example,
the manufacturing industry supply chains can take the form of tree structures, but in their simplest
form are equivalent to the linear chain structure proposed. Also, a supply tree can be split into two
chains with partial duplication. The tourism industry need not have agents chained. Consider the
processes in a package holiday; flight, hotel, limousine. They need not be organised into a chain, as
they do not have strong interdependencies, for example airlines do not refuse to serve you based on the
hotel that you use. Although the tourism scenarios do not require chains they can be represented by
chains, based on the expected order in which events are supposed to occur, i.e. you want a limousine
after the flight lands.
The most logical step from the atomic agents, is a chain structure for combining agents. At the
moment on the available information the chain structure is a viable assumption for combining agents.
7.2.2 Clustering Indicator
The clustering indicator, although not a construct of the complexity measure, but of the fitness
information is very useful for indicating if the efficiency measure is tending towards its limit. When
the efficiency tends towards its limit, the exact numerical value is not known, and the plot of the
measure oscillates uncertainly. Plotting the log of one or both of the variables unfortunately yields
no better indication. Therefore, the fitness based clustering indicator is very useful in indicating
that clustering is occurring, and then the efficiency measure can be used to determine the number of
clusters.
7.2.3 Clustering
In the results, only the original efficiency measure E and a visualisation of clustering in the population
after the evolutionary process had reached equilibrium could be provided. The efficiency E
m
for
multiple cluster previously defined could not be implemented efficiently, because the clustering, as
currently defined algebraically, lacks a computationally feasible definition for finding the clusters.
The solution is not obvious, and at the time of writing this report experimental simulations using a
distance matrix that shows the distance between individuals within a population is being prototyped.
The distance determined by either the fitness function or the Physical complexity for VLPs.
7.3 Summary
32

Page 37
We recommend adopting the Physical complexity measure for VLPs as the organisational complexity
(organisation) measure, based on the strength of the experimental results. The core work has been
done in defining the organisational complexity (organisation), and in creating an effective performance
measure from the Physical complexity, which has been tested experimentally with the simulations.
With the Physical Complexity as the organisational complexity (organisation) measure and an Digital
Ecosystem model, we can begin to investigate deeper questions described in section 8. The simulation
will be used for experimenting with different scenarios and algebraic formulations, so that the intuitive
understating from the visualisations can be compared to the numerical results.
8 Future Work & Research Plan
8.1 PhD Thesis Proposed Chapters
Here the future work is presented, structured as the expected chapters of my PhD thesis.
Chapter 1 will be the Introduction which will discuss the research background, including
all the biological and Mobile Agent Systems terminology. The Digital Ecosystem model will be
introduced so that self-organisation of agent populations can be investigated. A Digital Ecosystem is
extremely complex and we wish to understand generically how such a system works, then contribute
by changing the evolutionary dynamics with the addition of local intelligence in the agents. This effect
of intelligence on evolution is related to the Baldwin effect, where a biological trait becomes innate as
a result of first being learned.
Chapter 2 will be an Explanation of the Digital Business Ecosystem. This chapter will
describe the EU Framework 6 project, Digital Business Ecosystems (DBEs). The aspects of the
project we wish to investigate and model. Then the Digital Ecosystem (DE) model we intend to
use to investigate the DBE will be defined, extending on the information presented in section 2 of
this report. The DE model will be used as the basis for the architectural specification of the DBE
Evolutionary Environment (EvE), the project’s Digital Ecosystem. Effort has already been made, and
will continue, to ensure that the models used are applicable to the DBE.
Chapter 3 will be the Complexity work presented here that shows how to measure the
organisational complexity of an evolving population. For an evolving population the organisational
complexity measures the amount of information present by measuring the uniformity within the
evolving population. For an ecosystem the main measure of complexity is the level of diversity.
In collaboration with a Master’s student a complexity measure for habitats in the Digital Ecosystem
is being investigated. As the main theoretical difference in complexity for an ecosystem is to measure
the diversity, an inverted version of the organisational complexity (Physical complexity for VLPs)
which measures diversity rather than uniformity will be used.
Chapter 4 will be the Complexity Clustering based Self-Organisation which will provide
a method to partition an evolving population into clusters. Although the number of clusters can be
calculated from the efficiency E, it is not possible to partition the evolving population into clusters
efficiently. A hierarchical clustering methodology is currently being investigated to do this. It uses
a distance matrix, of the distance between individuals, to partition the evolving population. The
distance determined by either the fitness function, in which the number of clusters is dynamically
determined by the inter-cluster distance, or the Physical complexity for VLPs in which the number of
clusters estimated from the efficiency E can be used.
Chapter 5 will be Evolutionary Dynamics with Neural Networks based pattern recognition.
33

Page 38
Pattern recognition for decision making purposes is seen as the key requirement for intelligent
behaviour. The ability to perceive similarity or difference in objects is fundamental in decision making.
Distributed intelligence will be investigated by applying neural networks for pattern recognition at the
level of individual agents. This will encourage the formation of clusters within Evolving Populations,
for preferential crossover (genetic recombination) of individuals within the clusters. This Baldwin
style effect allows phenotypic plasticity (adaptation through learning), and would smooth the fitness
landscape increasing the efficiency of the evolutionary process[13, 32]. However, phenotypic plasticity
has inherent costs associated with the training phase in terms of energy and time. For these reasons, in
a second phase, evolution may find a way to achieve the same successful behaviours without plasticity.
Phenotypic plasticity is analogous to a local search strategy. The evolutionary process and the local
search may be used in combination, often achieving higher efficiency than either of the methods
alone[10, 21].
Chapter 6 will be a Comparison of Clustering, Neural Networks & Local Search to show
the effect and significance of the research. The results will be presented, compared and contrasted
with one another to determine if the investigated methods can optimise the evolutionary process,
and by how much. This chapter will primarily consist of visualisations, graphs, and statistical tests.
Deviations of the results from their hypothesis and predications will be discussed, and where possible
explanations for the unexpected behaviour will be provided.
Chapter 7 will be the Conclusion in which I will discuss how the techniques described in
Chapters 3, 4, and 5 effect the evolutionary dynamics, using the results in Chapter 6. Furthermore,
what consequences or side effects there are, positive or negative, from the use of these techniques.
Consideration will then be given to the implications of this work for the DBE project. As the models
proposed have been, and will be, used as the architectural basis of the Evolutionary Environment
(EvE) and Distributed Intelligence System (DIS), the general level of applicability of the work to the
DBE project is expected to be high. The DIS is the software component intended to optimise the
EvE and will integrate the complexity, clustering and neural networks research. Finally, to judge the
value of the work, its wider applicability and implications will be evaluated.
8.2 Timeline
The timeline proposed for the Research Plan is coordinated with the funding I current have from an
EU Framework 6 project, Digital Business Ecosystems. From the 1st February 2005 I have 21 months
funding remaining until 31st October 2006.
2
0 0
5
2
0 0
6
2
0
0 7
F M A M J
J A S O N D J
F M A M J
J A S O N D J
F M A
Chap. 4
Clustering
7m
Chap. 5
Neural Networks
6m
Chap. 6
Comparison
6m
DBE inte-
gration
1m
1m
Thesis
Write-Up
6m
Coordination
Figure 30: Research Plan Timeline
34

Page 39
References
[1] C Adami. Self-organized criticality in living systems. Physical Review Letters, A 203:23, 1995.
[2] C Adami. What is complexity? BioEssays, 24:1085–1094, 2002.
[3] C Adami. Sequence complexity in darwinian evolution. Complexity, 8(2):49–56, 2003.
[4] C Adami and N Cerf. Physical complexity of symbolic sequences. Physica D, 137:62–69, 2000.
[5] C Adami, C Ofria, and T Collier. Evolution of biological complexity. Proceedings of the National
Academy of Sciences, 97(9):4463–4468, 2000.
[6] W Ashby. Principles of the self-organizing dynamic system. Journal of General psychology,
37:125–128, 1947.
[7] A Barron, J Rissanen, and B Yu. The minimum description length principle in coding and
modeling. IEEE Transactions on Information Theory, 44:2743–2760, 1998.
[8] G Basharin. On a statistical estimate for the entropy of a sequence of independent random
variables. Theory Probability and its Applications, 4:333–336, 1959.
[9] W Beck, K Liem, and G Simpson. LIFE: An Introduction to Biology. HarperCollins Publishers
Inc., 3 edition, 1991.
[10] R K Belew and M Mitchell, editors. Adaptive individuals in evolving populations: models and
algorithms. Addison-Wesley, 1996.
[11] G Briscoe. Self-organisation in multi-agent systems. Digital Business Ecosystem, Contract no
507953, 2004.
[12] G Chaitin. Algorithmic information and evolution. In Perspectives on Biological Complexity,
pages 51–60. IUBS Press, 1991.
[13] K Downing. Reinforced genetic programming. Genetic Programming and Evolvable Machines,
pages 259–288, 2001.
[14] R D’Souza and N Margolus. Thermodynamically reversible generalization of diffusion limited
aggregation. Physical Review E, 60:264–274, 1999.
[15] J Epstein. Zones of cooperation in demographic prisoner’s dilemma. Complexity, 4:36–48, 1998.
[16] H Flyvbjerg, K Sneppen, and P Bak. Mean field theory for a simple model of evolution. Physical
Review Letters, 71:4087–4090, 1993.
[17] C Gershenson and F Heylighen. When can we call a system self-organizing?
In Advances
in Artificial Life, 7th European Conference, ECAL 2003 LNAI 2801, pages 606–614. Springer-
Verlag, 2003.
[18] D.E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-
Wesley, 1989.
[19] H Gutowitz, J Victor, and B Knight. Local structure theory for cellular automata. Physica D,
28:18–48, 1987.

Page 40
[20] W G Hale, J P Margham, and V A Saunders. Dictionary of BIOLOGY. HarperCollins Publishers
Inc., 2 edition, 1995.
[21] G E Hinton and S J Nowlan. How learning can guide evolution. Complex Systems, 1:495–502,
1987.
[22] T Kurz. Dbe term glossary 04070 v00.00.01. 2004.
[23] M Newman. Evidence for self-organized criticality in evolution. Physica D, 107:293–296, 1997.
[24] G Nicolis and I Prigogine. Self-organization in Nonequilibrium Systems: From Dissipative
Structures to Order Through Fluctuations. John Wiley & Sons Inc, 1997.
[25] H Parunak and S Brueckner. Entropy and self-organization in multi-agent systems. In Proceedings
of the Fifth International Conference on Autonomous Agents, pages 124 – 130, 2001.
[26] A Prugel-Bennett. Modelling evolving populations. Journal of Theoretical Biology, 185:81–95,
1997.
[27] A Redmore and M Griffen. Longman Reference Guides: Biology. Longman Group Limited, 7th
edition, 1994.
[28] J Rissanen.
Lectures on statistical modeling theory.
Technical report, Department of
Computer Science, University of Helsinki, 2004. Available from: http://www.cs.helsinki.
fi/u/tmononen/SMT/rissanen_lect.ps [cited 25/02/04].
[29] C Shalizi and K Shalizi. Quantifying self-organization in cyclic cellular automata. In Noise in
Complex Systems and Stochastic Dynamics, Proceedings of SPIE, volume 5114, pages 108–117,
2003.
[30] J Shapiro, M Rattray, and A Prugel-Bennett. The statistical mechanics theory of genetic
algorithm dynamics. In Proceedings of the First International Conference on Evolutionary
Computation and Its Applications. Kluwer, 1996.
[31] K Sneppen, P Bak, H Flyvbjerg, and M Jensen. Evolution as a self-organized critical phenomenon.
Proceedings of the National Academy of Sciences of the United States of America, 92:5209–5213,
1995.
[32] P Turney. Myths and legends of the baldwin effect. In Proceedings of the 13th International
Conference on Machine Learning, pages 135–142, 1996.
[33] J Vidal. Learning in multiagent systems. Technical report, Swearingen Engineering Center,
University of South Carolina, 2003. Available from: http://jmvidal.cse.sc.edu/papers/
vidal03a.pdf [cited 21/04/04].
[34] M Ward. Life offers lessons for business. Available from: http://news.bbc.co.uk/2/hi/
technology/3752725.stm [cited 13/10/2004].
[35] M Wooldridge. Introduction to MultiAgent Systems. John Wiley & Sons Inc, 2002.

Page 41
A Interdisciplinary Dictionary
The key scientific terms used within this document will be explained here, in general terms and in
terms of the Multi-Agents Systems(MAS) abstract model. These terms when first mentioned in the
main text, will be defined there and then, relative to their context. These definitions in context, and
this dictionary should hopefully make the terms clearly understood.
The scientific definitions are given in green italics[27, 9], and the Digital Ecosystem (MAS and DEC)
definitions are given in red. It is hoped that this dictionary will complement, and can be added to the
other efforts to create a cross disciplinary glossary of terms within the DBE[22].
DNA / Semantic Descriptions
DeoxyriboNucleic Acid whose sequence encodes the genetic information of living organisms, provides
two primary functions. The holder of virtually all information in inheritance, and the controller of
protein synthesis.
The equivalent is the process descriptions associated with each Agent.
Ecosystem / Digital Ecosystem
An ecosystem is a natural unit made up of living and non-living components whose interactions give rise
to a stable, self-perpetuating system. It is made up of one of more communities of organisms, consisting
of populations existing in their microhabitats. The Agent Stations are equivalent to Habitats, and the
Evolcingpopulation objects are equivalent to the populations in Figure A.1. The fitness function is
equivalent to the microhabitats in Figure A.1. The evolving population, under the influence of the
fitness function, is equivalent to the niche in Figure A.1.
Evolutionary Stable Strategies (ESS)
It is based on the concept of a population of organisms, playing a certain strategy, and the mutant
form of a gene that causes organisms to adopt a different strategy cannot invade the population, but
will instead be selected out by natural selection.
This is equivalent to a population of Agent-chains that have found all the global optima in the search
space (fitness landscape). So any mutant Agent-chains invading the population will be sub-optimal,
and therefore removed by the selection pressure (fitness function).
Fitness
Measure of the ability to produce mature offspring in the next generation which themselves will be able
to reproduce.
This is equivalent to the fitness used in the evolutionary computing, where it is dependent on the
desired ability of the software requested.
Gene / Semantic Description Process Object
A small length of the total DNA sequence of an individual, to which a specific function can be assigned.
The equivalent unit is the semantic description Process object of an agent.
Gene Pool / Agent Pool
All the genes in a population or species, therefore its genetic constitution.
This is the set of agents registered at an Agent Station, found in the Agent Pool.

Page 42
Genotype / Semantic Description Process Objects
The genetic makeup of an organism, independent of it physical or functional traits.
This is equivalent to the semantic description Process Objects of agents and agent-chains.
Migration
This is the movement of individuals between different habitats (areas), often due to seasonal variations.
This is equivalent to the movement of mobile agents from one Agent Station to another.
Mutation
A permanent transmissible change in genetic material(DNA).
In its simplest form, this is equivalent to replacing one of the agents in an agent-chain.
Organism / Agent
An organism(phenotype) is an individual that is capable of reproducing itself and existing as a separate
entity.
This is equivalent to the instantiation of an agent, or an agent-chain generated by evolutionary
computing in response to a user request.
Population
All members of a species that occupy a particular area at a given time. Statistically, the group of
entities under study which can be sampled.
This is the population of agent-chains created, using Evolutionary Computing at an Habitat (Agent
Station), to generate a solution to a user request.
Selection Pressure / Fitness Function
Describes the sum total of the forces acting upon a population, resulting in genetic change and natural
selection. Those organisms best fitted to survive the selection pressures operating upon them will pass
on their biological fitness to their progeny through the inheritance process.
This is equivalent to the fitness function in the MAS model, which assigns fitness to the Agent-chains
of the evolving population, and thereby determines which will survive.
Species
A series of populations within which significant gene flow occurs, so groups of organisms showing a
very similar genetic makeup.
Pure clusters are an example of different species within the Digital Ecosystem, as no agents are
exchanged (no gene flow occurs) between them.

Page 43
B Spreadsheet Tool
Here you can investigate the Physical complexity measure by changing the alphabet size and population
in red. Please note that it is only for fixed length populations.
Here you can investigate the Physical complexity measure
by changing the alphabet size and population in red.
Population S
1
2
3
4
Persite Entropy
Persite Information
A
A
A
A
H(1)=
0.000
I(1)=
1.000
A
B
B
C
H(2)=
0.750
I(2)=
0.250
A
C
C
D
H(3)=
0.750
I(3)=
0.250
H(4)=
I(4)=
alphabet D =
A,B,C,D
alphabet size |D|=
4
where: 1<=|D|<=5
population size |S|=
4
where: 1<=|S|<=5
length l =
3
where: 1<=l<=4
Complexity Potential Cp =
3
Complexity C =
1.500
Efficiency E =
0.500
%Efficiency %E =
50.00%
The spreadsheet can be found at:
http://dbe.hopto.org/PhysicalComplexity.xls