PLEKTIX: Information and Structure in Complex Systems

Information and Structure in Complex Systems

By Ben Allen at 10/24/2014 12:46:00 PM

Eight years ago, I had finished my first year of graduate school in math, and I was at a loss as to what to research. My original focus, differential geometry, was a beautiful subject to learn about, but the open research questions were too abstract and technical to sustain my interest. I wanted something more relevant to the real world, something I could talk to people about.

Looking for new ideas, I took a course in complex systems, run by the New England Complex Systems Institute. The director, Yaneer Bar-Yam, had pioneered a new way of representing structure in a systems. I was fascinated by this idea but also puzzled. As a mathematician, I wanted to understand the basis of this idea. What assumptions does it rely on? How are its basic concepts defined?

My attempt to answer these questions turned into one of the longest and most demanding projects I’ve worked. After an eight-year collaboration Yaneer and my friend Blake Stacey, we finally have a preliminary manuscript up on the web. It is currently under review for publication. And to my pleasant surprise, we got a nice write-up in ScienceNews.

So what is this project all about? The idea is that we're using information theory (which I've written about previously) as a tool to represent and quantify the structure of a system.

Before I explain what any of this means, let's consider some motivating examples. Here's a system (call it system A):

You wouldn't really call this a complex system. It has only one component (a ball) that bounces around in a fairly simple way. Since there's not much to see here, let's turn to system B:

Source: Wikimedia Commons

This system has many particles, which bounce around and bump into each other. In one sense, this system is quite complex: it is very difficult to describe or predict its exact state at any given time. But looking beyond the level of individual particles reveals a kind of simplicity: since the particles behave independently of each other, overall measures such as the average particle velocity or the rate of collisions are relatively stable. In other words, the individual complexity "averages out", so that on the whole, the system behaves quite simply.

Contrast that to the behavior of system C:

Source: A Bird Ballet by Niels Castillon

This is a murmuration of starlings. The starlings fly in a semi-coordinated, semi-independent way, creating intricate shapes and patterns that you would never observe in systems A and B. This is a prototypical "complex system"—the kind that has intrigued researchers since the 70's.

It is intuitively clear that systems A, B, and C have entirely different kinds of structure. But it is surprisingly difficult to capture this intuition mathematically. What is the essential mathematical property of system C that can allow us to differentiate it from A and B?

We try to answer this question using information theory. Information theory was first invented by mathematician Claude Shannon in 1948 to address problems of long-distance communication (e.g. by telegraph) when some signals may be lost along the way. Shannon's ideas are still used, for example, in the development of cell phone networks. But they also have found applications in physics, computer science, statistics, and complex systems.

To explain the concept of information, let's look at a system consisting of a single blinking light:

This is one of the simplest systems you could possibly imagine. In fact, we can quantify this simplicity. To describe the state of the system at any given time, you only have to answer one yes/no question: "Is the light on?"

The amount of information conveyed in one yes/no question is called one bit. "Bit" is short for "binary digit", and is the same unit used to quantify computer memory. In other words, the state of this light can be described in one binary digit, 0 for OFF and 1 for ON.

Now let's add another light:

Let's say these lights are statistically independent. This means that knowing the state of one doesn't tell you anything about the other. In this case, to identify the state of the system requires two bits of information—that is, two yes/no questions, one for the first light and one for the second. We can depict this situation with a diagram like this:

The circles are drawn separately, since information describing one of them tells us nothing about what the other is doing. We could say that each of these bits applies at "scale one", since each describes only a single light bulb.

Here are two lights that behave in a completely different fashion:

Note that the two light bulbs are always either both on or both off. Thus, even though there are two components, the system can still be described by a single bit of information—a single yes/no question. The answer to this question (e.g. "are they on?") applies to both bulbs at once. The "information diagram" for this system looks like two completely overlapping circles:

We could say that the one bit of information describing this system applies at "scale two", since it describes two light bulbs at once.

A more interesting case occurs between these two extremes:

It's hard to see it, but I've animated these bulbs to be in the same state 3/4 of the time, and the opposite state 1/4 of the time. If I told you the state of the first bulb, you wouldn't completely know the state of the second, but you could make an educated guess. Specifically, if I told you the first bulb is ON, you could guess that the second is ON too, and you'd be right 75% of the time. So there is information overlap: Information about the first bulb gives partial information about the second. In fact, we can use Shannon's formulas to actually calculate how much overlap there is: approximately 0.19 bits. So if you know the state of the first bulb (1 bit), then you also know 0.19 bits about the second bulb—not enough to know its state with certainty, but enough to make a guess that is 75% accurate. The overlapping information can be depicted like this:

As you can see, 0.19 bits of information apply to both light bulbs at once (scale two), while the remaining 0.81+0.81=1.62 bits apply only to a single bulb (scale one).

In principle, these "information diagrams" (we call them dependency diagrams) exist for any system. Highly ordered systems, like system A above, have lots of overlapping, large-scale information. Highly disordered systems like B have mostly small-scale, non-overlapping information. The systems that are most interesting to complex-systems researchers, like the starlings in example C, have lots of partial overlaps, with information distributed over a wide range of scales.

And that's the basic premise of our theory of structure. The structure of a system is captured in the overlaps of information describing different components, and the way information is distributed across scales. While we take these concepts quite a bit further in our paper, the central idea is right here in these blinking lights.

Thanks for reading!

8 comments:

Victor VenemaOctober 24, 2014 at 1:10 PM
In the geosciences many people use the autocorrelation function to describe the temporal structure of a time series. This is derived by computed the linear correlation for pairs of data points separated by certain temporal difference, for a large number of temporal differences.

Theoretically it is possible that the scatterplot of the pairs shows no linear correlation, but that there is a relationship. For example all points could lie on a circle. With your dependency measure (entropy) you could see the difference between an uncorrelated cloud of points and point lying, e.g. on a circle.

Do you know of anyone using an "auto-entropy function" to describe the temporal structure of a time series? Do you know of any applications where such a description of the temporal behaviour of a time series might be useful?
ReplyDelete
Replies
AnonymousOctober 25, 2014 at 6:48 AM
Nice intro into the Information Theory. Too bad it took that much space. I would very much love to read more about how all this related to the ideas presented in the newest paper "An Information-Theoretic Formalism for Multiscale Structure in Complex Systems" and in your project to characterize structure of complex systems in general.
ReplyDelete
Replies
DiegoOctober 25, 2014 at 10:17 AM
Great intro. Raises a few questions in my mind.

Complex system outcomes that conform to normal statistical distributions lend themselves to "information compression", i.e. one can describe the system using mean and variance. However, most complex systems either do not output stable distributions, or their distributions do not conform to normal. In that case, their "compressibility" is lower/Shannon entropy higher. So the challenge with complex systems is that self-organization implies some lower entropy/higher order, but it is tough to translate this into a lower Shanon entropy. To do so requires discovering a shorthand way to describe the more-or-less stable higher-order dynamic at work. The lightbulb example is one in which the system outputs a nice statistical relationship. I just wonder what happens when there is true novelty produced, one that is also subject to non-normal behavior like feedback loops and/or phase shifts? I suppose it will be in your paper, but anticipating that it may be quite technical, perhaps you could explain how the overlap concept works at higher levels of complexity?
ReplyDelete
Replies
DiegoOctober 26, 2014 at 12:53 PM
I agree -- looking at the scale that is producing order (i.e. driving self-organization) helps to convert the system to a lower Shannon entropy. I am looking at doing this informally in finance, management and economics. My approach is to tag recurrent dynamics produced by complex systems and thereby compress information about the system. A good example is identifying a positive feedback loop and its associated resource constraint. Just doing this would have helped to predict the 2008 crisis (the feedback loop was in credit/housing, and the resource constraint was the population that qualified for a mortgage at the lower bound of mortgage underwriting standards). This alone would be an improvement over macroprudential analysis that tends to look only at mean and variance as system descriptors.

What I would like to see is a merging of informal and formal approaches, and an explicit "fine-tuning" of the trade-off between rigor and relevance to decision making. Analysis of simple systems (mean/variance) have just such a "mix", and complex systems will get there eventually.
ReplyDelete
Replies
AnonymousJanuary 5, 2015 at 8:01 PM
Very interesting paper. The concepts suggest applications in computer science, and in particular to the area of software complexity metrics, an area that has long needed an objective, solid foundation.

Currently, most software complexity metrics produce a single, scalar value. What we need is something like your complexity profile, which would give a series of metrics, one per scale. For a well-structured software system (with low coupling and high cohesion), the metric should give a result similar to your metric on a system of largely independent components. For a poorly-structured software system, the complexity profile should be relatively flat.

One problem I encounter in applying your idea is that software really only has low-level couplings: Software components don't refer to other components in their entirety. Rather, lots of individual lines of code in one reference individual lines of code in others. But if I account for dependencies at the lowest scale, then there is nothing to count at higher scales. Yet intuitively, software modules and sub-systems do depend on other ones.

Are you aware of anyone employing your ideas in this area? If so, how have they adapted it to quantifying software complexity?

Thank you very much.
ReplyDelete
Replies

Add comment

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS

Field of Science

Information and Structure in Complex Systems

8 comments: