Eight years ago, I had finished my first year of graduate school in math, and I was at a loss as to what to research. My original focus, differential geometry, was a beautiful subject to learn about, but the open research questions were too abstract and technical to sustain my interest. I wanted something more relevant to the real world, something I could talk to people about.
Looking for new ideas, I took a course in complex systems, run by the New England Complex Systems Institute. The director, Yaneer Bar-Yam, had pioneered a new way of representing structure in a systems. I was fascinated by this idea but also puzzled. As a mathematician, I wanted to understand the basis of this idea. What assumptions does it rely on? How are its basic concepts defined?
My attempt to answer these questions turned into one of the longest and most demanding projects I’ve worked. After an eight-year collaboration Yaneer and my friend Blake Stacey, we finally have a preliminary manuscript up on the web. It is currently under review for publication. And to my pleasant surprise, we got a nice write-up in ScienceNews.
So what is this project all about? The idea is that we're using information theory (which I've written about previously) as a tool to represent and quantify the structure of a system.
Before I explain what any of this means, let's consider some motivating examples. Here's a system (call it system A):
You wouldn't really call this a complex system. It has only one component (a ball) that bounces around in a fairly simple way. Since there's not much to see here, let's turn to system B:
This system has many particles, which bounce around and bump into each other. In one sense, this system is quite complex: it is very difficult to describe or predict its exact state at any given time. But looking beyond the level of individual particles reveals a kind of simplicity: since the particles behave independently of each other, overall measures such as the average particle velocity or the rate of collisions are relatively stable. In other words, the individual complexity "averages out", so that on the whole, the system behaves quite simply.
Contrast that to the behavior of system C:
This is a murmuration of starlings. The starlings fly in a semi-coordinated, semi-independent way, creating intricate shapes and patterns that you would never observe in systems A and B. This is a prototypical "complex system"—the kind that has intrigued researchers since the 70's.
It is intuitively clear that systems A, B, and C have entirely different kinds of structure. But it is surprisingly difficult to capture this intuition mathematically. What is the essential mathematical property of system C that can allow us to differentiate it from A and B?
We try to answer this question using information theory. Information theory was first invented by mathematician Claude Shannon in 1948 to address problems of long-distance communication (e.g. by telegraph) when some signals may be lost along the way. Shannon's ideas are still used, for example, in the development of cell phone networks. But they also have found applications in physics, computer science, statistics, and complex systems.
To explain the concept of information, let's look at a system consisting of a single blinking light:
This is one of the simplest systems you could possibly imagine. In fact, we can quantify this simplicity. To describe the state of the system at any given time, you only have to answer one yes/no question: "Is the light on?"
The amount of information conveyed in one yes/no question is called one bit. "Bit" is short for "binary digit", and is the same unit used to quantify computer memory. In other words, the state of this light can be described in one binary digit, 0 for OFF and 1 for ON.
Now let's add another light:
Let's say these lights are statistically independent. This means that knowing the state of one doesn't tell you anything about the other. In this case, to identify the state of the system requires two bits of information—that is, two yes/no questions, one for the first light and one for the second. We can depict this situation with a diagram like this:
The circles are drawn separately, since information describing one of them tells us nothing about what the other is doing. We could say that each of these bits applies at "scale one", since each describes only a single light bulb.
Here are two lights that behave in a completely different fashion:
Note that the two light bulbs are always either both on or both off. Thus, even though there are two components, the system can still be described by a single bit of information—a single yes/no question. The answer to this question (e.g. "are they on?") applies to both bulbs at once. The "information diagram" for this system looks like two completely overlapping circles:
We could say that the one bit of information describing this system applies at "scale two", since it describes two light bulbs at once.
A more interesting case occurs between these two extremes:
It's hard to see it, but I've animated these bulbs to be in the same state 3/4 of the time, and the opposite state 1/4 of the time. If I told you the state of the first bulb, you wouldn't completely know the state of the second, but you could make an educated guess. Specifically, if I told you the first bulb is ON, you could guess that the second is ON too, and you'd be right 75% of the time. So there is information overlap: Information about the first bulb gives partial information about the second. In fact, we can use Shannon's formulas to actually calculate how much overlap there is: approximately 0.19 bits. So if you know the state of the first bulb (1 bit), then you also know 0.19 bits about the second bulb—not enough to know its state with certainty, but enough to make a guess that is 75% accurate. The overlapping information can be depicted like this:
As you can see, 0.19 bits of information apply to both light bulbs at once (scale two), while the remaining 0.81+0.81=1.62 bits apply only to a single bulb (scale one).
In principle, these "information diagrams" (we call them dependency diagrams) exist for any system. Highly ordered systems, like system A above, have lots of overlapping, large-scale information. Highly disordered systems like B have mostly small-scale, non-overlapping information. The systems that are most interesting to complex-systems researchers, like the starlings in example C, have lots of partial overlaps, with information distributed over a wide range of scales.
And that's the basic premise of our theory of structure. The structure of a system is captured in the overlaps of information describing different components, and the way information is distributed across scales. While we take these concepts quite a bit further in our paper, the central idea is right here in these blinking lights.
Thanks for reading!
Looking for new ideas, I took a course in complex systems, run by the New England Complex Systems Institute. The director, Yaneer Bar-Yam, had pioneered a new way of representing structure in a systems. I was fascinated by this idea but also puzzled. As a mathematician, I wanted to understand the basis of this idea. What assumptions does it rely on? How are its basic concepts defined?
My attempt to answer these questions turned into one of the longest and most demanding projects I’ve worked. After an eight-year collaboration Yaneer and my friend Blake Stacey, we finally have a preliminary manuscript up on the web. It is currently under review for publication. And to my pleasant surprise, we got a nice write-up in ScienceNews.
So what is this project all about? The idea is that we're using information theory (which I've written about previously) as a tool to represent and quantify the structure of a system.
Before I explain what any of this means, let's consider some motivating examples. Here's a system (call it system A):
You wouldn't really call this a complex system. It has only one component (a ball) that bounces around in a fairly simple way. Since there's not much to see here, let's turn to system B:
Source: Wikimedia Commons |
Contrast that to the behavior of system C:
Source: A Bird Ballet by Niels Castillon |
It is intuitively clear that systems A, B, and C have entirely different kinds of structure. But it is surprisingly difficult to capture this intuition mathematically. What is the essential mathematical property of system C that can allow us to differentiate it from A and B?
We try to answer this question using information theory. Information theory was first invented by mathematician Claude Shannon in 1948 to address problems of long-distance communication (e.g. by telegraph) when some signals may be lost along the way. Shannon's ideas are still used, for example, in the development of cell phone networks. But they also have found applications in physics, computer science, statistics, and complex systems.
To explain the concept of information, let's look at a system consisting of a single blinking light:
This is one of the simplest systems you could possibly imagine. In fact, we can quantify this simplicity. To describe the state of the system at any given time, you only have to answer one yes/no question: "Is the light on?"
The amount of information conveyed in one yes/no question is called one bit. "Bit" is short for "binary digit", and is the same unit used to quantify computer memory. In other words, the state of this light can be described in one binary digit, 0 for OFF and 1 for ON.
Now let's add another light:
Let's say these lights are statistically independent. This means that knowing the state of one doesn't tell you anything about the other. In this case, to identify the state of the system requires two bits of information—that is, two yes/no questions, one for the first light and one for the second. We can depict this situation with a diagram like this:
The circles are drawn separately, since information describing one of them tells us nothing about what the other is doing. We could say that each of these bits applies at "scale one", since each describes only a single light bulb.
Here are two lights that behave in a completely different fashion:
Note that the two light bulbs are always either both on or both off. Thus, even though there are two components, the system can still be described by a single bit of information—a single yes/no question. The answer to this question (e.g. "are they on?") applies to both bulbs at once. The "information diagram" for this system looks like two completely overlapping circles:
We could say that the one bit of information describing this system applies at "scale two", since it describes two light bulbs at once.
A more interesting case occurs between these two extremes:
It's hard to see it, but I've animated these bulbs to be in the same state 3/4 of the time, and the opposite state 1/4 of the time. If I told you the state of the first bulb, you wouldn't completely know the state of the second, but you could make an educated guess. Specifically, if I told you the first bulb is ON, you could guess that the second is ON too, and you'd be right 75% of the time. So there is information overlap: Information about the first bulb gives partial information about the second. In fact, we can use Shannon's formulas to actually calculate how much overlap there is: approximately 0.19 bits. So if you know the state of the first bulb (1 bit), then you also know 0.19 bits about the second bulb—not enough to know its state with certainty, but enough to make a guess that is 75% accurate. The overlapping information can be depicted like this:
As you can see, 0.19 bits of information apply to both light bulbs at once (scale two), while the remaining 0.81+0.81=1.62 bits apply only to a single bulb (scale one).
In principle, these "information diagrams" (we call them dependency diagrams) exist for any system. Highly ordered systems, like system A above, have lots of overlapping, large-scale information. Highly disordered systems like B have mostly small-scale, non-overlapping information. The systems that are most interesting to complex-systems researchers, like the starlings in example C, have lots of partial overlaps, with information distributed over a wide range of scales.
And that's the basic premise of our theory of structure. The structure of a system is captured in the overlaps of information describing different components, and the way information is distributed across scales. While we take these concepts quite a bit further in our paper, the central idea is right here in these blinking lights.
Thanks for reading!