Field of Science

Andi's Factor Game

Two weeks ago, my friend Andi messaged me about a mathematical game she had invented.  She was so excited to share it.  She had coded up a "proof of concept" version in html, and had come up with a mathematical proof about its winning strategies.  She was enthusiastic about its potential to make math fun even for non-math people, and full of ideas for next steps.

Then two days ago, I learned that Andi died.  It seems that this game is one of the last things she put into the world.  Although I didn't know her as well as I might have, her excitement about sharing this game seems to typify the passion and determination with which she approached all her projects.  Andi was an uncompromising advocate for social justice with a poetic eye and a keen sense of humor. Also, she was a transgender woman; I say this because visibility matters and because I believe she would not want this aspect of her identity to be erased.

The best way I can personally think of to honor Andi's memory is to share her final game with the world.  Like any well-designed game, it is easy to play but difficult to master. The rules are deceptively simple:

  1. A large whole number, called the Magic Number, is specified and known to both players (it could be randomly generated by computer, for example). All factors of the Magic Number are listed out, including 1 and the number itself.
  2. Two players take turns choosing factors of the Magic Number. Every time one player chooses a factor, that factor and all multiples of it are crossed out. Once a factor has been crossed out, neither player can choose it.
  3. Whoever chooses 1 loses. In other words, the goal is to eliminate the factors in such a way that the other player is forced to choose 1.

For example, let's say the Magic Number is 12.  The factors of 12 are 1, 2, 3, 4, 6, and 12.  These are all the numbers that can be chosen.

Say player 1 choses 12 itself.  Then 12 is eliminated, so the "board" looks like this:

1 2 3 4 6 12

Now player 2 chooses 3.  So 3 and all multiples of 3 are crossed out:

1 2 3 4 6 12

Next player 1 chooses 2.  So 2 and all multiples of 2 are crossed out:

1 2 3 4 6 12

Only the number 1 is left. Player 2 is forced to choose 1, so Player 1 wins.

To visualize what's happening in this game, it helps to draw a diagram like this:

Every time you a player picks a number, that number and all numbers downstream of it are eliminated. (Here "downstream" refers to the direction the arrows are pointing, which is visually upwards.)  So, if 2 is picked, that eliminates 2, 4, 6, and 12. 

To mathematicians, a diagram like this is called a lattice. The game-play for a given Magic Number is determined by the structure of the lattice, which in turn is determined by the Magic Number's prime factorization, as you can see in this lattice for 120.

But enough theory, go ahead and play!  Here's a link to the "proof of concept" version that Andi coded up.  You play against the computer, who goes first.  To try again with a different Magic Number, click "New Game". You can put in whatever Magic Number you choose, or have the computer randomly pick one.

Did you win? No, you didn't. But don't feel bad: Andi proved that, with optimal play, Player 1 will always win the game.

It's a proof by contradiction.  Assume, for the sake of contradiction, that for some particular Magic Number, Player 2 has a winning strategy. In other words, Player 2 has a winning response to any first move that Player 1 might make.  In particular, if Player 1 chooses the Magic Number itself, Player 2 must be able to choose some other number—call it n—which puts them in a winning position.  But then Player 1 could have chosen n as their first move, which would have put Player 1 in this same winning position.  This contradicts our assumption that Player 2 has a winning response to any first move of Player 1.  Therefore, by contradiction, Player 1 must win if both sides play perfectly.

The interesting thing about this proof is that it's non-constructive. It says that there exists a winning strategy for Player 1, but gives no indication of what this winning strategy might be!

Andi designed her code to search through all possible game outcomes for a winning one. While this guarantees that the computer always wins, it doesn't give much insight into how one ought to play, or why certain strategies might work better than others.

There are many interesting open questions here: Can the winning strategy be described concisely? Is there a polynomial-time algorithm to find the winning strategy for a given Magic Number? And can the game be generalized to other kinds of lattices?

Andi's final gift to the world is a good one.  Her code is available on GitHub; please use it and build on it if you are inspired. I hope she is remembered for this and for everything else she put out into the world.

I'll close with this mathematical meditation, which was one of Andi's last Facebook posts:

The set of rational numbers is continuous, in the sense that between any two distinct rational numbers, there exist more distinct rational numbers. If you only look at the rationals, you'll miss uncountably many reals. If you insist on defining reals in terms of rationals, you'll need to take rationals to their limits.

Rest in Power.

You can win the Electoral College with 22% of the vote

Donald Trump is poised to become the next US president, despite the fact that Hillary Clinton received over a million more votes than him (and counting). This would mark the second time in sixteen years, and either the fourth or fifth time in history (depending on how you count) that the Electoral College winner has lost the popular vote.

How is it possible to win the Electoral College but lose the popular vote?  The answer lies in a combination of two factors.  The first is the winner-take-all nature of the state contests. All states except for Maine and Nebraska deliver all their electors to the candidate with the plurality of votes.  This means that if you win by slim margins in a sufficient set of states, you can lose badly in all other states and still secure an Electoral College victory.

The second factor is the disproportionate representation of small states.  Each state has a number of electors equal to its total number of congresspeople (senators plus representatives).  The number of representatives is roughly proportional to population size, but adding in the two senators per state gives the smaller states more per-capita representation.  For example, Wyoming has approximately 7 electors per million elligible voters, while California has 2 per million.  So a Wyomingite has more over three times the Electoral College representation of a Californian (calculations here).

So if you want to become president without winning the most votes, your strategy is to aim for narrow victories in a set of smaller states that add up to 270, while ceding the other states to your opponent.  This begs the question: what is the smallest popular vote percentage one could receive while still winning the presidency?

The answer—according to my best calculations—is 22%.  You could capture the Electoral College, and become President of the United States, with only 22% of the vote.

I got this number by starting with the states with the most electors per elligible voter (Wyoming, Vermont, Delaware, Alaska, ...).  For each of these, I gave 50.1% of the vote to "Team Red", and the remaining 49.9% to "Team Blue".  I continued down the list of states with the most electors per capita, giving 50.1% to Team Red, until the total electoral votes exceeded the 270 needed to win.  I then gave Team Blue 100% of the vote for all other states.  It turns out Team Red didn't need New Jersey, so I threw that over to Team Blue as well.  The result: Team Blue captures 77.7% of the popular vote, but Team Red wins the Electoral College vote 270 to 268.  You can check my math in this spreadsheet.  My answer agrees with a similar calculation done in 2011.

Figure 1: One can capture the Electoral College with only 22.3% of the vote, by receiving 50.1% of the vote in the red states above and 0% in the blue states.

It makes sense that the 22.3% figure is close to one quarter.  If all states were equal in both population and electoral votes, one could tie the electoral college with slightly more than one quarter of the vote, by winning slightly more than half the vote in half the states, while losing the others completely (see below).  The fact that one can win the US electoral map with less than 25% is due to the disproportionate representation of small states.

Figure 2: A hypothetical electoral map of four states with equal populations and electoral votes.  Pie charts show the popular votes in each state.  One can tie the electoral college with slightly more than 25% of the vote, by winning narrow majorities in two states and receiving no votes in the other two. 
The above calculations assume that there are no third party candidates, and that voter turnout is the same in each state.  Dropping these assumptions can lead to even more lopsided possibilities.  For instance, with one third-party candidate, we only need to give Team Red 33.4% in the red states of Figure 1, while Team Blue and the third party each get 33.3%.  This leads to an Electoral College win for Team Red with 14.9% of the vote.  Alternatively, suppose that the turnout in the red states of Figure 1 is half that of the blue states.  Then Team Red wins with 14.3% of the vote.

Of course, possible is not the same as likely.  It would be very unlikely, for instance, for a candidate to receive 50.1% of the vote in Oklahoma but 0% in Texas.  What does not seem unlikely, on the other hand, is that the Electoral College winner loses the popular vote.  This has happened in at least 4 out of 58 elections, or 6.8%, which is not that rare of an occurrence.  What we need to decide, as a country, is whether we support an electoral system that does not always align with the majority of votes.

Information and Structure in Complex Systems

Eight years ago, I had finished my first year of graduate school in math, and I was at a loss as to what to research.  My original focus, differential geometry, was a beautiful subject to learn about, but the open research questions were too abstract and technical to sustain my interest.  I wanted something more relevant to the real world, something I could talk to people about.

Looking for new ideas, I took a course in complex systems, run by the New England Complex Systems Institute.  The director, Yaneer Bar-Yam, had pioneered a new way of representing structure in a systems.  I was fascinated by this idea but also puzzled. As a mathematician, I wanted to understand the basis of this idea.  What assumptions does it rely on?  How are its basic concepts defined? 

My attempt to answer these questions turned into one of the longest and most demanding projects I’ve worked.  After an eight-year collaboration Yaneer and my friend Blake Stacey, we finally have a preliminary manuscript up on the web.  It is currently under review for publication.  And to my pleasant surprise, we got a nice write-up in ScienceNews.

So what is this project all about?  The idea is that we're using information theory (which I've written about previously) as a tool to represent and quantify the structure of a system.

Before I explain what any of this means, let's consider some motivating examples.  Here's a system (call it system A):
You wouldn't really call this a complex system.  It has only one component (a ball) that bounces around in a fairly simple way.  Since there's not much to see here, let's turn to system B:

Source: Wikimedia Commons
This system has many particles, which bounce around and bump into each other.  In one sense, this system is quite complex: it is very difficult to describe or predict its exact state at any given time.  But looking beyond the level of individual particles reveals a kind of simplicity: since the particles behave independently of each other, overall measures such as the average particle velocity or the rate of collisions are relatively stable.  In other words, the individual complexity "averages out", so that on the whole, the system behaves quite simply.

Contrast that to the behavior of system C:
Source: A Bird Ballet by Niels Castillon
This is a murmuration of starlings.  The starlings fly in a semi-coordinated, semi-independent way, creating intricate shapes and patterns that you would never observe in systems A and B.  This is a prototypical "complex system"—the kind that has intrigued researchers since the 70's. 

It is intuitively clear that systems A, B, and C have entirely different kinds of structure.  But it is surprisingly difficult to capture this intuition mathematically.   What is the essential mathematical property of system C that can allow us to differentiate it from A and B?

We try to answer this question using information theory.  Information theory was first invented by mathematician Claude Shannon in 1948 to address problems of long-distance communication (e.g. by telegraph) when some signals may be lost along the way.  Shannon's ideas are still used, for example, in the development of cell phone networks.  But they also have found applications in physics, computer science, statistics, and complex systems.

To explain the concept of information, let's look at a system consisting of a single blinking light:
This is one of the simplest systems you could possibly imagine.  In fact, we can quantify this simplicity. To describe the state of the system at any given time, you only have to answer one yes/no question: "Is the light on?"

The amount of information conveyed in one yes/no question is called one bit.  "Bit" is short for "binary digit", and is the same unit used to quantify computer memory.  In other words, the state of this light can be described in one binary digit, 0 for OFF and 1 for ON.

Now let's add another light:
Let's say these lights are statistically independent.  This means that knowing the state of one doesn't tell you anything about the other.  In this case, to identify the state of the system requires two bits of information—that is, two yes/no questions, one for the first light and one for the second.  We can depict this situation with a diagram like this:

The circles are drawn separately, since information describing one of them tells us nothing about what the other is doing. We could say that each of these bits applies at "scale one", since each describes only a single light bulb. 

Here are two lights that behave in a completely different fashion:
Note that the two light bulbs are always either both on or both off.  Thus, even though there are two components, the system can still be described by a single bit of information—a single yes/no question.  The answer to this question (e.g. "are they on?") applies to both bulbs at once.  The "information diagram" for this system looks like two completely overlapping circles:
We could say that the one bit of information describing this system applies at "scale two", since it describes two light bulbs at once.

A more interesting case occurs between these two extremes:
It's hard to see it, but I've animated these bulbs to be in the same state 3/4 of the time, and the opposite state 1/4 of the time.  If I told you the state of the first bulb, you wouldn't completely know the state of the second, but you could make an educated guess.  Specifically, if I told you the first bulb is ON, you could guess that the second is ON too, and you'd be right 75% of the time.  So there is information overlap: Information about the first bulb gives partial information about the second.  In fact, we can use Shannon's formulas to actually calculate how much overlap there is: approximately 0.19 bits.  So if you know the state of the first bulb (1 bit), then you also know 0.19 bits about the second bulb—not enough to know its state with certainty, but enough to make a guess that is 75% accurate.  The overlapping information can be depicted like this:
As you can see, 0.19 bits of information apply to both light bulbs at once (scale two), while the remaining 0.81+0.81=1.62 bits apply only to a single bulb (scale one).

In principle, these "information diagrams" (we call them dependency diagrams) exist for any system.  Highly ordered systems, like system A above, have lots of overlapping, large-scale information.  Highly disordered systems like B have mostly small-scale, non-overlapping information.  The systems that are most interesting to complex-systems researchers, like the starlings in example C, have lots of partial overlaps, with information distributed over a wide range of scales. 

And that's the basic premise of our theory of structure.  The structure of a system is captured in the overlaps of information describing different components, and the way information is distributed across scales.  While we take these concepts quite a bit further in our paper, the central idea is right here in these blinking lights. 

Thanks for reading!

The time the cops pulled their guns on me

This post is not about science.

I'm writing this because the horrific news out of Ferguson, Missouri—the killing of an unarmed man and the subsequent assault on the populace and media—has been bringing back memories an experience I had with the police ten years ago in Chicago.

I should be clear about why I'm choosing to share this. It's not because I think my own problems are particularly deserving of attention in comparison to the violence done to Michael Brown, Eric Garner, and other recent victims of police violence.  In fact, what I experienced was relatively tame in comparison.  But that's kind of the point. This incident instantly brought my white privilege into sharp focus, in a way that has stuck with me ever since.  Issues like racial profiling can be somewhat abstract for white people.  I hope my story can open a new entry point into these issues for those who rarely experience them directly.

After college, I joined Teach for America.  I was assigned to a high school on the West side of Chicago, where I taught math and coached the chess team.  The school and the surrounding neighborhood were nearly 100% black.  (Yep, Chicago is segregated.)  It was also a rough neighborhood in the sense that drug dealers and prostitutes operated openly within a block of the high school, and students talked about gang warfare the way those at other schools might gossip about the Homecoming dance.  I was not a great teacher in that environment, but I felt a strong bond with the students—especially with those on the chess team, who would squeeze into my tiny Civic every month or so to face off against other teams, often from much more affluent suburban schools.

One Saturday, we got back to the West side around 10pm, and I decided to give each of the team members a ride home.  After I dropped the last student off, I got back into the car to head home. But as I tried to start out, there was another car right next to me, blocking me into my parking space.  And the driver was looking at me.

I didn't know what they wanted.  Maybe they wanted my parking spot.  To try to get out of their way, I pulled forward a bit.  But they moved in parallel, blocking me in again.  We repeated this dance two or three times.  They motioned to me to roll down my window.  But seeing as I had no idea who they were, I thought this was probably a bad idea and kept my window shut.

Then the driver and passenger got out, walked in front of my car, pulled guns out, and pointed them at me.

As a child, I frequently had nightmares in which "bad guys" would shoot me with guns.  I started to feel like I had slid into one of those nightmares.  It didn't feel like reality—it felt like a dream that was happening to me.  I thought maybe I was mistakenly mixed up in a criminal conspiracy, and they were going to kidnap me or worse.

They shouted "PUT THE FUCKING CAR IN PARK!"  I complied.  Then one of them yanked open my car door and put his gun to my head (literally, it was touching my temple).  He shouted "TAKE YOUR FUCKING SEATBELT OFF", which I did as well as I could given how much I was shaking.  He then pulled me out, put me in handcuffs, and bent me over the trunk of their car.

It was at this point that I realized I was probably dealing with the police, rather than some criminal organization.  I told them I didn't know they were police.  One of them responded "Who else would be going the wrong way down a motherfucking one way street?"

Ummm,  I guess this chain of logic might have occurred to me if I wasn't scared shitless by the fact that strangers were blocking me in and pointing guns at me. 

The other one, who still had his gun to my head, said "We don't want to hurt you, we just want to know your source!" I had no idea what they were talking about.  I told them that I was a math teacher at the local high school.  His response was "Oh yeah?  Well how long have you been doing heroin?" They continued to interrogate me and searched my pockets as I told them about the chess team, the tournament, and the student I had just dropped off. 

After a minute or so, it became clear to them that I was not, in fact, a heroin user.  It was remarkable how quickly I shifted in their view from "junkie" to "white do-gooder".  Within sixty seconds, their tone of voice changed, they took me out of cuffs, and their started explaining why they had taken the approach that they did.

Their explanation went like this: The corner where I had dropped of this student was a well-known herion point.  White people are so rare in this neighborhood that those who are around after dark are usually there for the drugs.  Transactions often occur in the buyer's car, with the buyer driving the dealer around the block as the deal is made.  So I fit the profile of a heroin buyer.  When I failed to stop for them, they escalated by getting out and drawing guns.  When I continued to creep my car forward towards them (unintentionally, since I had no idea what I was doing at that point), they felt they had to escalate further my opening the door and putting a gun to my head.

It almost makes sense, except that they never identified themselves as cops.  They were in an unmarked car and never bothered to show me a badge.  Because they read me as a heroin junkie, they assumed I would be familiar with the routine of being pulled over by an unmarked car.  Just to emphasize the point: They were quicker to pull their guns on me than to show me any kind of police identification.

The next week, I told the chess team what happened during practice.  I'll never forget what one of them said to me next: "Mr. Allen, I'm sorry you had to go through that, but you know what that makes you?  A black man.  We go through that shit every day."  He then told me about a time the cops made him strip to his underwear and stand outside in the middle of winter for hours, cuffed to a police car, before they released him without charge.  All of my students had stories.  They all had stories of the cops treating them as if their time, their dignity, and even their lives were worthless.

I did end up filing a complaint with the Chicago Police Department, but I was unable to ID the officers.  I had (and still have) a clear mental picture of one of them, but none of the photos they showed me matched him.  So the case was dropped.

What do I take from this experience?  For one thing, some very real anxiety.  It still haunts me sometimes when I'm trying to sleep, and I was shaking when typing this out.  But I also try to accept it as an alternate-reality window into something I would never have otherwise experienced.  For a brief moment in time, the usual dynamics were reversed: I was profiled for being a white person in an all-black neighborhood.  Because of the color of my skin and the block I was on, the cops read me as a criminal and treated me like one.  But only for about a minute.  Once they realized I was not a junkie, my white privilege reasserted itself and suddenly they were there to serve rather than threaten me. 

As a white person with financial and educational privilege to boot, I can be reasonably certain that I will not experience such an incident again, unless I choose to return to a situation like urban teaching in which the usual rules become twisted.  But imagine (and I'm talking to white folks here) if you had no choice.  Imagine if you could never tell whether the cops—the people who are supposed to protect you—would arbitrarily read you as a criminal and decide to threaten your life before even explaining who they are or what they want.  Imagine how that might change your concept of safety, the way you present yourself outside, or even your plans for any given evening.  That is the reality that my chess team described to me.  It is the reality that underlies the headline-grabbing incidents like Michael Brown, Eric Garner, or Trayvon Martin.  It is the reality that millions of people live every day.

Brian Arthur's vision of Complexity Economics

My friend Daria Roithmayr alerted me to a working paper of Brian Arthur laying out a vision for a new approach to studying economics.  Brian Arthur is one of the pioneers of complex systems thought, and has devoted his life to understanding what really happens in our economy, and why this behavior is so different from what classical economics predicts.

Classical economics is a theory based on the concept of equilibrium.  Equilibrium, in economics, is a state in which everyone is doing the best thing they could possibly do, relative to what everyone else is doing.  And since everyone is doing the best possible thing, no one has incentive to change.  So everything stays the same.  Forever.

Okay, that doesn't sound much like our actual economy.  So why is the equilibrium concept so central to economics?  The answer is that equilibria can be calculated.  If you make certain simplifying assumptions about how economic actors behave, you can prove that exactly one equlibrium exists, and you can calculate exactly what every actor is doing in this equilibrium.  This allows economics to make predictions. 

These predictions are useful in explaining many broad phenomena—for example, the relationship between supply, demand, and price.  But they exclude any possibility of movement or change, and therefore exclude what is really interesting (and lucrative!) about the economy.  Arthur explains it this way:
We could similarly say that in an ocean under the undeniable force of gravity an approximately equilibrium sea level has first-order validity. And this is certainly true. But, as with markets, in the ocean the interesting things happen not at the equilibrium sea level which is seldom realized, they happen on the surface where ever-present disturbances cause further disturbances. That, after all, is where the boats are.
T-Pain understands the need for nonequilibrium theories.

The vision of economics that Arthur lays out is based not on equilibrium, but on computation:
A better way forward is to observe that in the economy, current circumstances form the conditions that will determine what comes next. The economy is a system whose elements are constantly updating their behavior based on the present situation. To state this in another way, formally, we can say that the economy is an ongoing computation—a vast, distributed, massively parallel, stochastic one. Viewed this way, the economy becomes a system that evolves procedurally in a series of events; it becomes algorithmic.
The part of this essay that was most challenging to me personally was where he talks about the limitations of mathematics:

...the reader may be wondering how the study of such computer-based worlds can qualify as economics, or what relationship this might have to doing theory. My answer is that theory does not consist of mathematics. Mathematics is a technique, a tool, albeit a sophisticated one. Theory is something different. Theory lies in the discovery, understanding, and explaining of phenomena present in the world. Mathematics facilitates this—enormously—but then so does computation. Naturally, there is a difference. Working with equations allows us to follow an argument step by step and reveals conditions a solution must adhere to, whereas computation does not. But computation—and this more than compensates—allows us to see phenomena that equilibrium mathematics does not. It allows us to rerun results under different conditions, exploring when structures appear and don’t appear, isolating underlying mechanisms, and simplifying again and again to extract the bones of a phenomenon. Computation in other words is an aid to thought, and it joins earlier aids in economics—algebra, calculus, statistics, topology, stochastic processes—each of which was resisted in its time.
He later explains the limitations of mathematics with an analogy to biology:
Even now, 150 years after Darwin’s Origin, no one has succeeded in reducing to an equation-based system the process by which novel species are created, form ecologies, and bring into being whole eras dominated by characteristic species. The reason is that the evolutionary process is based on mechanisms that work in steps and trigger each other, and it continually defines new categories—new species. Equations do well with changes in number or quantities within given categories, but poorly with the appearance of new categories themselves. Yet we must admit that evolution’s central mechanisms are deeply understood and form a coherent group of general propositions that match real world observations, so these understandings indeed constitute theory. Biology then is theoretical but not mathematical; it is process- based, not quantity-based. In a word it is procedural. By this token, a detailed economic theory of formation and change would also be procedural. It would seek to understand deeply the mechanisms that drive formation in the economy and not necessarily seek to reduce these to equations.
Or, as Stuart Kauffman asked me when I told him about my mathematical biology research, "Can any of your equations predict rabbits fucking?"

How natural processes can create meaning

The project of science is largely about asking why things happen.  We seek causal explanations: Why do planets follow elliptical orbits? Why does water become solid in cold temperatures?

Historically, this project has been largely reductionist in its approach.  That is, scientists have generally taken the view that phenomena can be explained in terms of smaller components.  We can understand how molecules behave by looking at their atoms; we can understand how atoms behave by looking at subatomic particles, etc. This program has been extremely productive: we can explain why oceans have tides and why prisms make rainbows.  Because of this success, some people believe that science will eventually be able to explain everything this way.  They argue that, if we can just understand matter at its tiniest level—quarks or whatever else is smaller than them—explanations for everything else will follow as a matter of course.

A postulated interior of the Duck of Vaucanson (1738-1739) by an American observer.  SOURCE: Wikimedia Commons
I encounter this extreme view not so much in academic papers, but moreso in casual conversations among people who want to ground their arguments in science.  It seems to be a common "move" to argue that some concept is meaningless or illusory, because it can ultimately be reduced to the level of atoms, genes, or some other constituent entity.  Jerry Coyne, for example, argues in a recent essay that free will does not exist, because our brains are composed of atoms that must obey the laws of physics.

I argue that this extreme reductionism does not make for convincing arguments, on two grounds.  (I should pause to say that the ideas here are heavily influenced by many other thinkers—Stuart Kauffman in particular.) The first is that understanding the behavior of the parts of a system doesn't necessarily imply an understanding of the behavior of the whole.  This is a result of chaos theory. It can be shown that most systems with many interacting parts are chaotic, meaning that even if one could measure the present behavior of each component to within arbitrary precision, this would not suffice to predict the system's behavior for more than a brief window of time.  Any initial inaccuracies in measurement rapidly compound until all predictive power is lost. (This is the famous "butterfly effect": the future can be changed by a flap of a butterfly's wings.)  Additionally, quantum effects add another source of indeterminacy to any physical system.  Thus it is impossible, for example, to predict the advent of mantis shrimp or David Bowie by starting from the Big Bang and applying the laws of physics.  These entities do not contradict the laws of physics, but they're not predicted by them either.  (Okay, maybe Bowie contradicts the laws of physics just a little bit.)

The laws of physics do not predict this hotness.

The second ground—and the idea I most want to explore here—is the following:

Natural processes create new reasons for things to happen.

The prime example of this is evolution.  Consider, for example, a bacterium swimming up a glucose gradient—perhaps the simplest goal-directed behavior in nature.  The bacterium senses more glucose on one of its sides than the other, and swims in the direction of more glucose.  What would we say is the reason for this behavior?  One could investigate the physics and chemistry of the bacterium and identify mechanisms that cause it to move this way.  But this does not explain the apparent agency in the bacterium's movement.  The more satisfying explanation appeals to evolution: it moves toward greater sugar concentrations because evolution has provided it this mechanism to find food in order to reproduce.

Simulation of bacteria undergoing biased random walk toward a food source.  SOURCE:
Notice, however, that this explanation only makes sense on the level of the whole organism.  The carbon and other atoms that comprise this bacterium do not act as if they had any goal.  Only the bacterium as a whole appears to be goal-oriented.  Thus reductionism completely fails to explain the bacterium's behavior.  Evolution—a natural and spontaneous process—has created a new reason for something to happen. This reason applies to the whole organism, but not to its parts.

Once we accept that natural processes create new reasons for things to happen, many new questions arise.  For instance, do different kinds of evolutionary processes create different reasons?  Yes!  It turns out that evolution in spatially dispersed populations can select for cooperative behaviors that would be disfavored if all individuals were mixed together.  So the explanation "it behaves that way in order to help its neighbors" makes sense under some evolutionary conditions but not others.

We can also ask what other kinds of processes can create new causal explanations.  Humans, for instance, engage in many activities that do not seem to be directly related to survival or reproduction; I would argue that this is due to a complex process in which our genes co-evolved with our cultures

This man wants a slippery butt, but the individual cells that comprise him do not much care how slippery his butt is.  SOURCE: Three Word Phrase by Ryan Pequin
In short, nature can be creative.  Not only can it create new objects and life forms, it can also create new meanings, in the sense of reasons for things to happen.  These new meanings arise via naturally occurring processes that are consistent with—but not predicted by—the laws of physics.  These processes can even generate new, higher-level processes, which then create additional new layers of meaning.  If we, as scientists and as humans, want to understand why things happen, we must first understand the multiple, distinct ways that meaning and causality can arise. 

What's the deal with inclusive fitness theory?

You may not be aware of it, but there is a battle afoot in the theory of evolution.  The fight is over inclusive fitness theory—an approach to studying the evolution of cooperation.  I, together with mathematical biologist Martin Nowak and naturalist E. O. Wilson, just published an article pointing out weaknesses in the theory, and suggesting that it might not tell us much about why cooperation actually evolves.  This is my attempt to explain the controversy—and our new paper—to those who may not know anything about it. The essential question is, "Why do organisms sometimes help others at a cost to themselves?"  Such helping behaviors have been observed from microbes to insects to humans.  At first glance, these behaviors may appear to contradict natural selection, since the cost of helping reduces the chances that the behavior is passed on to offspring. 

Theorists have identified a number of different ways that costly helping can actually be favored by natural selection.  One way is if the help is primarily directed toward close relatives. These relatives have a good chance of sharing the "helping" gene, so that help increases the overall prevalence of this gene.  This mechanism is called kin selection.

Inclusive fitness theory is one way of representing the idea of kin selection.  Let's say you have some gene that makes you sacrifice your time and energy to help others.  This help affects fitness—the number of healthy offspring you produce.  ("Healthy" offspring are the ones that will eventually grow up and have offspring of their own.)  The first idea is to split fitness into the offspring that you produce on your own, and those which can be attributed to help from others:

The idea of inclusive fitness is to disregard the offspring that others help you produce, but instead count the ones that you help others produce:

To determine the overall effect on the helping gene, offspring that you help others produce must be weighted by the probability that they share the helping gene, which can be interpreted as your "relatedness" to them.  (For example, help you give to your siblings is weighted by one-half, equal to the probability that you inherited the same parental copy of the helping gene.)  Adding up these amounts of help times relatedness gives your inclusive fitness.  In some simplified models, it can be shown that natural selection favors organisms that have the highest inclusive fitness. 

At this point you may be asking "Wait, does it really make sense to divide offspring into those  produced on one's own versus those produced by help from others?"  This is exactly the problem!  Aside from the obvious point that no one reproduces without help in sexual species, nature is full of synergistic and nonlinear interactions, so that making clean divisions like this is impossible in most situations.  Thus the idea of inclusive fitness theory only works in simplified toy models of reality. 

Nowak and Wilson, together with mathematician Corina Tarnita, made this point forcefully in a 2010 Nature article.  In response, more than 100 authors signed a letter saying that inclusive fitness theory has no limitations, and is as general as natural selection itself.  (There were also heated blog posts and a talking bear video!)

What are we to make of this claim that inclusive fitness theory has no limitations at all?  This claim turns out to be based on the idea that, however complex the interactions are in nature, one can always use linear regression to split one's offspring into those attributable to oneself versus others.

Our new paper shows that this approach is not exactly wrong, but nonsensical.   To see why, let's consider a hypothetical helping trait (call it Trait X), and see if this approach can tell us whether and how this trait is selected for. 

Can the this method predict whether Trait X will succeed in evolution?  No, because in order to even set up the regression, one must know in advance whether it succeeds not.  The whole method is based on retrospectively analyzing known results of natural selection, and so it logically cannot predict anything new.

Ok, so if we must know in advance whether or not Trait X is favored, can this method at least help us understand why it succeeds or fails?  The answer is no again, at least not in general.  The reason is that the regression method looks for correlations between having type X as a partner and having high fitness.  If there is a positive correlation, this method says that trait X is "altruistic".  But as any statistics student knows, correlation does not imply causation.  In fact, it is easy to come up with examples where the regression method misidentifies the nature of a trait.

For example, suppose Trait X is actually a jealous trait—if you have it, it makes you want to find high-fitness individuals and attack them, reducing their fitness as well as your own.  A hypothetical example with numbers is illustrated here:

The greenish numbers are the fitnesses before the attack; while the red numbers indicate the results of the attack.  The individual with Trait X (indicated in red) found the highest-fitness individual (5, in this case) and attacked him, reducing each of their fitnesses by one.  But since the attacked individual still has fitness 4, there is a positive correlation between having Trait X as your partner and having high fitness.  So the regression method calls this "altruism" when it clearly is not.

In short, the regression method generates a "just-so-story", which is often wrong, for an outcome that is already known.  The fact that this method is trumpeted as "the very foundation of social-evolution theory" indicates a weird state of affairs in this corner of biology.  My reading is that many researchers fell in love with inclusive fitness theory (which admittedly can be elegant and intuitive when it works), and tried to stretch it to include all of natural selection.  Similar problems exist in economics, in that some researchers fall in love with the elegant mathematics of their theories and forget that they may not always apply to the real world.

I'm not proposing that we replace inclusive fitness theory with some other all-encompassing theory or framework.  Rather, I'm suggesting that the method of analysis be tailored to the problem at hand.  A variety of mechanisms can support the evolution of cooperation, and a variety of approaches are needed to understand them.  The only truly general theory in evolutionary biology is the theory of evolution itself. 

Allen B, Nowak MA, & Wilson EO (2013). Limitations of inclusive fitness. Proceedings of the National Academy of Sciences of the United States of America PMID: 24277847

Gardner A, West SA, & Wild G (2011). The genetical theory of kin selection. Journal of evolutionary biology, 24 (5), 1020-43 PMID: 21371156

Nowak MA, Tarnita CE, & Wilson EO (2010). The evolution of eusociality. Nature, 466 (7310), 1057-62 PMID: 20740005