Statistical analysis of agent-based models

I have observed that, when one writes a paper using one's own agent-based model, it is now common practice to perform statistical analysis of the output of the model.

This is like hiding an Easter egg under a shrub so that your paper can "discover" it there in its conclusion.


Worst use of "methodology", 2017

FBI profiler commenting on a series of murders: "They were all done with the same methodology."



Read into Things


A few weeks ago I walk into a coffee shop. I have a book in hand, and as I lean in to look at the menu, I place my book on the counter. The barista observes innocently, "Hey! Another customer came in with a book earlier. Is there a book sale going on around here or something?"

Merry on Rome and America

I don't think I have ever been cited this much in an essay.

What Is a Planet?


Fights over the best definition of a term are often a quagmire: there is no "correct" or "incorrect" definition in the same sense that there is a correct answer to what 2 + 2 equals. Instead, definitions are either more or less useful. If someone tries to define "animal" as "any entity in the physical universe," that definition is not wrong in the same sense the answering "5" to the 2 + 2 problem is wrong. The right attack on that definition is to point out that it renders the word "animal" less useful than does the currently prevailing definition.

"Common usage" is one factor in deciding how we should define a term. All other things being equal, we should defer to common usage. But common usage is not a trump card that defeats all other considerations.

For instance, when Copernicus forwarded his heliocentric model of the solar system, he was, among other things, offering a new definition of "planet." For many centuries before him, "planet" meant "a celestial entity that wanders among the fixed stars." The planets, under that definition, were the Sun, the Moon, Mercury, Venus, Mars, Jupiter, and Saturn. And please note: so long as we accept that definition of "planet," that list is correct. (Yes, it is incomplete, missing other "planets" that would only be discovered with telescopes.)

Copernicus's system changed that definition to "major celestial objects orbiting the sun." At the time he did this, his new definition certainly violated common usage! But it would not have been a cogent complaint about his work to say, "But Nicolaus, 'a wanderer amongst the fixed stars' is THE definition of a planet!"


The Real Meaning of "Due to Chance"

Sometimes, people have become so enamored with statistical methods they have hypostatized the terms used in such analysis, and have taken to treating ideas like "chance" or "regression to the mean" as if they could be the actual causes of events in the real world.

The analysis of probability distributions arose largely in the context of dealing with errors in scientific measurements. Ten astronomers all measured the position of Mercury in the sky at a certain time on a particular evening, and got ten different results. What should we make of this mess?

It was a true breakthrough to analyze such errors as though they were results in a game of chance, and to realize that averaging all the measurements was a better way to approach the true figure than was asking "Which measurement was the most reliable?"

This breakthrough involved regarding the measurement error in a population of measurements as being randomly distributed around the true value that a perfect measurement would have reported. The errors were "due to chance." And also, we could perform a statistical test to see which deviations from the perfect measurement were most likely not due to chance, and perhaps were the result of something like a deliberate attempt to fix the outcome of a test.

The phrase "due to chance" is just fine in the context of this statistical analysis: it means something like "We don't detect any causal factor so dominant in what we are analyzing that we should single it out as the cause of what occurred." But what it does not mean is that a causal agent called "chance" produced the result! No, it means that a large number of causal factors were at work, and that there is no way our test can isolate one in particular as "causing" the outcome.

In the context of measurement error, the fact that Johnson's measurement differed from Smith's, and from Davidson's, was caused by Smith's shaky hands, and Johnson having a smudge on his glasses, and the wind being high at the place Davidson was working, and Smith having slightly mis-calibrated  his measuring device, and Johnson being distracted by a phone call, and Davidson misreading his device, and... so on and so on. So long as lots of causal factors influence each measurement, and none of them dominate the outcome of the measurement, we can treat their interplay as if some factor called "chance" were at play: but there is no such actual factor!




A Fixed Roulette Wheel

In the comment section of this post, Bob Murphy asks how I would respond to a paper beginning:

"Abstract: It is well-known that players at the craps table are said to have a 'hot hand' after several advantageous rolls. The rollers themselves often report subjectively feeling 'in the zone' during streaks of successful rolls. However, using both Monte Carlo simulations and Bayesian inference models, we conclude that such 'patterns' are illusory and provide no operationally useful betting opportunity."

The idea is sound, but I think the point Bob wants made can be illustrated even better with an example from Willful Ignorance, a book which Ken B. recommended to me, but now seems to be willfully ignoring! (Sorry, Ken, I could not resist that joke.)

The author tells the story of George, a bright inventor who has figured out how to hack a casino's roulette wheel so that it produces a winning number he wants on command. So he could, say, produce one hundred 26s in a row, and clean up by continually betting on 26. But George is a lot smarter than that: he has seen the movies where people are beat up in the back room of the casino for doing that sort of thing. What he does instead is to grab a random number generator app for his phone, and have it randomly pick a number between 0 and 37 (with 37 representing 00), and then cause that number to "hit" on the wheel. (And of course he has several different accomplices win, rather than winning himself, and only on a few spins an evening.)

Clearly, this is no longer a "fair" roulette wheel, at least for George and his friends or for the casino. (It still is fair for the other players! Their chance of winning is unchanged by George's scheme.) On whatever occasions George decides to use his device, the outcome it is not due to "chance,"* but is being deliberately selected.

But no statistical test applied to the pattern of winning numbers will detect anything but chance at work. If Gilovich, Tversky and Vallone used the method of their famous hot hand paper on this wheel, they would have to conclude that George's idea that he could beat the wheel was just an illusion! (Of course, if researchers had more knowledge, specifically, the knowledge of who George's accomplices were, they could detect the scheme by analyzing those players' winning percentages.)

The point of the story is that there can be real causal factors at play in a situation that will not be revealed by the obvious statistical tests. A statistical test that concludes "No significant effect was found" should be a piece of evidence in the trial of a hypothesis, and not the verdict of the trial.

* A side note: "chance" is not properly speaking the cause of anything. At the quantum level, as Ken pointed out, we perhaps find truly random events. But that is just to say that it is possible that, for instance, an excited electron dropping back to a lower atomic orbital is a causeless event. It does not mean some pagan god called "Chance" made the electron shift orbits. And at the macro level, "chance" is just the name we give to a situation in which a myriad of causal factors are at play, and it is beyond our ken (b.) to sort them all out.

A problem with Computer Science education, at present

The approach of giving students "little" problems, and rewarding students who are able to "solve" the problem as rapidly as possible with a high grade, teaches an "anti-pattern": hack your way as fast as possible to any program that can solve the problem you have been assigned.

A skilled software engineer does not approach a "customer" (which customer might actually be his boss, or a marketing executive, etc.) request in that way at all: instead, given X has been requested by "the customer," a skilled software engineer resists fulfilling the request as fast as possible, and instead begins to think:
  • Is it really necessary to program anything at all to fulfill this request? Perhaps some existing capability in the system actually already satisfies the customer request, if only the customer is educated on how to properly use that capability.
  • Is the request so hard to fulfill, and its fulfillment of such marginal value, that the customer should just be advised, "You don't really want us to program this: it will cost too much."
  • Is the request one that can be met by simply installing some third-party library or a commercially available application? If so, it would be wasteful for the developer to write a program to fulfill it.
  • If it turns out that, after considering all the above points, there really is some in-house programming necessary to satisfy the customer request:
    • Are there likely to be similar requests in the pipeline, so that it will be useful to program a generic capability rather than simply one that fulfills the current request?
    • How can the code to fulfill this request be made an integral part of a coherent software system, rather than simply being an isolated chunk of code?
The "solve this isolated problem as fast as possible to receive an A" method of giving CS students "actual" work to do does not teaching them anything at all about how to address the real-world software engineering questions listed above.

Given the semester-oriented nature of modern university education, I don't think there is an easy solution to this problem. But at least keeping the above points in students' minds, even if we have to assign "mini-problems," might help.

No, I Don't Believe Probability Judgments Are "Subjective"

Tom was, I think, worried that this is what I was suggesting. Then he got what my claim is. But in case others misapprehend it...

1) There are no judgments whatsoever that are "purely subjective." Any judgment is an attempt to assert something about the world. Although Oakeshott's arguments on this point (in Experience and Its Modes, chiefly) are more robust, I think M. Polanyi's arguments in Personal Knowledge are still very good but also more accessible. If I claim that "The odds of that coin coming up hands are one in two," I am saying something about the world "out there," rather than commenting upon some "purely personal" state of my own.

2) As such, there are better and worse judgments about what the probability of some event is. If all I know is, "Tom is flipping a fair coin," then the correct probability to assign to "The coin will come up heads" is .50. One way to defend my claim here is to note that anyone else having only the same knowledge as me about the situation can assuredly win money from me in the long run if I choose any other probability while they choose .50.

3) But that perfectly correct probability judgment, given my state of ignorance about the flipping, will become decidedly mistaken should my knowledge of what is going on change: for instance, suppose I suddenly gain the superpower of instantaneously being able to assess all the forces acting on a coin at the moment it is flipped so as to "see" whether any particular flip will come up heads or tails. If I gain that superpower, my correct assignment of probability to "The coin will come up heads" is either zero or one, depending on what I "see."

4) And finally, even if I have that superpower, should the casino in which I am betting become suspicious, and only allow me to bet on coin flips from another room (so that I can't gauge the forces at play in the flip), my correct probability judgment reverts to .50.

So, the objectively correct judgment of the probability of some event occurring depends on how much knowledge we have when making that judgment: if all we know is that Joe is a 50-year-old American male, we might be correct in judging that the probability he will live to 80 is .50. (I just picked .50 as a plausible number: I'm not looking this up in the mortality tables at the moment!) But if we then learn he is planning on committing suicide tonight, we would be correct in revising our estimate to, "Well, his probability of living to 80 is pretty close to 0."

Hot Streak Length

The critics of this model claimed "It implies a streak length of one."

Well, it doesn't:

import random

SHOTS = 50
in_streak = False
hot_streaks = 0
hot_total = 0

print("Shooting with hot streaks:")
for shot in range(1, SHOTS):
    hot = (random.random() < .5)
    if hot:
        hot_total += 1
        if not in_streak:
            in_streak = True
            hot_streaks += 1
        make = (random.random() < .66)
    else:
        in_streak = False
        make = (random.random() < .33)
    mark = 'X' if make else 'O'
    print(mark, end='')
print("")
print("Average hot streak length = " + str(hot_total / hot_streaks))

print("Shooting without hot streaks:")
for shot in range(1, SHOTS):
    make = (random.random() < .5)
    mark = 'X' if make else 'O'
    print(mark, end='')
print("")




And the output is:

Macintosh:statistics gcallah$ ./hotstreak.py
Shooting with hot streaks:
OOXXOXXOXXOXXXOOXXXOXXXXOOXXXXXXXOOOOOXOXXOXXOOXO
Average hot streak length = 2.0
Shooting without hot streaks:
OOXXOOXOXXOXOXXXXXXOXOOOXOXOXOOOOOOOOOXXOXOXOOOOO

What the model actually codes, and was meant to code, was the possibility that a player could be genuinely "hot" for some period, but if the hot streak might end at any moment, then the streak has no predictive value, and "feeding the hot hand" will not help a team.

The Internet Is a Wonderous Place!

I have programed for 30 years now. I have published dozens of articles in professional software engineering journals. I have written programs used to trade tens of millions of dollars of securities each day. I teach computer science.

And today Ken B. informed me that if I set a random variable once outside of a loop the result will be different than if I set it anew each time around the loop!