Thomas Jefferson on George Washington: Weekend Reading

Homeopathic Bayes...

How to Be a Mad Scientist 3 Steps

The problem with debates about the philosophy of statistics is that it influences what you do: Minimize regret or minimize expected loss? Coverage or coherence? Is nature (and our own brains) our friend or our foe? There are actual real stakes here, in a way that there are not real stakes in philosophy of quantum mechanics or of economics...


Suppose we have some knowledge of the distribution of a parameter—Gaussian, with mean and variance known, because why not.

Then suppose we get more data from some sort of natural or artificial experiment about the parameter: data that increases our knowledge about the parameter by, say, 11.1% or so. We then update our initial knowledge via the Rule of the Reverend Thomas Bayes: there is nothing else very sensible to do. And, yes, we can say that the true parameter is unknown and fixed, and that our estimates—the mean and variance and the entire honking distribution of our estimate—are random variables. And, yes, we can say that the 95% is not the chance that a random Tykhe-drawn parameter is in the (now fixed) confidence interval, but rather the chance that the random Tykhe-drawn confidence interval surrounds the (fixed) parameter. But the Rule of the Reverend Thomas Bayes is still the sarissa in our panoply.

Now suppose that we dilute dilute and dilute in dilute and dilute our initial knowledge of the distribution of the parameter is not 90% of (initial knowledge) + (data knowledge) but only 1%. And then suppose we keep going. We dilute until our initial knowledge is reduced to a homeopathic level.

Some people say that you should approach statistical problems without any assumptions about what initial knowledge you have—that you should use “frequentist“ rather than “Bayesian“ tools. By continuity, when you have very little initial knowledge you should use almost “frequentist“ tools.

Question: At what point do you switch over, and why, exactly? Or is it a mistake to use the Rule of the Reverend Thomas Bayes whenever you get an additional 10% of information about a parameter? What should you do instead?


References:


Robert Waldmann: Why Statistics Is So Tightly Integrated with the Historically-Contingent Institutions of Games of Chance: "I want to talk about Keynes (that is: the pages of the Treatise on Probability that I read)...

...There is something else about gambling and probability—there is a correct answer. It is tempting to generalize the thought of a rational gambler (who will always walk right out of the casino once she figures out what is going on) to assume that all problems we solve are similar. But they aren't. We are mere savages not Savages, and we do not have a complete set of subjective probabilities.

Here, I think, the historic importance of gambling isn't that it consisted of problems hard enough that our intuition misleads us. I think it is also vital that it consists of problems simple enough that the sneaky guy who invents the game can solve it and see that he will win in the long run. Regular problems (betting on the weather say) are much more complicated, so the house doesn't systematically win.

I am not just claiming that we do not have rational expectations—that our subjective probabilities aren't identical to objective probabilities. The [subjective probabilities] are not in here. (Well, they are not in here in my skull—I don't know about yours (By "you" I mean Brad DeLong (not "one, but I'm American and won't type the God damn Queen's f-cking English"). I have noticed patterns of human behavior which fit everyone I've met, with one exception—Brad Delong. In general people are not capable of doing algebra in their head while writing (not typing—composing) 60 words a minute on a different topic. I have met one such creature (Brad). For all I know there are subjective probabilities inside his skull (I cannot do it any more—editor)). If Bayes were incorrectly portrayed as a psychologist, one would falsely concludes that he was a very bad psychologist. Only rarely do we have priors, and most such cases are artificial: involving dice, roulette wheels, and playing cards.

We do have hypotheses—extremely specific assertions which we can't help thinking might be plain true (even though we can't really hope to win a contest which Newton lost). We can test them. Occasionally we can face the facts that prove us wrong. Some of us have learned things this way. But I don't think they had subjective probabilities and Bayesian updating either.

Also, I must respond to Shalizi by attempting to robustify Bayesian analysis. We don't have subjective probabilities—if we did then updating them would give messy but exact numbers—the probability that Trump is re-elected is 5.7357%. Oh hell that's toooo high. This is an obviously false claim about psychology and always will be.

We are not sure of a prior distribution—not convinced that the probability density of a parameter is not only meaningful but also exactly one of the 2^c (that's a realllllllllly big infinity) possibilities. But I think we can believe that probabilities could be represented imagining a stochastic process at the beginning of time in which, with 90% probability, the parameters are drawn from a distribution (and are meaningful and describe the laws of nature) and with 10% probability something else happens. That's not a prior. That allows 2^c different possible worlds. It means that after applying Bayes's formula we will always get to probabilities somewhere in the range p to p+10%. But it's something. And it might even be valid...


Cosma Shalizi (2016): On the Uncertainty of the Bayesian Estimator: "A: I hardly know where to begin...

...I will leave aside the color commentary. I will leave aside the internal issues with Dutch book arguments for conditionalization. I will not pursue the fascinating, even revealing idea that something which is supposedly a universal requirement of rationality needs such very historically-specific institutions and ideas as money and making book and betting odds for its expression. The important thing is that you're telling me that αα, the level of credibility or confidence, is really about your betting odds.

Q: Yes, and?

A: I do not see why should I care about the odds at which you might bet. It's even worse than that, actually, I do not see why I should care about the odds at which a machine you programmed with the saddle-blanket prior (or, if we were doing nonparametrics, an Afghan jirga process prior) would bet. I fail to see how those odds help me learn anything about the world, or even reasonably-warranted uncertainties in inferences about the world.

Q: May I indulge in mythology for a moment?

A: Keep it clean, students may come by.

Q: That leaves out all the best myths, but very well. Each morning, when woken by rosy-fingered Dawn, the goddess Tyche picks θ from (what else?) an urn, according to π(θ). Tyche then draws x from p(X;θ), and x is revealed to us by the Sibyl or the whisper of oak leaves or sheep's livers...


We express Bayesian ideas in gambling contexts because those were the first contexts complicated enough for us to need to formalize and develop what we already knew: Judea Pearl and Dana Mackenzie: The Book of Why: The New Science of Cause and Effect (New York: Basic Books: 0465097618) https://books.google.com/books?isbn=0465097618: "Belated awakenings of this sort are not uncommon in science...

...For example, until about four hundred years ago, people were quite happy with their natural ability to manage the uncertainties in daily life, from crossing a street to risking a fistfight. Only after gamblers invented intricate games of chance, sometimes carefully designed to trick us into making bad choices, did mathematicians like Blaise Pascal (1654), Pierre de Fermat (1654), and Christiaan Huygens (1657) find it necessary to develop what we today call probability theory. Likewise, only when insurance organizations demanded accurate estimates of life annuity did mathematicians like Edmond Halley (1693) and Abraham de Moivre (1725) begin looking at mortality tables to calculate life expectancies. Similarly, astronomers’ demands for accurate predictions of celestial motion led Jacob Bernoulli, Pierre-Simon Laplace, and Carl Friedrich Gauss to develop a theory of errors to help us extract signals from noise. These methods were all predecessors of today’s statistics...


Cosma Shalizi: "I do think there's something to the expression of decision theory in terms of bets and money...

...that isn't handled by just observing "well, that was the first technological context where we really needed to understand all this". The relevant analogy would be if we'd formulated the laws of Newtonian physics in terms which only made sense for 17th century clockwork, so that before you could try to study, say, cloud formation or polymers you had to pretend everything was really a rigid body...


#cognition
#statistics
#bayesianism
#behavioraleconomics

Comments