links for 20091109

NeymanPearson lemma, as many though not all schoolchildren know, says that then, among all tests off a given size s, the one with the smallest miss probability, or highest power, has the form "say 'signal' if q(x)/p(x) > t(s), otherwise say 'noise'," and that the threshold t varies inversely with s. The quantity q(x)/p(x) is the likelihood ratio; the NeymanPearson lemma says that to maximize power, we should say "signal" if its sufficiently more likely than noise. The likelihood ratio indicates how different the two distributions — the two hypotheses — are at x, the datapoint we observed. It makes sense that the outcome of the hypothesis test should depend on this sort of discrepancy between the hypotheses. But why the ratio, rather than, say, the difference q(x)  p(x), or a signed squared difference, etc.? Can we make this intuitive?