Ten Years and 29 Days Ago at Grasping Reality: Hoisted from the Archives for August 2, 2007

Comment of the Day: Maynard Handley: For the Filing Cabinet: Using Computers and Data Science Here in the Twenty-First Century...: "Personally I'm a Mathematica fan. Yeah, you PAY (and pay and pay) for it... http://www.bradford-delong.com/2017/08/for-the-filing-cabinet-using-computers-and-data-science-here-in-the-twenty-first-century.html#comment-6a00e551f08003883401b8d2a61adf970c

...but in return you get a power tool beyond your wildest imaginings.

I took a lot longer than five minutes to run through the stochastic simulation part of the exercise but that was because, geek that I am, once I got the damn thing to work it became imperative to keep honing the solution to get it faster and faster!

My final solution runs at 62 seconds for 500 million samples on a 2012 MacBook Pro (quad core i7, nominally 2.6GHz):

dice = {4, 6, 8, 12, 20}; diceBins = {1}  ~Join~(dice + 1);
nTrials = 1000000; nOuterLoops = 500;
AbsoluteTiming[
c = ParallelTable[
BinCounts[#, {diceBins}] & /@

Table[RandomInteger[{1, i}, nTrials], { i, dice}] //

Transpose,
nOuterLoops] // Total;
]
cn = #/Total[#] & /@ c // N[#, 5] & // 100 # & // TableForm

37.038   24.690  18.518  12.346   7.4078
 0     39.217  29.411  19.610  11.762
 0      0      48.391  32.257  19.352
 0      0       0      62.497  37.503
 0      0       0       0     100.00

(This solution collapses all the identical case [eg 1,2,3,4) together to aggregate their statistics. A similar solution calculates the results per die outcome, like Brad's table.)...

It's interesting to note where the [computer] time goes. My first few variants of this all took most of their time in some sort of version of histogramming part of the process, and it was only when I managed to cram as much of that as possible into a Mathematica built-in (namely BinCounts[]) that the bulk of the time (about 2/3) moved to the generation of the random numbers. One could, given the trivial nature of the problem, likely still improve performance (even while staying within Mathematica) by replacing the (doubtless high quality) random number generator Mathematica is using with something substantially cheaper.

Now I have to think of the neatest way of implementing the more intelligent solution in Mathematica :-)...


Maynard Handley: For the Filing Cabinet: Using Computers and Data Science Here in the Twenty-First Century...: "So, as expected the thinking-based solution (represent prior, the joint probability of a particular die/value pair, sum out the value PDF, and use Bayes) was so trivial... http://www.bradford-delong.com/2017/08/for-the-filing-cabinet-using-computers-and-data-science-here-in-the-twenty-first-century.html#comment-6a00e551f08003883401b8d2a6479f970c

...there was nothing to optimize or obsess over, no fun! But if anyone cares I can post it. Of course, being Mathematica, that type of calculation is all done as rationals, so you can print the table as exact fractions if you like rather than as decimal.

Much more interesting is that I kept playing with the calculational solution and managed to shrink the time by another third or so (from 65s to 40s for half a billion samples) by carefully scaling the size of the tables that are being worked on at any given time so that everything fits in L3. This requires shrinking the variable called nInstances down to, I think it was 50 000, and commensurately cranking up the number of outer loops.

It's actually rather remarkable seeing the theory you know play out just like it's supposed to.

As you crank up the working set size you speed up until you exceed L3 at which point you start dropping (from 40s to 65s) and if you keep cranking up the working set size beyond the size of RAM, so you start paging (actually, this being macOS you actually first start trying to compress the RAM before real paging) and then the wheels fall off and your job that should have taken a minute is off in the wilds where you eventually give up and shoot it to put it out of its misery.

So for those playing along at home, the rules remain the same as they always have been... Use fast code and a fast algorithm but ALSO pay some attention to L3 size, and MASSIVE attention to RAM size.

Comments