::     ::

### Intro

So, some day in April 2023 Prof. James Grime asked for help on Twitter with some problem he was tinkering with. This is his tweet:

"OK, I have a problem someone could probably simulate faster than I can work it out. I have a set of dice with values: 0, 0, 0, 0, 2, 3. You roll the dice, add up the total, then roll that number of dice. And repeat. If I start with five dice, how long will the game last?"

I quickly wrote a little program to perform the simulation, and let it run overnight not only with 5 dice as a starting point, but also 1, 2, 3 and 5. These are the results:

 Initial Rolls Average Length Simulated games Longest game 1 2.22307... 2.7 x 1012 150 2 3.23895... 1.5 x 1012 151 3 4.09949... 1.1 x 1012 154 4 4.84160... 0.8 x 1012 151 5 5.49181... 1.2·1012 143

Here the decimals are all correct, in that I truncated the results a couple of digits sooner than the most stable digits of the simulation. So, the answer he was looking for is 5.491816...

This is the histogram of the lengths of the 1.2 x 1012 games, for initial number of dice # = 5: Which in logarithmic Y scale looks like this: Which show a nice linear decay. The histograms for #=1,2,3 and 4 follow linear decays as well.

Regarding the code, I run the simulation with a direct and straightforward encoding of the problem statement. It's written in C, and runs at about 8 million games per second as is:

#include <stdint.h> #include <stdio.h> #include <random> int main( void ) { int dice = { 0, 0, 0, 0, 2, 3 }; std::mt19937 gen; uint32_t longestGame = 0; uint64_t runningTotal = 0; for( uint64_t numGames=1; ; numGames++ ) { uint32_t numThrows = 0; uint32_t numDiceToRoll = 5; for( numThrows=1; ; numThrows++ ) { uint32_t total = 0; for( uint32_t i=0; i<numDiceToRoll; i++ ) total += dice[ gen() % 6 ]; numDiceToRoll = total; if( numDiceToRoll==0 ) break; } runningTotal += uint64_t(numThrows); longestGame = std::max( longestGame, numThrows ); if( (numGames & 0x00000000001fffff )==0 ) { printf( "%I64u %d %.16g\n", numGames, longestGame, double(runningTotal)/double(numGames) ); } } return 0; }

This is rather unoptimized in terms of low level programing like SIMD instructions or multithreading, not to speak of a GPU implementation. But a simple optimization can be done by creating a look up table (LUT) that encodes the results of 1, 2, 3, 4 or 5 rolls of the dice, so that we can have a lot less random number generation invocations, memory lookups and additions the innest of the loops. Something like this:

#include <stdint.h> #include <stdio.h> #include <random> int main( void ) { uint8_t dice1 = { 0, 0, 0, 0, 2, 3 }; uint8_t dice5[6*6*6*6*6]; // 7776 bytes uint32_t len[] = { 1, 6, 6*6, 6*6*6, 6*6*6*6, 6*6*6*6*6 }; // precompute results for 5 rolls for( int m=0; m<6; m++ ) for( int l=0; l<6; l++ ) for( int k=0; k<6; k++ ) for( int j=0; j<6; j++ ) for( int i=0; i<6; i++ ) dice5[6*6*6*6*m+6*6*6*l+6*6*k+6*j+i] = dice1[i] + dice1[j] + dice1[k] + dice1[l] + dice1[m]; std::mt19937 gen; uint32_t longestGame = 0; uint64_t runningTotal = 0; for( uint64_t numGames=1; ; numGames++ ) { uint32_t numThrows = 0; uint32_t numDiceToRoll = 5; for( numThrows=1; ; numThrows++ ) { uint32_t total = 0; while( numDiceToRoll>0 ) { uint32_t num = std::min(numDiceToRoll,5u); total += dice5[ gen() % len[ num ] ]; numDiceToRoll -= num; } numDiceToRoll = total; if( numDiceToRoll==0 ) break; } runningTotal += uint64_t(numThrows); longestGame = std::max( longestGame, numThrows ); if( (numGames & 0x00000000001fffff )==0 ) { printf( "%I64u %d %.16g\n", numGames, longestGame, double(runningTotal)/double(numGames) ); } } return 0; }