Statistical probability of rolling all 18s

FinneousPJ · September 2013

@Mathsorcerer This isn't a matter of approach or opinion. Fact is 1/56 is wrong.

Mathsorcerer · September 2013

*shrug* The horse is dead and there is no point in trying to convince anyone who has decided that their mind is closed. I have indeed studied the math, I have crunched the numbers, and I stand by my results as accurate, a fact with which even my statistics professor agreed when I showed him the results.

Nothing more to see here so just move on.

Corvino · September 2013

I depends on your perspective, and the expected outcome to an extent. If we were talking about rolling a 15 then it would be relevant. There is only 1 way to roll an 18.

Mathsorcerer · September 2013

The only results from 3d6, presuming you are interested in the sum total, are
111, 112, 113, 114, 115, 116, 122, 123, 124, 125, 126, 133, 134, 135, 136, 144, 145, 146, 155, 156, 166, 222, 223, 224, 225, 226, 233, 234, 235, 236, 244, 245, 246, 255, 256, 266, 333, 334, 335, 336, 344, 345, 346, 355, 356, 366, 444, 445, 456, 455, 456, 466, 555, 556, 566, and 666.
Interestingly, this is the sum of the first 6 perfect numbers. There are 6+5+4+3+2+1=21 results beginning with 1, 5+4+3+2+1=15 results beginning with 2, 4+3+2+1=10 results beginning with 3, etc. until we have only one result beginning with 6. 21+10+15+10+6+3+1=56, the total set of possible results when rolling 3d6 *if we consider only the sum total and not the individual die results*. That is the key to what I have been saying--ignore the individual results and look only at the sum. Based on this reasoning, p(rolling an 18) = 1/56.

Hadar · September 2013

But you assume that rolling each sum has equal probability... If the dices are ideal and the environment does not effect the rolling your assumption is false. And your way of thinking has noting about rolling dices in BG

Mathsorcerer · September 2013

No, each sum does not have equal probability. p(14) = 4/56 = 1/14 but p(6) = 3/56, so you are more likely to roll a 14, which we know should be true because 14 is closer to the expected value of 10.5 than 6 is.

If you want a more difficult challenge at some point try analyzing the results of 5d4-2, which will also give you the range of results 3 to 18. The expected value is still 10.5 but the distribution is skewed towards the mean even more than 3d6.

Jarrakul · September 2013

@Mathsorcerer, I'm sorry that your statistics professor agreed, because frankly you're wrong. And here's why. You're saying that we don't care about the order of the dice, only the sum total at the end. This is true, as far as it goes. 4/3/1 is the same as 1/4/3. The problem is that they're still unique sets of die rolls. They have the same result, so the probability of rolling an 8 must include both of them, but they are two different ways to reach that result, and so both must be factored in.

I'll give an example using 2d6, because it's simpler, and trying to get an 11. You say there's only one way to do that, because it's always going to be a 5/6 set. That's wrong, and I'll explain why. So, how can we get that 5/6 set? Well, let's do this sequentially. It's the same thing, since the events are independent. It's definitionally impossible for rolling one die first to influence the probability at the end (and I can demonstrate why it's the same in more explicit detail, if you'd like). When you roll the first die, think about what you can possibly roll that will let us get this 5/6 spread when we roll the second. It's pretty obvious. We have to roll a 5 or a 6. Otherwise there's no way we can get a 5 and a 6. So we have to roll a 5 or a 6 on the first die, or in other words we have a 1/3 chance of getting a roll that allows us to potentially roll an 11 once we add the second die. Now we roll the second die. No matter what the first die rolled, there's precisely one roll that will add up to 11. If we rolled a 5 before, we need a 6 now. If we rolled a 6 before, we need a 5 now. So either way, we have a 1/6 chance of getting out 11. Now, using the definition of sequential probability, we have a 1/3 chance and a 1/6 chance, so the chance of both of them happening is 1/3 * 1/6 = 1/18, which is consistent with the 2/36 predicted by the idea that there are two possible ways to make 11. In other words, this sequential model makes it obvious that 5/6 and 6/5 must be unique events with unique probabilities that must be summed to get the total probability of rolling an 11.

Now, if that doesn't convince you, I urge you to find a pair of d6s and roll them a couple hundred times. See how many 11s you roll, and how many 12s. I don't like using observed frequencies to argue the definition of probability, because that's a remarkably frequentist way of thinking, but it is nonetheless illustrative of the fact that you tend to get roughly twice as many 11s as 12s.

Mathsorcerer · September 2013

That's just it--I don't disagree with that assessment precisely because it *is* correct. p(11 on 2d6) = 2/36 no matter how you examine it. I am merely proposing a slightly different way of looking at dice results considering only the sum, not the order of the individual results. Collapsing the results in this manner would show that only one result gives an 11, whether that roll is 5/6 or 6/5. Believe me, on a test I would be using the standard way of analyzing the problem--p(11 on 2d6) = 2/36.

I never knew that discussing probability analysis could make some people so riled up. At least they take it seriously, which is a good thing. This is almost as good as the time on a different board we were discussing the Monty Hall problem, a classic. If you don't like my way of analyzing dice throws then don't use it.

FinneousPJ · September 2013

@Mathsorcerer Would you say that on a 2d6 the probability of rolling 2 equals the probability of rolling 12 equals 1/21?

Jarrakul · September 2013

Oh, the Monty Hall problem. That one does make people rage spectacularly. :P

So, @Mathsorcerer, my problem with your analysis is that, while I understand how you got it, I don't understand what definition of probability you think it works under. I know you have a formula that yields that as an answer, and that's cool, but what does it mean? Granted, certain subjectivist sub-camps would accept it as a correct measure of your confidence of rolling an 11 (for example), but they'd also accept any other number you gave between 0 and 1, so that's not terribly useful. I think you have something more concrete in mind, and I simply have no idea what.

Also, if you only care about the sums, why isn't everything on 2d6 just 1/11? If 4/3/1 is the same as 1/4/3, why is 2/2/4 different?

Mathsorcerer · September 2013

If you collapse the results like I normally do, yes. If you want the textbook results, then no--p(2 on 2d6) = p(12 on 2d6) = 1/36.

Jarrakul said:
Also, if you only care about the sums, why isn't everything on 2d6 just 1/11? If 4/3/1 is the same as 1/4/3, why is 2/2/4 different?

Because 4/3/1 normally has 6 ways of showing up--134, 143, 314, 341, 413, and 431--but 2/2/4 has only three ways to show up--224, 242, and 422. So sometimes results collapse from 6 to 1 and other times from 3 to 1.

FinneousPJ · September 2013

Mathsorcerer said:
If you collapse the results like I normally do, yes. If you want the textbook results, then no--p(2 on 2d6) = p(12 on 2d6) = 1/36.

Jarrakul said:
Also, if you only care about the sums, why isn't everything on 2d6 just 1/11? If 4/3/1 is the same as 1/4/3, why is 2/2/4 different?
Because 4/3/1 normally has 6 ways of showing up--134, 143, 314, 341, 413, and 431--but 2/2/4 has only three ways to show up--224, 242, and 422. So sometimes results collapse from 6 to 1 and other times from 3 to 1.

And you're saying that the textbook is wrong and you're right? You must understand it can't be both right, lol

Jarrakul · September 2013

I'm not totally sure you're answering what I'm asking. If you are, please go into more detail, because I'm not getting it. What I'm asking is, you're clearly counting the entire 4/3/1 set as one result and the entire 4/2/2 set as a second result. Given that they both sum to 8, why aren't you counting them both combined as a single result? What's special about different die faces that isn't special about different dice?

But that's a secondary concern. What I'd really like to know is not what method you're using to get your results, or even the logic behind that method, but rather what your results actually mean. Does your 1/56 answer reflect anything we can expect to see in the real world, or is it just the output of your formula? Because, to be frank, if it's just the output of your formula I don't really see the point.

Corvino · September 2013

As with so many things, we see a normal(ish) distribution arise with repeated 3d6 rolling. It depends on the r, and should normalise as r increases. There are are a lot of rolls favoring the 3.5x3x6, adjusted for minimums. So there will be a significant amount of skew built into the system.

6 18s is well beyond a 2 standard deviation cutoff though. That being the normal standard in human populations. I honestly can't think of a character who would need it. For any class. Even for a paladin, who have crazy requirements, what would 18 Int add to the character?

Given that I've rolled at least a couple of thousand times, and so have a lot of other people, it would not surprise me had someone rolled an 18/18/18/18/18/18 but it would be vanishingly rare.

*edit* I can contribute to stats discussions, but probably shouldn't do so while drunk. My points stand.

Mathsorcerer · September 2013

FinneousPJ said:
And you're saying that the textbook is wrong and you're right? You must understand it can't be both right, lol

Actually...both results can be correct because the textbook and I are viewing the problem slightly differently. As I have said a handful of times now, if you are taking a test or working a problem in the real world then use the textbook formula.

@Jarrakul, the way I am collapsing the results would count both 4/3/1 and 4/2/2 as an 8, which they are, so those 6+3=9 results would all go into the "8" bucket. There isn't anything "real world" about it; rather, it was an exercise in intellectual curiosity that struck me back when I was playing PNP games and I wondered what the results would be if I analyzed 3d6 and cared only about the sum rather than the individual rolls. The results don't necessarily "mean" anything but they do make the case that extreme results--3s, 4s, 17s, 18s-- are *more* likely to happen than traditional analysis would state, given that 1/56 is much more likely than 1/216. I suppose I should put together a short program and have the computer give me 1,000,000 rolls of 3d6 over the weekend to get some experimental data for comparison.

FinneousPJ · September 2013

@Mathsorcerer Now I know you're shitting me.

Jarrakul · September 2013

@Mathsorcerer, I'm confused now. I thought you were collapsing the 4/3/1 and 4/2/2 groups into 1 each. Now you're telling me you get 9? That's precisely what traditional probability will tell you. Do the math that way and you will precisely reproduce the predictions of traditional probability. Or are you telling me they all get grouped into a single 8 group? In which case, why do you have more than 11 buckets at the end with equal weight?

I guarantee you that if you run that experiment you will get roughly the distribution predicted by traditional probability. I guarantee this because current traditional probability was made up by frequentists who used exactly that sort of experiment to define what they thought probability was. I happen to disagree with their definition, but the Bayesian camp would still predict exactly the same results and use the same math to do it.

Mathsorcerer · September 2013

About the results or about making a program to generate 1,000,000 random rolls?

Either way, at least I have helped make the day more interesting.

pixie359 · September 2013

As far as I can tell, 1/56 means that the given result (rolling 18) is one of a set of 56 possible results.

Unfortunately, without knowing the relative likelihood of those results, it's not very illuminating. It's wrong to call this the 'probability' of that outcome.

I'd also disagree with the way that the outcomes are grouped, as I think that if you are going to call (eg) 1-1-3 and 1-3-1 the same result, you might as well go whole hog and say 1-2-2 is as well, as that also adds up to five, which is the final outcome.

So actually we have reduced the outcome from 1/216 to 1/15. Pointlessly, admittedly, but we've done it.

Mathsorcerer · September 2013

The way I was looking at the results, you *would* group 1/1/3 and 1/2/2 into the "5" bucket. Many hobbies, as this is one of mine, ultimately wind up being pointless...unless you can find a way to make a point with it. I had a different professor who published a small book that was a compilation of ways *not* to be able to trisect an angle using straightedge and compass alone.

Jarrakul · September 2013

That sounds like an awesome book and I want to read it. Who was the professor, so that I might find this book?

I remain confused by much of what you're saying. Oh well. Life goes on. And for the record, I have no problem with you having an ultimately pointless hobby. I just feel like your bringing it up here added unnecessary confusion to the thread, since it doesn't really answer the question at hand. Of course, then I had to go exacerbate the problem by calling you out on it, thereby derailing the thread, so perhaps I shouldn't talk. :P

Mathsorcerer · September 2013

If a thread doesn't get derailed at some point, then it isn't worth posting in. *laugh*

I'll have to look up the book's info; it may or may not be in print anymore even though I have a copy--he gave me one himself.

I just ran 50,000 test rolls--a good start--and the experimental results match what we would expect from the standard 216 results, not my 56. *shrug* It is what it is--a wonderful hypothesis ruined by an ugly fact. I don't expect those results to change if I run 500,000 or the full million.

pixie359 · September 2013

I dunno, I enjoyed the discussion, and it seemed (to me) to be held in good spirits.

FinneousPJ · September 2013

Here's a quick progam if anyone's interested. Just pop it into a C compiler and it should run on most OS. I can upload the compiled exe if you need.


#include < stdio.h>
#include < stdlib.h>
#include < time.h>

int main()
{
 	size_t n, d, runs;

 	size_t* results;

 	srand(time(NULL));

 	printf("Welcome to Die Simulator!\n");

 	printf("Please input die to simulate (e.g. \"3d6\")\n");
 	scanf("%ud%u", &n, &d);

	results = calloc(n * d, sizeof(size_t));

	printf("Please input number of simulations to run in millions\n");
	scanf("%u", &runs);

	for (size_t i = runs * 1E6; i != 0; i--)
	{
	 	// derp
	 	size_t result = 0;

	 	for (size_t c = n; c != 0; c--)
	 		result += rand() % d +1;

	 	results[result -1]++;
	}
	printf("\nResults:\n");
	for (size_t i = n -1; i < n * d; i++)
		printf("%u:\t%d\t%f %%\n", i +1, results[i], 100 * results[i] / (runs * 1E6));

}

Example run:

Welcome to Die Simulator!
Please input die to simulate (e.g. "3d6")
10d6
Please input number of simulations to run in millions
10

Results:
10:     0       0.000000 %
11:     0       0.000000 %
12:     11      0.000110 %
13:     37      0.000370 %
14:     127     0.001270 %
15:     307     0.003070 %
16:     869     0.008690 %
17:     1840    0.018400 %
18:     3957    0.039570 %
19:     7558    0.075580 %
20:     14047   0.140470 %
21:     24689   0.246890 %
22:     40392   0.403920 %
23:     63614   0.636140 %
24:     95181   0.951810 %
25:     137850  1.378500 %
26:     189749  1.897490 %
27:     254984  2.549840 %
28:     324001  3.240010 %
29:     405884  4.058840 %
30:     482214  4.822140 %
31:     564486  5.644860 %
32:     626947  6.269470 %
33:     684563  6.845630 %
34:     710792  7.107920 %
35:     731047  7.310470 %
36:     711877  7.118770 %
37:     684733  6.847330 %
38:     625788  6.257880 %
39:     563186  5.631860 %
40:     482904  4.829040 %
41:     404853  4.048530 %
42:     324219  3.242190 %
43:     255140  2.551400 %
44:     190944  1.909440 %
45:     137894  1.378940 %
46:     95466   0.954660 %
47:     63741   0.637410 %
48:     40868   0.408680 %
49:     24501   0.245010 %
50:     14049   0.140490 %
51:     7585    0.075850 %
52:     3908    0.039080 %
53:     1877    0.018770 %
54:     820     0.008200 %
55:     345     0.003450 %
56:     116     0.001160 %
57:     28      0.000280 %
58:     11      0.000110 %
59:     1       0.000010 %
60:     0       0.000000 %

EDIT: tags

@Jarrakul @Mathsorcerer

EDIT again error in formatting

EDIT 3 here's the file https://forums.beamdog.com/uploads/FileUpload/b6/2f3a2fbd53d916a0a369d5da0722be.7z

@Dee please remove it if sharing such files is not allowed.

On the topic you can simulate all 18s by running the program with 18d6 (i.e. 6 times 3d6).

SCARY_WIZARD · September 2013

kamuizin said:

alnair said:
@nihility00 It's 3d6 for each stat, so the theoretical maximum is (3x6)x6 = 108. But you'd have VERY lucky to roll that...
And will probally reroll before you notice it, i do this all the time TT!

"Uhh..."
*click, click, click, click*
"74... 77... 78... 79... 97... 82... Oh, **** me!"

taltamir · September 2013

Just use CLUA console to award yourself as many tomes as needed.

I recently did a playthrough with all 19 to my stats.. I figured that was proper for the prophesied demigod. GREAT FUN! (I could have gone up to 25 but that felt too high).

clanqui · September 2013

The roller is definitely not using 3d6. The averages skew too high. Most likely it is using 4d6 drop lowest, which gives a 1 in 54 chance of rolling an "18". (24 possibilities evaluate to 18 out of 1296 possible combinations)

TJ_Hooker · September 2013

clanqui said:
The roller is definitely not using 3d6. The averages skew too high. Most likely it is using 4d6 drop lowest, which gives a 1 in 54 chance of rolling an "18". (24 possibilities evaluate to 18 out of 1296 possible combinations)

As was said earlier in the thread, high average rolls are due to fact that the game re-rolls whenever a stat is less than the race/class minimum, or when the total is less than 75. So it could still be 3d6 (or simply a random number between 3 and 18).

nihility00 · September 2013

No doubt it will be programmed to discredit a total roll value less than a predesignated number. I've used the 3d6 rule for D&D but sometimes the rolls were so poor we implemented a max re-roll of x3.

arondes · September 2013

The D&D rule for generating character may or may not be exactly the algorithm the BG game uses, but I would like to introduce one of the possible rule:

4d6 and drops the lowest die

reference:
http://web.fisher.cx/robert/infogami/Classic_D&D_house_rules
http://catlikecoding.com/blog/post:4d6_drop_lowest
http://prestonpoulter.com/2010/10/19/the-mathematics-behind-4d6-drop-the-lowest/
http://www.sosmath.com/CBB/viewtopic.php?t=49843
http://kill-0.com/duplo/2008/06/02/dd-dice-probabilities/

Under that, the result for one single ability, e.g. strength, comes from a summation of three order statistics. Someone claims that the game simply use "discrete uniform" to generate 3~18, but I do not think this is true. The reason is simply because from my experience, the generated value is more likely to stay "in the middle", not evenly distributed from 3 to 18.

If you would like to calculate the probability that sum=108, it is actually something like "4D6 and get at least 3 dice as 6" and repeat this process 6 times if assuming independency. There are 21 combinations to generate 18, one for (6,6,6,6), and four for each (6,6,6,i), i=1,2,3,4,5, so 1+4*5=21

Here I would like to extend the problem a little bit. Suppose we call the final result as: big3sum, then first let us calculate the mean.

E(big3sum)=E(x1+x2+x3+x4)-E(min)

where min is the smallest one, or we can write it as x(1) as textbook names the order statistics. The above equation holds due to the nature of expectation nomatter whether the variables are independent.

It is easy to see that E(x1+x2+x3+x4)=4*3.5=14. So thenext step is to calculate E(min)

Let us look closely at the probability mass function of min.

(1) P(min < k)=1-P(min>=k)=1-P(x1>=k,x2>=k,x3>=k,x4>=k)=1-((7-k)/6)^4
(2) P(min <= k)=1-P(min>k)=1-P(x1>k,x2>k,x3>k,x4>k)=1-((6-k)/6)^4

Use (2)-(1), we get:
P(min = k)=P(min<=k)-P(min<k)=((7-k)/6)^4-((6-k)/6)^4

Then by definition,

E(min)=sum(k=1 to 6) k*P(min=k)

The result is 1.7554. Thus, E(big3sum)=14-1.7554=12.2446

However, the calculation of variance will be very difficult. We have to consider the covariance.

V(x1+x2+x3+x4-min)=V(x1+x2+x3+x4)+V(min)-2*Cov(x1+x2+x3+x4,min)

V(x1+x2+x3+x4) is again easy to calculate. The answer is 11.66667. V(min) is also easy, since we have got the mass function. The V(min) is 0.91.

Next step is really dirty. Cov(x1+x2+x3+x4,min)=Cov(x(1),x(1))+Cov(x(1),x(2))+Cov(x(1),x(3))+Cov(x(1),x(4)). Here I call min as x(1) in order to make it clearer.

Cov(x(1),x(1)) is exactly V(min), which we have got. In order to calculate other numbers, you have to deal with the joint distribution of order statistics.

But let us do it in a "cheating" way. It is well known that, for n continuous uniform variables u's in [0,1], if we denote u(i) as the ith value in increasing order, then the distribution of u(i) is Beta(i,n+1-i). In our problem, n=4. If we just assume that x(i)=6*u(i), then by calculate the covariance of u(i) and times that by 36 we can "estimate" the covariance from x(i)'s. This is kind of "non-rigorous" but workable. By textbook,
cov(u(i),u(j))=j(n+1-i)/(n+1)^2/(n+2)

If we plug n=4, i=1, j=2,3,4 in to the formula, and times 36, we will get:
cov(x(1),x(2))=0.72
cov(x(1),x(3))=0.48
cov(x(1),x(4))=0.24

Thus, Cov(x1+x2+x3+x4,min)=Cov(x(1),x(1))+Cov(x(1),x(2))+Cov(x(1),x(3))+Cov(x(1),x(4))=2.35

Next we combine everything together, V(x1+x2+x3+x4-min)=V(x1+x2+x3+x4)+V(min)-2*Cov(x1+x2+x3+x4,min)=11.66667+0.9100789-2*2.35=7.9

The big3sum follows some distribution with mean 12.2446 and variance 7.9, although the probability mass function is not easy to get, but luckily we are more interested in the distribution of the sum of "six independent big3sum". By the i.i.d condition and central limit theorem, we may assume that the sum of a character's ability follows a normal asymptotically. The normal has mean 73.4676 and variance 47.4

However, it is worth noting that, the tail probability, such as P(sum>=107) may not be approximated well due to the nature of asymptotic theory, so this normal model is just provided as a general mechanism for this problem. i also wrote some R codes for MCMC study:

generateone<-function(min,add,n)
{
temp=sample(1:6,4*n,replace=TRUE)
temp=matrix(temp,ncol=4)
temp2=rowSums(temp)-apply(temp,1,min)
temp2=temp2+add
(temp2<min)*min+(temp2>=min)*temp2
}

This function generate one of the character's ability. the min set the lower bound. The add is modifier for extra bonus (for elf, the dex has add = +1)

Here is a simple example. We get 100000 observations, do the kernel density estimation and then numerical integral. The result is for P(sum>=94)

N=100000
str=generateone(9,1,N) #9-19
dex=generateone(3,0,N) #3-18
con=generateone(4,1,N) #4-19
int=generateone(1,-2,N) #1-16
wis=generateone(3,0,N) #3-18
cha=generateone(3,0,N) #3-18
total=str+dex+con+int+wis+cha
pdf=density(total)
apdf<- approxfun(pdf$x, pdf$y, yleft=0, yright=0)
integrate(apdf, 93.5, 108)$value

There could be something not correct in the paper, I appreciate any other suggestions. Also, could the programmer gives some hints for how the game actually generates those numbers?

Statistical probability of rolling all 18s

Comments