Is the save-rolling mechanism fair?
chimeric
Member Posts: 1,163
Just now I cast a 1st-level spell at Kagain about 20 times in a row. His save vs. spell is way lower than it has any right to be, 8. Even so, I succeeded 2 or 3 times. This makes me suspicious. If the engine is skewed, does that only work to the benefit of the party, or do monsters get unusually nice results as well? I'm playing on the Core Rules difficulty.
0
Comments
If Kagain was 1st level, an 8 would be low without buff spells or items. If Kagain is level 7 or 8, then an 8 is the correct save vs. spells for a dwarf with 20 Constitution (see the 2nd Edition Player's Handbook). So, what level was Kagain?
Was Kagain wearing any save-buffing items (e.g., Ring of the Princes). Was he under the effects of any buff spells, bard songs, etc..?
What spell were you casting? Some spells have save bonuses/penalties, e.g., Charm Person gives a +3 save bonus to the target.
Was the caster a specialist? Was the spell from their school of specialization?
Paralysis / Poison / Death: 6
Rod / Staff / Wand: 8
Petrification / Polymorph: 12
Breath weapon: 13
Spell: 9
This is in BG. The save values for different classes in the pen-and-paper game are collected in Table 60 of the PHB. For a 6th level fighter they are: 11 13 12 13 14.
Here, apparently, is where the dwarf, halfling and gnome spell resistance comes in. They all get +1 for every 3 points of Constitution - starting from a little higher than zero, according to Table 9 in the PHB. Constitution 4-6 gives them +1, and so on from there every 3 points. So, although this is off the chart in the handbook, at 20 Con the bonus should already equal +6 for vs. spell and vs. rod, staff or wand for those three races. Dwarves and halflings, but not gnomes, also get this bonus for rolls vs. poison - only poison, not all vs. death rolls. A crushing boulder crushes them just as well as a human. Apparently this was simplified for the Infinity Engine.
Kagain's saving throw of 8 was an adjusted one, by one point from a ring. It is actually 9. Since 14 - 6 = 8, his save vs. spell value is actually worse by a point than it ought to be, by the rules! And so on for the other values. It seems that the designers stopped with the limits of Table 9, so Kagain is actually due another point for vs. death, vs. rod and vs. spell.
With this out of the way, the spell I cast on him is one I had made myself, an enchantment spell, 1st level, cast by Edwin the conjurer, with a straight vs. spell roll. As I said, Kagain was wearing a ring, so his save value vs. spell was 8. He should have failed to save a little less than half of the time. I'm going to do some testing and cast the spell on him 50 times or so to figure out whether the number generator is fair or not, because that determines whether I ought to always assign a save penalty to my saves. I would appreciate input on saves of monsters, too.
Table 9 only goes up to 19 Con (the limit for a freshly rolled dwarf). Bioware did take the +5 at 19 Con to be the maximum (see SAVECNDH.2DA). However, even if you extend the pattern above 19, the next bump in saving throw would be at 21 Con to +6 (and then +7 at 25). So, Kagain's save is just fine.
Rolling 50 times and then suspecting something is off is not exactly a good method to go by here.
If you want to empirically investigate skewed rolling behavior in this manner, 50 tries simply won't do. 500 could be some indication, 5,000 or 50,000 would be a solid leads.
If 5,000 rolls show a significant deviation, I'd go about starting to investigate systemic issues. Any order of magnitude below that can still be simply explained by RNGesus frowning upon you.
Exempli gratia, the bonus vs. poison in pen-and-paper is a nice assurance, but not a big deal. How often do you get poisoned, anyway? The rest of the vs. death save is regular, that leaves the midget races just as susceptible to, for example, dying in a spiked pit. When we have this mechanic transported without differentiation to BG, what we get is a solid protection against a whole lot of possible misadventures - or just adventures. If something happens now to the party that requires a roll vs. death or vs. spell, humans, elves and half-elves are going to be severely disadvantaged. They are about half as likely to survive. And the midgets, on the other hand, needn't worry or be excited much. Oh, my dwarf is so tough, he don't care. From this it follows that whoever decides to put a spiked pit in the game, or just some new magic, had better use another save, like vs. breath weapon. And so on.
And remember the topic of this thread: I want to know above all if the rolling itself is fair. If it's skewed towards good results, penalties to the save all-around may be necessary.
Remove the Save vs. Death bonuses in "SAVECNDH.2DA".
Alter any item/spell that poison's to use the following format:
Opcode: 326 (Race: Dwarf or Halfling) Subspell1, no save
Opcode: 326 (Race: Not Dwarf or Halfling) Subspell2, Save vs. Death
Subspell1- Opcode: 326 (CON: 1-25, one for each) Subspell2, Save vs. Death with proper bonus
Subspell2- Poison Effect, no save
Anyway. It depends on what you are looking at. The example case was a pretty standard dice-roll scenario, i.e. a repeatable event of little significance that can be generalized across a large sample size. That doesn't translate into "low numbers don't matter" per se - context matters a LOT. In fact, context is pretty much ALL that matters, and it is WHY in this case low numbers MUST be disregarded.
The key here is the so-called "expected value", i.e. the mathematical average of a random variable. To take dice as an example, the average of, say, 2d6 is 7: (2+12)/2 meaning you're most likely to roll a 7, with the other numbers less and less likely the further away you go (in either direction), all the way to 2 and 12, the least likely.
Why do low numbers not matter? Because they tell you NOTHING. If you throw 2d6 and roll 12, then throw them again and roll another 12 - does that mean the dice are loaded? Or were you just unlucky? Well, you HAVE NO IDEA. You cannot make a determination one way or the other because you lack sufficient data. You would need to test and try things out before you made any sort of assertion with any sort of truth value. A lot of testing. 10 rolls? Nope. Might be anything. 100? Better, but you could still see weird results. 1,000? Now we're getting towards a much clearer picture. That is what is called the "law of large numbers", i.e. that you approach the expected value (EV) better the more attempts you analyze.
Note however that this is an abstract mathematical example with very clear context and limitations. Also, this is only about making assertions about the validity of the process - we're trying to determine here if the "randomness" of the dice works as it should (mathematically) or whether something in the system isn't screwed up and producing bad random numbers. You can't say EITHER WAY without testing, but it's often good to assume a sensible default position - such as, in this case, that the system works as it is described, i.e. that dice rolls are actually modeled to be sufficiently ideal dice that mimic mathematical EV. It says so on the box, so we assume it to be true. Now, if you think something ELSE is true... then you have to first disprove the default position. I.e. to show the system is NOT working properly, you'd have to have some sort of statistical evidence or at least indication. For that, you need a LOT of tries, because only then can you actually see whether any thing is wrong. And you need ever higher numbers the smaller the wrongness you want to detect is (it's easy to see if something is 10% but should be 50%, but it's MUCH harder to realize something is 34.9% when it should be 35%, to give some examples).
That depends entirely on the situation. These mathematical models are all abstractions, and only work within very specific frameworks. However, you can in fact very much make an argument similar to the one you made.
The key in this case would be the TOTAL number of employees. If you're meeting 10 people from a 20-person company and 9 of them turn out jerks - that's a staggering percentage (45%) and you're probably right to be concerned. But let's say you meet 10 employees of a company like e.g. Amazon.com - the same 9 jerks there tells you fairly little, considering they have about 230,000 employees (0.0039%). To put it in everyday terms, would you really condemn 229,991 people based on 9 people you met? Or would you at least consider that you just got a bad sample and ran into the Jersey Shore squad for some reason. Now, that's not to say that you wouldn't still be negatively affected even by such an insignificant sample - and indeed, many people are (because most people don't do math). However, the company will probably be fine with that. They know they can't control everything, and they'll focus on controlling the average - our good friend expected value, once again (albeit in a bit more complex form). For even if you happened to pick the 9 bad apples out of 230,000, they can rest assured that on average people will not have that same rotten luck. And averages, to them, are all that count because they know they have a large sample size. They have more than 1 customer. More than 10. More than 100, or 1,000, or even 10,000. It all evens out in the end.
Consider Charm Person cast by Edwin (a non-enchanter), it gives a +3 save bonus to the target. So, if Kagain has a save vs. spell of 8, his save vs Charm Person is 5. So, on average he should save 80% of the time (i.e., only a roll of 1-4 fails). The expected number of failed saves is 4 of 20. You got 2 or 3 out of 20, which is certainly consistent with the expected value (binomial calculation shows a ~40% chance of 3 or fewer failed saves out of 20).
On the other hand, consider Spook cast by an 8th level illusionist. Now, the spell has a -4 save penalty with an additional -2 save penalty for an illusionist casting an illusion spell. Now, Kagain's save is 14. So, you would expect 13 failed saves out of 20. The chance of 3 or fewer failed saves is now ~0.006%.
As you can see, the spell being cast (and the caster) makes a big difference.
Your apparent situation is nowhere near as unlikely as the Spook example. With a straight save of 8 vs. spells, the chance of a failed save is 35% (7 of 20 failed saves expected). The odds of 3 or fewer failed saves is 4.4%. Not a likely outcome, but not exceedingly rare either. Think of it this way. If you ran this 20-roll test 20 times, you'd expect the this result to happen about once.
Hence, a greater sample size is needed to make an assessment.
And if you're primarily concerned about role-playing, then just dive into the story and don't worry about the mechanics. Is the Infinity Engine a perfect implementation of PnP rules? No, of course not. It's got some simplifications and outright deviations, but overall it feels like AD&D. So, fire up the game, roll up a character, and go talk to your foster father (after you fetch some Pepto for the cow).
Sometimes it works for saves too. Sometimes a character saves consecutively three or four times all the time, and sometimes he fails the next few saves all the time.
Still I think the mechanic tries to be fair. When you do the math, the results often don't seem too weird or unexpected. Also, specialist mage saving throw penalties are applied correctly. I tried this by creating a necromancer and casting finger of deaths on a dwarf with save vs spell of -1. If it is just the natural spell, the dwarf should always save. Even if he rolls a 1, with -2 from the spell, it becomes -1 and I have seen save vs spell:-1 message. However, before even the tenth casting, he failed once, because necromancer applies a further -2 to the save, so he has to save at -4. Thus any roll of 3 or better saves, but rolls of 1 and 2 fail. (%10 fail chance)
Note that a roll of 1 or a roll of 20 are not considered critical failures or successes when it comes to saving throws.
Ok, please correct me if I'm wrong, but there are a few things involved in finding the actual sample mean. Basically, the claim you want to make is that you are a certain percent sure, *ci*, that the actual mean of the distribution is within a certain margin of error, *me*, of the mean calculated from the sample *p*.
From this, the calculation is
me z p n = z * sqrt (p*(1-p)/n)
where n is the sample size, and z is a number corresponding to the confidence, *ci*. A z=1.645 corresponds to ci=90%, z=1.96 for ci=95%, and z=2.58 for ci=99%. So, using the the 95% interval:
me 1.96 (3/20) 20 = 0.15
me 1.96 ( 3/20) 50 = 0.1
me 1.96 (3/20) 500 = 0.03
So, for the sample of fifty, and assuming you still had the same ratio of fails, you can be 95 percent certain that the true mean is actually somewhere between 5%-25%, i.e. 15% +/- 10%
Computers can't roll dice, flip coins, or in any other such way generate random numbers. Instead, they quote a series of simulated "random" numbers from very long mathematical number strings, which are not infinite, and eventually begin to repeat.
There are problems inherent in generating these strings and programming them. The algorithms used are very mathematically complicated, and have changed over time. We don't know what RNG algorithm or seed the BG program uses. I don't even think the current devs know that.
For detailed information on computer RNG, the wiki article is a good place to start:
https://en.wikipedia.org/wiki/Random_number_generation
There IS a pattern to RNG. But the pattern is so complex that our ability to figure it out is zero.
You can treat it as effectively random.