05 December, 2013

Yeah, so... I'm not dead

I apologize for the extended absence.  My research and dissertation kicked up into high gear in April, and has refused to let go of my life.  

I've also moved to a new part of California for research and got divorced... those are the big issues.  I'm going to try to get back to writing, I've got a few ideas for my next article about games and the OODA loop concept.

So, stay tuned.  Good stuff to come.

/endofline

19 January, 2013

Expanded Response to Jay Little's "Nerd Numbers: Terminal Outcomes"


This post is an expansion of my comment on Jay Little's article "Nerd Numbers: Terminal Outcomes" on the Gaming Security Agency's website.  Jay is the designer of great games like the Warhammer Fantasy Roleplaying Game.  You should seriously keep up with his "Painted Thumb" blog.  In this article, Jay describes how the dice pools in the new Edge of the Empire RPG (which he also designed) produce results on multiple axes, in contrast to dice mechanics of most classic rpgs that typically just produce results along a single axis.  You should read the article for the details, but the point is that results from a system that uses multiple axes provides exponentially more 'terminal outcomes' than uniaxial dice systems, which has implications on design outlook.  In the comments, we've started to discuss the implications of the design choice.

I'm not sure that more terminal outcomes are uniformly better.  Edge's dice system does handily produce a large variety of distinct/terminal outcomes, and this is an important feature of the game.  I still think there are some issues, though.  In his response, he refers to an "oversight", in that he didn't state that he wasn't referring to a single dice pool composition, but a generalization of the pool effect.  Further, he's somewhat dismissive of the need to understand the probability associated with the terminal outcomes, stating:

"This merely speaks to the potential for various Terminal Outcomes, not the probabilities of any single result."

I'm worried about these statements, because they do relate to design outlook, but, in my interpretation, not necessarily in the way I think Jay believes they do.  To explain my position, I need to take a little detour through decision making in games.  So keep reading, and please be patient, I'll get to the good stuff soon.

When making design choices, you are creating the way players interact with your game system, and players interact with a game system by making choices, such as "play a healer or a tank", or "Shoot the storm trooper or apply a medpack to my wounded friend",  or "Throw a frag grenade or thermal detonator", etc.  Now, how do players decide between these options? The same way we make any decisions as humans: We identify a goal, evaluate our options, and choose the option most likely to achieve the goal with the fewest negative consequences.  Simple right?

It helps to break it down into some discrete parts.  On a very micro scale, a player follows the following path when deciding what actions to take:

Action Evaluation & Selection (AES) -> Dice Roll (DR) -> Outcome Resolution (OR)

In the AES step, a player reviews the all (or a reasonable subset) of his available actions, and decides on one action or set of actions that are available.  It is in this step that the player has the most control.

In the DR step, he throws the appropriate dice that correspond to his selected options, and includes any modifiers.   This is where the terminal outcome is produced/determined.  The player has no control in this phase, since there is no way to roll the dice to manipulate the outcome.

In the OR step, any further decisions that are called for (possibly none) are made by the appropriate parties, and then the outcome of the roll, the terminal outcome, is applied to the game.  This is where EotE players decide how to spend triumph and advantage, damage is applied to targets, etc.  

So, back to the original point, "how do we make decisions?"  Exactly as I stated above, we choose the option that's going to get us closest to our goal with the least negative consequences.  But how do we know which option that is?  We rely on our previous experience (observed outcomes of previous decisions, rolls, etc) and intuition (unobserved results that can be reasonably expected based on our understanding of the rules) to compare these.  Using this information, players evaluate the relative probabilities of the possible terminal outcomes for each of their options, and choose the option that best suits their goals.

How does this related to the AES -> DR-> OR path?  Because our knowledge about the relative probabilities of each of the terminal outcomes is determined by the DR step, and our knowledge about the impact of each of the terminal outcomes is determined by the OR step, and it should be clear that our experience and intuition derived from DR & OR steps is absolutely critical for use in the AES step.  

Now, when evaluating options that lead to only a few terminal outcomes (say, four, like in Jay's first example), it is easy to evaluate the value of each on the terminal outcome on the game.  However, if there are over two hundred terminal outcomes, this because vastly more difficult.  Further, you need more rules to interpret all of these additional outcomes.  The latter is less of a problem if you can apply general rules, but there's no getting around the first: As you increase terminal outcomes, a players ability to compare the value of options increases.

Also, when we are presented with an option that has four terminal outcomes, and the generating method is relatively simple (e.g. roll a d20, apply a modifier), our minds have little trouble comprehending the relative likelihoods of the various outcomes by either intuition or experience.  However, in an exponentially more complex system like EotE's dice pools, our brains are NOT able to easily intuit the likelihoods of different outcomes, which leaves us to depend on previous experience.  But this method ALSO collapses because there are so many possible outcomes, we would need to have rolled an individual dice pool thousands of times to have a feel for how JUST THAT DICE POOL behaves.  Finally, Jay comments:

"And finally. I’m not a mathematician. That much should be clear. An understanding of math, probabilities and percentages is certainly helpful, but ultimately instincts, gut feelings, and intuition play as large a part as any other factor."

I think there are 2 possible interpretations of this: Jay is talking about either how he makes decisions while playing a game, or how he makes decisions while designing a game.  If it's the former, okay, that's cool if the game system behaves in the way that it intuitive based on its presentation.  Unfortunately, the EotE dice mechanic fails on this account.  This has been documented previously in the long thread on the FFG forums: The upgrade mechanism, which is portrayed as one of 'king' of the dice pool modification mechanics, provides little benefit.  This has been shown to be true using not only theoretical calculations comparing the two mechanics AND large numbers of simulated results, but also empirically at my game table and others.

If Jay is referring to the latter situation, that he relies on gut feelings and intuition as much or more than math when he's designing games, then... 

I'm speechless.  

No, seriously, when I typed that, I just sat here with my jaw hanging open for a few seconds making vowel sounds.

I don't understand how you can tweak and balance and appropriately design an intricate system like a role-playing game, even with huge amounts of controlled play-testing, without a very solid understanding of underlying dice mechanic.  I have HUGE amounts of respect for Jay because he's produced some of the best games I've played, and maybe he can write these rules off the cuff with an intuitive understanding, but I think we see examples of unforeseen, uh, hiccups in the system.  It also seems to me an odd sentiment to end a comment on after a math-heavy and self described "nerdy" article about multidimensional outcome spaces.

I think outcomes on multiple axes are fucking fantastic!  I have no problem with this mechanical concept in games!  They add depth to the game and remove the admittedly boring action resolutions we've seen in previous games.  I love the dice mechanics of WFRPG, especially they way the rolled results could be spent on:

  • A few, clear options presented on the action card that varied between actions,
  • A few, clear options presented in the book that were relatively constant between actions, or
  • Narratively appropriate actions/effects

However, like any mechanic, it needs to be appropriately implemented in a game to live up to its full potential.  It's here where I worry a "go-with-your-gut" design method may have been a poor choice and created a situation where the mechanism may not be able to really shine.  The dice just don't seem to want to behave in the game, and frequently lead to some weird and game-slowing results.

Jay, if you're reading this, I hope this hasn't been too antagonistic.  I'm just being honest about how I see the dice mechanic.

Anyway, I think that's everything I needed to say to fully support my position.  This was way too long for the comment section on GSA, and actually gave me a chance to touch on a few upcoming subjects of future blogs, so it all worked out.  Hope to have the definition blog finished soon.

/endofline

EDIT: I mis-credited Jay as the designer of the second ed. of Descent: Journeys into the Darkness, which was designed by Clark, Konieckza, Sadler & Wilson.  It has been fixed above.

13 January, 2013

My History with Games


In my last post, I spoke of my passion for games.  I’d like to spend a short post to explain my background with games to provide a view on why I am so fond of them.  Obviously, everything we write, do, think, believe is colored by our experiences, so I see no reason not to be forthcoming about mine which have led to these posts.  There’s also little reason to say I play games for reasons other than fun.  Fun really should be the reason anyone, anywhere games.  However, not all players find the same games fun, and those that do find the same games fun may not find them fun for the same reasons.  So when I say “I think this game is fun”, I find it to be a very superficial statement that bears some support. 

Games as a Mathematic Exercise

The math in most games is pretty simple, and should be simple so that the games are accessible and quick to play.  However, the probability structures in many of these games are substantially more complex than many players realize, but full understanding of these systems aren’t really necessary to play the games.  I find digging into the what the dice are doing numerically, and how we interpret their results is a great way to practice basic probability, and some methods I’ve applied to my simulation research projects originally came from thinking about dice results from WEG Star Wars, Axis & Allies, or Dungeons & Dragons.  I love this kind of analysis, and really, it’s a major reason I’ve started this blog.


Games as a Social Gateway

Prior to high school, both electronic and tabletop games provided me a measure of common experience and common language that allowed me to relate to my peers.  Growing up, I was socially withdrawn, and often felt to fit poorly into my environment.  This made the common ground created by my interest in games all the more valuable.  In grade school, I spent plenty of nights playing HeroQuest with friends and talking about Dragon Warrior.  In high school, discussing various flavors of X-Com occupied a lot of high school.  I also had a few fitful starts at trying to run a WEG Star Wars game during high school.  The plot never really went anywhere, but it had me spending time time with my friends socially outside of school, which was a major step forward for me. 

Gaming as a social activity expanded after I started college, and during my early university semesters, there was little difference between my social circle and gaming circle.  When I started veterinary school I was lucky to find a local gaming group that I meshed with very well.  And again when I returned to graduate school, I found friends in gaming that formed the basis of my social circle.  

While some people in this world can move to a new area and find friends with little effort, I am not one of them.  On all of these occasions, these games provided me with a critical foundation which I could build a social network and find friends.  I don’t believe I could have done this without these games.

Games as Empowerment & Escapism

Escapism and empowerment are two reasons I play games, but I think it’s difficult to separate them since they feel like two sides of the same coin.  It well established that games can be empowering, they allow to take on a role we don’t usually occupy in ‘real life’, and, in this role, make decisions that have far-reaching consequences, albeit in an imaginary world.  By taking on a new role in this other world, we allow ourselves to escape our realities, for a brief time.  But beyond this first level of empowerment and escapism, I find a second level where these two are entwined.  Not only am I escaping the real world by engaging my mind in the rules of a fictional one, I’m personally immune from the consequences of the fictional decisions here in the real world.  Games can be incredibly liberating.  This has been a valuable outlet for me to have, and provided a place I could feel welcome, regardless of what was occurring in other parts of my life.

Games and my ADD

While a player’s objectives in a game are relatively constant, the environment in which the player pursues these goals is dynamic.  The available options to each player and the relative value of each option changes with each players move.  For example:
  • The values of Monopoly properties change as the game progresses
  • Moves in chess may become open or blocked following on opponents move 
  • Party member’s health pools in Final Fantasy games dwindle during combat
  • Buffs and procs occur during boss fights in WoW

This dynamic environment requires substantial attention to track, and the various sources of input are extremely stimulating.  This type of dynamic environment is extremely appealing to people with attention deficit disorder (ADD), like myself.  While we have substantial difficulty focusing on individual tasks, we tend to find great enjoyment when we can divide our attention between several sources of information simultaneously.  

I don’t think I recognized this appeal of games when I was younger, but certainly do now.  When I’m slaying a group of Draugr while avoiding traps in Skyrim or watching outbreaks and seeking cures in Pandemic, I enjoy a relaxed feeling.  It’s difficult to explain how stimulating activities can lead to relaxation, but in my case they do, and it’s a very comfortable way for me to enjoy myself.

Games as a Creative Outlet

Games, both electronic and tabletop, are interactive.  Truly, interactivity is one of the very few qualities all games must share to be considered games.  By their interactive nature, the player is able to influence the outcome.  By giving the player influence on the direction of events in a prefabricated setting, it becomes very easy for players to overlay their own narrative into the events, which may further guide later player choices.  This is essentially what role-playing is, application of a narrative to otherwise purely objective dice rolls in a relatively arbitrary system of rules.  While this method of creation storytelling very easy, it is also very shallow.  The magnitude of decisions and consequences is dictated by the narrator and the game system, and the results are decided by the interpretation of the dice by the narrator and players.  But nevertheless, I’ve found it to be an easy stepping off point for writing, a valuable creative outlet for me.

And such are the major highlights of why I find games so appealing.  Hopefully this history and set of opinions will provide some context for the arguments yet to come.  I'll be starting the actual discourse on the game shortly by examining what is necessary to make a game.

/endofline  

06 January, 2013

What Makes Games Great?


I love gaming.  It's great.  For so many reasons.  It's engaging.  It's stimulating.  It's social.  Playing a game is a chance to  escape reality and step into another life.  There are puzzles to solve, obstacles to overcome, decisions to make, foes to best and adventure to be had!

As you may have guessed, gaming (specifically tabletop gaming) is one my favorite pastimes.  I've played a lot of games during my [almost] 33 years on this planet and they continue to occupy a substantial amount of my free time.  There are a number of reasons I've developed such a passionate for gaming.  These reasons will be the topic of a future post, because I think they will help provide important context for these posts, but they are beyond the scope of this post.  When you spend a lot of time engaged in an activity, you naturally develop an appreciation for the activity and it's facets.  Eventually, I started to wonder WHY I enjoy games so much.  This wondering led to curiosity, which in turn led to examination.

I don't think there's any media that hasn't been improved by objective examination.  Movies, music, literature, theater; all of these media have dedicated fields of study that led to substantial improvements in our understanding of the media that has subsequently led to an increase in quality of the authored material.  I find it reasonable to view tabletop games as another form of entertainment or educational media, depending on the game's purpose or context.  Following the fusion of these two ideas (tabletop games as media & objective evaluation leads to improved content), I believe that our tabletop games can benefit from similar objective scrutiny.  It's for this reason I want to start putting my thoughts and observations on the matter to text.

I really can't be sure how novel this work is going to be.  While I have found several previous studies and essays on gaming, they have all concerned themselves primarily with the individuals and groups of individuals, instead of the games themselves.  Since this being treated as unknown territory, I expect the tone of the initial posts will be exploratory, and their primary purpose will be descriptive.  Overall, the purpose of these "What makes games great?" (WMGG) posts will be to first discuss the variety of current table top games and establish a functional vocabulary to succinctly describe their characteristics.  After we have an idea of the variety present in the games and we can easily communicate what we are looking at, we can start getting into more advanced ideas and concepts.

Because these posts will start as an exploratory exercise, I expect that some important points will be missed, mistakes will be made, and missteps taken, and for this reason I encourage my readers to provide constructive feedback on the material I post.   I am looking forward to seeing how the community receives these ideas.  Truly, what will be posted here is not intended to be a final product. There's still a lot for me to read, a lot to learn, a lot to find, a lot to integrate.  These concepts will undoubtedly will need to be refined and reworked, improving the quality of the idea with each iteration. 

I would like to be clear that these posts are NOT "What makes great games".  I feel this would be placing the cart before the horse.  The goal of this exercise is to create or define a framework in which we can work to describe and understand games (i.e. "what makes games great").  With this information, we will hopefully have better tools with which to design new games (i.e. "what makes great games").  I believe it is entirely possible to make a good game or a great game [or,  for that matter, a horrible game] with any concept or any rules.  The rules or concepts simply need to appropriately fit the context of the game.  As an analogy, the former exercise is similar to the examination of styles, techniques, materials, tools, et cetera used to create paintings to understand how they influence the final product, whereas the latter is an exercise in application of the styles, techniques, materials, tools, et cetera to create a painting.  Without the former, the latter is much more difficult.
Further, the purpose of these posts is not an absolutely exhaustive exercise in categorization, or a dissection of games into some atomic form.  These goals would be futile, since the useful information would be lost to us in such an exercise; The scale of our examination will be tailored to suit our goals, instead of forced to conform some arbitrary degree of consistency.

As a brief preview, some concepts I expect to address in this series are:
  • Cooperation vs competition
  • Symmetry
  • Stochasticity
  • Automation
  • Persistence
  • Degree of interaction

So, with all that, I'm undertaking this substantial project, but they say the longest journey starts with a single step.  It looks like I've just taken that step, now let's see where it goes.  I appreciate everyone and anyone who comes along for the ride.  I hope this is a productive exercise, and who knows, maybe we'll even make some progress.

/endofline

03 January, 2013

New year, new plan

I'm going to try to tack these posts from a different angle this year.  Instead of massive posts that are a bitch to read and worse to write, I'm going to try breaking them into smaller chunks.  Hopefully I can get more out if I'm not exhausting myself with extended posts.

In this spirit, I'll end this post here.  See its working already!  =D

/endofline

24 September, 2012

Now for something completely different

I want to take a break from all these heavy numbers and statistics to talk about video games for a second.  What I'm talking about may be yesterday's news, but... shut up.  There's enough news here to justify talking about it.

X-COM: Enemy Unknown

In X-COM: Enemy Unknown, you are placed in command of an elite agency of soldiers and scientists tasked with protecting the world from an invasion by an unknown alien menace.  This game was announced about 8 months ago, and caused the internet collectively shit it's pants.   The original X-COM (tagline: "UFO Defense" or "Enemy Unknown", depending on where in the world you were when it came out) was released originally by Microprose in 1994,  almost two decades ago.  Since then, it has been widely hailed as one of the best games of all time, and I strongly agree.  As has happened with so many of our favorite IPs In the last 10 years, X-COM has been "reimagined" by a new developer: Firaxis.

Important note: this is not the other rebooted X-COM game under development at 2K Marin.

The game is due out on 9.Oct in the NAm (12. Oct elsewhere), and Firaxis just released a playable demo today on Steam.

After playing the demo, any doubts that this game could be anything less be amazing died faster than a panicking rookie armed with a 9mm pistol.  

Like the original, the game is comprised of a tactical and strategic portion.  During the games tactical engagements, you are responsible for using your squad of soldier to eliminate the alien threat in an urban or natural environment.  In the strategic portion, you make decisions about funding allocation, research, soldier load-out and promotion, etc.

The demo consists of 2 tactical missions, and the chance to make some meaningless research decisions and promote a unit back at the base.  We get to see some basic interfaces and systems that are critical to playing the game.  Just enough to whet our appetites, and make us froth at the mouth like rabid badgers for two weeks while we wait for damn thing to be released for reallies.  

Even though it's just a tutorial, the first mission perfectly encapsulates the "X-COM Experience": Brutality.  I won't give away what that actually means; it would be unfair of me to ruin it for the uninitiated, but veterans of the original can guess what happens.  The effect is blunted a little bit because, like so many other tutorials, it tells you/forces your troop movements, including a move no veteran would ever make.  But younger players will happily march right in with out thinking.

The second mission gives you complete control except for the first few moves.  It introduces some of the aliens you'll be fighting as you try to defend the earth in the full game.  I lost my assault trooper in one round, even though I was being careful, and tossed a smoke grenade, but my other guys made it out w/o a scratch. I can only hope the rest of my troops will be able to be so lucky.

Anyway, make no mistake, I think this will be the gotta-have-it game of the year.


Faster-Than-Light

Faster-Than-Light, or FTL, is an crowd-funded indy game I picked up late last week via a sale somewhere (can't remember) that was released earlier this month.  The game is considered to be "Rogue-like" in that it is a top-down view of several rooms.  While they are very clean and refined, the graphics look like they would be at home on a 486 DX2 and a 800 x 600 monitor, but don't let that fool you; This game is a perfect example of why game budgets should concentrate on design, not graphics.

You are given command of a crew and starship and are tasked with delivering the plans for a super-weapon to your commanders 8 sectors away.  The primary interface is a top down view of the ship (This supposedly makes it "rogue-like"), which is divided into rooms/compartments, which your crew-members occupy.  Most compartments contain vital ship systems (weapons, helm, shields, engines, etc). Crew members occupying these compartments interact with these systems to give them small boosts, repair them, or receive bonuses from them.  You have direct control over the ship's navigation, energy distribution, modification, and attack strategy, in addition to control over where your crew members are stationed.  

The game play is really very simple and quick, but the real meat of the game is tactical and strategic decisions that lie just below the surface.  The different play styles that are possible given variety of available ship modifications are extremely diverse.  You are able to attack enemy ships with weapons, employ a variety of drones, teleport your crew onto enemy ships to sabotage systems or their kill crew, or FTL jump away from combat if you find yourself at a disadvantage.  At the same time you will have to protect your ship from enemy weapons fire, fend off enemy boarding actions, repair hull breaches, and avoid asteroids and solar flares.  

The game's difficulty has been well-tuned to be very difficult, but not impossible, even on the "easy" setting.  You don't know what you're getting into with each encounter until you're deep in it.  There's also no "reload last save" option; when you die (all your crew are dead, or the ship explodes), you failed, and have to start over from the beginning.  Game over.  This means a lot of play-throughs end well before the ship reaches it's destination.  But given the wide variety of play options available, restarting isn't all that bad.  You can take what you use what you learned in the previous attempt to make it a little further.

If you are able to survive all of your encounters, you will be "rewarded" with a boss to defeat in the final sector.  I almost shit a brick when I discovered I would have to fight the boss not once, but three times.  Each time it had a new configuration that required an entirely different approach to be successful.  Out of 20-30 play-throughs on easy, I have beat the game once.  And about a 1/2 second after I scored the final blow on the boss, my ship took a fatal blow.  But when I saw I had won after all, I was stoked!

This game is not going be for everyone, given that it really is pretty damned hard, you can play through great and lose on random encounters or make it to the end only to discover you don't have a chance in hell at beating the boss.  All this on top of no "reload from last save" option.  Butif you're able to strap your big-boy space pants, and learn to take your losses with your wins, there is a ton of fun to be had in this game.  

As an aside, FTL bears a striking resemblance to a board game known as Space Alert.  SA is one of my gaming group's all time favorites.  The game consists of 2-5 players cooperatively moving around a starship and activating consoles (shields, weapons, etc) to deal with threats trying to destroy the ship inside and out.  The game is also pretty brutal when you move beyond the easiest levels.  If you like this game, check out Space Alert if you can find a copy.  And if you like Space Alert, you're already 90% of the way to enjoying FTL.


/endofline

22 September, 2012

A quick primer on statistics, pt 2. Inferential Stats and Simulation

Last time I talked about statistics, I limited my discussion to the statistics used to describe the distribution of results from random processes.  These methods are the fundamental parts that can be assembled to understand the stat methods used to understand unknown parameters, and differences between unknown parameters.

What follows below is a whirlwind tour of what is essentially at least a quarter long class in upper division undergraduate statistics.  Again, Wikipedia and Khan Academy are great resources to learn more.

Inferential Statistics

Inferential statistics describes the set of methods used to estimate the unknown parameters of random events.  The most common of these methods rely on observed data to produce estimates of these unknown parameter values.

A classic example of the use of inferential statistics is to estimate the probability of an unfair coin, i.e. a coin that may not come up heads as frequently as it comes up tails when flipped in the air.  Another similar example (more applicable) would be to calculate the probability that an Edge of the Empire Dice pool would produce more successes than failures.

Presume we have no reliable way to calculate how often this coin will come up heads based on its physical qualities, and need an alternate method method to estimate this probability.  Essentially, we have the following situation, expressed in the notation I explained before:

Pr(Heads) = p

But we do not know the value of p, beyond the fact it lies between 0 (it never comes up heads, the probability of heads is 0%) and 1 (it always comes up heads, the probability of heads 100%).

Now that we've identified the problem, and what we're trying to find (the value of p), we will make some assumptions to VASTLY simplify our problem:
  1. p is constant during the experiment, i.e. p does not change value between flips.
  2. The results of each flip are independent, i.e. the results of one flip do not affect the results of any other flip.
  3. The only possible outcomes for each flip is heads or tails
  4. Every flip produces a valid outcome (either heads or tails).
  5. The variable X (the total number of heads in a set of n trials) has a binomial distribution, with parameters p and n.
The first two are basic, and I'm not going to discuss them with any depth, but in statistics we call this iid.  The third and fourth assumptions allow us to make the fifth, which states that we will assume that X conforms to the binomial distribution.  This is very commonly used distribution when we want to calculate the probability of an event.  Technically, the number of heads produced from a number of flips, X, is what is truly binomially distributed (as stated above), not the probability, p, but I'll reconcile this in a moment.

With the distribution defined, we have a paradigm to work within, and useful defined equations to produce parameter estimates.  The distribution has two parameters: the number of trials (in this case, each flip is a 'trial') and the probability of the trial being a success (in this case, success is the flip coming up heads).  Note that it is this second parameter is exactly what we are interested in estimating: Pr(Heads) = p. We also have control over the number of trials we perform, n.  So, it can be shown that the estimated expected value of X is equivalent to the number of successes divided by the number of trials performed.  Essentially: 

X/n = E(p) =  Pr(Heads)

This formula represents 2 different concepts:
  1. X/n is the proportion of trials in our experiment that came up heads.
  2. X/n is the probability of a single trial in our experiment to coming up heads, Pr(Heads).
These two concepts are equivalent: The proportion of successes on all trials may be interpreted as the probability of success on a single roll.  This is will important below.

All we need now, is the data.  To generate the data, we have to perform an experiment, in which we can simply take the coin, and flip it 20 times.  or 100 times. Or 100,000 times.  But let's start out small, with 20 coin flips, and we'll say this produced 7 heads.  Now we can calculate X:

7 heads total/20 trials = .35 = Pr(Heads)

This is essentially the scenic route to do exactly what you would have done anyway to figure this out.  But by giving the justification and walking through these steps we have a simple example that shows our route to get what we need: an estimate, or inference of the value of a previously unknown and incalculable parameter in a situation where we don't fully understand the underlying mechanism that produces the results.

Point Estimates and Sample Sizes

The value reported above, p = 0.35, is a point estimate of the probability of heads.  Point estimates are a measure of centrality, and indicate the most likely value of the parameter, given the data.  If you look back to the previous post, you should also be reminded of the difference between an estimate and a parameter, and see that this is an estimate.  Now, if we repeated the experiment (flip a coin 20 times, count the total number of heads), we may get different point estimates.  This further shows that the result is not necessarily the parameter value.  

If we wanted to be more confident about our estimate, we could increase our sample size by increasing the number of times we flip the coin.  Many curious minds may ask "why does increasing our sample size increase our confidence about the estimate?" which is a great question.  The details are beyond the scope of this discussion, so I'll simply invoke the Weak Law of Large Numbers, which states "as the sample number increases, the observed mean converges on the actual expected value".  So larger samples tend to produce more reliable (but not necessarily perfect) estimates of parameters.

Simulation or: "How I Learned to Stop Worrying and Love The RNG"

So, we have shown how to estimate a parameter based on observed data from experiments we have performed when we do not understand the underlying mechanism that produces the result.  Now, let's we re-examine our 1d6 example from yesterday.  Let's say we wanted to find the probability of rolling a 5 or 6 on any roll.  In this case, we  do understand the underlying distribution that produces the results: There is a 1/6 chance of producing, respectively, a 1, 2, 3, 4, 5, or 6 on any roll of the die.   We could use our knowledge of expected values to find the parameter value (in this case it would be 1/3), but that would be boring!

Instead, we use what we just learned about binomial distribution and inferential statistics to perform an experiment.  We roll 1d6, physically, 20 times, and get 8 rolls that came up a 5 or a 6.  Based on what we did above, this would lead us to estimate that there is a 8/20 = .4 chance that we roll a 5 or 6 on a die.  [Note that it would be impossible to calculate the REAL probability of 1/3 based on this experiment].

Now if we were to desire a more reliable estimate, we could continue to roll the die many more times, recording each result and calculating the overall proportion of trials that produced 5s or 6s, which we can interpret as the probability as any roll coming up a 5 or a 6.  However, this method becomes rather tedious, and we have other tools at our disposal to automate this process.

With some code, we can create a program that will randomly select a value from the set: {1, 2, 3, 4, 5, 6}, each with 1/6 probability, which is exactly the distribution we are sampling from, and calculate the proportion of results that are 5's or 6's, which we have established can be interpreted as probability.  This is known as Monte Carlo sampling, and relies on the computer's (pseudo)random number generator to randomly sample from known distributions to estimate parameter values.  By invoking the weak law of large numbers, the results of such a simulation should produce parameter estimates that converge to the actual expected values.  This requires no explicit calculation of expected values, which can become very complex in some situations, and much larger sample sizes can be produced in much less time than similar experiments.  It simply requires that we have a very good understanding the underlying distributions.

Technically, the computer is unable to produce truly random numbers, but today's pseudo-random number generators are so good anymore,there is practically no difference.


Confidence Intervals, Hypothesis Testing, and Simulation

Typically, the purpose of invoking inferential statistical method is to estimate parameters that are unknown and cannot be calculated or to estimate the difference between two or more parameters.  The former of these is typically done by calculating confidence intervals (CI's) from observed data and the latter done using hypothesis testing.  Really these are two sides of the same coin.  What you need to know is, as the sample size, n, increases, the confidence intervals become narrower (to represent that the parameter estimates are more reliable) and observed parameter estimate differences are more likely to be different, because larger sample sizes are more reliable to detect smaller differences.

The term "p-value" comes into play at this point, and is frequently recognized and frequently poorly understood concept, even by professionals that use statistics on a daily basis.  For the purposes of this discussion, people passingly familiar with this concept need to understand that everything I say about hypothesis testing bears true for p-value as well.

Back to the point! Which is: our ability to hypothesis test for a difference in estimates is dependent, at least in part, on our sample size.  This means that as we are able to make our sample in simulations arbitrarily large, CIs and hypothesis testing becomes fucking useless.  Further, the CIs are used to describe the uncertainty and distribution of the mean of the distribution, and not the distribution itself.

Enter the Probability Interval

Probability intervals are a concept I was first introduced to in studying Bayesian statistics (seriously, don't worry about it), and are similar to bayesian credibility intervals.  They are typically defined by the narrowest interval that contains XX% of the observations from the entire distribution of observations.  They are derived from the raw observed (or simulated) data, and briefly describe the entire data set, not just the mean (as CI's do).  They become more reliable as sample sizes increase, but do not become substantially more narrow as sample size increases.  This makes them ideal for discussing and reporting simulation data.  

Some things to remember about PI's:

  • PI's are not centered on the mean, since mean is not used to calculate them in any way
  • PI's are not symmetric around the median or the mode, since distributions may be asymmetric.
  • PI's do not rely on or assume an underlying distribution (CI's rely on the normal)
  • PI's may be reported with different %'s, e.g. 90% PI means the PI covers 90% of the observations, and a 95% PI would cover 95% of the observations.
  • PI's are only a synopsis, there is information lost when ONLY a PI is reported.  Full histograms are usually necessary to fully visualize a distribution.
Alright... Thats enough for now.  With all the tools I need at least mentioned, even though nobody really cares, I can start talking about what I really want to talk about:

The probability implications of the Edge of the Empire dice system... FINALLY!!!

/endofline

EDIT: Sorry for the delay on this post.  It was sitting at around 90% finished most of the week, but I fell into EotE forum discussions and FTL... Which is AWESOME!!! TRY IT!!!  BUY IT!!!