# Probability Cheat Sheet: Rules, Laws, Concepts, and Examples

##### Categories

• Written by:
Nathan Rosidi
Author Bio

This probability cheat sheet equips you with knowledge about the concept you can’t live without in the statistics world. Yes, it’s probability!

Probability is one of the fundamental statistics concepts used in data science. It’s an essential part of machine learning, and you should understand it thoroughly before you jump to writing algorithms and building your models.

To help you with that, we prepared this probability cheat sheet with examples.

## What is Probability?

Let's begin our probability cheat sheet by exploring the concept of probability.

Definition: Probability is a mathematical concept for measuring the likelihood of an event occurring. In other words, probability quantifies how likely an event is to occur on a scale from 0 to 1. Zero means impossibility, and 1 indicates certainty. Probability is expressed as a fraction, decimal number, or percentage.

Formula: The probability of an event occurring is calculated like this.

$P = \frac{Number\ of\ Favorable\ Outcomes}{Total\ Number\ of\ Outcomes}$

Example: In flipping a coin, these are the values for calculating the probability of getting a head.

Number of Favorable Outcomes = 1 (there’s one head)
Total Number of Outcomes = 2 (there’s one head and one tail)

The probability is:

$P = \frac{Number\ of\ Favorable\ Outcomes}{Total\ Number\ of\ Outcomes} = \frac{1}{2} = 0.5 = 50\%$

## What Are the Rules of Probability?

Next, we will explore the rules in our probability cheat sheet.

Definition: The probability rules are the basic principles defining the foundation of probability.

Definition (mutually exclusive events): The exclusive events are those that can’t occur at the same time. The probability of the occurrence of one event or another event, in that case, is the sum of their individual probabilities.

Formula (mutually exclusive events):

$P(A\cup B) = P(A) + P(B)$

Note:

$P(A\cup B)\ = P(A\ or\ B)$

Example (mutually exclusive events): Consider a single tossing of a coin. When you’re tossing it, two events can occur:

B = getting tails

We can calculate the probability of getting head or tails in one coin toss. Let’s first calculate the probability for each event.

$P(head) = P(A) = \frac{Number\ of\ Favorable\ Outcomes}{Total\ Number\ of\ Outcomes} = \frac{1}{2} = 0.5 = 50\%$

$P(tails) = P(B) = \frac{Number\ of\ Favorable\ Outcomes}{Total\ Number\ of\ Outcomes} = \frac{1}{2} = 0.5 = 50\%$

Use these values in the probability formula. The probability of getting head or tails in one coin toss is 100%.

$P(head\cup tails) = P(A\cup B) = P(A) + P(B) = \frac{1}{2} + \frac{1}{2} = 1 = 100\%$

Definition (mutually non-exclusive events): The two events are mutually exclusive if they can occur simultaneously. In that case, the probability of either event occurring is the sum of their individual probabilities minus the probability of both events occurring together.

Formula (mutually non-exclusive events):

$P(A\cup B) = P(A) + P(B) - P(A\cap B)$

Note:

$P(A\cap B) = P(A\ and\ B)$

Example (mutually non-exclusive events): In this question by Meta, you need to calculate the probability of pulling a different color or shape card from a shuffled deck of 52 cards.

Pulling a card of a different color or shape are mutually non-exclusive events. We’ll call them events A and B:

A = probability of selecting a card with a different color
B = probability of selecting a card with a different shape

Let’s calculate the probability of event A. A deck of cards has 52 cards; 26 are black, 26 are red. If we pull one card out randomly, 26 cards of the other color will still be left, i.e., the number of favorable events. In total, there are 51 cards left, i.e., the total number of outcomes.

$P(A) = \frac{Number\ of\ Favorable\ Outcomes}{Total\ Number\ of\ Outcomes} = \frac{26}{51} = 0.5098 = 50.98\%$

Now, the probability of event B. Before drawing a card, there are 13 cards of each of the four shapes: clubs, diamonds, hearts, and spades. After we draw one card, there are 3 remaining different shapes, each with 13 cards. Again, 51 cards left is after one drawing.

$P(B) = \frac{Number\ of\ Favorable\ Outcomes}{Total\ Number\ of\ Outcomes} = \frac{3*13}{51} = \frac{39}{51} =\frac{13}{17} = 0.7647 = 76.47\%$

Now, we need the probability of pulling a second card of different colors and shapes. In other words, the probability of events A and B occurring at the same time. Two shapes (spades and clubs) are black, and two (hearts and diamonds) are red. With drawing, there remain only two other shapes with 13 cards each that are of different colors. The total number of outcomes is 51.

$P(A\cap B) = \frac{Number\ of\ Favorable\ Outcomes}{Total\ Number\ of\ Outcomes} = \frac{2*13}{51} = \frac{26}{51} = 0.5098 = 50.98\%$

So, to answer the question, we need to plug these values into the mutually non-exclusive events probability formula. The probability of pulling a different color or shape card from a shuffled deck of 52 cards is 76.47%.

$P(A\cup B) = P(A) + P(B) - P(A\cap B) = \frac{26}{51} + \frac{39}{51} - \frac{26}{51} = \frac{39}{51} = \frac{13}{17} = 0.7647 = 76.47\%$

#### 2. Rule of Complementary Events

Definition: The probability of an event not occurring is equal to one minus the probability of that event occurring. In other words, the sum of the probabilities of an event and its complement (the event not happening) is always one.

Formula:

$P(A') = 1 - P(A)$

A’ = The complement of the event A, i.e., the event A not occurring

Example: Let’s modify this question by Belvedere Trading a little. Sure, we’ll calculate the probability of getting a 6 when rolling a die. But we’ll also use the rule of complementary events to calculate the probability of not getting a 6.

A dice has six sides, so the total number of outcomes is 6. We want to get the number 6 when rolling a die, so the number of positive outcomes is 1.

$P(A) = \frac{Number\ of\ Favorable\ Outcomes}{Total\ Number\ of\ Outcomes} = \frac{1}{6} = 0.1667 = 16.67\%$

So, 16.67% is the probability of getting the number 6. This answers the interview question.

$P(A') = 1 - P(A) = 1 - \frac{1}{6} = \frac{5}{6} = 0.8333 = 83.33\%$

This result means that the probability of not getting a 6 when rolling a die is 83.33%.

#### 3. Rule of Conditional Probability

Definition: The rule of conditional probability explains the probability of an event occurring, given that another event has already occurred.

In other words, it describes how the likelihood of an event might change based on the occurrence of a related event. Under conditional probability, we're interested in the probability of one event (Event A) happening if a related event (Event B) has already taken place.

With this rule, we can adjust our calculations based on the new information provided by the occurrence of Event B.

Formula:

$P(A\ |\ B) = \frac{P(A\ \cap\ B)}{P(B)}$

Note:

$P(A\ |\ B) = P(A\ given\ B)$

Example: This is a question from the Goldman Sachs interview. It gives you its own definitions of certain values that we can use to calculate the probability that Company X goes bankrupt before T.

It’s a conditional probability. We can use the above formula to plug in all the values given to us by the question.

Let’s first define the events.

A = company X goes bankrupt between T and T+dT
B = company X doesn’t go bankrupt until T

The question asks to find the probability that the company does go bankrupt. It’s a complementary event to the event B. So, we want to find P(B’).

The question defines the conditional probability as K*dT. Therefore:

$P(A\ |\ B) = K*dT$

So, the probability of the complementary event is this.

$P(B') = 1-P(B)$

From that, we know this is also true.

$P(B) = 1-P(B')$

Again, this is the conditional probability formula.

$P(A\ |\ B) = \frac{P(A\ \cap\ B)}{P(B)}$

Let’s plug the above values into it.

$K*dT = \frac{P(A\ \cap\ B)}{1-P(B')}$

From there, we can get P(B’). It’s the probability that Company X will go bankrupt before T.

$P(B') = \frac{K*dT - P(A\ \cap\ B)}{K*dT}$

#### 4. Multiplication Rule

Definition (independent events): The multiplication rule determines the probability of two events happening together. If the two events are independent, then the occurrence of one doesn’t influence the occurrence of the other. In that case, the probability of both events happening together is the product of their individual probabilities.

Formula (independent events):

$P(A\cap\ B) = P(A)\cdot P(B)$

Example: Let’s assume that you’re tossing a coin. You want to calculate the probability of getting tails in two consecutive tosses. These events are independent, as getting tails the first time doesn’t change the probability of getting tails the second time.

So the probability is 25%.

$P(A\cap\ B) = P(A)\cdot P(B) = \frac{1}{2}\cdot\frac{1}{2} = \frac{1}{4} = 0.25 = 25\%$

Definition (dependent events): Two events are dependent when the occurrence of one does influence the occurrence of the other event. In that case, the probability of both events happening together is the product of the probability of the first event and the conditional probability of the second event, given that the first event has occurred.

Formula (dependent events):

$P(A\ \cap\ B) = P(A)\cdot P(B\ |\ A)$

Example: Here’s a question that appeared in the Zenefits, Pinterest, and Quora interviews.

We need to calculate the probability that one number will double the other if we pull two random cards numbered from 1 to 100.

First, let’s define the events.

X1 = the number of the first card pulled from the deck
X2 = the number of the second card pulled from the deck

There are two scenarios when one number will double the other.

Scenario 1: The second card is double the value of the first card. For that to happen, X1 has to belong to the set {1, 2, …, 50}, so:

$X_2 = 2*X_1$

Otherwise, X2 can’t be a double of X1.

Scenario 2: The second card is half the value of the first card. This means that X1 had to be an even number, i.e., divisible by 2. In that case,

$X_1 = 2*X_2$

From the two scenarios come the following events.

A = X1 belongs to the set {1, 2, …, 50}
B = X1 is an even number
C = X2 is double the value of X1
D = X2 is half the value of X1

We’ll now calculate the probability for each of the two scenarios separately.

Scenario 1: This scenario involves dependent events A and C. We’ll use the multiplication rule to get the probability the following way. There are 50 positive outcomes for the A event. Given A, there’s only one positive outcome for event C, i.e., only one card (of the remaining 100) can be double the value of the first card.

$P(second\ card\ doubles\ the\ first\ card) = P(2*X_1 = X_2) = P(A)\cdot P(C\ |\ A) = \frac{50}{100}\cdot \frac{1}{99} = \frac{50}{9900} = \frac{1}{198}$

Scenario 2: In this scenario, we have events B and D. We’ll apply the same formula as in the previous scenario. Again, there will be 50 positive outcomes for the event B. Given B, there’s only one card that can be half the value of the first card, i.e., one positive outcome for event D.

$P(second\ card\ is\ half\ the\ first\ card) = P(X_1 = 2*X_2) = P(B)\cdot P(D\ |\ B) = \frac{50}{100}\cdot \frac{1}{99} = \frac{50}{9900} = \frac{1}{198}$

To answer the question, we need to sum the probabilities of Scenario 1 and Scenario 2 using the addition rule.

$P(one\ card\ doubles\ the\ other\ from\ the\ deck) = P(Scenario\ 1) + P(Scenario\ 2) = \frac{1}{198} + \frac{1}{198} = \frac{2}{198} = \frac{1}{99} = 0.0101 = 1.01\%$

If we pull two random cards numbered from 1 to 100, there’s a 1.01% probability that one card will be double the other.

## What Are the Laws of Probability?

Next on our Probability cheat sheet, here are the laws of Probability you should be familiar with.

Definition: The laws of probability are the more advanced principles derived from the basic rules of probability.

#### Bayes’ Theorem

Definition: Bases’ theorem is a conditional probability statement that defines the probability of an event happening, given that another event related to the first one has already happened.

Formula: The Bayes’ theorem formula is given below.

$P(A\ |\ B) = \frac{P(B\ |\ A)*P(A)}{P(B)}$

P(A | B) = the posterior probability or the probability of the event A occurring given that the event B has happened
P(B | A) = the likelihood or the probability of the event B occurring given that the event A has happened
P(A) = the prior probability or the initial probability of the event A
P(B) = the marginal likelihood or the total probability of the event B occurring

If the normalizing constant P(B) is not given, you can use this Bayes theorem version.

$P(A\ |\ B) = \frac{P(B\ |\ A)*P(A)}{P(B\ |\ A)*P(A) + P(B\ |\ \neg A)*P(\neg A)}$

Example: Here’s an interview question from Zenefits, AXA, and MetLife that requires you to know Bayes’ theorem.

We need to calculate the probability of a stock actually going up when the program predicts it will go up.

Here, P(A) is the prior probability that the stock is going up. It will go up, or it won’t, so the probability is 0.5.

P(B | A) is the likelihood that the stock will go up when the program says so. The program’s accuracy rate is 60% or 0.6; this is our probability in this case.

P(B) is the probability that the program will say the stock will go up. We don’t have this value, so we have to use the modified version of Bayes’ theorem.

$P(A\ |\ B) = \frac{P(B\ |\ A)*P(A)}{P(B\ |\ A)*P(A) + P(B\ |\ \neg A)*P(\neg A)}$

We know that the probability of the program predicting the increase in value when the stock actually goes down is calculated like this. Remember the rule of complementary events?

$P(B\ |\ \neg A) = 1 - P(B\ |\ A) = 1 - 0.6 = 0.4$

We apply the same rule to get the probability of event A not happening, i.e., the stock goes down.

$P(\neg A) = 1 - P(A) = 1 - 0.5 =0.5$

Now, we can plug all this into the Bayes’ theorem formula.

$P(A\ |\ B) = \frac{0.6*0.5}{0.6*0.5 + 0.4*0.5} = \frac{0.3}{0.3 + 0.2} = \frac{0.3}{0.5} = 0.6 = 60\%$

In plain English, the probability for the stock to go up when the program says it will go up is 60%.

#### Law of Total Probability

Definition: It’s a law that calculates the probability of an event by considering all the ways it can happen based on a partition given by another event. In other words, you sum up the probabilities of B occurring given each event Ai, weighted by the probability of each Ai.

Formula: Translated into a formula, the law of total probability is expressed like this.

$P(B) = \sum_{i=1}^{n}P(B\ |\ A_i) * P(A_i)$

Example: The classic example of total probability is a disease that affects 2% of the population. The test for having a disease has the following characteristics:

1. If a person has a disease, the test will correctly identify them as positive 90% of the time.
2. If a person doesn’t have a disease, the test will falsely identify them as positive 3% of the time.

What is the probability that the random person tested positive?

For this, we need to use the law of total probability. There is one event B, and two events A:

1. B = the person tested positive
2. A1 = the person has the disease
3. A2 = the person doesn’t have a disease

The two possible outcomes in case of a positive test are:

1. The person has the disease and tests positive.
2. The person doesn’t have the disease but tests positive.

The values we need for the calculation are:

P(A1) = the probability of having the disease
P(A2) = the probability of not having the disease
P(B|A1) = the probability of testing positive and having the disease
P(B|A2) = the probability of testing positive and having the disease

The calculation will look like this.

$P(B) = \sum_{i=1}^{n}P(B\ |\ A_i) * P(A_i) \\ \ \\ = P(B\ |\ A_1) * P(A_1) + P(B\ |\ A_2) * P(A_2) \\ \ \\ = 0.9 * 0.02 + 0.03 * (1 - P(A_1)) \\ \ \\ = 0.9 * 0.02 + 0.03 * 0.98 \\ \ \\ = 0.0180 + 0.0294 \\ \ \\ = 0.0474 \\ \ \\ = 4.74\%$

So, the probability that the random person tests positive is 4.74%.

## Probability Distributions

Definition: A probability distribution is a mathematical representation of the probabilities of all possible outcomes for a random variable in a given scenario. In other words, it defines how the likelihood of each possible outcome is distributed across the range of values that the random variable can take.

### Discrete Probability Distributions

Definition: Discrete probability distributions describe the probabilities that the event outcome will take one of the values in a discrete set. Discrete value is the one that takes a certain countable and finite or countably infinite value, commonly expressed as a whole number or integer.

Important probability concepts:

PMF: Discrete probability distributions are defined by the Probability Mass Function (PMF). It assigns a probability to each possible outcome.

CDF: If calculating the probability that the variable will take on a value less than or equal to a given value, you should use the Cumulative Distribution Function (CDF). This is useful also when you want to find the probability of a variable taking on a value from the defined range.

It is given by this formula:

$F_X(x) = P(X \leq x)$

To find the probability that the random variable takes value from the interval, use this formula.

$P(a

Mean or Expected Value: The mean is the average value of a random variable. If taking many measurements from the probability distribution, the mean would represent a long-term average or expected value.

Median: The median is the middle value of the dataset arranged in ascending order. In the probability distribution, this means that half of the probability mass lies to the left of the median and half to the right of it.

#### Discrete Uniform Distribution

Definition: The discrete uniform distribution means that every outcome within a sample is equally likely to occur.

Formulas: The PMF of the uniform distribution is calculated in the following way.

$f(x) = P(X = x) = \frac{1}{n}$

X = the random variable
x = a particular outcome
n = the total number of equally likely outcomes

The CDF formula looks like this.

$F(x) = \begin{cases} 0, &\quad \text{if}\ x < a \\ \frac{x-a+1}{n}, &\quad \text{if}\ a\leq x \leq b \\ 1, &\quad \text{if}\ x > b \end{cases}$

a = the minimum value of the random variable
b = the maximum value of the random variable

The mean or expected value is given by this formula.

$E(X) = \mu =\frac{a+b}{2}$

The median is the same as mean.

$\tilde{x} =\frac{a+b}{2}$

Curve: The curve of the uniform distribution looks like this.

Example: If you have a six-sided die, then n = 6. So the probability of getting any of six sides, e.g., 3, is 16.67%

$P(X = 3) = \frac{1}{n} = \frac{1}{6} = 0.1667 = 16.67\%$

#### Binomial Distribution

Definition: It describes the number of successes in a fixed number of independent trials, i.e., Bernoulli trials. There are only two possible outcomes of each trial: success or failure.

Formulas: The binomial distribution has two parameters. The first one is the number of trials, n. The second one is the probability of success in a single trial, p. Here, the p is constant.

Here’s the PMF formula.

$P(X = k) = \binom{n}{k}p^k(1-p)^{n-k}$

P(X = k) = the probability of observing k successes in n trials
nk  = the binomial coefficient; the number of ways to choose k successes from n trials
p = the probability of success on a single trial
(1-p) = the probability of failure on a single trial
n = the total number of trials
k = the number of successes; can take values from to 0 to n

Note: The binomial coefficient is calculated the following way.

$\binom{n}{k} = \frac{n!}{k!(n-k)!}$

The CDF is calculated this way.

$F(x) = P(X\leq x) = \Sigma_{k=0}^{x}\binom{n}{k}p^k(1-p)^{n-k}$

If you’re doing the same for the interval, just sum the CDFs for each value in the interval.

The expected value formula is given below.

$E(X) = \mu = n*p$

Regarding the median, there’s no unique formula. If n*p is an integer, then the median is the same as the mean.

$\tilde{x} = n*p$

If it’s not an integer, then you need to find the CDF for each k number of successes. The one that gets the closest to 0.5 is your median.

Curve: Here’s the binomial distribution curve shown with the normal distribution.

Example: We can use the binomial distribution formula to solve this question asked by Microsoft, Apple, and IBM.

Given the probability of a bulb being defective, we need to calculate the probability of 2 out of 10 bulbs being defective. Again, this is the formula we’ll use.

$P(X = k) = \binom{n}{k}p^k(1-p)^{n-k}$

Let’s define the values so we can plug them into the formula.

n = the total number of bulbs is 10
k = the number of defective bulbs is 2
p = the probability of a bulb being defective is 0.05
nk = the binomial coefficient showing the number of ways to choose two defective bulbs from the package of ten

With these values, the probability formula looks like this.

$P(X = 2) = \binom{10}{2}0.05^2(1-0.05)^{10-2}$

Let’s first calculate the binomial coefficient.

$\binom{10}{2} = \frac{10!}{2!(10-2)!} = \frac{10!}{2!*8!} = 45$

We’ll now use this value to get the answer to the question.

$P(X = 2) = 45*0.05^2*0.95^8 = 0.0746 = 7.46\%$

So the probability that two out of ten bulbs are defective is 7.46%.

#### Bernoulli Distribution

Definition: The Bernoulli distribution is a special case of the binomial distribution where the number of trials is 1.

Formulas:

$P(X = x) = p^x(1-p)^{1-x}$

P(X = x) = the probability that the random variable X takes on the value x
p = the probability of success, i.e., X = 1
x = it can be either 0 (failure) or 1 (success)

Since x can have only two values, from this follows that there are two (shorter) formulas for Bernoulli distribution:

1. When x = 1 (success):

$P(X = 1) = p$

2. When x = 0 (failure):

$P(X = 0) = 1 - p$

Now, here’s the CDF that gets these values in three scenarios.

$F(x) = \begin{cases} 0 & \text{if } x < 0 \\ 1-p & \text{if } 0 \leq x < 1 \\ 1 & \text{if } x \geq 1 \end{cases}$

The expected value for Bernoulli distribution is:

$E(X) =\mu = p$

Regarding the median, it’s the same as the value x. There are two cases:

1. If p ≥ 0.5, the median is 1.
2. If p < 0.5, the median is 0.

Curve: Here’s the distribution curve.

Example: Here’s a simple example from Belvedere Trading. We’ll solve it using the Bernoulli distribution.

We need to calculate the probability of getting 6 from one roll of a dice. For the Bernoulli distribution, this means:

• P(X = 1) = the probability of success, i.e., getting 6
• P(X = 0) = the probability of all other outcomes

We know what the formula is in the case of success. The probability of getting a 6 is 1/6 since there is one favorable outcome out of six.

$P(X = 1) = p =\frac{1}{6} = 0.1667 = 16.67\%$

#### Poisson Distribution

Definition: It’s a distribution that describes the number of events in a fixed interval of time or space if these events occur with a known average rate and independently of the time since the last event.

Formula: Here’s the formula for calculating Poisson’s PMF.

$f(x) = P(X = x) = \frac{\lambda^x e^{-\lambda}}{x!}$

P(X = x) = the probability of observing x events
λ = the average rate (or mean) of occurrences in the given interval
e = the base of the natural logarithm ≈ 2.71828
x = non-negative integer (0, 1, 2, …)

The CDF of this distribution is given here.

$F(x) = \sum_{i=0}^{\lfloor x \rfloor} \frac{\lambda^i e^{-\lambda}}{i!}$

i = non-negative integer values

The ‘incomplete square brackets’ around x are called the floor function. You can read more about it here.

The expected value for this distribution is calculated like this.

$E(X) = \mu = \lambda$

The median is approximated using the following formula.

$\tilde x \approx \lfloor \lambda + \frac{1}{3} - \frac{1}{50\lambda}\rfloor$

Curve: Here’s the Poisson distribution curve.

Example: We’ll use the example that appears in the official solution of this problem from Google and Amazon interviews.

We want to calculate the probability of a football team scoring a specific number of goals in a match, given the average number of goals they score per match. If a team has an average of 1.5 goals per match, we want to calculate the probability of them scoring 3 goals in a match.

From this, we know that:

λ = 1.5
x = 3

Plug these values into the Poisson formula, and you get that the probability of the team scoring 3 goals in a match is 12.55%.

$P(X = 3) = \frac{1.5^3 e^{-1.5}}{3!} =\frac{3.375 * 0.22313}{6} = 0.1255 = 12.55\%$

#### Geometric Distribution

Definition: It represents the number of Bernoulli trials required for success to occur. Theoretically, the trials can go on indefinitely.

Formulas: The PMF formula is given below.

$f(x) = P(X = x) = p(1-p)^{x-1}$

p = the probability of success on any given trial

The CDF formula for the geometric distribution looks like this.

$F(x) = 1 - (1-p)^x$

When calculating the expected value, use the following formula.

$E(X) = \mu = \frac{1}{p}$

You’ll get the median of a geometric distribution with this formula.

$\tilde x = \lceil \frac{-1}{log_2(1-p)}\rceil$

Curve: The geometric distribution curve looks like this.

Example: Let’s solve this interview question by Meta.

We basically need to find the number of rolling two dice and getting 5. We’ll also calculate the expected earnings.

The number of possible outcomes with two dices on a single roll is:

$Possible\ outcomes = 6*6 = 36$

There are four ways in which you can get a sum of 5, i.e., win the game:

1. Getting 1 on the first die and 4 on the second die = {1, 4}
2. Getting 2 on the first die and 3 on the second die = {2, 3}
3. Getting 3 on the first die and 2 on the second die = {3, 2}
4. Getting 4 on the first die and 1 on the second die = {4, 1}

From there, the probability of winning in a single roll is:

$p = \frac{Number\ of\ Positive\ Outcomes}{Total\ Number\ of\ Outcomes} = \frac{4}{36} = \frac{1}{9} = 0.1111 = 11.11\%$

The expected number of rolls is:

$E(X) = \frac{1}{p} = \frac{1}{\frac{1}{9}} = 9$

This means that, on average, you can expect to roll the dice 9 times for you to win.

The question doesn’t require this, but let’s calculate the probability of winning in 9 rolls.

$P(X = 9) = p(1-p)^{x-1} = \frac{1}{9}*(1-\frac{1}{9})^{9-1}= \frac{1}{9}*\left(\frac{8}{9}\right)^8 = 0.0433 = 4.33\%$

Now we can go back to the expected number of rolls and calculate the expected earning. The income is defined as earnings minus expenses. We will calculate our income after 9 rolls. Each roll costs $5, and we win, we win$10.

$Income = Earnings - Expenses = 10 * (number\ of\ wins) - 5*(number\ of\ rolls) = 10*1 - 5*9 = 10 - 45 = -35$

### Exponential Distribution

Definition: It’s a distribution that describes the time between events in a Poisson process, where events occur continuously and independently at a constant average rate. It is usually used to model the time until the event.

Formulas: The probability density function of the exponential distribution is given below.

$f(x;\ \lambda) = \lambda e^{-\lambda x}$

f(x; λ) = the probability density at value x
λ  = the rate parameter, i.e., the average number of events per unit of time

The cumulative distribution function of the exponential distribution is:

$F(x) = P(X =x) = 1 - e^{-\lambda x}$

The expected value of the exponential function is:

$E(X) = \mu = \frac{1}{\lambda}$

The median for the exponential function is calculated like this.

$\tilde x = \frac{ln(2)}{\lambda}$

Curve: Here’s the distribution’s curve.

Example: If the customers arrive at a bar at an average rate of four customers per hour, what is the probability that the next customer will arrive within 15 minutes?

We can use the CDF formula where our x is 15 minutes or 0.25 hours.

$P (X \leq 0.25) = 1 - e^{-4*0.25} = 1 - 2.718282^{-1} = 1 - 0.3679 = 0.6321 = 63.21\%$

The probability that the customer comes into a bar in the next 15 minutes is 63.21%.

### Chi-Squared Distribution

Definition: It’s a distribution of k degrees of freedom describing the distribution of a sum of squared random variables. The chi-squared distribution can be found in statistical inference, especially in hypothesis testing and confidence interval estimation.

Formula: The probability density function of this distribution is calculated like this.

$f(x;\ k) = \frac{1}{2^{\frac{k}{2}}\cdot \Gamma\left( \frac{k}{2}\right)}\cdot x^{\left(\frac{k}{2}-1\right)}\cdot e^{-\frac{x}{2}}$

f(x; k) = the probability density at value x
k = the degrees of freedom which must be a positive integer
Γ(k2) = the gamma function evaluated at (k2)

The cumulative distribution function formula looks like this.

$F(x; \ k) = \frac{\gamma(\frac{k}{2}, \frac{x}{2})}{\Gamma(\frac{k}{2})}$

The expected values is equal to the degrees of freedom, or:

$E(X) = k$

The median is approximated using the following formula.

$\tilde x \approx k(1-\frac{2}{9k})^3$

Curve: Here’s the curve.

Example: Let’s say a chi-squared distribution where the degrees of freedom is 4. What is the probability that the random variable falls between 3 and 8?

Again, we need to subtract one CDF from another to calculate that.

$P(a < X\leq b) = CDF_{\chi^2}(8;\ 4) - CDF_{\chi^2}(8;\ 3)\\ \ \\ = \frac{\gamma(\frac{4}{2},\ \frac{8}{2})}{\Gamma(\frac{4}{2})} - \frac{\gamma(\frac{4}{2},\ \frac{3}{2})}{\Gamma(\frac{4}{2})}\\ \ \\ = \frac{\gamma(2,\ 4)}{\Gamma(2)} - \frac{\gamma(2,\ 1.5)}{\Gamma(2)} \\ \ \\ = \frac{0.9804}{1} - \frac{0.4422}{1} \\ \ \\ = 0.4662 \\ \ \\ = 46.62\%$