Discrete Probability
Probability is a fundamental concept in mathematics that deals with uncertainty and the likelihood of events occurring. It provides a framework for quantifying and reasoning about uncertainty, which is essential in fields like statistics, computer science, finance, engineering, and more.
This comprehensive lecture will cover the foundational principles of probability, progressing from basic to advanced topics:
- Introduction to Probability
- Basic Probability Definitions
- Rules of Probability
- Conditional Probability
- Independence
- Bayes’ Theorem
- Random Variables
- Expectation (Expected Value)
- Bernoulli Trials and Binomial Distribution
- Advanced Applications
Experiment: A procedure that yields one out of several possible outcomes. It’s an action or process whose outcome is uncertain.
Sample Space ($S$): The set of all possible outcomes of an experiment.
Event: A subset of the sample space. An event occurs if the outcome of the experiment is an element of the event set.
An outcome is a single possible result of an experiment, while an event can consist of one or more outcomes.
- Experiment: Rolling two six-sided dice, one red and one blue.
- Sample Space ($S$):
- All possible ordered pairs $(x, y)$ where $x$ is the outcome on the red die and $y$ is the outcome on the blue die.
- Total outcomes: $6 \times 6 = 36$.
- Event: Getting a sum of 7.
- Event set $E$: ${ (1,6), (2,5), (3,4), (4,3), (5,2), (6,1) }$.
- Experiment: Drawing a single card from a standard deck of 52 playing cards.
- Sample Space ($S$):
- All 52 unique cards.
- Event: Drawing a heart.
- Event set $E$: All 13 heart cards in the deck.
- Experiment: Drawing a tile from a complete set of Mahjong tiles.
- Sample Space ($S$):
- All 136 tiles in the traditional Mahjong set.
- Event: Drawing a Dragon tile.
- Event set $E$: The 12 Dragon tiles (4 Red, 4 Green, 4 White).
A probability distribution assigns a probability to each outcome in the sample space.
- Properties:
- $0 \leq P(s) \leq 1$ for every $s \in S$.
- $\sum_{s \in S} P(s) = 1$.
The probability of an event $E$ is the sum of the probabilities of the outcomes in $E$:
\[ P(E) = \sum_{s \in E} P(s) \]
- Definition: A distribution where every outcome in the sample space is equally likely.
- Probability of each outcome: If $|S| = n$, then $P(s) = \frac{1}{n}$ for all $s \in S$.
- Sample Space: $S = {1, 2, 3, 4, 5, 6}$.
- Probability of each outcome: $P(s) = \frac{1}{6}$.
- Experiment: Randomly selecting a piece from all the pieces used in Xiangqi.
- Sample Space: All 32 pieces (16 red and 16 black pieces).
- Uniform Distribution: Each piece has a probability of $\frac{1}{32}$.
- Definition: A distribution where outcomes have different probabilities.
- Application: Used when certain outcomes are more likely than others.
- Suppose a six-sided die is weighted so that the number 6 comes up twice as often as the other numbers.
- Assigning Probabilities:
- For numbers 1-5: $P(s) = \frac{1}{7}$.
- For number 6: $P(6) = \frac{2}{7}$.
- Verification:
- Total probability: $5 \times \frac{1}{7} + \frac{2}{7} = 1$.
Definition: Events $A$ and $B$ are mutually exclusive if $A \cap B = \emptyset$.
Rule:
\[ P(A \cup B) = P(A) + P(B) \]
Rule:
\[ P(A \cup B) = P(A) + P(B) - P(A \cap B) \]
- Problem: What is the probability of drawing a card that is a King or a Heart from a standard deck?
- Solution:
- $P(\text{King}) = \frac{4}{52} = \frac{1}{13}$.
- $P(\text{Heart}) = \frac{13}{52} = \frac{1}{4}$.
- $P(\text{King} \cap \text{Heart}) = \frac{1}{52}$.
- $P(\text{King} \cup \text{Heart}) = \frac{1}{13} + \frac{1}{4} - \frac{1}{52} = \frac{16}{52} = \frac{4}{13}$.
Definition: The complement of an event $A$, denoted $A^c$, consists of all outcomes not in $A$.
Rule:
\[ P(A^c) = 1 - P(A) \]
- Problem: What is the probability of not rolling a six on a fair die?
- Solution:
- $P(\text{Six}) = \frac{1}{6}$.
- $P(\text{Not Six}) = 1 - \frac{1}{6} = \frac{5}{6}$.
For Three Events:
\[ P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(A \cap B) - P(A \cap C) - P(B \cap C) + P(A \cap B \cap C) \]
- Problem: In a group of people:
- 30 speak English.
- 25 speak Mandarin.
- 20 speak Spanish.
- 10 speak both English and Mandarin.
- 5 speak both English and Spanish.
- 3 speak both Mandarin and Spanish.
- 2 speak all three languages.
- Question: What is the probability that a randomly selected person speaks at least one of the languages?
- Solution:
- Total number of people $N$ is needed. Suppose $N = 50$.
- Apply the Inclusion-Exclusion Principle to find the number of people who speak at least one language.
- Calculate probabilities accordingly.
Conditional Probability: The probability of event $A$ given that event $B$ has occurred.
Formula:
\[ P(A|B) = \frac{P(A \cap B)}{P(B)}, \quad \text{provided } P(B) > 0 \]
- Problem: From a standard deck, what is the probability that a drawn card is a Queen, given that it’s a face card?
- Solution:
- $P(\text{Queen}|\text{Face card}) = \frac{P(\text{Queen} \cap \text{Face card})}{P(\text{Face card})}$.
- Total face cards: 12 (Jacks, Queens, Kings of each suit).
- Total Queens: 4.
- $P(\text{Queen}|\text{Face card}) = \frac{4/52}{12/52} = \frac{4}{12} = \frac{1}{3}$.
- Problem: Two dice are rolled. What is the probability that the sum is 8, given that the first die shows a 5?
- Solution:
- Sample space is reduced to outcomes where the first die is 5.
- Possible outcomes: $(5,1), (5,2), (5,3), (5,4), (5,5), (5,6)$.
- $P(\text{Sum is 8}|\text{First die is 5}) = \frac{1}{6}$, since only $(5,3)$ sums to 8.
Two events $A$ and $B$ are independent if the occurrence of one does not affect the probability of the occurrence of the other.
Mathematically:
\[ P(A \cap B) = P(A) \times P(B) \]
- Equivalent Conditions:
- $P(A|B) = P(A)$.
- $P(B|A) = P(B)$.
- Problem: Are the events “the first die shows a 3” and “the second die shows a 5” independent when rolling two dice?
- Solution:
- $P(\text{First die is 3}) = \frac{1}{6}$.
- $P(\text{Second die is 5}) = \frac{1}{6}$.
- $P(\text{First die is 3 and second die is 5}) = \frac{1}{36}$.
- Since $\frac{1}{36} = \frac{1}{6} \times \frac{1}{6}$, the events are independent.
- Problem: Are the events “drawing an Ace” and “drawing a Spade” independent when drawing one card from a deck?
- Solution:
- $P(\text{Ace}) = \frac{4}{52}$.
- $P(\text{Spade}) = \frac{13}{52}$.
- $P(\text{Ace of Spades}) = \frac{1}{52}$.
- Since $\frac{1}{52} \neq \frac{4}{52} \times \frac{13}{52}$, the events are dependent.
Bayes’ Theorem relates the conditional and marginal probabilities of stochastic events.
Formula:
\[ P(A|B) = \frac{P(B|A)P(A)}{P(B)} \]
Starting from the definition of conditional probability:
\[ P(A \cap B) = P(A|B)P(B) = P(B|A)P(A) \]
Rearranging gives Bayes’ Theorem.
- Prior Probability ($P(A)$): Initial probability before new evidence.
- Posterior Probability ($P(A|B)$): Updated probability after considering new evidence $B$.
- Problem: A disease affects 1% of the population. A test detects the disease with 99% accuracy (true positive rate), but has a 5% false positive rate. What is the probability that a person who tests positive actually has the disease?
- Solution:
- Let $D$ be the event of having the disease.
- $P(D) = 0.01$.
- $P(\text{Positive}|D) = 0.99$.
- $P(\text{Positive}|D^c) = 0.05$.
- $P(D|\text{Positive}) = \frac{P(\text{Positive}|D)P(D)}{P(\text{Positive}|D)P(D) + P(\text{Positive}|D^c)P(D^c)}$.
- Calculate accordingly to find $P(D|\text{Positive})$.
- Problem: An email filter uses Bayesian analysis to determine if an email is spam based on certain keywords.
- Solution:
- $P(\text{Spam}|\text{Keywords}) = \frac{P(\text{Keywords}|\text{Spam})P(\text{Spam})}{P(\text{Keywords})}$.
- Helps in updating the belief about the email being spam given the presence of certain keywords.
A random variable is a function that assigns a numerical value to each outcome in the sample space.
- Notation: Random variables are usually denoted by uppercase letters like $X$, $Y$.
- Definition: A random variable that can take on a countable number of values.
Definition: A function that gives the probability that a discrete random variable is exactly equal to some value.
Formula:
\[ P(X = x) = p(x) \]
- Experiment: Roll two fair six-sided dice.
- Random Variable: $X =$ sum of the numbers on the two dice.
- Possible Values: $X$ can take on integer values from 2 to 12.
- PMF: Calculate $P(X = k)$ for each possible sum $k$.
Experiment: Toss a fair coin 5 times.
Sample Space (S): The possible outcomes for each toss are “heads” (H) or “tails” (T). Each toss is independent, so there are ( 2^5 = 32 ) possible outcomes (e.g., HHTHT, TTTTT, etc.).
Random Variable (X): Let ( X ) be the number of heads in 5 coin tosses. ( X ) can take values from 0 to 5, because you could get anywhere from 0 to 5 heads.
Random Variable:
- ( X = 0 ): No heads (all tails).
- ( X = 1 ): One head and four tails.
- ( X = 2 ): Two heads and three tails.
- ( \dots )
- ( X = 5 ): All heads.
Probability Mass Function (PMF): The probability of getting exactly ( k ) heads in 5 coin tosses is given by the binomial distribution: [ P(X = k) = \binom{5}{k} \left(\frac{1}{2}\right)^k \left(\frac{1}{2}\right)^{5-k} ] where ( k ) is the number of heads, and ( \binom{5}{k} ) is the binomial coefficient representing the number of ways to choose ( k ) heads from 5 tosses.
For example:
- ( P(X = 0) = \binom{5}{0} \left(\frac{1}{2}\right)^5 = \frac{1}{32} )
- ( P(X = 2) = \binom{5}{2} \left(\frac{1}{2}\right)^5 = \frac{10}{32} )
The expected value of a random variable $X$ is a measure of the central tendency of its distribution.
Formula:
\[ E[X] = \sum_{x} x \cdot P(X = x) \]
For any two random variables $X$ and $Y$ and real numbers $a$ and $b$:
\[ E[aX + bY] = aE[X] + bE[Y] \]
Regardless of whether $X$ and $Y$ are independent.
- Problem: A lottery ticket costs $1. The probability of winning $100 is $\frac{1}{100}$. What is the expected gain?
- Solution:
Random Variable $X$: Net gain from the lottery.
Possible outcomes:
- Win: Gain $99 ($100 prize - $1 cost), with probability $\frac{1}{100}$.
- Lose: Lose $1, with probability $\frac{99}{100}$.
Expected Value:
\[ E[X] = 99 \times \frac{1}{100} + (-1) \times \frac{99}{100} = \frac{99}{100} - \frac{99}{100} = 0 \]
Interpretation: On average, you break even.
- Problem: In Mahjong, a player scores points based on specific tile combinations. Suppose completing a certain hand has a probability of $0.05$ and yields 800 points, while failing yields 0 points. What is the expected score?
- Solution:
- $E[X] = 800 \times 0.05 + 0 \times 0.95 = 40$.
An indicator random variable $I_A$ for an event $A$ is defined as:
\[ I_A = \begin{cases} 1 & \text{if } A \text{ occurs}, \ 0 & \text{otherwise}. \end{cases} \]
- Expected Value: $E[I_A] = P(A)$.
- Simplifies calculation of expectations involving events.
- Problem: At a party, $n$ guests check their hats. The hats are returned randomly. What is the expected number of guests who get their own hat back?
- Solution:
Let $I_i$ be the indicator variable that guest $i$ gets their own hat.
$P(I_i = 1) = \frac{1}{n}$.
Total expected guests with correct hats:
\[ E\left[ \sum_{i=1}^{n} I_i \right] = \sum_{i=1}^{n} E[I_i] = n \times \frac{1}{n} = 1 \]
Interpretation: On average, one guest gets their own hat back.
- Definition: An experiment with exactly two possible outcomes: “success” or “failure”.
- Probability of Success: $p$.
- Probability of Failure: $q = 1 - p$.
Definition: The distribution of the number of successes in $n$ independent Bernoulli trials.
Probability Mass Function:
\[ P(X = k) = \binom{n}{k} p^k q^{n - k} \]
Mean and Variance:
- $E[X] = np$.
- $\text{Var}(X) = npq$.
- Problem: A fair coin is tossed 10 times. What is the probability of getting exactly 6 heads?
- Solution:
- $n = 10$, $p = 0.5$.
- $P(X = 6) = \binom{10}{6} (0.5)^6 (0.5)^4 = \binom{10}{6} (0.5)^{10}$.
- Problem: A factory produces bulbs with a 2% defect rate. If 100 bulbs are tested, what is the expected number of defective bulbs?
- Solution:
$n = 100$, $p = 0.02$.
Expected Defects:
\[ E[X] = np = 100 \times 0.02 = 2 \]
- Question: In a group of $n$ people, what is the probability that at least two people share the same birthday?
Complement Approach:
- Calculate the probability $P(\text{No shared birthdays})$.
- Then, $P(\text{At least one shared birthday}) = 1 - P(\text{No shared birthdays})$.
Calculation:
\[ P(\text{No shared birthdays}) = \frac{365}{365} \times \frac{364}{365} \times \cdots \times \frac{365 - n + 1}{365} \]
- Finding Minimal $n$:
- For $P(\text{At least one shared birthday}) > 0.5$, $n = 23$.
- Hash Functions: Understanding collisions in hash tables.
- Cryptography: Birthday attacks exploit the mathematics behind the birthday problem.
- Definition: A collision occurs when two inputs produce the same hash output.
Problem: With $n$ inputs and $k$ hash values, what is the probability of at least one collision?
Approximation:
\[ P(\text{At least one collision}) \approx 1 - e^{-n^2 / (2k)} \]
- Problem: Hashing 1000 keys into 10,000 slots.
- Solution:
- Calculate the probability of collision using the approximation.
- Concept: Update the probability estimate for a hypothesis as more evidence or information becomes available.
Prior Probability ($P(H)$): Initial belief about the hypothesis.
Likelihood ($P(D|H)$): Probability of the data given the hypothesis.
Posterior Probability ($P(H|D)$): Updated belief after observing data.
- Spam Classification:
- Hypothesis: An email is spam.
- Data: Presence of certain words.
- Goal: Compute $P(\text{Spam}|\text{Data})$ to classify emails.
- Experiment: A procedure with uncertain outcomes.
- Sample Space: All possible outcomes.
- Event: A subset of the sample space.
- Addition Rule: For combining probabilities of events.
- Complement Rule: Probability of an event not occurring.
- Conditional Probability: Probability given some condition.
- Independence: Events not affecting each other.
- Bayes’ Theorem: Updating probabilities with new information.
- Random Variables: Assign numerical values to outcomes.
- Expectation: Long-run average value of a random variable.
- Bernoulli Trials: Experiments with two outcomes.
- Binomial Distribution: Number of successes in Bernoulli trials.
- Birthday Problem: Probability of shared birthdays.
- Hashing: Collisions in hash functions.
- Machine Learning: Bayesian inference in model updating.
Conditional Probability with Cards:
- Problem: Two cards are drawn sequentially from a standard deck without replacement. What is the probability that the second card is an Ace, given that the first card is a King?
- Solution:
- $P(\text{Second is Ace}|\text{First is King}) = \frac{4}{51}$.
Independence Verification:
- Problem: Verify if the events “rolling an even number” and “rolling a number greater than 3” are independent when rolling a single die.
- Solution:
- $P(\text{Even}) = \frac{3}{6} = \frac{1}{2}$.
- $P(>3) = \frac{3}{6} = \frac{1}{2}$.
- $P(\text{Even and } >3) = \frac{2}{6} = \frac{1}{3}$.
- Since $\frac{1}{3} \neq \frac{1}{2} \times \frac{1}{2}$, events are not independent.
Expected Value in Games:
- Problem: In a game, you draw a tile that can score you 0, 10, or 50 points with probabilities 0.6, 0.3, and 0.1 respectively. What is your expected score per draw?
- Solution:
- $E[X] = 0 \times 0.6 + 10 \times 0.3 + 50 \times 0.1 = 0 + 3 + 5 = 8$.
Binomial Probability:
- Problem: If the probability of success on a single trial is 0.2, what is the probability of exactly 2 successes in 5 trials?
- Solution:
- $P(X = 2) = \binom{5}{2} (0.2)^2 (0.8)^3 = 10 \times 0.04 \times 0.512 = 0.2048$.
Applying Bayes’ Theorem:
- Problem: A factory has two machines producing widgets. Machine A produces 60% of widgets with a 2% defect rate, and Machine B produces 40% of widgets with a 5% defect rate. If a randomly selected widget is defective, what is the probability it was produced by Machine B?
- Solution:
- $P(\text{Defective}|A) = 0.02$, $P(\text{Defective}|B) = 0.05$.
- $P(A) = 0.6$, $P(B) = 0.4$.
- $P(B|\text{Defective}) = \frac{P(\text{Defective}|B)P(B)}{P(\text{Defective})}$.
- Calculate $P(\text{Defective}) = P(\text{Defective}|A)P(A) + P(\text{Defective}|B)P(B)$.
- Compute $P(B|\text{Defective})$ accordingly.
- Outcome: A possible result of an experiment.
- Event: A set of outcomes.
- Sample Space: The set of all possible outcomes.
- Probability: A measure of the likelihood of an event.
- $P(A)$: Probability of event $A$.
- $P(A|B)$: Probability of $A$ given $B$.
- $E[X]$: Expected value of random variable $X$.
- $\binom{n}{k}$: Number of combinations choosing $k$ from $n$.
Probability provides a powerful framework for understanding and quantifying uncertainty in various contexts. From simple experiments like rolling dice to complex applications in machine learning and cryptography, mastering the principles of probability is essential for analyzing and interpreting real-world phenomena.
By studying the definitions, rules, and examples provided in this lecture, you should have a solid foundation to tackle probability problems and appreciate its applications across different fields.