Discrete Probability

Updated on Dec 4, 2024 14 minutes to read lecture sheets

Introduction

Probability is a fundamental concept in mathematics that deals with uncertainty and the likelihood of events occurring. It provides a framework for quantifying and reasoning about uncertainty, which is essential in fields like statistics, computer science, finance, engineering, and more.

This comprehensive lecture will cover the foundational principles of probability, progressing from basic to advanced topics:

Introduction to Probability
Basic Probability Definitions
Rules of Probability
Conditional Probability
Independence
Bayes’ Theorem
Random Variables
Expectation (Expected Value)
Bernoulli Trials and Binomial Distribution
Advanced Applications

1. Introduction to Probability

Definition of an Experiment, Sample Space, and Event

Experiment: A procedure that yields one out of several possible outcomes. It’s an action or process whose outcome is uncertain.
Sample Space ($S$): The set of all possible outcomes of an experiment.
Event: A subset of the sample space. An event occurs if the outcome of the experiment is an element of the event set.

Understanding Outcomes and Events

An outcome is a single possible result of an experiment, while an event can consist of one or more outcomes.

Examples

Example 1: Rolling Dice

Experiment: Rolling two six-sided dice, one red and one blue.
Sample Space ($S$):
- All possible ordered pairs $(x, y)$ where $x$ is the outcome on the red die and $y$ is the outcome on the blue die.
- Total outcomes: $6 \times 6 = 36$.
Event: Getting a sum of 7.
- Event set $E$: ${ (1,6), (2,5), (3,4), (4,3), (5,2), (6,1) }$.

Example 2: Drawing Cards

Experiment: Drawing a single card from a standard deck of 52 playing cards.
Sample Space ($S$):
- All 52 unique cards.
Event: Drawing a heart.
- Event set $E$: All 13 heart cards in the deck.

Example 3: Playing Mahjong

Experiment: Drawing a tile from a complete set of Mahjong tiles.
Sample Space ($S$):
- All 136 tiles in the traditional Mahjong set.
Event: Drawing a Dragon tile.
- Event set $E$: The 12 Dragon tiles (4 Red, 4 Green, 4 White).

2. Basic Probability Definitions

Probability Distribution

A probability distribution assigns a probability to each outcome in the sample space.

Properties:
- $0 \leq P(s) \leq 1$ for every $s \in S$.
- $\sum_{s \in S} P(s) = 1$.

Probability of an Event

The probability of an event $E$ is the sum of the probabilities of the outcomes in $E$:

\[ P(E) = \sum_{s \in E} P(s) \]

Uniform Distribution

Definition: A distribution where every outcome in the sample space is equally likely.
Probability of each outcome: If $|S| = n$, then $P(s) = \frac{1}{n}$ for all $s \in S$.

Example

Example 1: Fair Die Roll

Sample Space: $S = {1, 2, 3, 4, 5, 6}$.
Probability of each outcome: $P(s) = \frac{1}{6}$.

Example 2: Dealing Tiles in Xiangqi (Chinese Chess)

Experiment: Randomly selecting a piece from all the pieces used in Xiangqi.
Sample Space: All 32 pieces (16 red and 16 black pieces).
Uniform Distribution: Each piece has a probability of $\frac{1}{32}$.

Non-Uniform Distribution

Definition: A distribution where outcomes have different probabilities.
Application: Used when certain outcomes are more likely than others.

Example

Example 1: Loaded Die

Suppose a six-sided die is weighted so that the number 6 comes up twice as often as the other numbers.
Assigning Probabilities:
- For numbers 1-5: $P(s) = \frac{1}{7}$.
- For number 6: $P(6) = \frac{2}{7}$.
Verification:
- Total probability: $5 \times \frac{1}{7} + \frac{2}{7} = 1$.

3. Rules of Probability

Addition Rule

For Mutually Exclusive Events

Definition: Events $A$ and $B$ are mutually exclusive if $A \cap B = \emptyset$.
Rule:
\[ P(A \cup B) = P(A) + P(B) \]

General Addition Rule

Rule:
\[ P(A \cup B) = P(A) + P(B) - P(A \cap B) \]

Example

Example 1: Drawing Cards

Problem: What is the probability of drawing a card that is a King or a Heart from a standard deck?
Solution:
- $P(\text{King}) = \frac{4}{52} = \frac{1}{13}$.
- $P(\text{Heart}) = \frac{13}{52} = \frac{1}{4}$.
- $P(\text{King} \cap \text{Heart}) = \frac{1}{52}$.
- $P(\text{King} \cup \text{Heart}) = \frac{1}{13} + \frac{1}{4} - \frac{1}{52} = \frac{16}{52} = \frac{4}{13}$.

Complement Rule

Definition: The complement of an event $A$, denoted $A^c$, consists of all outcomes not in $A$.
Rule:
\[ P(A^c) = 1 - P(A) \]

Example

Example 1: Not Rolling a Six

Problem: What is the probability of not rolling a six on a fair die?
Solution:
- $P(\text{Six}) = \frac{1}{6}$.
- $P(\text{Not Six}) = 1 - \frac{1}{6} = \frac{5}{6}$.

Inclusion-Exclusion Principle

For Three Events:
\[ P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(A \cap B) - P(A \cap C) - P(B \cap C) + P(A \cap B \cap C) \]

Example

Example 1: Language Skills

Problem: In a group of people:
- 30 speak English.
- 25 speak Mandarin.
- 20 speak Spanish.
- 10 speak both English and Mandarin.
- 5 speak both English and Spanish.
- 3 speak both Mandarin and Spanish.
- 2 speak all three languages.
Question: What is the probability that a randomly selected person speaks at least one of the languages?
Solution:
- Total number of people $N$ is needed. Suppose $N = 50$.
- Apply the Inclusion-Exclusion Principle to find the number of people who speak at least one language.
- Calculate probabilities accordingly.

4. Conditional Probability

Definition

Conditional Probability: The probability of event $A$ given that event $B$ has occurred.
Formula:
\[ P(A|B) = \frac{P(A \cap B)}{P(B)}, \quad \text{provided } P(B) > 0 \]

Examples

Example 1: Drawing Cards

Problem: From a standard deck, what is the probability that a drawn card is a Queen, given that it’s a face card?
Solution:
- $P(\text{Queen}|\text{Face card}) = \frac{P(\text{Queen} \cap \text{Face card})}{P(\text{Face card})}$.
- Total face cards: 12 (Jacks, Queens, Kings of each suit).
- Total Queens: 4.
- $P(\text{Queen}|\text{Face card}) = \frac{4/52}{12/52} = \frac{4}{12} = \frac{1}{3}$.

Example 2: Rolling Dice with Conditions

Problem: Two dice are rolled. What is the probability that the sum is 8, given that the first die shows a 5?
Solution:
- Sample space is reduced to outcomes where the first die is 5.
- Possible outcomes: $(5,1), (5,2), (5,3), (5,4), (5,5), (5,6)$.
- $P(\text{Sum is 8}|\text{First die is 5}) = \frac{1}{6}$, since only $(5,3)$ sums to 8.

5. Independence

Definition

Two events $A$ and $B$ are independent if the occurrence of one does not affect the probability of the occurrence of the other.

Mathematically:
\[ P(A \cap B) = P(A) \times P(B) \]

Criteria for Independence

Equivalent Conditions:
- $P(A|B) = P(A)$.
- $P(B|A) = P(B)$.

Examples

Example 1: Independent Dice Rolls

Problem: Are the events “the first die shows a 3” and “the second die shows a 5” independent when rolling two dice?
Solution:
- $P(\text{First die is 3}) = \frac{1}{6}$.
- $P(\text{Second die is 5}) = \frac{1}{6}$.
- $P(\text{First die is 3 and second die is 5}) = \frac{1}{36}$.
- Since $\frac{1}{36} = \frac{1}{6} \times \frac{1}{6}$, the events are independent.

Example 2: Dependent Card Draws

Problem: Are the events “drawing an Ace” and “drawing a Spade” independent when drawing one card from a deck?
Solution:
- $P(\text{Ace}) = \frac{4}{52}$.
- $P(\text{Spade}) = \frac{13}{52}$.
- $P(\text{Ace of Spades}) = \frac{1}{52}$.
- Since $\frac{1}{52} \neq \frac{4}{52} \times \frac{13}{52}$, the events are dependent.

6. Bayes’ Theorem

Statement

Bayes’ Theorem relates the conditional and marginal probabilities of stochastic events.

Formula:
\[ P(A|B) = \frac{P(B|A)P(A)}{P(B)} \]

Derivation from Conditional Probability

Starting from the definition of conditional probability:
\[ P(A \cap B) = P(A|B)P(B) = P(B|A)P(A) \]
Rearranging gives Bayes’ Theorem.

Understanding Prior and Posterior Probabilities

Prior Probability ($P(A)$): Initial probability before new evidence.
Posterior Probability ($P(A|B)$): Updated probability after considering new evidence $B$.

Examples and Applications

Example 1: Medical Testing

Problem: A disease affects 1% of the population. A test detects the disease with 99% accuracy (true positive rate), but has a 5% false positive rate. What is the probability that a person who tests positive actually has the disease?
Solution:
- Let $D$ be the event of having the disease.
- $P(D) = 0.01$.
- $P(\text{Positive}|D) = 0.99$.
- $P(\text{Positive}|D^c) = 0.05$.
- $P(D|\text{Positive}) = \frac{P(\text{Positive}|D)P(D)}{P(\text{Positive}|D)P(D) + P(\text{Positive}|D^c)P(D^c)}$.
- Calculate accordingly to find $P(D|\text{Positive})$.

Example 2: Spam Filter

Problem: An email filter uses Bayesian analysis to determine if an email is spam based on certain keywords.
Solution:
- $P(\text{Spam}|\text{Keywords}) = \frac{P(\text{Keywords}|\text{Spam})P(\text{Spam})}{P(\text{Keywords})}$.
- Helps in updating the belief about the email being spam given the presence of certain keywords.

7. Random Variables

Definition

A random variable is a function that assigns a numerical value to each outcome in the sample space.

Notation: Random variables are usually denoted by uppercase letters like $X$, $Y$.

Discrete Random Variables

Definition: A random variable that can take on a countable number of values.

Probability Mass Function (PMF)

Definition: A function that gives the probability that a discrete random variable is exactly equal to some value.
Formula:
\[ P(X = x) = p(x) \]

Examples

Example 1: Sum of Dice Rolls

Experiment: Roll two fair six-sided dice.
Random Variable: $X =$ sum of the numbers on the two dice.
Possible Values: $X$ can take on integer values from 2 to 12.
PMF: Calculate $P(X = k)$ for each possible sum $k$.

Example 2: Number of Heads in 5 Coin Tosses**

Experiment: Toss a fair coin 5 times.

Sample Space (S): The possible outcomes for each toss are “heads” (H) or “tails” (T). Each toss is independent, so there are ( 2^5 = 32 ) possible outcomes (e.g., HHTHT, TTTTT, etc.).
Random Variable (X): Let ( X ) be the number of heads in 5 coin tosses. ( X ) can take values from 0 to 5, because you could get anywhere from 0 to 5 heads.

Random Variable:

( X = 0 ): No heads (all tails).
( X = 1 ): One head and four tails.
( X = 2 ): Two heads and three tails.
( \dots )
( X = 5 ): All heads.

Probability Mass Function (PMF): The probability of getting exactly ( k ) heads in 5 coin tosses is given by the binomial distribution: [ P(X = k) = \binom{5}{k} \left(\frac{1}{2}\right)^k \left(\frac{1}{2}\right)^{5-k} ] where ( k ) is the number of heads, and ( \binom{5}{k} ) is the binomial coefficient representing the number of ways to choose ( k ) heads from 5 tosses.

For example:

( P(X = 0) = \binom{5}{0} \left(\frac{1}{2}\right)^5 = \frac{1}{32} )
( P(X = 2) = \binom{5}{2} \left(\frac{1}{2}\right)^5 = \frac{10}{32} )

8. Expectation (Expected Value)

Definition

The expected value of a random variable $X$ is a measure of the central tendency of its distribution.

Formula:
\[ E[X] = \sum_{x} x \cdot P(X = x) \]

Properties of Expectation

Linearity of Expectation

For any two random variables $X$ and $Y$ and real numbers $a$ and $b$:
\[ E[aX + bY] = aE[X] + bE[Y] \]
Regardless of whether $X$ and $Y$ are independent.

Examples

Example 1: Expected Winnings in a Lottery

Problem: A lottery ticket costs $1. The probability of winning $100 is $\frac{1}{100}$. What is the expected gain?
Solution:
- Random Variable $X$: Net gain from the lottery.
- Possible outcomes:
  - Win: Gain $99 ($100 prize - $1 cost), with probability $\frac{1}{100}$.
  - Lose: Lose $1, with probability $\frac{99}{100}$.
- Expected Value:
  \[ E[X] = 99 \times \frac{1}{100} + (-1) \times \frac{99}{100} = \frac{99}{100} - \frac{99}{100} = 0 \]
- Interpretation: On average, you break even.

Example 2: Expected Value in Mahjong

Problem: In Mahjong, a player scores points based on specific tile combinations. Suppose completing a certain hand has a probability of $0.05$ and yields 800 points, while failing yields 0 points. What is the expected score?
Solution:
- $E[X] = 800 \times 0.05 + 0 \times 0.95 = 40$.

Indicator Random Variables

Definition

An indicator random variable $I_A$ for an event $A$ is defined as:

\[ I_A = \begin{cases} 1 & \text{if } A \text{ occurs}, \ 0 & \text{otherwise}. \end{cases} \]

Usage

Expected Value: $E[I_A] = P(A)$.
Simplifies calculation of expectations involving events.

Example

Example 1: Hat Check Problem

Problem: At a party, $n$ guests check their hats. The hats are returned randomly. What is the expected number of guests who get their own hat back?
Solution:
- Let $I_i$ be the indicator variable that guest $i$ gets their own hat.
- $P(I_i = 1) = \frac{1}{n}$.
- Total expected guests with correct hats:
  \[ E\left[ \sum_{i=1}^{n} I_i \right] = \sum_{i=1}^{n} E[I_i] = n \times \frac{1}{n} = 1 \]
- Interpretation: On average, one guest gets their own hat back.

9. Bernoulli Trials and Binomial Distribution

Bernoulli Trials

Definition: An experiment with exactly two possible outcomes: “success” or “failure”.
Probability of Success: $p$.
Probability of Failure: $q = 1 - p$.

Binomial Distribution

Definition: The distribution of the number of successes in $n$ independent Bernoulli trials.
Probability Mass Function:
\[ P(X = k) = \binom{n}{k} p^k q^{n - k} \]
Mean and Variance:
- $E[X] = np$.
- $\text{Var}(X) = npq$.

Examples

Example 1: Coin Tosses

Problem: A fair coin is tossed 10 times. What is the probability of getting exactly 6 heads?
Solution:
- $n = 10$, $p = 0.5$.
- $P(X = 6) = \binom{10}{6} (0.5)^6 (0.5)^4 = \binom{10}{6} (0.5)^{10}$.

Example 2: Quality Control

Problem: A factory produces bulbs with a 2% defect rate. If 100 bulbs are tested, what is the expected number of defective bulbs?
Solution:
- $n = 100$, $p = 0.02$.
- Expected Defects:
  \[ E[X] = np = 100 \times 0.02 = 2 \]

10. Advanced Applications

Birthday Problem

Problem Statement

Question: In a group of $n$ people, what is the probability that at least two people share the same birthday?

Solution

Complement Approach:
- Calculate the probability $P(\text{No shared birthdays})$.
- Then, $P(\text{At least one shared birthday}) = 1 - P(\text{No shared birthdays})$.
Calculation:
\[ P(\text{No shared birthdays}) = \frac{365}{365} \times \frac{364}{365} \times \cdots \times \frac{365 - n + 1}{365} \]

Application

Finding Minimal $n$:
- For $P(\text{At least one shared birthday}) > 0.5$, $n = 23$.

Real-World Relevance

Hash Functions: Understanding collisions in hash tables.
Cryptography: Birthday attacks exploit the mathematics behind the birthday problem.

Hashing and Hash Maps

Understanding Collisions

Definition: A collision occurs when two inputs produce the same hash output.

Probability Analysis

Problem: With $n$ inputs and $k$ hash values, what is the probability of at least one collision?
Approximation:
\[ P(\text{At least one collision}) \approx 1 - e^{-n^2 / (2k)} \]

Example

Problem: Hashing 1000 keys into 10,000 slots.
Solution:
- Calculate the probability of collision using the approximation.

Bayesian Machine Learning

Applying Bayes’ Theorem

Concept: Update the probability estimate for a hypothesis as more evidence or information becomes available.

Components

Prior Probability ($P(H)$): Initial belief about the hypothesis.
Likelihood ($P(D|H)$): Probability of the data given the hypothesis.
Posterior Probability ($P(H|D)$): Updated belief after observing data.

Example

Spam Classification:
- Hypothesis: An email is spam.
- Data: Presence of certain words.
- Goal: Compute $P(\text{Spam}|\text{Data})$ to classify emails.

Summary

Fundamental Concepts

Experiment: A procedure with uncertain outcomes.
Sample Space: All possible outcomes.
Event: A subset of the sample space.

Probability Rules

Addition Rule: For combining probabilities of events.
Complement Rule: Probability of an event not occurring.
Conditional Probability: Probability given some condition.

Advanced Topics

Independence: Events not affecting each other.
Bayes’ Theorem: Updating probabilities with new information.
Random Variables: Assign numerical values to outcomes.
Expectation: Long-run average value of a random variable.
Bernoulli Trials: Experiments with two outcomes.
Binomial Distribution: Number of successes in Bernoulli trials.

Applications

Birthday Problem: Probability of shared birthdays.
Hashing: Collisions in hash functions.
Machine Learning: Bayesian inference in model updating.

Practice Problems

Conditional Probability with Cards:
- Problem: Two cards are drawn sequentially from a standard deck without replacement. What is the probability that the second card is an Ace, given that the first card is a King?
- Solution:
  - $P(\text{Second is Ace}|\text{First is King}) = \frac{4}{51}$.
Independence Verification:
- Problem: Verify if the events “rolling an even number” and “rolling a number greater than 3” are independent when rolling a single die.
- Solution:
  - $P(\text{Even}) = \frac{3}{6} = \frac{1}{2}$.
  - $P(>3) = \frac{3}{6} = \frac{1}{2}$.
  - $P(\text{Even and } >3) = \frac{2}{6} = \frac{1}{3}$.
  - Since $\frac{1}{3} \neq \frac{1}{2} \times \frac{1}{2}$, events are not independent.
Expected Value in Games:
- Problem: In a game, you draw a tile that can score you 0, 10, or 50 points with probabilities 0.6, 0.3, and 0.1 respectively. What is your expected score per draw?
- Solution:
  - $E[X] = 0 \times 0.6 + 10 \times 0.3 + 50 \times 0.1 = 0 + 3 + 5 = 8$.
Binomial Probability:
- Problem: If the probability of success on a single trial is 0.2, what is the probability of exactly 2 successes in 5 trials?
- Solution:
  - $P(X = 2) = \binom{5}{2} (0.2)^2 (0.8)^3 = 10 \times 0.04 \times 0.512 = 0.2048$.
Applying Bayes’ Theorem:
- Problem: A factory has two machines producing widgets. Machine A produces 60% of widgets with a 2% defect rate, and Machine B produces 40% of widgets with a 5% defect rate. If a randomly selected widget is defective, what is the probability it was produced by Machine B?
- Solution:
  - $P(\text{Defective}|A) = 0.02$, $P(\text{Defective}|B) = 0.05$.
  - $P(A) = 0.6$, $P(B) = 0.4$.
  - $P(B|\text{Defective}) = \frac{P(\text{Defective}|B)P(B)}{P(\text{Defective})}$.
  - Calculate $P(\text{Defective}) = P(\text{Defective}|A)P(A) + P(\text{Defective}|B)P(B)$.
  - Compute $P(B|\text{Defective})$ accordingly.

Additional Clarifications

Key Terms

Outcome: A possible result of an experiment.
Event: A set of outcomes.
Sample Space: The set of all possible outcomes.
Probability: A measure of the likelihood of an event.

Notation

$P(A)$: Probability of event $A$.
$P(A|B)$: Probability of $A$ given $B$.
$E[X]$: Expected value of random variable $X$.
$\binom{n}{k}$: Number of combinations choosing $k$ from $n$.

Conclusion

Probability provides a powerful framework for understanding and quantifying uncertainty in various contexts. From simple experiments like rolling dice to complex applications in machine learning and cryptography, mastering the principles of probability is essential for analyzing and interpreting real-world phenomena.

By studying the definitions, rules, and examples provided in this lecture, you should have a solid foundation to tackle probability problems and appreciate its applications across different fields.