Probability theory is the mathematical framework for quantifying uncertainty and the likelihood of events. It provides a precise language to reason about randomness.
Understanding these core terms is essential because they form the building blocks for analyzing data, building statistical models, interpreting Hypothesis Tests, and understanding algorithms in Machine Learning and AI.
This section covers the fundamental vocabulary. We will primarily discuss theoretical probability (based on models like fair coins or dice), but these terms also apply to empirical probability (based on observed data frequencies), which we'll explore later.
Definition: Any process or action with an uncertain outcome that can be observed and repeated (at least conceptually). The set of potential outcomes is well-defined.
Purpose: Defines the specific scenario we are analyzing.
Examples:
Flipping a fair coin.
Rolling a standard six-sided die.
Measuring the response time of a web server.
Running an A/B test for a website button (observing one user's action).
Definition: A single, specific possible result of an experiment. Outcomes are mutually exclusive (only one can occur per trial) and exhaustive (they cover all possibilities).
Definition: A function \(P\) that assigns a numerical value between 0 and 1 (inclusive) to each event \(E\) in the sample space, representing the likelihood of that event occurring. \(P(E)\) denotes the probability of event E.
Scale:
\(P(E) = 0\): Event \(E\) is impossible.
\(P(E) = 1\): Event \(E\) is certain.
\(0 < P(E) < 1\): Event \(E\) has some chance of occurring.
Basic Rules (Axioms of Probability):
Non-negativity: For any event E, \(P(E) \ge 0\).
Normalization: The probability of the entire sample space S is 1: \(P(S) = 1\). (Something must happen).
Additivity for Mutually Exclusive Events: Events are mutually exclusive if they cannot occur at the same time (their intersection is empty, \(A \cap B = \emptyset\); e.g., rolling a 1 and rolling a 6 on a single roll). If events \(A\) and \(B\) are mutually exclusive, then the probability that either\(A\) or \(B\) occurs is \(P(A \cup B) = P(A) + P(B)\).
Contrast: Rolling an even number (\(A = \{2, 4, 6\}\)) and rolling a number greater than 4 (\(B = \{5, 6\}\)) are not mutually exclusive because the outcome '6' is in both events (\(A \cap B = \{6\}\)). The simple addition rule does not directly apply here.
In many simple experiments where all individual outcomes in the sample space are considered equally likely (like a fair coin or a fair die), the theoretical probability of an event E is calculated as:
$$ P(E) = \frac{\text{Number of outcomes in event E}}{\text{Total number of outcomes in the Sample Space S}} $$
Example: For a fair die, \(P(\text{Rolling an even number}) = P(\{2, 4, 6\}) = \frac{3}{6} = 0.5\).
Primary Textbooks: (e.g., "Probability and Statistics for Engineers and Scientists" by Walpole et al.; "A First Course in Probability" by Sheldon Ross)