Bayesian Probability
Bayesian probability offers a powerful framework for understanding and working with uncertainty in data science. Unlike the frequentist approach, which views probability as the long-run frequency of events, Bayesian probability interprets probability as a degree of belief or certainty about an event. This article explores the Bayesian approach to probability, contrasts it with the frequentist perspective, and demonstrates how Bayesian reasoning can be applied to real-world data science problems.
Bayesian vs. Frequentist Probability
Frequentist Perspective
The frequentist interpretation of probability is the most common approach in classical statistics. It defines probability as the limit of the relative frequency of an event occurring as the number of trials approaches infinity. For example, the probability of flipping a fair coin and getting heads is , because in an infinite number of flips, half of them would land on heads.
Key Characteristics of Frequentist Probability:
- Objective: Probability is considered an inherent property of the physical world, independent of observers or prior knowledge.
- Focus on Long-Run Frequencies: Probabilities are derived from the long-run frequency of events.
- Hypothesis Testing: Frequentist methods, such as p-values and confidence intervals, are used to test hypotheses without incorporating prior beliefs or knowledge.
Example: Frequentist Approach to Coin Flipping
Consider a scenario where you flip a coin 100 times, and it lands on heads 60 times. A frequentist would estimate the probability of heads as:
This estimate is based purely on the observed frequency of heads in the sample, without considering any prior information about the coin.
Bayesian Perspective
The Bayesian interpretation of probability, named after Reverend Thomas Bayes, treats probability as a measure of belief or certainty about an event, which can be updated as new evidence is presented. Bayesian probability is fundamentally subjective, depending on the prior beliefs of the observer.
Key Characteristics of Bayesian Probability:
- Subjective: Probability represents a degree of belief, which can vary between individuals based on prior information.
- Bayes’ Theorem: Central to Bayesian reasoning, allowing for the updating of beliefs based on new evidence.
- Incorporation of Prior Knowledge: Bayesian methods integrate prior beliefs with observed data to form a posterior belief.
Example: Bayesian Approach to Coin Flipping
Using the same coin-flipping scenario, a Bayesian would start with a prior belief about the fairness of the coin, say . After observing 60 heads out of 100 flips, the Bayesian would update this belief using Bayes' Theorem, potentially arriving at a different estimate depending on the strength of the prior belief.
Clarifying the Hypothesis:
Let be the probability of heads. Instead of fixing , we treat as a continuous parameter with its own prior distribution. This approach aligns with Bayesian practices by allowing to vary and be updated based on observed data.
Bayes' Theorem
Bayes' Theorem is the foundation of Bayesian probability. It describes how to update the probability of a hypothesis based on new evidence.
Formula
Bayes' Theorem is mathematically expressed as:
Where:
- is the posterior probability: the probability of the hypothesis given the evidence .
- is the likelihood: the probability of the evidence assuming the hypothesis is true.
- is the prior probability: the initial belief about the hypothesis before seeing the evidence.
- is the marginal likelihood or evidence: the total probability of the evidence under all possible hypotheses.
Example: Bayesian Coin Flipping with Bayes' Theorem
Suppose you initially believe that a coin is fair (). After flipping the coin 10 times, it lands on heads 7 times. You want to update your belief about the fairness of the coin using Bayes' Theorem.
Let be the hypothesis that the probability of heads is . The prior probability might be , reflecting initial uncertainty, and the likelihood is calculated based on the binomial distribution: