🎲 Naive Bayes

Master probabilistic classification, Bayes' theorem, and feature independence assumptions for robust predictions

← Back to Data Science

Naive Bayes Curriculum

10
Core Units
~50
Key Concepts
5+
NB Variants
20+
Practical Examples
1

Probability Fundamentals

Build foundation in probability theory essential for understanding Naive Bayes.

  • Basic probability concepts
  • Conditional probability
  • Joint probability
  • Marginal probability
  • Independence assumption
  • Probability distributions
  • Maximum likelihood estimation
  • Prior and posterior probabilities
2

Bayes' Theorem

Master the mathematical foundation that powers Naive Bayes classification.

  • Bayes' theorem derivation
  • Prior probability
  • Likelihood function
  • Posterior probability
  • Evidence (marginal likelihood)
  • Bayesian inference
  • Bayesian vs frequentist
  • Real-world applications
3

Naive Independence Assumption

Understand the "naive" assumption and its implications for classification.

  • Feature independence assumption
  • Why "naive"?
  • Conditional independence
  • Real-world violations
  • Impact on performance
  • When assumption works
  • Robustness to violations
  • Computational benefits
4

Naive Bayes Classification

Learn how to apply Naive Bayes for classification problems.

  • Classification framework
  • Maximum a posteriori (MAP)
  • Decision rule
  • Class prediction
  • Probability estimation
  • Training process
  • Prediction process
  • Multiclass classification
5

Gaussian Naive Bayes

Apply Naive Bayes to continuous features using Gaussian distributions.

  • Continuous feature handling
  • Gaussian distribution assumption
  • Mean and variance estimation
  • Probability density function
  • Parameter estimation
  • Numerical stability
  • Feature scaling effects
  • Implementation details
6

Multinomial Naive Bayes

Handle discrete count data and text classification with multinomial distribution.

  • Discrete count features
  • Multinomial distribution
  • Text classification applications
  • Word count vectors
  • Feature frequency
  • Smoothing techniques
  • Vocabulary handling
  • Document classification
7

Bernoulli Naive Bayes

Work with binary features using Bernoulli distribution for classification.

  • Binary feature handling
  • Bernoulli distribution
  • Presence/absence features
  • Text classification with binary
  • Feature binarization
  • Parameter estimation
  • Comparison with multinomial
  • Use case selection
8

Smoothing Techniques

Handle zero probabilities and improve generalization with smoothing methods.

  • Zero probability problem
  • Laplace (add-one) smoothing
  • Add-k smoothing
  • Lidstone smoothing
  • Good-Turing smoothing
  • Smoothing parameter selection
  • Impact on performance
  • Implementation considerations
9

Text Classification with NB

Master text classification applications including spam detection and sentiment analysis.

  • Text preprocessing for NB
  • Bag-of-words representation
  • Feature extraction
  • Spam email detection
  • Sentiment analysis
  • Document categorization
  • N-gram features
  • Performance optimization
10

Implementation and Optimization

Build efficient Naive Bayes implementations and optimize for real-world use.

  • From-scratch implementation
  • Scikit-learn usage
  • Numerical stability
  • Log probability computations
  • Memory efficiency
  • Incremental learning
  • Feature selection
  • Model evaluation

Unit 1: Probability Fundamentals

Build foundation in probability theory essential for understanding Naive Bayes.

Basic Probability Concepts

Learn the fundamental concepts of probability theory that underpin Naive Bayes algorithms.

Sample Space Events Probability
Probability measures the likelihood of an event occurring, ranging from 0 (impossible) to 1 (certain). It forms the mathematical foundation for making predictions under uncertainty.

Conditional Probability

Understand how to calculate probabilities when additional information is available.

P(A|B) = P(A ∩ B) / P(B)
"Probability of A given B"
# Conditional probability example
import numpy as np

# Medical diagnosis example
# P(Disease) = 0.01 (1% of population has disease)
# P(Test+|Disease) = 0.95 (95% sensitivity)
# P(Test+|No Disease) = 0.05 (5% false positive rate)

def conditional_probability_example():
  # Prior probabilities
  p_disease = 0.01
  p_no_disease = 0.99
  
  # Likelihoods
  p_test_pos_given_disease = 0.95
  p_test_pos_given_no_disease = 0.05
  
  # Total probability of positive test
  p_test_pos = (p_test_pos_given_disease * p_disease +
                p_test_pos_given_no_disease * p_no_disease)
  
  # Conditional probability: P(Disease|Test+)
  p_disease_given_test_pos = (
    p_test_pos_given_disease * p_disease) / p_test_pos
  
  print(f"P(Disease|Test+) = {p_disease_given_test_pos:.4f}")
  print(f"Only {p_disease_given_test_pos*100:.1f}% chance of disease!")

conditional_probability_example()

Joint Probability

Learn how to calculate the probability of multiple events occurring together.

Joint Probability: P(A ∩ B) - Probability that both A and B occur
For independent events: P(A ∩ B) = P(A) × P(B)
For dependent events: P(A ∩ B) = P(A|B) × P(B)
# Joint probability examples
import numpy as np

def joint_probability_examples():
  print("=== Independent Events ===")
  # Rolling two dice
  p_die1_six = 1/6
  p_die2_six = 1/6
  p_both_six = p_die1_six * p_die2_six
  print(f"P(Both dice show 6) = {p_both_six:.4f}")
  
  print("\\n=== Dependent Events ===")
  # Drawing cards without replacement
  p_first_ace = 4/52
  p_second_ace_given_first = 3/51 # One ace removed
  p_both_aces = p_first_ace * p_second_ace_given_first
  print(f"P(Both cards are aces) = {p_both_aces:.4f}")
  
  print("\\n=== Joint Probability Table ===")
  # Create a joint probability table
  weather = ['Sunny', 'Rainy']
  mood = ['Happy', 'Sad']
  
  # Joint probabilities (must sum to 1)
  joint_probs = {
    ('Sunny', 'Happy'): 0.4,
    ('Sunny', 'Sad'): 0.1,
    ('Rainy', 'Happy'): 0.2,
    ('Rainy', 'Sad'): 0.3
  }
  
  for (w, m), prob in joint_probs.items():
    print(f"P({w}, {m}) = {prob}")

joint_probability_examples()

Independence Assumption

Understand when events are independent and how this simplifies probability calculations.

Independence Conditional Independence Simplification
# Independence vs dependence in probability
import numpy as np
import matplotlib.pyplot as plt

def demonstrate_independence():
  print("=== Testing Independence ===")
  
  # Example: Weather and coin flip (independent)
  p_sunny = 0.7
  p_heads = 0.5
  p_sunny_and_heads_independent = p_sunny * p_heads
  
  print(f"P(Sunny) = {p_sunny}")
  print(f"P(Heads) = {p_heads}")
  print(f"P(Sunny AND Heads) if independent = {p_sunny_and_heads_independent}")
  
  # Example: Height and weight (dependent)
  print("\\n=== Dependent Example ===")
  p_tall = 0.3
  p_heavy = 0.4
  p_heavy_given_tall = 0.8 # Tall people more likely heavy
  
  p_tall_and_heavy_dependent = p_heavy_given_tall * p_tall
  p_tall_and_heavy_independent = p_tall * p_heavy
  
  print(f"P(Tall AND Heavy) if dependent = {p_tall_and_heavy_dependent:.3f}")
  print(f"P(Tall AND Heavy) if independent = {p_tall_and_heavy_independent:.3f}")
  print(f"Difference = {abs(p_tall_and_heavy_dependent - p_tall_and_heavy_independent):.3f}")
  
  # Conditional independence (key for Naive Bayes)
  print("\\n=== Conditional Independence ===")
  print("Features may be dependent on each other,")
  print("but independent GIVEN the class label")
  print("This is the 'naive' assumption in Naive