Instance-Based Learning Concept
Learn how instance-based learning differs from model-based approaches by storing training examples.
Memory-Based
Non-Parametric
Local
Instance-based learning stores training instances and makes predictions based on similarity to stored examples, rather than building an explicit generalized model during training.
Lazy vs Eager Learning
Understand the fundamental difference between lazy and eager learning approaches.
Lazy Learning: Defers computation until query time (K-NN, Case-Based Reasoning)
Eager Learning: Builds model during training (Decision Trees, SVM, Neural Networks)
# Comparison of lazy vs eager learning
class LazyLearner:
def __init__(self):
self.training_data = []
def fit(self, X, y):
# Just store the data
self.training_data = list(zip(X, y))
print("Training completed: Data stored")
def predict(self, x_query):
# Computation happens here!
# Find nearest neighbors and make prediction
print("Computing prediction using stored data...")
return self._knn_predict(x_query)
class EagerLearner:
def __init__(self):
self.model = None
def fit(self, X, y):
# Build model during training
print("Training: Building model...")
self.model = self._build_model(X, y)
print("Model built and ready")
def predict(self, x_query):
# Quick prediction using pre-built model
return self.model.predict(x_query)
Memory-Based Learning
Explore how instance-based methods use memory to store and retrieve similar cases.
Memory-based learning assumes that similar problems have similar solutions. It relies on storing examples in memory and using similarity measures to retrieve relevant cases for new problems.
import numpy as np
from collections import Counter
class SimpleMemoryBasedLearner:
def __init__(self):
self.memory = [] # Store all training examples
def store_example(self, features, label):
"""Store a training example in memory"""
self.memory.append((features, label))
def retrieve_similar(self, query_features, k=3):
"""Retrieve k most similar examples"""
distances = []
for stored_features, label in self.memory:
# Calculate Euclidean distance
dist = np.linalg.norm(query_features - stored_features)
distances.append((dist, label))
# Sort by distance and return k nearest
distances.sort(key=lambda x: x[0])
return distances[:k]
# Example usage
learner = SimpleMemoryBasedLearner()
# Store some examples
learner.store_example(np.array([1, 2]), "A")
learner.store_example(np.array([2, 3]), "A")
learner.store_example(np.array([8, 9]), "B")
# Query for similar examples
similar = learner.retrieve_similar(np.array([1.5, 2.5]), k=2)
print("Similar examples:", similar)
Local Approximation
Understand how instance-based learning creates local approximations rather than global models.
Local Models
Adaptive
Context-Sensitive
# Local approximation concept
import matplotlib.pyplot as plt
import numpy as np
# Generate sample data
np.random.seed(42)
X = np.linspace(0, 10, 50)
y = np.sin(X) + 0.1 * np.random.randn(50)
def local_approximation(x_query, X_train, y_train, k=5):
"""Create local approximation around query point"""
# Find k nearest neighbors
distances = np.abs(X_train - x_query)
nearest_indices = np.argsort(distances)[:k]
# Use only local data for prediction
local_X = X_train[nearest_indices]
local_y = y_train[nearest_indices]
# Simple average of local neighbors
prediction = np.mean(local_y)
return prediction, local_X, local_y
# Example: Predict at x = 5.0
pred, local_X, local_y = local_approximation(5.0, X, y, k=7)
print(f"Query point: 5.0")
print(f"Local prediction: {pred:.3f}")
print(f"Using {len(local_X)} nearest neighbors")
print(f"Local X values: {local_X}")
print(f"Local y values: {local_y}")
Computational Characteristics
Learn about the computational trade-offs of instance-based learning methods.
Training Time: O(1) - Just store data
Prediction Time: O(n) - Search through all stored instances
Memory Usage: O(n) - Store all training data
import time
import numpy as np
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
# Compare training and prediction times
def compare_algorithms(X, y, X_test