Logistic Regression - ML Classification Tutorial

🎓 Complete all modules to earn your Free Machine Learning Certificate
Shareable on LinkedIn • Verified by AITutorials.site • No signup fee

🎯 Welcome to Logistic Regression

Despite its name, Logistic Regression is actually a classification algorithm, not regression! It's one of the most popular algorithms for binary classification (yes/no, spam/not spam, disease/healthy).

In this tutorial, you'll learn how machines predict probabilities and make binary decisions using a brilliant mathematical function called the sigmoid.

📊 Classification vs Regression

Aspect	Regression	Classification
Output	Continuous values (any number)	Discrete categories (Class A, B, etc.)
Examples	Predicting prices, temperature, distance	Email (spam/not spam), disease (yes/no)
Output Range	-∞ to +∞	Class labels (0 or 1 for binary)
Best Metric	MSE, R² Score	Accuracy, Precision, Recall

💡 Why "Regression" in the name? Because it builds on linear regression by adding a special mathematical function (sigmoid) to convert continuous outputs to probabilities. Historical naming quirk!

⚙️ How Logistic Regression Works

The Sigmoid Function

The magic of Logistic Regression is the sigmoid function. It takes any number and squashes it into a probability (0 to 1):

σ(z) = 1 / (1 + e^(-z))

Where z = w₁x₁ + w₂x₂ + ... + b (same as linear regression!)

Key properties of sigmoid:

Input can be any number (negative or positive)
Output is always between 0 and 1
Output = probability of class 1 (positive class)
S-shaped curve makes smooth transitions

🏥 Example: Predicting if a patient has a disease:

Sigmoid output = 0.2 → 20% probability of disease
Sigmoid output = 0.9 → 90% probability of disease
Decision threshold = 0.5 (if output ≥ 0.5, predict "disease")

Decision Boundary

Logistic Regression creates a decision boundary that separates classes. Points on one side are classified as Class 0, the other as Class 1.

💻 Logistic Regression in Python

# Spam Email Detection using Logistic Regression
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score
import numpy as np

# Sample data: Features extracted from emails
# Features: [word_count, link_count, caps_ratio, exclamation_marks]
X_train = np.array([
    [100, 2, 0.05, 0],      # Legitimate
    [500, 15, 0.3, 5],      # Spam
    [150, 1, 0.02, 0],      # Legitimate
    [600, 20, 0.35, 8],     # Spam
    [200, 3, 0.08, 1],      # Legitimate
])

# Labels: 0 = Not spam, 1 = Spam
y_train = np.array([0, 1, 0, 1, 0])

# Create and train the model
model = LogisticRegression()
model.fit(X_train, y_train)

# Make predictions
new_email = np.array([[120, 1, 0.03, 0]])
prediction = model.predict(new_email)
probability = model.predict_proba(new_email)

print(f"Prediction: {'Spam' if prediction[0] == 1 else 'Not Spam'}")
print(f"Probability: {probability[0][1]:.2%} chance of being spam")

# Evaluate model
y_pred = model.predict(X_train)
accuracy = accuracy_score(y_train, y_pred)
print(f"\nAccuracy: {accuracy:.2%}")

💡 predict_proba(): Returns probabilities for both classes. Use predict() for hard predictions (0 or 1).

🔀 Beyond Binary: Multiclass Classification

Logistic Regression naturally handles binary (2-class) problems. For problems with 3+ classes, there are two approaches:

1️⃣

One-vs-Rest (OvR)

Train multiple binary classifiers (Class A vs All, Class B vs All, etc.) and take the highest probability

2️⃣

Multinomial

Train a single classifier that directly learns all classes using softmax (extension of sigmoid)

# Multiclass example: Iris flower classification
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression

iris = load_iris()
X, y = iris.data, iris.target

# Logistic Regression automatically handles multiclass
model = LogisticRegression(max_iter=200)
model.fit(X, y)

# Predict class and probabilities
prediction = model.predict([[5.1, 3.5, 1.4, 0.2]])
probabilities = model.predict_proba([[5.1, 3.5, 1.4, 0.2]])

print(f"Predicted class: {iris.target_names[prediction[0]]}")
for i, prob in enumerate(probabilities[0]):
    print(f"{iris.target_names[i]}: {prob:.2%}")

📈 Key Evaluation Metrics

For classification, we use different metrics than regression:

Metric	Formula/Description	When to Use
Accuracy	% of correct predictions	Balanced datasets
Precision	Of predicted positives, how many are correct?	When false positives are costly
Recall	Of actual positives, how many did we find?	When false negatives are costly
F1-Score	Harmonic mean of precision and recall	Imbalanced datasets
AUC-ROC	Area under the ROC curve	Comprehensive performance metric

🏥 Disease Detection Example:

Precision matters: If test says you have disease, how sure are we? (avoid false alarms)
Recall matters: Did we catch all actual disease cases? (avoid missing real cases)

✅ Strengths & ❌ Limitations

✅

Probabilistic Output

Returns probabilities, not just class labels

✅

Fast & Efficient

Trains quickly even on large datasets

✅

Interpretable

Feature weights show importance in predictions

❌

Linear Boundary Only

Can't handle complex non-linear decision boundaries

❌

Requires Feature Scaling

Performance improves when features are normalized

❌

Not for Complex Data

Tree-based methods often outperform on high-dimensional data

🎯 When to Use Logistic Regression

Binary or multiclass classification - Predicting categories, not numbers
You need probabilities - Not just class predictions
Interpretability matters - You need to explain decisions to others
Fast inference required - Real-time predictions needed
Classes are linearly separable - Decision boundary is roughly a line
Baseline classifier - Before trying complex algorithms

⚠️ Common Misconception: "Logistic Regression doesn't work with non-linear data." You can add polynomial features or interactions to handle non-linearity!

📚 Key Concepts Summary

Classification: Predicting discrete categories, not continuous values
Binary Classification: Two-class problem (yes/no, spam/not spam)
Sigmoid Function: Converts linear output to probability (0-1)
Decision Boundary: The line/plane that separates classes
Threshold: Usually 0.5 for probability → class conversion
Accuracy/Precision/Recall: Different metrics for different needs

📋 Summary

What You've Learned:

Logistic Regression is for classification (despite the name!)
Uses sigmoid function to convert outputs to probabilities
Outputs probabilities and class predictions
Works for both binary and multiclass problems
Fast, interpretable, but assumes linear separability

What's Next?

In the next tutorial, Decision Trees, we'll learn how to build tree-based models that can handle non-linear relationships and are much more flexible than Logistic Regression. Get ready for a different approach to classification!

🎉 Great Progress! You now know two fundamental algorithms: Linear Regression for continuous predictions and Logistic Regression for categories. These form the foundation of ML mastery!

📝 Knowledge Check

Test your understanding of Logistic Regression!

1. What type of problems is logistic regression used for?

A) Predicting continuous values

B) Binary and multi-class classification

C) Clustering data points

D) Time series forecasting

2. What function does logistic regression use to convert predictions to probabilities?

A) ReLU function

B) Linear function

C) Sigmoid (logistic) function

D) Softmax function

3. What range do probabilities output by logistic regression fall into?

A) 0 to 1

B) -1 to 1

C) -∞ to +∞

D) 0 to 100

4. What is the decision boundary in logistic regression typically set at?

A) 0.0

B) 0.25

C) 0.75

D) 0.5 (can be adjusted)

5. What does regularization in logistic regression help prevent?

A) Underfitting

B) Overfitting by penalizing large coefficients

C) Data imbalance

D) Missing values