Logistic Regression From Scratch — What It Actually Does (Without Skipping the Thinking)

Why I'm Writing This

I've used Logistic Regression many times through libraries, but I realized that using a model and understanding a model are very different things.

This post is my attempt to explain Logistic Regression from first principles—focusing on why it exists and how it works internally, not just how to call it from sklearn.

The Real Problem Logistic Regression Solves

The problem is not classification.

The real problem is:

How do we express confidence when the outcome is binary?

In many real-world systems (spam detection, fraud detection, medical risk scoring), we don't just want a class label. We want to know how confident the model is.

Linear Regression fails here because:

It produces unbounded outputs
It cannot be interpreted as probability

Logistic Regression exists to fix this exact issue.

Why Linear Regression Fails for Classification

Linear Regression predicts a number like:

y = w·x + b

But for classification:

Predictions like −3 or 2.7 make no sense
There's no concept of probability
Error is measured incorrectly using squared loss

So instead of predicting a value, we need to predict a belief.

The Core Idea Behind Logistic Regression

Logistic Regression keeps the linear model, but changes how we interpret its output.

Instead of saying:

"This number is the prediction"

We say:

"This number represents confidence before converting it into probability"

That linear score is then passed through a sigmoid function, which converts any real number into a value between 0 and 1.

This final output represents:

The probability that the input belongs to class 1

Minimal Logistic Regression (From Scratch)

Below is a minimal forward pass of Logistic Regression implemented from scratch:

import numpy as np

def sigmoid(z):
    return 1 / (1 + np.exp(-z))

def logistic_forward(x, w, b):
    z = np.dot(w, x) + b
    p = sigmoid(z)
    return p

x = np.array([2.0, 3.0])
w = np.array([0.5, -0.5])
b = 0.1

probability = logistic_forward(x, w, b)
print(probability)

This small piece of code captures the entire model logic:

A linear score
A probability mapping
No magic

Why Cross-Entropy Loss Is Used

Since the model outputs a probability, using squared error is incorrect.

Instead, Logistic Regression uses log loss, which:

Rewards confident correct predictions
Heavily penalizes confident wrong predictions

This makes the model statistically sound for binary outcomes.

Where This Shows Up in Real Systems

Even today, Logistic Regression is widely used in:

Credit risk modeling
Medical diagnosis systems
Baseline classifiers for large ML pipelines
Calibration layers in deep learning systems

Understanding it deeply helps in understanding all modern ML models.