⚡️ Saturday ML Spark – 🎯 Threshold Tuning

Posted on: April 11, 2026

Description:

Most classification models return probabilities, but the final decision — whether something is classified as 0 or 1 — depends on a threshold. By default, this threshold is set to 0.5. But in real-world systems, this default is often not optimal.

In this project, we explore how to tune the classification threshold to better align model predictions with real-world needs.

Understanding the Problem

A classifier outputs probabilities like:

0.2 → likely negative
0.8 → likely positive

To convert these into labels, we apply a rule:

If probability ≥ 0.5 → classify as positive

However, this assumption may not always make sense.

For example:

In fraud detection → missing fraud is costly → prefer high recall
In spam detection → false alarms are costly → prefer high precision

The threshold directly controls this trade-off.

What Is Threshold Tuning?

Threshold tuning is the process of selecting a cutoff value that determines how probabilities are converted into class labels.

Instead of using:

y_pred = model.predict(X_test)

we use probabilities:

y_probs = model.predict_proba(X_test)[:, 1]

and apply custom thresholds.

1. Applying the Default Threshold

We first evaluate the model using the default threshold of 0.5.

y_pred_default = (y_probs >= 0.5).astype(int)

This gives us a baseline for comparison.

2. Evaluating Multiple Thresholds

We test different threshold values and observe how metrics change.

for threshold in thresholds:
    y_pred = (y_probs >= threshold).astype(int)

For each threshold, we compute:

Precision
Recall
F1 Score

3. Understanding the Trade-Off

Changing the threshold affects model behaviour:

Lower threshold → more positives → higher recall, lower precision
Higher threshold → fewer positives → higher precision, lower recall

There is no single “best” threshold — it depends on the problem.

4. Visualising Threshold Impact

Plotting metrics across thresholds makes the trade-off clearer.

plt.plot(thresholds, precision)
plt.plot(thresholds, recall)
plt.plot(thresholds, f1)

This helps identify the point where the desired balance is achieved.

Why Threshold Tuning Matters

Threshold tuning allows us to:

align model decisions with business goals
control false positives and false negatives
improve practical usefulness of models
move beyond generic accuracy metrics

It is widely used in:

fraud detection
medical diagnosis
recommendation systems
lead scoring

Key Takeaways

Classification models output probabilities, not final decisions.
The default threshold of 0.5 is not always optimal.
Lower thresholds increase recall but reduce precision.
Higher thresholds increase precision but reduce recall.
Threshold tuning aligns ML models with real-world objectives.

Conclusion

Threshold tuning is a simple yet powerful technique that transforms how classification models are used in practice. Instead of relying on default settings, we can tailor model behaviour to match real-world priorities and constraints.

This makes threshold tuning an essential concept in the Saturday ML Spark ⚡️ – Advanced & Practical series.

Code Snippet:

# 📦 Import Required Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import precision_score, recall_score, f1_score, confusion_matrix


# 🧩 Load Dataset
data = load_breast_cancer()

X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target


# ✂️ Split Data
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.3,
    random_state=42,
    stratify=y
)


# 🤖 Train Model
model = LogisticRegression(max_iter=5000)
model.fit(X_train, y_train)


# 📊 Generate Predicted Probabilities
y_probs = model.predict_proba(X_test)[:, 1]


# 🎯 Default Threshold (0.5)
y_pred_default = (y_probs >= 0.5).astype(int)

print("=== Default Threshold = 0.5 ===")
print("Precision:", precision_score(y_test, y_pred_default))
print("Recall:", recall_score(y_test, y_pred_default))
print("F1 Score:", f1_score(y_test, y_pred_default))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_default))


# 🔁 Evaluate Multiple Thresholds
thresholds = np.arange(0.1, 0.91, 0.1)

results = []

for threshold in thresholds:
    y_pred = (y_probs >= threshold).astype(int)

    results.append({
        "threshold": threshold,
        "precision": precision_score(y_test, y_pred),
        "recall": recall_score(y_test, y_pred),
        "f1": f1_score(y_test, y_pred)
    })

results_df = pd.DataFrame(results)

print("\n=== Threshold Tuning Results ===")
print(results_df)


# 📈 Plot Metrics vs Threshold
plt.figure(figsize=(8, 5))

plt.plot(results_df["threshold"], results_df["precision"], marker="o", label="Precision")
plt.plot(results_df["threshold"], results_df["recall"], marker="o", label="Recall")
plt.plot(results_df["threshold"], results_df["f1"], marker="o", label="F1 Score")

plt.xlabel("Threshold")
plt.ylabel("Score")
plt.title("Threshold Tuning for Classification")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()


# ✅ Select Best Threshold (based on F1 Score)
best_threshold = results_df.loc[results_df["f1"].idxmax(), "threshold"]

print("\nBest Threshold based on F1 Score:", best_threshold)


# 🔍 Predictions using Best Threshold
y_pred_best = (y_probs >= best_threshold).astype(int)

print("\n=== Using Best Threshold ===")
print("Precision:", precision_score(y_test, y_pred_best))
print("Recall:", recall_score(y_test, y_pred_best))
print("F1 Score:", f1_score(y_test, y_pred_best))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_best))

← →	move
↑	rotate
↓	soft drop
Space	hard drop
P	pause / resume

⚡️ Saturday ML Spark – 🎯 Threshold Tuning

Description:

Understanding the Problem

What Is Threshold Tuning?

1. Applying the Default Threshold

2. Evaluating Multiple Thresholds

3. Understanding the Trade-Off

4. Visualising Threshold Impact

Why Threshold Tuning Matters

Key Takeaways

Conclusion

Code Snippet:

Comments

Add Your Comment

⚡️ Saturday ML Spark – 🎯 Threshold Tuning

Description:

Understanding the Problem

What Is Threshold Tuning?

1. Applying the Default Threshold

2. Evaluating Multiple Thresholds

3. Understanding the Trade-Off

4. Visualising Threshold Impact

Why Threshold Tuning Matters

Key Takeaways

Conclusion

Code Snippet:

Comments Show Comments

Add Your Comment

Related Posts

⚡️ Saturday ML Spark – 📊 Feature Drift Detection using Population Stability Index (PSI)

⚡️ Saturday ML Spark – 📉 Concept Drift Detection Basics

⚡️ Saturday ML Spark – 🧪 A/B Testing for ML Models

7-Day AI Crash Course

Comments