Understanding threshold logic

Thresholds in binary classification models balance precision & recall, affecting model performance. Adjust based on business needs & costs.

Ori Sagi avatar
Written by Ori Sagi
Updated over a week ago

What is a threshold?

In binary classification models, entities are assigned a probability score between 0 and 1, indicating their likelihood of belonging to the positive class. The threshold separates the positive and negative classes by determining the probability score above which an observation is classified as positive and below which it is classified as negative.

You can set the threshold after the model is trained.

Pecan allows you to configure the threshold in the model’s dashboard to ensure the model’s output fits business needs.

How does adjusting the threshold impact the results?

First, we need to understand two basic terms:

  • Precision - the proportion of correctly predicted positive entities out of all entities predicted as positive. For example, if a model predicts that 100 users will churn and 80 of them are actually churned, the precision is 80/100 or 0.8.

  • Recall - the proportion of correctly predicted positive entities out of all actual positive entities in the test dataset. For example, if there are 150 actual churned users in the test dataset and the model correctly predicts 80 of them, the recall is 80/150 or 0.53.

    Adjusting the threshold can influence the balance between precision and recall, which are important metrics used to evaluate model performance. Pecan sets the default threshold by balancing precision & recall, aiming for the best results in both metrics
    (also known as F1 score).

The higher the threshold, the higher the precision is - when the probability score, which is used as the classification threshold, is relatively high - only entities with a higher degree of certainty are classified as positive by the model, resulting in higher precision.

The lower the threshold, the recall (detection) is higher - when the probability score, which is used as the classification threshold, is relatively low - more entities are assigned as positive, and hence the coverage of the positive class is bigger, resulting in higher recall (detection).

How to choose a threshold?

The decision of a threshold is not set in stone and is typically a business choice.
The choice will usually be derived from the intended action following the predictions made by the model. This decision is determined by answering the question of whether the focus should be on precision or recall (detection).

To make this determination, various factors need to be considered:

  • Cost of false-positive prediction (when the model predicts a "1" but the actual outcome is "0"):

    • If the cost is high, we will set a high threshold.

    • If the cost is relatively low, there is less concern about detecting false positives, and a lower threshold can be selected.

  • Cost of false-negative prediction (when the model predicts a "0" but the actual outcome is "1"):

    • If the risk is high, we would want to set a lower threshold to have higher coverage and avoid false negatives.

    • Conversely, if the risk is low, a higher threshold should be chosen.

Precision over recall (high threshold) | when the price of acting is high

Say you create a Churn model that predicts which customers are likely to churn within 60 days. You plan to offer a discount coupon to customers who are predicted to churn, but your company has limited resources, and you can only offer the discount to 3,000 customers every month.

To ensure that the coupon is only offered to customers who intend to churn, you would prioritize precision over recall by setting a relatively high threshold. This approach reduces the number of false positives, which are customers who are incorrectly flagged as at risk of churn, enabling you to use your resources more efficiently and avoid offering unnecessary discounts.

Recall over precision (low threshold) | aiming for high coverage of all target population

Suppose you work in a health organization that uses a model to predict whether patients are at risk of developing a particular type of cancer. The model helps identify patients who should undergo a diagnostic test to avoid a missed diagnosis.

Given the severe cost of missing a diagnosis, you would prioritize detection over precision by setting a lower threshold. This approach minimizes false negatives, which are patients who are incorrectly flagged as not at risk of developing cancer, and ensures that those who need the test receive it. However, as the diagnostic test becomes more expensive, it may become necessary to prioritize precision over detection, which would impact the placement of your threshold.

Understanding the threshold graph

The threshold graph demonstrates the distribution of the two classes along the probability score. In general, this graph can indicate how well the model distinguishes between the classes.

In this example, we can see a Retention model, where the model managed to separate the classes well, providing low scores to the negative class and high scores to the positive class.

In this example, we can see a classification model that didn’t manage to separate the classes very well, providing a relatively similar distribution of scores for both classes.

How can I improve my model for the threshold I chose?

Pecan allows you to optimize your model according to a specific threshold under “advances settings” in the Blueprint screen.
Read more here: Optimization metrics for binary models

Did this answer your question?