All Collections
Getting Started
Binary vs. multiclass vs. regression models
Binary vs. multiclass vs. regression models
Ori Sagi avatar
Written by Ori Sagi
Updated over a week ago

Binary classification models

Binary models classify inputs into two mutually exclusive groups: “A” and “B” (or “yes” and “no”, “0” and “1”, etc.)

This means that everyone in the population must be grouped into either Group A or Group B, and never into both at the same time.

This would be typical for a model that predicts churn. Let’s say we’re predicting churn for an online gaming platform. Every player on the platform is in, or will fall into, either Group A (“churn”) or Group B (“no churn”). No other possibilities exist in between.

The machine-learning algorithm would learn from historical data that’s fed into it, and assign to each user a probability that they will churn within a defined time period in the future. If the probability is above a certain threshold – let's say 90% – the user is labeled as “churn” (Group A). If not, they will be labeled as “no churn” (Group B).

In Pecan, it is up to you to decide what statistical threshold to use. Once the model is trained, you can interactively change it on our dashboard and see your model's metrics change accordingly.

Moreover, Pecan automatically detects if your model is a Binary Classification.

Multiclass Models

Multiclass classification problems are very similar to Binary Classification, but here inputs can be classified into many separate mutually exclusive groups: A, B, C, D ...

For example, imagine you want to segment customers on your e-commerce platform based on their first two minutes of behavior: are they likely to become a Tier 1 customer, with high and frequent buys; are they a Tier 2 customer, with occasional buys; or a Tier 3 customer, which only buys on sales? Once properly classified, different treatments can be given to customers in order to increase conversions.

Another example might be having an undefined state on Binary Classifications; whether a customer is a churn (Group A), a non-churn (Group B), or undefined (Group C).

Here, the probability of a customer belonging to each tier/group is calculated, and the most likely selected.

In Pecan, the user can interactively change the statistical threshold for each class and see the model's metrics change accordingly.

Moreover, Pecan automatically detects if your model is a Multiclass Classification.

Regression Models

Regression problems involve quantitative problems, where outcomes are numbers instead of labels.

For example, imagine we want to predict the Lifetime Value of a customer on a gaming platform. How much money, let's say in US dollars, each user is likely to spend in the game? Here, quantifying the value is the goal, not classifying it into groups.

Of course, the metrics and algorithms for addressing regression problems are different.

Pecan automatically detects if your model is a Regression model.

How can you decide which type of model is relevant?

Will someone, or something, be either X or not X?

If so, this is a binary classification model.

Will someone, or something, be either A, B, C or D?

If so, this is a multiclass model.

What will the numeric value be for someone or something?

This would be a regression model.

Did this answer your question?