All Collections
Evaluating a Model
Dive deeper
Understanding Pecan’s Benchmarks
Understanding Pecan’s Benchmarks

Benchmarks evaluate ML models by comparing them to rule-based models, to understand their performance and communicate value to stakeholders

Ori Sagi avatar
Written by Ori Sagi
Updated over a week ago

In machine learning, benchmarks play a crucial role in measuring the performance of various models. Benchmarks are used to evaluate the effectiveness of algorithms, compare different models, and measure progress.

The benchmarks approach has three benefits:

  1. It provides a simple reference point for comparison.
    By comparing the performance of an AI model against a rule-based benchmark, we can estimate the extent of the lift provided by the AI model.

  2. It helps to identify areas where the AI model is underperforming.
    If the AI model cannot outperform the rule-based benchmark, it may indicate limitations to the model or that more data is needed to improve its performance.

  3. It provides a way to communicate the value of AI to non-technical stakeholders.
    By presenting the lift provided by the AI model compared to the rule-based benchmark, we can help stakeholders understand the potential impact of AI on their business.

How are benchmarks calculated in Pecan?

Binary classification models

For understanding the quality of binary classification models (2 category labels, for example: predicting churn vs. non-churn), two benchmarks are used:

  • The rule-based logic is a simple single-variable model created using a column from the data that is strongly correlated to the Target. It serves as a reference point to ensure that the model has sufficient predictive power and can outperform the benchmark. In case a single column achieves better results than a complex model that uses robust algorithms and multiple variables, it means there is a problem with the model that requires further investigation.

  • The random guess, on the other hand, assumes no logic and predicts the positive class based on its rate in the data. For example, if the Churn rate is 10%, randomly selecting 100 entities would provide 10 Churn entities, resulting in 10% Precision. The random guess is also vital as it shows how much better the model is compared to a situation where no sophisticated logic is used to detect the positive class.

Regression models

For understanding the quality of regression models (numeric label, for example: predicting users' LTV), one benchmark is used per model on different ranges of values in the data: low, medium, and high.

The benchmark in regression models is a rule-based logic based on a single column from the data with a high correlation to the Target.

How is the rule-based model calculated?

To further elaborate on the process of creating the rule-based model in Pecan of both binary/classification and regression models, let's dive into the steps involved.

Step 1: Choosing the variable

The first step in creating a rule-based benchmark is selecting the variable that will be used to build the rule-based model. This variable is chosen based on its correlation to the Target.

Why is correlation significant?
Correlation measures the strength of the relationship between two variables. By selecting the variable with the highest correlation to the label, we choose the variable with the strongest association with the outcome we are trying to predict.
This increases the chances that the benchmark will be a helpful label predictor.

Step 2: Generating a rule based on groups of the chosen variable

Once the variable has been selected, the next step is to generate a rule based on groups of the selected variable. This involves dividing the chosen variable into groups and assigning a value to each group based on its relationship to the label.

For example, if the chosen variable is age, we may divide age into groups such as 0-18, 19-30, 31-50, and 51 and above. We would then assign a value to each group based on its relationship to the label. If the label is binary (e.g., 0 or 1), we may assign a value of 0 to groups with a low incidence of the label and a value of 1 to groups with a high incidence of the label.

Once the groups have been assigned values, we can use them to generate a rule-based model. This model will assign a predicted value to each observation based on the group to which it belongs. For example, if an observation belongs to the 31-50 age group, the model will predict a value based on the value assigned to that group.


The process of creating a benchmark in Pecan involves selecting the variable with the highest correlation to the label and generating a rule-based model based on groups of that variable. The benchmark model provides a simple reference point for comparison and can be used to evaluate the performance of AI models. By using benchmarks, we can measure the effectiveness of AI models and identify areas where they can be improved.

Did this answer your question?