Regression dashboards - overview
Linor Ben-El avatar
Written by Linor Ben-El
Updated over a week ago

Once your Pecan model is fully trained, you can view its performance in an interactive dashboard. Pecan’s dashboard provides statistical information and tools so you can understand the accuracy of your model, tailor it to your business needs and understand the importance of different features in your data.

This article provides an overview of navigating and interpreting the metrics in a dashboard for a regression model.

To view the dashboard for any model you have created:

  1. Click the “Predictive Flows” tab at the top of the screen

  2. Click on a flow with a trained model

Examining predictions on the Test Set

It's important to remember that the metrics displayed in your dashboard will be for your Test Set, which is the final 10% of training data that Pecan automatically sets aside during the training process. This set serves as fresh data that your model will test its predictions against to evaluate the predictive performance of your model.

Below is a breakdown of each dashboard component for a regression model, listed in order of appearance.

Performance metrics

The regression dashboard highlights metrics that represent the performance of the model. It appears in this fashion:

For an explanation of each of these metrics, please take a look at the Model performance metrics for regression models article.

Predicted vs. actual vs. benchmark

This graph illustrates a comparative analysis between the model's predicted values, the actual dataset values, and the estimates of a benchmark model.

The X-axis of this graph is determined by the actual values in the dataset, divided into 100 percentiles. This arrangement allows for the representation of the model's performance across high, medium, and low percentiles of the data.

The graph offers two display modes: 'zeros included' and 'zeros excluded'. Excluding zeros can be particularly advantageous when dealing with models where a large proportion of the data consists of zeros (such as LTV). This facilitates the exploration of model performance over non-zero observations.

Accuracy among different groups

Pecan allows you to test the model's performance across different ranges of values in the data. The entities are split into 3 groups, based on their actual label: low, med, and high. In each group, the following details are provided:

  • Values: range of the actual values of entities in this group.

  • Group total value: the sum of all entities in this group.

  • Group size: the proportion of this group from the data.

  • Mean error: The average error between model predictions and actual values.

  • Benchmark error: The error of a simple rule-based model.

The groups' limits are configurable and can be edited according to your wish.

Predicted and actual vs benchmark - Over Time

This graph shows how your model performed over a period of time by comparing your predictions against actual values and against benchmark predictions.

This graph helps us see trends, patterns that repeat over time, noise in the data, and unusual points that need to be looked at more closely.

By pressing the 'Sum' button, the graph shows the total predicted and actual values for all things we're looking at. For instance, in a model predicting how much money will be spent over time, it compares the total predicted spending with the total real spending for each day.

The 'Average' button shows the average predicted and actual values for all things we're looking at. In the same spending model, it compares the average predicted spending with the average real spending per item for each day.

You can also change the graph to show data by day, week, or month.

Column Importance

When your model is trained, it uses the columns from the Attribute tables to find common patterns and similarities of the Target population. The model assigns different weights to the columns according to the impact they had on predicting the Target.

The importance of each column is calculated by summing the importance of all the AI aggregations (also known as features) that were extracted from the column.

For a comprehensive explanation of the widget and how to interpret it, see Understanding Column importance.

Model Output

Located at a different tab, this table displays a sample of 1,000 predictions in your dataset, including:

  • EntityID & Marker

  • Actual value

  • Predicted value

  • Error (predicted value - actual value)

  • 10 most contributing features to the prediction (when clicking a row).

You can download the full output table to a spreadsheet by clicking Save as CSV.

For more details, see this article: Understanding Explainability & Prediction Details.

Did this answer your question?