Once your Pecan model is fully trained, you can view its performance in an interactive dashboard.

Pecan’s dashboard provides statistical information and tools so you can understand the accuracy of your model, tailor it to your business needs, monitor predictions over time, and understand the importance of different features in your data.

This article provides an overview of how to navigate and interpret the metrics in a dashboard for a regression model. For a more extensive explanation of each metric, see Model performance metrics for regression models.

To view the dashboard for any model you have created: log into Pecan, click the “Models” tab at the top of the screen, and then click “Models” in the left-side navigation.

Note that the metrics displayed in your dashboard will be for your Test Set, which is the the final 10% of training data that Pecan automatically sets aside during the training process. This set serves as fresh data that your model will test its predictions against in order to evaluate the predictive performance of your model.

Below is a breakdown of each dashboard component for a regression model, listed in order of appearance:

“Technical details” button

Click the “Technical details” button to view basic information about your model and when it will be run next. When you do so for a regression model, the below box will pop up:

Two bar graphs will display the distribution of results in both your Train Set and Test Set (the set that’s used to test the model once it’s been trained and validated.) If you notice a discrepancy, this may indicate that there is a issue with your model. For example: differing performance between them may indicate that the Train Set is not representative of your entire dataset.

“Code” button

Click here to see the code behind your model. The box that pops up will show:

  • the queries that were created for it (“input_query” panel)

  • how these queries were Parsed by Pecan to make them machine-readable (Level_one_query” panel)

  • statistics from the columns that exist in your dataset (“Analyzer_json” panel)

“Use model” button

Click here once your are satisfied with the model and ready to start generating predictions. You will select how often you want the model to run.

Date filter

Next to the threshold buttons is a date filter. Clicking it will open a pop-up box, where you can define which part of the dataset will be represented in the dashboard.

Naturally, the prediction results and performance metrics will also be affected by the date range you choose.

  • All to date: include all data up until the End Date (defined as either a “fixed” date or a "moving” date [e.g. 2 weeks ago])

  • All from now: include all data beyond the Start Date (defined as either a “fixed” date or a "moving” date [e.g. 2 days from now])

  • Test Set period: include data from the Test Set (the final 10% of training data, which is set aside and then used to test the model against fresh data)

  • Custom: include data from a custom date range

If you want the selected range to always be displayed by default, select “Save as default” and click Set range.

Performance metrics

The regression dashboard highlights metrics that represent the performance of the model. It appears in this fashion:

This panel will display different metrics depending on the type of regression model being run. Two of the below metrics will be included:

  • Mean Absolute Percentage Error (MAPE)

    • Expresses prediction accuracy as a measure of error. The lower the score, the better the model. Read more

    • Included for non-LTV regression models

  • Weighted MAPE (WMAPE)

    • Like MAPE, but grants more influence to high-value entities. Read more

    • Included for LTV (Lifetime Value) models

  • Weighted MPE

    • Like WMAPE, but indicates the direction of the mean percentage error. Read more

    • Included for LTV (Lifetime Value) models

  • Explained Variance (R²)

    • Quantifies how good a fit a model is; expresses the percentage of variance that’s accounted for by the model rather than by the dataset. The higher the score, the better the model. Read more

    • Included for non-LTV regression models

  • Median Absolute Percentage Error

    • Like MAPE, but uses the median – instead of mean – absolute percentage error. Read more

    • May be included for non-LTV regression models

For an explanation of each of these metrics, see Model performance metrics for regression models.

“Predicted and actual - Over Time” graph

This graph shows how your model performed over a period of time (as defined in the date filter) by comparing your predictions against actual values.

As shown below, the dotted blue line shows your predictions, and the solid blue line shows the actual values.

This visualization can help shed light on overall trends, seasonal cycles, noise in the data, and spikes or abnormalities that require investigation.

  • When the Total button is selected, the graph will display the sum of predicted values and the sum of actual values for all entities.

    For example: in a Long-Term Value model, you'll see the total dollar value of all predictions on any given day, compared to the total amount of dollars actually spent on that day.

  • When the Average button is selected, the graph will display the average predicted value and the average actual value for all entities.

    For example: in a Long-Term Value model, you’ll see the average dollar value predicted to be spent per entity on any given day, compared to the average amount actually spent per entity on that day.

Set Breakdowns

On the left side of the panel, clicking the Set Breakdowns button enables you to filter results for certain aggregation levels. In other words, you can segment model performance based on attributes like country code, media source, campaign name, etc.

In the box that pops up, you can select the attributes you’d like to segment performance by. The checkboxes on the left represent a column in your dataset, and the checkboxes on the right represent all the different values that occur in that column.

Let’s say you selected “country_code” – now, when you return to the dashboard, you would be able to view model performance for entities that had the values “US” and “CA” in the “country_code” column.

Similarly, if you had selected “bonus_type”, you would be able to segment the data based on entities who had received a particular bonus type.

“Predicted and actual - Plot” graph

This scatterplot, which is illustrated below, shows how close a model’s predicted values are to the actual values. Data is only displayed for the relevant date range selected in the date filter.

Each blue dot represents an entity, and you can hover over each dot to view the precise values for each. You can click and drag to zoom in on specific parts of the graph.

When the Predicted vs Actual button is selected, you’ll view both the predicted and actual value for each entity.

Here are a couple examples to help you interpret the graph:

  • As you can see above, on the right side of the graph, there was a prediction of $19.71 for one entity – and the actual value turned out to be $66.54.

  • Meanwhile, there were hundreds of predictions in the range of $4, and predictions for those entities tended to range between approximately $1.50 and $7. If you were to zoom in, you would see each individual entity and the specific values for each.

    • Below is a zoomed-in section of a plot for “Predicted vs. Actual” values:

When the Residuals button is selected, you’ll view the “residual” for each entity: the distance between the predicted value and the actual value.

This plot shows how many predictions were “off by how much”, and also makes it easy to spot outliers.

As you can see, the residual generally increases as the absolute value of actual results increases (since greater values become harder to predict).

Note that another way to visualize the residual would be to view the distance between the predicted value and the regression line for the model, which is not shown in the dashboard.

Feature Importance Widget

This widget communicates the importance of the features that exist in your data, or, in other words, how strongly they contribute to your model’s predictions.

Listed on the left side of the widget are the top 20 features that contribute the most to your predictions, as determined during initial model training.

Clicking on each feature will load a Feature Importance Graph (a.k.a. Partial Dependency Plot) on the right side of the widget. This graph shows the effect of each feature on your model’s predictions.

For a comprehensive explanation of the widget and how to interpret everything in it, see The Feature Importance Widget.

Preview Table

Located at the bottom of the dashboard, this table displays the first 100 predictions in your dataset, including:

  • the customer ID

  • the prediction itself (where 1 = target behavior, and an integer represents the prediction)

  • the top 10 contributing features, according to their SHAP values.

The round icons are designed to help you identify, quickly and easily, which features contribute most strongly to each prediction, and in what direction:

  • An “up” arrow indicates that a feature contributes to a prediction of the target behavior occurring, while a “down” arrow predicts the opposite.

  • The darker the purple of the icon, the stronger the effect of that feature.

You can download the full output table to a spreadsheet by clicking Save as CSV.

Did this answer your question?