Once your Pecan model is fully trained, you can view its performance in an interactive dashboard.
Pecan’s dashboard provides statistical information and tools to help you understand how accurate your model is, tailor it to your business needs, monitor predictions over time, and discover the importance of different columns in your data.
It’s important to remember that the metrics displayed in your dashboard are for your Test Set, which is the latest 10% of the training data that Pecan automatically sets aside during the training process. Pecan uses this data as “fresh data”, asking your model to create predictions for it and then testing the predictions against it in order to evaluate the predictive performance of your model.
Below is a breakdown of each dashboard component for a binary model, listed in order of appearance.
Head metric & Quality comparison test
The first widget displays the Precision of the model and provides a comparison to both a benchmark model and a random guess.:
The benchmark is a simple rule-based model created using a single column from the data that has a strong correlation to the Target column. It serves as a reference point to ensure that the model has sufficient predictive power and is able to outperform the benchmark.
The random guess, on the other hand, assumes no logic and predicts the positive class based on its rate in the data. For example, if the Churn rate is 10%, randomly selecting 100 entities would provide 10 Churn entities, resulting in 10% Precision. The random guess is also important as it shows how much better the model is compared to a situation where no sophisticated logic is used to detect the positive class.
By default, the head metric is Precision, but if your priority is Recall, you can modify it in the "Advanced Performance Details" section. Once changed, the model will be evaluated against a benchmark and a random guess based on the Recall metric.
Advanced Performance Details
Precision & Recall
This widget presents both the Precision and Recall metrics of the model and offers insights into the number of entities utilized in the test set to evaluate the model. Additionally, you can modify the head metric to be either Precision or Recall within this section.
Threshold Selection
The threshold is a parameter in machine learning that determines the cutoff point for classifying predicted outcomes as positive or negative. It can be adjusted after training the model to meet specific business needs. Pecan allows you to adjust the threshold for optimal model performance.
By changing the threshold, the proportions of predicted positive and negative outcomes can be altered, impacting overall model performance.
It's important to note that adjusting the threshold doesn't change the model itself, but rather changes the probability score used to classify predictions into two classes. Pecan sets a default threshold based on the optimal balance of precision and recalls to provide the best results.
The graph illustrates the distribution of probability scores for entities in negative and positive classes. A clear separation between the classes indicates that the model can effectively distinguish between them and assign low probability scores to the negative class and high probability scores to the positive class.
Confusion Matrix
A confusion matrix is like a report card for our model. It tells us how well our model did in predicting the two classes. The confusion matrix has two main sections: Predicted as Negative and Predicted as Positive. The proportions between the two are defined by the threshold.
Under the two sections, there are 4 elements, which are determined by the scores provided to the entities by the model:
True '1': This is when the model correctly predicts that something is positive.
False '1': This is when the model incorrectly predicts that something is positive.
True '0': This is when the model correctly predicts that something is negative.
False '0': This is when the model incorrectly predicts that something is negative.
The confusion matrix shows us how many times our model made each of these types of predictions. By looking at the confusion matrix, we can see how well our model is doing and whether it is making more mistakes with false positives or false negatives.