Generally, predictive models are designed to maximize prediction accuracy and minimize prediction error. But with Pecan, you can optimize the model-training process to create a model that best serves your business needs.
This is accomplished by selecting an optimization metric for your model. When you do so, you instruct the model to optimize its performance on the basis of a particular metric and to assign greater weight to different types of prediction error.
Once you choose an optimization metric, you can also optimize the model on the basis of the metric’s performance for a specific data aggregation and/or segment of the population.
Let’s illustrate this feature with an example…
A typical example
Say you’d like to reduce churn for your app by offering an incentive to the 1,000 users who are most likely to churn within the next month. To do this, you train a model to predict the likelihood of each user to churn within the next 30 days.
You configure the model’s threshold so your Precision Rate is 85% and the Detection Rate is 70%. This means the model’s overall predictions are correct 85% of the time, and it's able to detect churn 70% of the time.
The question is: does it make sense to optimize these metrics for your entire dataset when you’re only able to perform a business treatment on 1,000 users? In this case, you’re better off focusing on the accuracy of those top 1,000 churn predictions, even at the expense of all the rest. In other words, the cost of False Positives is relatively high.
Therefore, you would want to optimize the model to achieve the highest possible Precision Rate, but only for the top 1,000 users (those most likely to churn). This will enable you to predict for them with a high degree of confidence, and thus justify directing your limited resources towards them – even if it increases prediction error for all other users.
So, in this case, when configuring your Pecan model, you would select “Precision for Top 1,000 entities” as your optimization metric.
Deciding which optimization metric to use
Which metric you choose for your model will depend on three key factors:
Your business use case (e.g. churn, conversion), as expressed in your predictive question
The cost of taking action on the basis of predictions (a.k.a. the business treatment)
The cost of failing to take action on the basis of predictions (in the case of False Negatives)
As an example, imagine you want to train a model to predict customer churn. These are two paths of reasoning (among several) for selecting a particular optimization metric:
The cost of taking action to prevent churn is relatively high. If you would like to minimize the error of taking action for customers who were predicted to churn, but actually did not, you would want to optimize for a high Precision Rate. This will lead to less False Positive predictions.
The cost of not taking action to prevent churn (in cases where it occurs) is relatively high. If you would like the minimize the error of failing to take action for customers who were not predicted to churn but ended up doing so, you would want to optimize for a high Detection Rate. This will lead to less False Negative predictions.
For a brief description of each metric and when to use them, see here for binary models and here for regression models.
How to select an optimization metric
Once you’ve created a notebook for your model, you can select and configure an optimization metric for training purposes. For step-by-step instructions and to learn about each metric, see the following articles:
Note: Pecan does not currently support optimizing a model itself. Instead, multiple models are trained in parallel, and Pecan chooses the best-performing model based on your desired settings.
If you have any questions about selecting an optimization metric for your model, feel free to reach out to a Pecan expert or your customer success manager.