Pecan

Have you ever studied for a test and memorized everything instead of understanding it? 

If the test contains the exact questions you've memorized, you'll get an A, but with new questions you've never seen before... let's say you won't be at the top of the class.

That's what happens when a model overfits - it means the model has memorized the answers instead of actually learning how to solve the problem.

Overfitting occurs when your predictions correspond too closely (or even precisely) to the entities in the training data set, and as a result, your model is unable to predict for new, unseen entities.

You can think of a binary model as a line that separates two groups (i.e. churn and non-churn) and a regression model as a line that tries to go through all the right points. When a model overfits, it looks too good to be true. It is too precise:

When a model underfits, it simplifies the process too much, so it does a not-so-good job.

Too many attributes: Too many columns of data (“<a href="https://help.pecan.ai/en/articles/6454371-understanding-eta-entity-target-attribute#UnderstandingETA(Entity,Target,Attribute)-WhatisanAttributetable?">Attributes</a>”) may cause the model to try and use all of them to make predictions, even if some are not useful for the task. This can make the model overly complex and cause it to fit the training data too closely.

Not enough entities: If there aren't enough entities for the model to learn from, it might overfit by learning the entities too well and not being able to generalize to new data.

1. Too many attributes: Too many columns of data (“<a href="https://help.pecan.ai/en/articles/6454371-understanding-eta-entity-target-attribute#UnderstandingETA(Entity,Target,Attribute)-WhatisanAttributetable?">Attributes</a>”) may cause the model to try and use all of them to make predictions, even if some are not useful for the task. This can make the model overly complex and cause it to fit the training data too closely.
2. Not enough entities: If there aren't enough entities for the model to learn from, it might overfit by learning the entities too well and not being able to generalize to new data.

At Pecan, we have a unique way of detecting overfitting. We compare performance across the training, validation, and test sets. If the gap exceeds 10%, we raise an alert, which is available in the model's dashboard. This approach helps us ensure that the model's predictions are not just accurate on historical data but will also remain robust for future, unseen data.

If you suspect overfit, we recommend taking two steps:

Reduce the number of <a href="https://help.pecan.ai/en/articles/6454371-understanding-eta-entity-target-attribute#UnderstandingETA(Entity,Target,Attribute)-WhatisanAttributetable?">attribute columns</a> fed into your model. This helps you achieve a leaner model that is more generalizable to future data. Generally, you will want to remove attribute columns if:

1. The attribute is unlikely to be causally related to your predictions.
2. They attribute cause leakage, meaning it’s representative of your training set, but won’t be available for future data.
3. The attribute impact on your model (as reflected by “Feature Importance”) seems unusual. ​

Add more entities. Providing the model with more entities to learn from can increase the variety of patterns and behaviors it can learn and increase its predictive power for future data.

1. Reduce the number of <a href="https://help.pecan.ai/en/articles/6454371-understanding-eta-entity-target-attribute#UnderstandingETA(Entity,Target,Attribute)-WhatisanAttributetable?">attribute columns</a> fed into your model. This helps you achieve a leaner model that is more generalizable to future data. Generally, you will want to remove attribute columns if:
 1. The attribute is unlikely to be causally related to your predictions.
 2. They attribute cause leakage, meaning it’s representative of your training set, but won’t be available for future data.
 3. The attribute impact on your model (as reflected by “Feature Importance”) seems unusual. ​
2. Add more entities. Providing the model with more entities to learn from can increase the variety of patterns and behaviors it can learn and increase its predictive power for future data.

In the event that you suspect your model might be overfitting, or if you're seeking guidance on enhancing its performance, please feel free to reach out to the Pecan team. As part of our commitment to your success, we're always prepared to provide in-depth assistance, share further insights, and help you navigate potential challenges.

Overfitting is when a model memorizes training data instead of learning patterns. Resolve it by reducing attributes and adding more data.

What is overfitting?

Go to Pecan.ai

Log in to Pecan

Go to pecan.ai

Login to Pecan

Find answers and get help from Intercom Support and Community Experts

This site employs cookies and other technologies that we and our third party vendors use to monitor and record personal information about you and your interactions with the site (including content viewed, cursor movements, screen recordings, and chat contents) for the purposes described in our Cookie Policy. By continuing to visit our site, you agree to our {websiteTermsLink}, {privacyPolicyLink} and {cookiePolicyLink}.

This site uses cookies and similar technologies ("cookies") as strictly necessary for site operation. We and our partners also would like to set additional cookies to enable site performance analytics, functionality, advertising and social media features. See our {cookiePolicyLink} for details. You can change your cookie preferences in our Cookie Settings.

We use cookies to make our site work and also for analytics and advertising purposes. You can enable or disable optional cookies as desired. See our {cookiePolicyLink} for more details.

Advertising cookies are set by our advertising partners to collect information about your use of the site, our communications, and other online services over time and with different browsers and devices. They use this information to show you ads online that they think will interest you and measure the ads' performance. Social media cookies are set by social media platforms to enable you to share content on those platforms, and are capable of tracking information about your activity across other online services for use as described in their privacy policies.

These cookies enable the website to provide enhanced functionality and personalisation. They may be set by us or by third party providers whose services we have added to our pages. If you do not allow these cookies then some or all of these services may not function properly.

These cookies are necessary for the website to function and cannot be switched off in our systems.

These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us to know which pages are the most and least popular and see how visitors move around the site.

You have the right to opt out of the sale of your personal information. See our {cookiePolicyLink} for more details about how we use your data.

Your Privacy Choices

We use cookies to enhance your experience. You can customize your cookie preferences below. See our {cookiePolicyLink} for more details.

Cookie Settings

Link, Press control-option-right-arrow to exit

Empty Help Center

Uh oh. That page doesn’t exist.

Disappointed

Neutral

Smiley

Thinking...

Searching through sources...

Analyzing...

Tickets submitted through the messenger or by a support agent in your conversation will appear here.

What is overfitting?

What causes overfitting?

Proactive Overfit Prevention in Pecan

How to resolve overfitting?

Still need help?