Model building is inherently an iterative process, and it often requires multiple rounds of adjustments and refinements. For this reason, we recommend starting your first notebook with these points in mind.
Choose a Simple Predictive Question
To familiarize yourself with the platform and get your process up to speed, it’s best to start with a simple predictive question:
1. Define Your Target Simply
Choosing a clear target that's easily identifiable in your data will help achieve results in your first model. For example, if your target is customer churn, there should be a clear indication in your data that a particular customer is no longer purchasing, or that they cancelled their subscription.
2. Start with an Event Trigger, Not a Recurring Prediction
Recurring predictions can introduce additional complexity, especially depending on the specific use case and the frequency of the predictions. It’s generally recommended to start with a triggered event-based approach, as opposed to sampling daily or weekly.
For example, sign up, a transaction or another event can be triggers for a prediction, while sampling on a recurring basis can be daily, weekly or monthly.
Generate Queries Step by Step
When using the AI assistant to generate SQL queries, begin by creating them step by step. The chat will walk you through each stage and offer extra details to help you better understand the process.
Import Only the Relevant Data for Your Model
We recommend connecting Pecan directly to your data source using our connectors. Once you’ve connected your data, begin by importing only the tables and columns that are directly relevant to your model. Focus on the columns needed to address your predictive question: columns for identifying the entity, target and relevant dates.
Start Simple: Work with a Small Number of Attributes
Start by working with a small number of attributes from one table in one attribute query to simplify the initial model and ensure it’s manageable.
Once you’ve established a solid foundation, you can gradually expand the model by duplicating your notebook and adding more attributes and attribute queries (manually or using the “Generate attribute cell” button).
This incremental approach helps maintain focus and makes it easier to identify potential issues, such as data leakage, early on.