All Collections
Creating a Model
Understanding "Pecan number" concept
Understanding "Pecan number" concept

Pecan's "Pecan Number" uniquely identifies each transactional activity, offering granular data processing for better predictive accuracy

Linor Ben-El avatar
Written by Linor Ben-El
Updated over a week ago

Pecan's predictive model offers a unique approach to data structuring, allowing you to utilize various tables, including crucial transactional tables, to feed into your machine-learning models.
However, when dealing with extensive transactional data for a single entity, the challenge lies in differentiating and processing individual transactions. This is where the concept of the "Pecan Number" emerges in Pecan's framework. This unique identifier lends granularity to Pecan's data processing mechanism, providing an effective way to handle extensive transactional data. This article aims to clarify the role of the Pecan Number within the Pecan model, its interplay with transactional tables, and its impact on enhancing the model's predictive accuracy.

Where should a Pecan Number be used?

To contextualize the role of the Pecan Number, it's crucial to understand the broader framework of the Pecan predictive model. Pecan operates by constructing Entity, Target, and Attribute (ETA) tables using SQL queries, which are then transformed, enriched, and flattened by Pecan's platform into a comprehensive AI-ready dataset. The purpose is to align the data with the predictive question and enable machine-learning algorithms to recognize patterns and make predictions.

  1. Entity Table: Defines who and when the model will predict for.

  2. Target Table: Defines what we want to predict

  3. Attribute Table: Defines with which variables the model will use to generate predictions, such as transaction history, demographic information, etc. The Pecan number will be used here, in attribute tables, which are transactional (e.g., when there might be multiple events per customer).

The Role of the Pecan Number

As your model ingests data, it attempts to create predictions for the set of past entities and compare those predictions against actual occurrences. Each entity, defined by a unique pairing of a customer ID and a marker date, is assigned a unique identifier - the "Pecan Number".

In the architecture of Pecan's model, multiple transactions may occur for a single entity (a particular customer at a specific point in time). Here, the Pecan Number comes into play. Each transaction associated with an entity is assigned a distinct Pecan Number, enabling them to be individually accounted for in the predictive model.

This means that even if a customer makes multiple transactions within the specified time frame (the marker date), each transaction won't be lumped together but rather considered separately, enriching the dataset and refining the predictive model.

For demonstrating the concept, let's take a look on the historical transactions of the customer A_11:

Applying Pecan Number on the transactions table, will result numbering his historical transactions in the right order:

Significance of Pecan Number

The assignment of unique Pecan Numbers to transactions of a single entity provides the Pecan model with a more granular perspective. This facilitates a precise and nuanced understanding of each customer's behavior over time, rather than a vague, aggregated view. This level of detail is what allows for high-resolution predictive analytics.

By distinguishing between each transaction and feeding these distinct elements into the model, Pecan enables a sophisticated analysis of a customer's past behavior. It builds a comprehensive view of each customer's actions over time, allowing the model to draw on this rich history when making its predictions about future behavior.


Let's take a look on this Attribute query, in the Retention-Demo model.

The query combines the Entity and demodata.campaigns tables, ensuring the customer_id and entity_id match, and the campaign date_sent is within the 360 days before the Marker.

The Pecan Number, produced by the ROW_NUMBER() OVER (PARTITION BY Entity.entity_id, Entity.Marker ORDER BY a.date_sent DESC) command, gives a unique rank to each transaction based on the descending date_sent within each unique entity_id and Marker.

As a result, every transaction gets a distinct Pecan Number that enables Pecan to treat multiple transactions per entity separately, thereby refining your predictive model, extracting interesting insights from the history of each user, and leading to sharper predictions.

Before the query, the demodata.campaigns attribute table looked like this:

After running the query, the resulting table would include the Pecan Number:


The Pecan Number is a simple yet ingenious mechanism that enhances the accuracy and efficiency of the Pecan predictive model. By identifying individual transactions within the behavior of each entity, it allows for an enriched and detailed understanding of customer behavior, leading to more accurate and insightful predictions.

Did this answer your question?