Once you train a model that answers your predictive question, you can start connecting fresh data to Pecan to create up-to-date ongoing predictions.

What does it mean to predict with a machine-learning model?

When predicting a model (also - deploying a model, or making a model go “live”), you can:

Generate predictions for newly-provided entities
Download a CSV with the predictions
Send (or export) the predictions to a destination of your choice, such as your data warehouse or your marketing attribution system; Pecan offers a variety of options you can send your predictions to
Schedule automatic generation of predictions on a timely basis

Things to do before you can schedule predictions

Use a "live" data connection (i.e., not a static CSV) as its data source.
If you want to send your predictions to your data warehouse, you must have a ready “write” connection to send them to. Follow this guide to create one if you don’t have it set up yet. If you prefer to download a CSV, you can skip this step.

How to set up and get your predictions

Get to the Predict tab

Go to the predictive flow you want to predict with and click the “Predict” tab on the top of the screen:

2. The Predict view

The Predict view loads with the predictions agent on the left side and the prediction notebook and predictions tabs on the right side. The agent can help and guide you through the process of setting up the predictions.

3. Review the Queries

Pecan automatically creates this notebook, ensuring it remains fully aligned with the specific core set and attributes defined during your model training phase

Core Set Query: This query creates a table of the entities you want to get predictions for. It is mostly identical to the "sampled" cell in your predictive notebook (cell named sampled_users, sampled_customers, etc.). Usually, the change refers to timeframes (i.e., taking users from all timeframes instead of the last two years).

Attribute Queries: Attribute queries provide feature data for the model to make predictions. They take the entities from the core set and join them to other tables to pull relevant historical information.

Note that the output columns and their types must match exactly what was used in training - this ensures the model sees the same features it learned from.

4. Run Validations

Before you can predict, the notebook must pass dataset validations.

Validations check that:

Every column and data type aligns perfectly with the training set.
The core set contains no redundant rows to prevent double-predicting any individual record.
The ID and Date in the core set and entirely free from null values

If any cells are out of date (e.g., you’ve edited the core set), they must be re‑run before validation.

Upon running the validations - if any fail, you can click “How to fix it” next to each failure to see the exact issue, and the agent can help you correct it.

5. Configure Prediction Run

Once validations pass, click the Predict button at the upper right part of the notebook in the UI to set the prediction configuration.

Schedule - one‑time (now) or recurring (daily, weekly, monthly, or custom Cron using a cron expression for a more custom frequency.).
Exclude entities that already have predictions - optional checkbox to avoid duplicates with previous predictions executions.
Writeback - optional, to send predictions to a destination table in your data warehouse. You’ll need to choose a write connector and table path, and can customize the writeback query.

Note: To be able to select a destination for your predictions, you need to create a “write” type connection. Each connection can be either “read” or “write”, so two connections are required even if it is from and to the same destination. Once you create a “write” connection, you can select if from the dropdown.

Writeback Query (optional): Using SQL, you can manipulate the predictions output table. Common manipulations are related to the column names you would like to see.
The default columns there are:
- entity_id: The identifier you used to identify your entities (user ID, SKU, etc).
- Marker: The time for which the prediction was given
- 0 : The probability that the wanted result WILL NOT happen, or the predicted value in regression models
- 1 : The probability that the wanted result WILL happen (for classification models only)

Remember that ` (backtick) and ‘ (single quote) represent different things when using this option. For example, while `entity_id` would address the entity_id column in the RESULTS table, ‘entity_id’ will produce a string that says “entity_id”.

6. Execute Predictions

After configuration, click the Predict button at the bottom of the configuration dialogue.

The run will process the core set and attributes, generate predictions, and store them in the results table.

You can monitor run status, view results, and download them.

7. Reviewing your predictions

Your various prediction batches are accessible within the dedicated predictions tab located on the right of the interface.

Selecting a specific prediction batch name expands the detailed view, where you can evaluate core metrics like the average, lowest, and highest probabilities, and access a full table showing individual results for each entity.

If you configured the predictions to be sent back to your data warehouse, they will wait for you there.

If you can't see them, you can check the logs in your 'write' connection history tab to see if Pecan had issues writing your predictions to your data warehouse and the cause for this issue.

a. Review the queries & data

You can review each prediction queries & data by clicking the ⠇ menu next to the batch and selecting the view queries & data option.

This view presents your data and queries, serving as a helpful resource for troubleshooting and refining your predictive flows.

b. Downloading predictions results

You can download all your prediction batches as CSV files from this screen by clicking the ⠇ menu next to the batch and selecting Download as CSV.

Please note that once your predictive run is set to a schedule, the associated notebook will be restricted and cannot be modified.

How does your model get data to produce predictions?

Each time your predictive model executes, it automatically imports fresh data from the tables the model uses. This up-to-date data serves as the basis for identifying new entities and their corresponding attributes, which are crucial for generating accurate and current predictions.

Pecan can re-import the entire table or fetch only new rows - depending on your table settings (read more here).

This process ensures that your model's insights are based on the most recent and relevant information, allowing for timely and informed decision-making.

Using Your Model To Schedule Automated Prediction Cycles