BigQuery is a serverless, highly scalable, SQL-based cloud data warehouse designed for business agility.
You'll need to provide certain details and credentials so Pecan can access the relevant data in Google BigQuery. If you need help with the below steps, be sure to consult with your internal IT or DevOps team.
Prerequisite steps
First, make sure you whitelist these Pecan IP addresses:
54.246.108.184
54.216.8.168
54.73.234.235Before adding a BigQuery connection, you’ll need to create a Pecan role with Read and Write permissions so Pecan can access your data service. Here’s how:
Log in to your Google Cloud account.
Create a Pecan role with relevant permissions for BigQuery:
Select your project at the top-left corner of the screen.
Click IAM & Admin > Roles > CREATE ROLE.
Create a role, using Title: “pecan_role” and ID: “pecanRole”.
If you would like to transfer data from your BigQuery dataset to Pecan, add the following permissions to this role:
bigquery.datasets.get bigquery.readsessions.create bigquery.readsessions.getData bigquery.readsessions.update bigquery.tables.get bigquery.tables.getData bigquery.tables.list bigquery.jobs.create
If you would like to send predictions from Pecan to your BigQuery dataset, add the following permissions to this role:
bigquery.tables.get bigquery.jobs.create bigquery.tables.create bigquery.tables.update bigquery.tables.updateData
Note: you can define a role that contains all the permissions above (both for transferring data and sending predictions) to serve both functions.
Click ADD > CREATE.
Next, create a Google Cloud service account and grant it access to your Pecan project:
Click IAM & Admin > Service Accounts > CREATE SERVICE ACCOUNT.
Enter a service account name of your choosing, and click CREATE AND CONTINUE.
In the “Select a role” field, select the “pecan_role” role you created in Step A above.
Click DONE.
Then download the JSON file for this service account, and use the information within it to complete the relevant fields in Pecan (Step 2). This is how it will look:
{ "type": "service_account", "project_id": <PROJECT_IDS>, "private_key_id": "ABCDEFGHI", "private_key": "-----BEGIN PRIVATE KEY----- <PRIVATE_KEY> -----END PRIVATE KEY-----", "client_email": "google-adminsdk-pxixy@somethinggooglerelated.iam.gserviceaccount.com", "client_id": "123456789", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/google-adminsdk-pxixy%40somethinggooglerelated.iam.gserviceaccount.com" }
Next, create an additional Pecan role with a different set of permissions for your service account.
Click IAM & Admin > Roles > CREATE ROLE.
Create a role with the title “pecan_role_for_empty_dataset” and ID: “pecanRoleForEmptyDataset”.
Add the following permissions to your Pecan role:
bigquery.tables.create bigquery.tables.delete bigquery.tables.get bigquery.tables.update bigquery.tables.updateData
Click ADD > CREATE.
Lastly, create an empty dataset.
Still, within Google Cloud, click Big Query > SQL Workspace.
Select your relevant project at the top of the screen, and click Create dataset (you may give it a name like “pecan_temporary_dataset”).
Once it’s created, click Open > Sharing > Permissions > ADD PRINCIPAL.
In the “New principals” field, enter the name of the service account you created in Step B above.
In the “Select a role” field, select the name of the role you created in Step D above (“pecan_role_for_empty_dataset”).
Click SAVE.
How to configure a Google BigQuery connection
Log in to Pecan, select the “Connections” tab, and click Add connection.
Select “Google BigQuery” and complete the following fields:
Connection name – This is how you’ll identify the connection when creating and working with models on the platform. Names should be unique and reflect the data source and what’s stored in it. Valid characters include letters, numbers, and underscores. Connection names can’t be changed once created. Example: “bigquery_paid_downloads_fall_2021”
Connection type – Select "Read" if the connection will be used to import data into Pecan. Select "Write" if it will be used to export predictions from Pecan to Google BigQuery.
Projects - the Google project name in which you created the role and dataset.
Credentials – This information is contained in the JSON file that was exported in Step 1C above. The file should contain your Project ID, Private Key ID, Private Key, Client Email, Client ID, Port, and Client x509 certificate URL.
Temporary dataset - This should be the name you created for the empty dataset in Step 2.e.ii above (“pecan_temporary_dataset” being the example).
Now, click Test connection to make sure everything is working correctly. Then click Create connection to complete the setup. (For more information, see Testing and creating a data connection.)