Snowflake
Snowflake is a cloud-based data warehouse provided as Software-as-a-Service (SaaS). It offers all the features of a modern data warehouse, including scalability, ease of use, secure access to your data, accelerated analytics capabilities, and much more.
RudderStack allows you to configure Snowflake as a destination to dump your event data seamlessly.
Please check our Warehouse Schemas guide to know how events are mapped to the tables in Snowflake.
Find the open-source transformer code for this destination in our GitHub repo.
Snowflake User Setup
To enable RudderStack access, please make sure you have an ACCOUNTADMIN, or an account that has MANAGE GRANTS.
The following sections illustrate how to create a virtual warehouse, a database, a role, and an user in Snowflake:
Creating a Virtual Warehouse
Create a X-Small warehouseby following the wizard on Snowflake's website as shown in the screenshot below:
You can set your data warehouse size as per your future data volume.
Alternatively, you can also use SQL to create a warehouse, as shown:
CREATE WAREHOUSE "RUDDER_WAREHOUSE" WITH WAREHOUSE_SIZE = 'XSMALL' WAREHOUSE_TYPE = 'STANDARD' AUTO_SUSPEND = 600 AUTO_RESUME = TRUE;It is recommended that you set AUTO_SUSPEND to ~10 mins, and enable AUTO_RESUME to avoid any extra costs.
Creating a Database
Please create a new database to avoid conflicts with your existing data, as RudderStack creates its own tables.
The following screenshot demonstrates the Create Database option in Snowflake.
Alternatively, you can also use SQL to create a database, as shown:
CREATE DATABASE "RUDDER_EVENTS";Creating a Role for RudderStack
Please execute the following SQL commands to create a new role with the required permissions to load your data into the warehouse:
- Create a new role
CREATE ROLE "RUDDER";- Grant access to the virtual warehouse
GRANT USAGE ON WAREHOUSE "RUDDER_WAREHOUSE" TO ROLE "RUDDER";- Grant access to the database
GRANT USAGE ON DATABASE "RUDDER_EVENTS" TO ROLE "RUDDER";GRANT CREATE SCHEMA ON DATABASE "RUDDER_EVENTS" TO ROLE "RUDDER";GRANT ALL ON ALL SCHEMAS IN DATABASE "RUDDER_EVENTS" TO ROLE "RUDDER";Creating a User
Finally, please create a user to connect RudderStack to the previously created Snowflake warehouse, as shown:
Alternatively, you can use SQL to create a user in Snowflake, as shown:
CREATE USER "RUDDER_USER" MUST_CHANGE_PASSWORD = FALSE DEFAULT_ROLE = "RUDDER" PASSWORD = "strong_unique_password";GRANT ROLE "RUDDER" TO USER "RUDDER_USER";Configuring Snowflake in RudderStack
In order to enable dumping data to Snowflake, you will first need to add it as a destination to the source from which you are sending event data. Once the destination is enabled, events from RudderStack will start to flow to Snowflake.
To do so, please follow these steps:
- Choose a source to which you would like to add Snowflake as a destination. You can also simply create a destination and connect it to a source later.
Please follow our Adding a Source and Destination guide to know how to add a source in RudderStack.
- After choosing a source, select Snowflake from the list of destinations.
- Give your destination a name and then click on Next. You should then see the following screen:
Connection Settings for configuring Snowflake as a Destination
Add the required credentials in the Connection Settings as described below:
- Account - This is the account ID of your warehouse. Account ID is part of the Snowflake URL. The following examples illustrate the slight differences in the account ID for various cloud providers.
Account ID sample Snowflake URL Snowflake cloud provider qya56091.us-east-1 https://qya56091.us-east-1.snowflakecomputing.comAmazon Web Services (AWS) rx18795.east-us-2.azure https://rx18795.east-us-2.azure.snowflakecomputing.comMicrosoft Azure (Azure) ah76025.us-central1.gcp https://ah76025.us-central1.gcp.snowflakecomputing.comGoogle Cloud Platform (GCP) ** - Database - The name of the database, as created in the Creating a Database section
- Warehouse - The name of the warehouse, as created in the Creating a Virtual Warehouse section
- User - This is the username, as created in Creating a User section.
- Password - This is the password, as created in Creating a User section.
The following settings are applicable if you are using an Amazon S3 bucket for object storage:
- S3 bucket name - This is your unique S3 bucket name.
- AWS Access Key ID - This can be obtained from the AWS Console.
- AWS Secret Access Key - This can be obtained from AWS Console. Please refer to the Setting Up Amazon S3 section for more details.
Please grant the following permissions to the IAM user:
"Effect": "Allow","Action": [ "s3:GetObject", "s3:PutObject", "s3:PutObjectAcl", "s3:ListBucket"]Configuring Snowflake Integration
If you have Amazon Web Services (AWS) as your cloud provider and want to leverage S3 as your object storage, you will need to follow a few more steps to configure your Snowflake destination with a snowflake integration.Detailed instructions can be found here
Configuring snowflake integration with AWS
- Create a policy in AWS:
- Create a role and attach the above policy in AWS: - Create role of type Another AWS account - Enter your AWS account ID and enable Require External ID - For external ID enter a dummy ID such as 0000. We will modify this later on. - And attach the policy created in Step 1. - Give your role a name and keep the role ARN handy for the next step.
- Create cloud storage integration in Snowflake
- Retrieve the AWS IAM user for your Snowflake account as shown:
- Grant IAM user permission to access bucket objects in AWS Choose the role created in Step 2 and edit the trust relationship with the following JSON.
- Grant integration access to role in Snowflake:
Replace <bucket> and <prefix> with your values in the JSON below. And create the policy with a name of your choice.
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:GetObjectVersion", "s3:DeleteObject", "s3:DeleteObjectVersion" ], "Resource": "arn:aws:s3:::<bucket>/<prefix>/*" }, { "Effect": "Allow", "Action": "s3:ListBucket", "Resource": "arn:aws:s3:::<bucket>", "Condition": { "StringLike": { "s3:prefix": ["<prefix>/*"] } } } ]}CREATE STORAGE INTEGRATION <integration_name> TYPE = EXTERNAL_STAGE STORAGE_PROVIDER = S3 ENABLED = TRUE STORAGE_AWS_ROLE_ARN = '<iam_role>' STORAGE_ALLOWED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/') [ STORAGE_BLOCKED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/') ]<integration_name> with an integration name of your choice.<iam_role> with the role ARN from the previous step.Record the values: <integration_name>.
DESC INTEGRATION <integration_name>;<integration_name> - The integration name used in the previous step.Record the values - STORAGE_AWS_ROLE_ARN and STORAGE_AWS_EXTERNAL_ID .
{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "<snowflake_user_arn>" }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": "<snowflake_external_id>" } } } ]}<snowflake_user_arn> is the STORAGE_AWS_ROLE_ARN in the previous step.<snowflake_external_id> is the STORAGE_AWS_EXTERNAL_ID in the previous step.grant usage on integration <integration_name> to role <sf_role><integration_name> is the integration created in the Step 3.<sf_role> is the role in Snowflake you want to grant access to.Detailed instructions can be found here.
Configuring snowflake integration with Azure
- Create a storage account in Azure.
Create a container in the storage account created above by navigating to <storage_account> - Storage Explorer - Blob Containers - Create a Blob Container.
- Create cloud storage integration in Snowflake:CREATE STORAGE INTEGRATION <integration_name>TYPE = EXTERNAL_STAGESTORAGE_PROVIDER = AZUREENABLED = TRUEAZURE_TENANT_ID = '<tenant_id>'STORAGE_ALLOWED_LOCATIONS = ('azure://<account>.blob.core.windows.net/<container>/<path>/', 'azure://<account>.blob.core.windows.net/<container>/<path>/')[ STORAGE_BLOCKED_LOCATIONS = ('azure://<account>.blob.core.windows.net/<container>/<path>/', 'azure://<account>.blob.core.windows.net/<container>/<path>/') ]
<tenant_id>- Get your tenant id by navigating to Azure Active Directory - Properties. The tenant ID is displayed in the Directory ID field.<account> - <storage_account> - Access keys - <storageAccountName> - Grant Snowflake access to the storage locations
- Navigate to the URL in the
AZURE_CONSENT_URLobtained in the previous step and accept. Grant snowflake access to the container created in Step 2: - Navigate to
Azure Services - Storage Accounts and select the storage account created in Step 1. - Add role : Navigate toAccess Control (IAM) - Add Role Assignmentand select either Storage Blob Data Reader(Read Access) orStorage Blob Data Contributor (Read and Write Access) - AddAssign Access : Add Service Principal as the type of security principal to assign the role to. And search forAZURE_MULTI_TENANT_APP_NAMEobtained in step 4.- Grant integration access to role in Snowflake:
Replace <integration_name> with the integration name created in the previous step.
DESC INTEGRATION <integration_name>;AZURE_CONSENT_URL and AZURE_MULTI_TENANT_APP_NAME .grant usage on integration <integration_name> to role <sf_role>;<integration_name> is the integration created in the Step 4.<sf_role> is the role in Snowflake you want to grant access to.Detailed instructions can be found here.
Configuring Snowflake Integration with Google Cloud Platform
- Create a Cloud Storage integration in Snowflake.
<integration_name>is the name of the new integration being created<bucket>is the name of the Cloud Storage bucket that you createdabove<path>is an optional path that can be used to provide granular control over objects in the bucket- Retrieve the Cloud Storage Service Account for your Snowflake account. The following
DESCRIBEcommand will retrieve the ID for the Cloud Storage service account that was created for your Snowflake account. - Where
<integration_name>is the name of the integration you specified above in step 1. - The output will be a table that has a property called
STORAGE_GCP_SERVICE_ACCOUNT. Retrieve that property value. - The value that should be retrieved will have the following format:
service-account-id@<unique_string>.iam.gserviceaccount.com - Grant the Service Account Permissions to Access Bucket Objects
- Create a custom IAM role that has the permissions required to access the bucket and get objects.
- Assigning the Custom Role to the Cloud Storage Service Account
- Log into the Google Cloud Platform Console as a project editor.
- From the home dashboard, choose Cloud Storage » Browser:
- Select the checkbox of the bucket you would like to configure for access.
- Click SHOW INFO PANEL in the upper-right corner. The information panel for the bucket will slide out from the right.
- In the Add members field, search for the service account name from the DESCRIBE INTEGRATION output in Step 2: Retrieve the Cloud Storage Service Account for your Snowflake Account (in this topic).
- From the Select a role dropdown, select Storage » Custom »
<role>, where<role>is the custom Cloud Storage role you created in Creating a Custom IAM Role (in this topic). - Click the Add button. The service account name is added to the Storage Object Viewer role dropdown in the information panel.
- Grant usage to an external (i.e. Cloud Storage) stage that references the integration you created.GRANT USAGE ON INTEGRATION <integration_name> TO ROLE "RUDDER";
CREATE STORAGE INTEGRATION <integration_name> TYPE = EXTERNAL_STAGE STORAGE_PROVIDER = GCS ENABLED = TRUE STORAGE_ALLOWED_LOCATIONS = ('gcs://<bucket>/<path>/', 'gcs://<bucket>/<path>/')DESC STORAGE INTEGRATION <integration_name>;1. Log into the Google Cloud Platform Console as a project editor 2. From the home dashboard, choose IAM & admin » Roles. 3. Click Create Role. 4. Enter a name, and description for the custom role. 5. Click Add Permissions. 6. Filter the list of permissions, and add the following from the list below: | Permission Property Name | | :----------------------- | | <code class="inline-code">storage.buckets.get</code> | | <code class="inline-code">storage.objects.get</code> | | <code class="inline-code">storage.object.list</code> | | <code class="inline-code">storage.objects.create</code> | 7. Click Create"RUDDER"is the name of the role that you created above here<integration_name>is the name of the integration that you set above in step 1
Setting Up the Network Access
You will need to whitelist the RudderStack IPs to enable network access to it.
The IPs to be whitelisted are : 3.216.35.97, 34.198.90.241 , 54.147.40.62 , 23.20.96.9, and 18.214.35.254.
Contact Us
If you come across any issues while configuring Snowflake with RudderStack, please feel free to contact us or start a conversation on our Slack channel. We will be happy to help you.