As technology moves further into a service-oriented (read: Cloud) world, people and enterprises alike need to ensure that securing their assets is taken seriously. Security has always been one of the first of topics covered before purchasing a cloud service which is why good cloud vendors typically spend more time securing their products than any customer ever would want or be able to. That being said, there are often opportunities to make configurations that render the service even more secure than the out of the box offering.
In my current engagement, I have been partnering with a Red Pill Analytics client to move from a legacy business intelligence system to a hybrid cloud/on-premises technology stack that, in part, includes Attunity Replicate and Snowflake Data Warehouse on AWS. As part of the project, we have been tasked by Information Security with hardening the architecture. “Hardening” is a bit of an ambiguous term that loosely translates to: configure all components to be as secure as possible. While there are many controls that the team has worked through, this blog post will focus on one in particular: Deny network access to Snowflake and AWS S3 by default, allow access by exception.
Securing Access to Snowflake
The first step in satisfying the requirement listed above is to restrict Snowflake network access. Snowflake makes it incredibly easy to control which IP address(es) can access the instance. Simply navigate to Account > Policies and create a new network policy. A policy can include CIDR notated IP ranges which helps to whitelist traffic coming from various subnets on the company network. Do this before loading any data if at all possible. Creating a new network policy takes a few minutes to set up and can always be adjusted to add or remove IP addresses as needed.
Setting the (External) Stage
In our case, Attunity is running on-premises within the client network while Snowflake is a cloud-only data warehouse. A current (October 2018) pre-requisite for connecting Attunity to Snowflake requires that the customer provide an S3 bucket to stage data files; in Snowflake, this is known as an external stage. When Attunity tasks are run, files are continuously shipped to S3 and subsequently copied into Snowflake. As the InfoSec requirement above dictates, access to the S3 bucket must be restricted to only be accessible by Attunity and Snowflake at the network level.
Restricting S3 Access
We know the on-premise IP addresses that Attunity traffic will originate from so that part is easy; however, Snowflake is not as obvious. Snowflake is constantly spinning up and down compute (EC2) instances which means other than knowing the IP addresses are somewhere within AWS ranges, it is a bit of a moving target. Fortunately, Snowflake traffic can also be identified by the AWS Virtual Private Cloud (VPC) from which it originated. VPC Endpoints are not public information; however, the owner of the VPC can share the identifier with whomever they wish. In this case, Snowflake support can provide a customer with the appropriate ID. Taking the resulting information over to S3, bucket policies can include a combination of VPC IDs and IP Addresses as described here. Note: We had little luck using StringNotLike
for IP addresses as mentioned in the article but substituting NotIPAddress
worked just fine.
Putting together the IP addresses and Snowflake’s VPC endpoint ID, the bucket policy ends up looking like this:
<code>{
“Version”: “2012-10-17”,
“Id”: “”,
“Statement”: [
{
“Sid”: “”,
“Effect”: “Deny”,
“Principal”: “*”,
“Action”: “s3:*”,
“Resource”: “arn:aws:s3:::”,
“Condition”: {
“StringNotLike”: {
“aws:SourceVpce”: “vpce-”
},
“NotIpAddress”: {
“aws:SourceIp”: [
“0.0.0.0/0”,
“1.1.1.1/1”
]
}
}
}
]
}
Notice that the effect is to deny all traffic except the listed IP addresses and VPC Endpoint ID. Simply copy and paste the JSON into the Bucket Policy on the Permissions tab.
Once saved, a quick test using Postman to send a GET request to the bucket from an unauthorized IP address now returns 403 Forbidden, indicating the bucket policy is working as expected.
Conclusion
It is important to control not only who is accessing your applications but also where they are accessing from. Following the instructions above, Snowflake and S3 can be configured to only allow traffic from trusted networks.
For additional information specific to Snowflake security, check out documentation for items such as federated authentication & SSO, multi-factor authentication, AWS PrivateLink, AWS Direct Connect, and more. A categorized summary of Snowflake security features can be found here.
Need help?
Red Pill Analytics is a Snowflake Solutions Partner experienced not only in working with Snowflake and Attunity technically but also advising organizations on overall data strategy. From proof-of-concept to implementation to training your users, we can help. If you are interested in guidance while working with Snowflake, Attunity, or any of your data projects, feel free to reach out to us any time on our website or find us on Twitter, Facebook, and LinkedIn.