Featured image of post Serverless Egress Control: How to Secure your Azure Databricks Serverless Outbound Network Traffic

Serverless Egress Control: How to Secure your Azure Databricks Serverless Outbound Network Traffic

In this post we will look at the new Databricks Serverless Egress Control feature, and how to use it to secure your serverless outbound network traffic.

Introduction

Ever since Databricks Serverless became Generally Available (GA) for Notebooks and Delta Live Tables (DLT) last year, it has been a game changer! Serverless has many benefits, including not having to worry about right-sizing and scaling your clusters, and almost instant start up times. But as with all new and shiny things, it can be easy to momentarily forget about the “boring” stuff; security.

When serverless first got released, there was no way for this compute to securely communicate to your data if it was stored in a privately networked storage account with public network access disabled. This automatically ruled out using serverless for the majority of production workloads (or anyone in financial services!). Databricks then introduced NCCs: Network Connectivity Configurations. When first released, NCCs essentially facilitated managed private endpoints for your storage account. You create them at the Databricks Account level - I recommend one per environment - and add private endpoint rules to them for specific storage accounts. These private endpoint rules create a managed private endpoint connection against your storage account which you can approve. This facilitates secure network connectivity between the serverless compute and your storage account. They have very recently been extended to support some exciting new functionality, so stay tuned for another blog on that!

Network Connectivity Configurations (NCCs)
Network Connectivity Configurations (NCCs).

This is a great feature, but as one financial services client asked me recently:

“What about outbound network traffic?

After having a little dig around in the Account Console, I stumbled upon a brand new Public Preview feature which helps with just that.

Databricks Serverless Egress Control

At the end of January 2025, Databricks introduced the concept of “Serverless Egress Control” (which simply means serverless outbound network traffic control) in Public Preview. Along with this, came the introduction of the Network Policy feature. Let’s have a look at how you can use this feature to secure your serverless outbound network traffic.

Databricks Network Policies

Navigating to the Databricks Account Console, under Cloud Resources you will now find Network Policies next to Network Connectivity Configurations.

Network Policies
Network Policies.

A default-policy is applied to all workspaces in your account, which allows full outbound network access. It continues to be applied to any workspaces that do not have a custom network policy attached. Let’s create a new network policy to restrict this.

Create Network Policy
Create Network Policy.

We’ve given our new network policy a name, and when selecting Restricted access to specific destinations a whole host of options appear.

Egress Rules

The first section allows us to specify a set of Egress Rules. These are essentially firewall rules that apply to outbound traffic. There are two subsections here, Allowed Domains and Storage Account.

Allowed Domains

Allowed Domains are where we specify destination domains that we want to allow network traffic to access. Let’s add a destination. Here I have added the FQDN (Fully Qulified Domain Name) pypi.org, an online Python package repository.

Add an allowed destination
Add an allowed destination.

Interestingly a dropdown is provided, however, it is greyed out on the DNS_NAME option. This might suggest that support for IP addresses and CIDR ranges is be coming soon.

I also added the domain files.pythonhosted.org, which is an additional domain required for downloading Python packages from PyPi.

Note: At time of starting this blog, the dropdown value was called FQDN. Shortly before publishing, it had been changed to DNS_NAME.

Storage Account

It was my original assumption that Allowed Storage Accounts is where we can specify the names of Storage Accounts we need to be able to access from serverless compute. There is a note to let you know that any Unity Catalog destinations are automatically allowed and therefore do not need to be added here. So if you already have your lake set up as an External Location for example, this will automatically be accessible using serverless compute.

However, when testing this out, I experienced some unexpected behaviour. I already have stgdbxdemo1 added as an External Location, so I have created an additional Storage Account in my resource group called stgdbxdemo2 which I am adding here. You can choose from blob or dfs as the Storage Service; I chose dfs.

Add an allowed storage account
Add an allowed storage account.

I expected this to give me network access to the dfs endpoint of the stgdbxdemo2 storage account, but it didn’t. I will discuss this feature in the Testing section below.

Policy Enforcement Mode

Now we have added some rules, we need to look at how those rules are applied.

Policy Enforcement Mode
Policy Enforcement Mode.

What’s interesting here is the option for a dry run mode. This feature does not block traffic to unallowed destinations, but captures the policy violations in the logs, which can be found in the network_outbound system table. The dry run mode is available for Databricks SQL and AI model serving workloads only. For all other products, the network policy is enforced.

I’m going to select Enforced for all products, because I want to test it out!

Attach to Workspace

Now I have created my policy.

New Restricted Network Policy
New Restricted Network Policy.

However, it is currently not being applied to any workspaces. Let’s attach it to my demo workspace.

In the Account Console, find your workspace under Workspaces, then under Network Policy select update network policy.

Attach Network Policy to Workspace
Attach Network Policy to Workspace.

Select the relevant policy you want to apply; I have selected my demo-network-policy. Note the Outbound network access property changes from FULL_ACCESS to RESTRICTED_ACCESS.

Network Policy Updated
Network Policy Updated.

Testing

Now that we have applied our network policy, let’s test it out.

Testing Successful PyPi Access

Firstly, i’m going to use the requests library to try and access the pypi.org domain by making a get request against the json package. We added all the necessary domains for downloading packages from PyPi, so we’re expecting this to be allowed.

Successful connection to pypi.org
Successful connection to pypi.org.

It works! Let’s also test using pip. I’m specifying the index url for clarity, but by default pip does use PyPi anyway.

Successful package download with pip
Successful package download with pip.

This works too!

Testing Blocked Access

Now let’s try access a domain which we didn’t specify in our network policy. I’m going to try and access catfacts.ninja - a very important and useful domain!

Blocked Access to Cat Facts
Blocked Access to Cat Facts.

Let’s add it to our Network Policy.

Add Cat Facts to Allowed Destinations
Add Cat Facts to Allowed Destinations.

It’s important to note that you will need to wait for the DNS to update before any newly added domains become accessible. If we do an nslookup on this domain straight after adding it to our policy, it will still be unresolvable.

nslookup against catfacts.ninja
nslookup against catfacts.ninja.

After 5 minutes or so, the domain becomes resolvable.

Successful nslookup against catfacts.ninja
Successful nslookup against catfacts.ninja.

Now let’s try get our cat fact again.

Successful Cat Fact Ninja Request
Successful Cat Fact Ninja Request.

Testing Storage access

I attempted to send a get request to a specific csv blob that I had written to my stgdbxdemo2 storage account using its dfs endpoint. I was expecting this to work, since I had added stgdbxdemo2 as an allowed storage account for the dfs service to my network policy. However, the connection did not go through.

Blocked Access to Storage Account dfs endpoint
Blocked Access to Storage Account dfs endpoint.

I ran an nslookup, and the domain was not resolvable. I then added the dfs endpoint FQDN directly as an egress rule to my network policy, to test that the endpoint was definitely reachable.

Adding dfs FQDN directly
Adding dfs FQDN directly.

This resulted in a successful nslookup and a successful get request.

Successful domain name resolution
Successful domain name resolution.
Successful connection to dfs endpoint
Successful connection to dfs endpoint.

At this point, I still had the storage account dfs service added as an allowed storage account destination as well. I remove this, and the connection still worked, suggesting that the allowed storage account destination has no impact.

I also experimented which changing the service type of my Allowed Storage rule between dfs and blob, to see if I could enable some protocol limiting. However again, this had no impact on the connections I was able to make.

Limitations to Consider

Obviously this feature is still in Public Preview, so some of the functionality feels somewhat limited, but hopefully that will change when it becomes GA. Some things to consider…

Allowed Storage

Don’t rely on Allowed Storage rules for network connectivity to storage accounts. It seems that the only way to access storage accounts is to add the FQDN of the storage account endpoint directly as an egress rule. I would expect this to be fixed before the feature goes GA.

Bulk Import

As of time of writing, it looks like there is no way to bulk import a set of rules via the UI or via the Databricks API. This could be a bit of a pain if you have a lot of rules to add. Hopefully this is a feature that is added going forwards.

IP Address and CIDR Range Support

As previously mentioned, currently there is only support for FQDN destinations, and IP Addresses or CIDR ranges are not supported. This will likely be sufficient for connecting to most public destinations, but will restrict connectivity to Gateway IPs for example.

Ownership and Management of Process

As far as I can tell, there is no integration functionality between Databricks Network Policies and Azure Network Security Groups (NSGs) or Azure Firewall. This means that you would need to manage your network policies separately to your Azure network security. This raises the question of who is responsible for managing these policies; the Databricks Admin or the Azure Network Admin? Having a defined business process here is going to be crucial for maintaining security standards. It also means you will need to significantly restrict who has access to the Databricks Account Console, since this would give them the ability to open up outbound network access.

Conclusion

In summary, Databricks Network Policies are a great new feature that allow you to restrict outbound network access for your serverless workloads. This is a great step forward in allowing the use of serverless compute into production workloads, however there are still some limitations to consider. I would expect this feature to be extended and improved upon before it becomes Generally Available, so keep an eye out for updates!

Built with Hugo
Theme Stack designed by Jimmy