Skip to main content

How can I integrate data sources that are secured via private endpoints into Fabric? How do I deal with Azure Data Lakes behind a firewall? This blog post shows the possibilities which Fabric Nativ offers. 

Securing incoming data traffic

Considerations of network security need to cover incoming as well as outgoing data transfer.

Inbound data traffic describes access to Fabric itself (e.g. invoking app.fabric.microsoft.com). It is possible to integrate Fabric into a network via private links. This makes Fabric available exclusively via the internal network. Networks from the Internet are blocked.

Two parameters are used in the admin portal for this purpose: Azure Private Link and Block Public Internet Access.

Private links enable services to be delivered across private virtual networks (Vnet) without the need to connect them via peering.

"Block Public Internet Access" deactivates all traffic via the Internet. It is important to ensure that a private link is configured correctly before the setting is enabled. Otherwise there is a risk of locking oneself out of Fabric.

Restrictions due to the two tenant settings

The following features for different areas can be restricted by activating the tenant settings:

Subject area

Private Link activation 

Activation of Block Public Internet Access setting

Governance

One Lake Regional Endpoint is not supported (used to comply with data residency).

 

 

Microsoft Purview Information Protection is not supported, causing the Sensitivity button to be greyed out in Power BI, label information to be unavailable, and decryption of .pbix to fail.

 

Migration

Inability to migrate workspaces to capacities in another region.

 

 

Tenant migrations are not supported.

 

 

Fabric trials are not supported.

 

Data Engineering

Tenant must be in the home region where Fabric data engineering is supported (regardless of capacity region) to allow use of Spark jobs.

Visual Queries in the Warehouse

 

If data are to be loaded into the lakehouse via pipeline, implementation is possible. If a data warehouse is involved, no support is currently available.

 

Data Engineering IoT Focus

Shortcuts for Eventhouses are not possible.

Event Stream feature is not usable

 

Inability to invoke Eventhouses from data pipelines.

 

 

Reading of data via queuing as well as data connectors which rely on queuing.

 

 

Eventhouse via T-SQL queries.

 

Data Analysis

"Publish to Web" function is not available.

Power BI semantic model, Datamart or Dataflow Gen1 cannot connect to a Power BI semantic model or Dataflow as a data source.

 

Reports cannot be exported to PDF or PowerPoint.

E-mail subscriptions for dashboards and areas are no longer possible (brief component overviews are regularly sent to addressees

 

On-premises Data Gateway does not work, a workaround is necessary here. 

 

 

If both parameters are enabled, updates to the database for the modern usage metrics report fail.

User Experience

Deployment of designs and external images to customize the Fabric portal.

 

Service Limits

450 capacities supported

 

Table 1: Effects of the two settings on different subject areas 

Outbound traffic for destinations in Azure with network security

For outbound data traffic, access from Fabric to external data sources is of central importance. Different scenarios are possible here:

Scenario 1: Azure Storage / Data Lake Gen 2 behind firewall

Abbildung 3: Trusted-Workspace-Zugriff auf Azure Data Lake Gen 2 – eigene Darstellung

It is possible to whitelist all Fabric workspaces of the tenant or individual ones. Though all the tenant's workspaces can be specified, this is not recommended because the feature might be discontinued in future.

Individual workspaces can also be specified via the ARM template of the storage account / data lake:

"resourceAccessRules": [

 

       { "tenantId": " df96360b-9e69-4951-92da-f418a97a85eb",

 

          "resourceId": "/subscriptions/00000000-0000-0000-0000-000000000000/resourcegroups/Fabric/providers/Microsoft.Fabric/workspaces/b2788a72-eef5-4258-a609-9b1c3e454624"

       }

]

Here, however, only „b2788a72-eef5-4258-a609-9b1c3e454624“ is to be replaced with the own workspace ID. This can be found in the Fabric portal when invoking the workspace, and is indicated as part of the URL.

https://app.fabric.microsoft.com/groups/b2788a72-eef5-4258-a609-9b1c3e454624/list?experience=data-factory

In addition to the network, there is also identity as a security parameter 1. Therefore, the appropriate permissions still have to be set. Two access models are available here: Firstly, access-control lists and secondly, role-based access-control assignments (RBAC). For RBAC assignments, it is important to keep in mind that there are dedicated roles for data lakes and storage accounts (e.g. Storage Blob Data Reader).

Restrictions at a glance:

  • Currently only supported by Azure Data Lake Gen 2
  • Whitelisting at the individual instance level requires ARM template know-how
  •  F-capacity neede

Scenario 2: Azure PaaS resources integrated into the network via private endpoints are to be accessed by Fabric.

This is not the case if Anonymus Blob Access is activated (only recommended for fewer use cases, as authentication is thus bypassed).

Here, it is possible to work in Fabric via managed private endpoints and managed Vnets

This must then be approved in the Azure portal:

After that, it is possible to access the data sources with notebooks and Spark job definitions via the private endpoints. Access via pipelines is currently not possible, however. Spark notebooks

require CPU power, for which a cluster of virtual machines is created in the background. By default, the machines share a standard network. In this scenario, an isolated, managed network is created during the first run.

Restrictions at a glance:

  • Spark jobs in on-demand clusters are less readily available than default pools.

  • Fabric data-engineering workloads must be supported for tenant and capacity region: This may result in restrictions for Switzerland West.

  • F-capacity needed.

  • OneLake shortcuts are not supported if pointing to connections to a data lake with private endpoints.

Another scenario mentioned here for the sake of completeness is connectivity to on-premises data sources. The on-premises data gateway can be used to access these data sources directly.

Ultimately, it can be said that the private-endpoint feature often being discussed currently is only relevant if a data source existent in Azure is to be connected. In this case, one should be aware that the data source needs to be loaded via notebooks / Spark job definitions. Access via pipelines is not possible.

Fabric itself or the capacity cannot be integrated into a network via a simple private endpoint. A more complex setup is needed for this purpose. A private link must be used, and the tenant settings for this link and/or the Block Public Internet Access setting must be activated.

However, companies wanting to avoid a use of network integration can also employ other security measures, as described at the beginning of the article. In the end: Fabric security means more than the use of private endpoints.

Do you need help with your Fabric setup using private endpoints? Do you want to know if this feature is really necessary for your environment?

Talk to us and learn more about b.telligent's «Way to Fabric».

Get in touch