How can I integrate data sources that are secured via private endpoints into Fabric? How do I deal with Azure Data Lakes behind a firewall? This blog post shows the possibilities which Fabric Nativ offers
Table of Contents
Securing Incoming Data Traffic
Considerations of network security need to cover incoming as well as outgoing data transfer.
Inbound data traffic describes access to Fabric itself (e.g. invoking app.fabric.microsoft.com). It is possible to integrate Fabric into a network via private links. This makes Fabric available exclusively via the internal network. Networks from the Internet are blocked.
Two parameters are used in the admin portal for this purpose: Azure Private Link and Block Public Internet Access.
Private links enable services to be delivered across private virtual networks (Vnet) without the need to connect them via peering.
"Block Public Internet Access" deactivates all traffic via the Internet. It is important to ensure that a private link is configured correctly before the setting is enabled. Otherwise there is a risk of locking oneself out of Fabric.
Subject area
Private Link activation
Activation of Block Public Internet Access setting
Governance
One Lake Regional Endpoint is not supported (used to comply with data residency).
Microsoft Purview Information Protection is not supported, causing the Sensitivity button to be greyed out in Power BI, label information to be unavailable, and decryption of .pbix to fail.
Migration
Inability to migrate workspaces to capacities in another region.
Tenant migrations are not supported.
Fabric trials are not supported.
Data Engineering
Tenant must be in the home region where Fabric data engineering is supported (regardless of capacity region) to allow use of Spark jobs.
Visual Queries in the Warehouse
If data are to be loaded into the lakehouse via pipeline, implementation is possible. If a data warehouse is involved, no support is currently available.
Data Engineering IoT Focus
Shortcuts for Eventhouses are not possible.
Event Stream feature is not usable
Inability to invoke Eventhouses from data pipelines.
Reading of data via queuing as well as data connectors which rely on queuing.
Eventhouse via T-SQL queries.
Data Analysis
"Publish to Web" function is not available.
Power BI semantic model, Datamart or Dataflow Gen1 cannot connect to a Power BI semantic model or Dataflow as a data source.
Reports cannot be exported to PDF or PowerPoint.
E-mail subscriptions for dashboards and areas are no longer possible (brief component overviews are regularly sent to addressees).
On-premises Data Gateway does not work, a workaround is necessary here.
If both parameters are enabled, updates to the database for the modern usage metrics report fail.
User Experience
Deployment of designs and external images to customize the Fabric portal.
Service Limits
450 capacities supported
Outbound Traffic for Destinations in Azure With Network Security
For outbound data traffic, access from Fabric to external data sources is of central importance. Different scenarios are possible here:
Scenario 1: Azure Storage / Data Lake Gen 2 behind firewall
It is possible to whitelist all Fabric workspaces of the tenant or individual ones. Though all the tenant's workspaces can be specified, this is not recommended because the feature might be discontinued in future.
Individual workspaces can also be specified via the ARM template of the storage account / data lake:
Here, however, only „b2788a72-eef5-4258-a609-9b1c3e454624“ is to be replaced with the own workspace ID. This can be found in the Fabric portal when invoking the workspace, and is indicated as part of the URL.
In addition to the network, there is also identity as a security parameter 1. Therefore, the appropriate permissions still have to be set. Two access models are available here: Firstly, access-control lists and secondly, role-based access-control assignments (RBAC). For RBAC assignments, it is important to keep in mind that there are dedicated roles for data lakes and storage accounts (e.g. Storage Blob Data Reader).
Restrictions at a glance:
Currently only supported by Azure Data Lake Gen 2
Whitelisting at the individual instance level requires ARM template know-how
F-capacity neede
Scenario 2: Azure PaaS resources integrated into the network via private endpoints are to be accessed by Fabric.
This is not the case if Anonymus Blob Access is activated (only recommended for fewer use cases, as authentication is thus bypassed).
Here, it is possible to work in Fabric via managed private endpoints and managed Vnets
This must then be approved in the Azure portal:
After that, it is possible to access the data sources with notebooks and Spark job definitions via the private endpoints. Access via pipelines is currently not possible, however. Spark notebooks
require CPU power, for which a cluster of virtual machines is created in the background. By default, the machines share a standard network. In this scenario, an isolated, managed network is created during the first run.
Restrictions at a glance:
Spark jobs in on-demand clusters are less readily available than default pools.
Fabric data-engineering workloads must be supported for tenant and capacity region: This may result in restrictions for Switzerland West.
F-capacity needed.
OneLake shortcuts are not supported if pointing to connections to a data lake with private endpoints.
Another scenario mentioned here for the sake of completeness is connectivity to on-premises data sources. The on-premises data gateway can be used to access these data sources directly.
Ultimately, it can be said that the private-endpoint feature often being discussed currently is only relevant if a data source existent in Azure is to be connected. In this case, one should be aware that the data source needs to be loaded via notebooks / Spark job definitions. Access via pipelines is not possible.
Fabric itself or the capacity cannot be integrated into a network via a simple private endpoint. A more complex setup is needed for this purpose. A private link must be used, and the tenant settings for this link and/or the Block Public Internet Access setting must be activated.
However, companies wanting to avoid a use of network integration can also employ other security measures, as described at the beginning of the article. In the end: Fabric security means more than the use of private endpoints.
Who is b.telligent?
Do you want to replace the IoT core with a multi-cloud solution and utilise the benefits of other IoT services from Azure or Amazon Web Services? Then get in touch with us and we will support you in the implementation with our expertise and the b.telligent partner network.
Many users of Google's IoT Core are currently looking for a successor to this service which will expire in August 2023. This blog post shows how Stackable's data platform can be used to create a highly scalable open-source alternative to Google's IoT Core.
Many security considerations involving Azure revolve primarily around network security. Other important security aspects to be considered in the context of Microsoft Fabric are indicated below.
Lack of resources or technical challenges are often hurdles to establish the value and viability of IoT use cases, and present them later to project sponsors. Even for simple IoT use cases, sometimes weeks instead of days may be needed to produce tangible results. In this blog, we’ll present our IoT kick-starter platform that makes it possible to technically assess simple IoT use cases within a few days.