Skip to main content

Preparing to run the Power BI collector

Setting up pre-requisites for running the collector

Make sure that the machine from where you are running the collector meets the following hardware and software requirements.

Table 1.

Item

Requirement

Hardware

RAM

8 GB

CPU

2 Ghz processor

Software

Docker

Click here to get Docker.

Java Runtime Environment

OpenJDK 17 is supported and available here.

data.world specific objects

Dataset

You must have a ddw-catalogs (or other) dataset set up to hold your catalog files when you are done running the collector.



Setting up access for cataloging Power BI resources

Important things to note

  • Power BI does not include user workspace when using Service Principal authentication.

  • Dataflows require the user/service principal to be added to the workspace with at least contributor access. When authenticating with username/password, the app registration needs to have least Dataflow.Read. All permissions in API permissions.

  • Only administrators of the tenant can grant admin consent.

Authentication types supported

There are two separate ways to authenticate to Power BI:

  • Service principal

  • User and password

This section will walk you through the process for both authentication types.

STEP 1: Registering your application

To register a new application:

  1. Go to the Azure Portal.

  2. Select Azure Active Directory.

  3. Click the App Registrations option in the left sidebar.

  4. Click New Registration and enter the following information:

    1. Application Name: DataDotWorldPowerBIApplication

    2. Supported account types: Accounts in this organizational directory only

  5. Click Register to complete the registration.

STEP 2: Creating Client secret and getting the Client ID

To create a Client Secret:

  1. Go to the Azure Portal.

  2. On the application page, select Certificates and Secrets.

  3. Click on Secret and add a description.

  4. Set the expiration to Never.

  5. Click on Create, and copy the secret value. You will use this value while setting the parameters for the collector.

To get the Client ID from the Azure portal:

  1. Click on the Overview tab in the left sidebar of the application home page.

  2. Copy the Client ID from the Essentials section. You will use this value while setting the parameters for the collector.

STEP 3: Setting up permissions for username & password authentication

Important

Perform this task only if you are using user and password for authentication.

To add permissions:

  1. Click on API Permissions, and select Add Permission.

  2. Search for the Microsoft Graph and select the following permissions:

    • Application permission: Application.Read.All

    • Delegated permission: User.Read (assigned by default)

  3. Search for the Power BI service, and click on Delegated permissions. Select the following permissions:

    • App.Read.All

    • Dashboard.Read.All

    • Dataflow.Read.All

    • Dataset.Read.All

    • Report.Read.All

    • Tenant.Read.All

    • Workspace.Read.All

  4. Click on the Grant Admin consent button, which is located next to the Add permission button. This allows the data.world collector to run as a daemon without having to ask the user permission on every crawler run.

Note

Only administrators of the tenant can grant admin consent.

STEP 4: Setting up REST API for service principals

Important

Perform this task only if you are using the service principal for authentication.

If you are using service principal as your authentication type, ensure that you enable service principals to use the Power BI APIs. For detailed instructions for doing this task, please see this documentation.

STEP 5: Setting up metadata scanning

Set up metadata scanning to enable access to the detailed data source information (like tables and columns) provided by Power BI through the read-only admin APIs.

Before metadata scanning can be run over an organization's Power BI workspaces, it must be set up by a Power BI administrator. Setting up metadata scanning involves two steps:

  • If using service principals for authentication, enabling service principal authentication for read-only admin APIs.

  • Enabling tenant settings for detailed dataset metadata scanning.

For details about doing this task, please see this documentation.

STEP 6: Getting the Tenant ID

  1. To find the tenant ID, click the question mark in the Power BI app and then choose About Power BI.

  2. The tenant ID can be found at the end of the Tenant URL. You will use this value while setting the parameters for the collector.