Skip to main content

About the Power BI Service collector

Power BI is a collection of software services, apps, and connectors that work together to turn your unrelated sources of data into coherent, visually immersive, and interactive insights. Your data might be an Excel spreadsheet, or a collection of cloud-based and on-premises hybrid data warehouses. Power BI lets you easily connect to your data sources, visualize and discover what's important, and share that with anyone or everyone you want.

Use this collector to harvest metadata from Power BI. Users can then:

  • Discover Power BI reports and dashboards across your enterprise’s Power BI workspaces.

  • Perform impact analysis to understand how changes to upstream data sources impact Power BI reports.

Important things to note:

  • This collector only harvests from the Power BI Service. It does not harvest Power BI desktop (pbix) files unless these files are uploaded to Power BI cloud service.

  • This collector does not harvest from Power BI Report Server. There is a separate collector available for Power BI Report Server.

Important

The Power BI Service collector can be run in the Cloud or on-premise using Docker or Jar files.

Note

The latest version of the Collector is 2.247. To view the release notes for this version and all previous versions, please go here.

What is cataloged

The collector catalogs the following information.

Table 1.

Object

Information collected

Workspaces

Title, Description

Apps

Title, Description

Power BI Measures

Title, Description, Is hidden, Expression

Reports

Title, Reports type, External URL, Embed URL, Preview Image, Created date, Last modified, Created by, Last modified by

Report Pages

Title

Note: Report pages within Apps cannot be cataloged when using Service Principal authentication due to restrictions in the Power BI APIs.

Dashboards

Title, External URL, Embed URL

Dashboard tiles

Title, Embed URL

Data Sources

Title, Data source type, Connection Details (kind and path)

Datasets

Title, External URL, Description, Created date, Created by

Dataflows

Title, Last modified, Description, Created by

Power BI Tables (Datasets and Dataflows)

Title, Is hidden, Description, Source expression

Power BI Columns

Title, Data type, Column type, Is hidden, Expression

Tabular file

File path, File name

File directory

Directory path

Database

Title, Type, Identifier, Server, Port

Database Schema

Title

Database Table

Title

Database Column

Title, Type

Table

Title, Description

Column

Title, Type



Relationships between objects

By default, the data.world catalog will include catalog pages for the resource types below. Each catalog page will have a relationship to other related resource types. Note that the catalog presentation and relationships are fully configurable, so these will list the default configuration.

Table 2.

Resource page

Relationship

App

Report, Dashboard, Workspace

Power BI Column

Power BI Table

Data source

Dataset, Dataflow, Tabular Data source (Database, Tabular File)

Tile

Dashboard, Report, Dataset

Dashboard

Tile, Workspace

Dashboard Tile

Associated Dataset

Dataset

Dashboard Tile, Report

Report

Tile, Workspace, Report pages, Dataset, Report

Note: In Power BI, App reports and the original reports in the associated workspace are considered as two different reports with unique report IDs. We catalog the relationship between these two reports.

Report Pages

Report

Dataset

Tile, Workspace, Report, Table, Data source, Dataset, Dataflow

Workspace

Report, Dataset, Dataflow, Dashboard, App

Dataflow

Workspace, Table, Data source, Dataflow

Power BI Table

Dataset, Dataflow, Power BI Column, Power BI Measure

Power BI Measure

Power BI Table

Tabular Data source (Database, Tabular File)

Data source



Lineage for Power BI

The following lineage information is collected by the Power BI collector.

Table 3.

Object

Lineage available

Dashboard Tile

Associated Dataset

Dataset

Associated Dataflow, Dataset

Dataflow

Dataflow

Power BI Column

Associated columns that the column sources its data from or calculates its values from.

Notes:

  • The collector is able to harvest lineage from Power BI expressions which use parameters in place of database server name, schema name, database table, or database name.

  • Table-level lineage, column-level lineage, and catalog relationships are not available between tables/columns (fields) and reports/report pages through the Power BI API.

Power BI table

Associated tables that the table sources its data from

Note: The collector uses Power BI expressions returned by the APIs to parse the lineage to the source columns/tables.

Power BI Measure

Associated columns that the measure sources it data from



Supported cross-system lineage

The currently supported data sources for cross-system lineage are:

  • Oracle

  • Denodo

  • Snowflake

  • SQL Server

  • PostgreSQL

  • Redshift

  • Databricks

  • CSV documents

    Important

    While other data sources are not formally supported, running the collector for those sources may still enable you to view cross-system lineage between Power BI and these sources.

Supported transformations and expressions for harvesting lineage metadata

Note

Any table operations or transformations not listed in the following table as supported or unsupported are ignored.

This section captures supported transformations, source expressions, calculated columns, and measure expressions when harvesting lineage metadata.

Table 4.

Category

Supported/Unsupported objects

Supported Parameterized Expressions

The collector parses source expressions that use parameters in place of the following elements of the expressions: full Source value, server/host value, warehouse value, database name, schema name, table name, SQL expressions which incorporate parameters into them

Supported data functions

Csv.Document, Excel.Workbook, File.Contents, Folder.Contents, Folder.Files, Json.Document, Odbc.DataSource, Odbc.InferOptions, Odbc.Query, Xml.Document, Web.Contents, Web.Headers, Web.BrowserContents, AmazonRedshift.Database, Sql.Database, Sql.Databases, Snowflake.Databases, PostgreSQL.Database, Databricks.Catalogs, Oracle.Database, Denodo.Contents, Databricks.Query

Supported table functions

Table.AddColumn, Table.AddIndexColumn, Table.RenameColumns, Table.NestedJoin, Table.ExpandTableColumn, Table.SplitColumn, Table.DuplicateColumn, Table.CombineColumns

Unsupported table operations

Note: Contact data.world support if you have any expressions that use the following unsupported table operations.

Table.Pivot, Table.PromoteHeaders, Table.DemoteHeaders, Table.PrefixColumns, Table.TransformColumnNames, Table.Unpivot, Table.UnpivotOtherColumns, Table.AddFuzzyClusterColumn, Table.AddJoinColumn, Table.AggregateTableColumn, Table.Combine, Table.CombineColumnsToRecord, Table.ExpandRecordColumn, Table.Join, Table.Transpose

Supported dataflow functions

  • PowerPlatform.Dataflows

  • PowerBI.Dataflows

Supported value functions

  • Value.NativeQuery

Supported calculated columns

Lineage from calculated column expressions containing columns with and without table references, Columns or tables with alphanumeric characters, Spaces, Hyphens, and Underscore are supported

Supported measures

Lineage from measure expressions containing columns or tables with alphanumeric characters, Spaces, Hyphens, Underscore, Surrounding quotes are supported



Version supported

  • The collector supports Power BI Cloud API v 1.0.

Authentication supported

There are two separate ways to authenticate to Power BI:

  • Service principal

  • User and password

The collector will harvest metadata for all Power BI apps and workspaces to which the supplied account has access.