About the Power BI Service collector
Power BI is a collection of software services, apps, and connectors that work together to turn your unrelated sources of data into coherent, visually immersive, and interactive insights. Your data might be an Excel spreadsheet, or a collection of cloud-based and on-premises hybrid data warehouses. Power BI lets you easily connect to your data sources, visualize and discover what's important, and share that with anyone or everyone you want.
Use this collector to harvest metadata from Power BI. Users can then:
Discover Power BI reports and dashboards across your enterprise’s Power BI workspaces.
Perform impact analysis to understand how changes to upstream data sources impact Power BI reports.
Important things to note:
This collector only harvests from the Power BI Service. It does not harvest Power BI desktop (pbix) files unless these files are uploaded to Power BI cloud service.
This collector does not harvest from Power BI Report Server. There is a separate collector available for Power BI Report Server.
Important
The Power BI Service collector can be run in the Cloud or on-premise using Docker or Jar files.
Note
The latest version of the Collector is 2.255. To view the release notes for this version and all previous versions, please go here.
What is cataloged
The collector catalogs the following information.
Object | Information collected |
---|---|
Workspaces | Title, Description |
Apps | Title, Description |
Power BI Measures | Title, Description, Is hidden, Expression |
Reports | Title, Reports type, External URL, Embed URL, Preview Image (not supported for paginated report types), Created date, Last modified, Created by, Last modified by, Descriptions |
Report Pages | Title Note: Report pages within Apps cannot be cataloged when using Service Principal authentication due to restrictions in the Power BI APIs. |
Dashboards | Title, External URL, Embed URL |
Dashboard tiles | Title, Embed URL |
Data Sources | Title, Data source type, Connection Details (kind and path) |
Semantic model | Title, External URL, Description, Created date, Created by |
Dataflows | Title, Last modified, Description, Created by |
Power BI Tables (Semantic model and Dataflows) | Title, Is hidden, Description, Source expression |
Power BI Columns | Title, Data type, Column type, Is hidden, Expression |
Tabular file | File path, File name |
File directory | Directory path |
Database | Title, Type, Identifier, Server, Port |
Database Schema | Title |
Database Table | Title |
Database Column | Title, Type |
Table | Title, Description |
Column | Title, Type |
Relationships between objects
By default, the data.world catalog will include catalog pages for the resource types below. Each catalog page will have a relationship to other related resource types. Note that the catalog presentation and relationships are fully configurable, so these will list the default configuration.
Resource page | Relationship |
---|---|
App | Report, Dashboard, Workspace |
Power BI Column | Power BI Table |
Data source | Semantic model, Dataflow, Tabular Data source (Database, Tabular File) |
Tile | Dashboard, Report, Semantic model |
Dashboard | Tile, Workspace |
Dashboard Tile | Associated Semantic model |
Semantic model | Dashboard Tile, Report |
Report | Tile, Workspace, Report pages (not applicable for paginated report types), Semantic model (not applicable for paginated report types), Report Note: In Power BI, App reports and the original reports in the associated workspace are considered as two different reports with unique report IDs. We catalog the relationship between these two reports. |
Report Pages | Report (not applicable for paginated report types) |
Semantic model | Tile, Workspace, Report, Table, Data source, Semantic model, Dataflow |
Workspace | Report, Semantic model, Dataflow, Dashboard, App |
Dataflow | Workspace, Table, Data source, Dataflow |
Power BI Table | Semantic model, Dataflow, Power BI Column, Power BI Measure |
Power BI Measure | Power BI Table |
Tabular Data source (Database, Tabular File) | Data source |
Lineage for Power BI
The following lineage information is collected by the Power BI collector.
Object | Lineage available |
---|---|
Dashboard Tile | Associated Semantic model |
Semantic model | Associated Dataflow, Semantic model |
Dataflow | Dataflow |
Power BI Column | Associated columns that the column sources its data from or calculates its values from. Notes:
|
Power BI table | Associated tables that the table sources its data from Note: The collector uses Power BI expressions returned by the APIs to parse the lineage to the source columns/tables. |
Power BI Measure | Associated columns that the measure sources it data from |
Supported cross-system lineage
The currently supported data sources for cross-system lineage are:
Oracle
Denodo
Snowflake
SQL Server
PostgreSQL
Redshift
Databricks
CSV documents
Important
While other data sources are not formally supported, running the collector for those sources may still enable you to view cross-system lineage between Power BI and these sources.
Supported transformations and expressions for harvesting lineage metadata
Note
Any table operations or transformations not listed in the following table as supported or unsupported are ignored.
This section captures supported transformations, source expressions, calculated columns, and measure expressions when harvesting lineage metadata.
Category | Supported/Unsupported objects |
---|---|
Supported Parameterized Expressions | The collector parses source expressions that use parameters in place of the following elements of the expressions: full Source value, server/host value, warehouse value, database name, schema name, table name, SQL expressions which incorporate parameters into them |
Csv.Document, Excel.Workbook, File.Contents, Folder.Contents, Folder.Files, Json.Document, Odbc.DataSource, Odbc.InferOptions, Odbc.Query, Xml.Document, Web.Contents, Web.Headers, Web.BrowserContents, AmazonRedshift.Database, Sql.Database, Sql.Databases, Snowflake.Databases, PostgreSQL.Database, Databricks.Catalogs, Oracle.Database, Denodo.Contents, Databricks.Query | |
Table.AddColumn, Table.AddIndexColumn, Table.RenameColumns, Table.NestedJoin, Table.ExpandTableColumn, Table.SplitColumn, Table.DuplicateColumn, Table.CombineColumns | |
Unsupported table operations | Note: Contact data.world support if you have any expressions that use the following unsupported table operations. Table.Pivot, Table.PromoteHeaders, Table.DemoteHeaders, Table.PrefixColumns, Table.TransformColumnNames, Table.Unpivot, Table.UnpivotOtherColumns, Table.AddFuzzyClusterColumn, Table.AddJoinColumn, Table.AggregateTableColumn, Table.Combine, Table.CombineColumnsToRecord, Table.ExpandRecordColumn, Table.Join, Table.Transpose |
Supported dataflow functions |
|
| |
Supported calculated columns | Lineage from calculated column expressions containing columns with and without table references, Columns or tables with alphanumeric characters, Spaces, Hyphens, and Underscore are supported |
Supported measures | Lineage from measure expressions containing columns or tables with alphanumeric characters, Spaces, Hyphens, Underscore, Surrounding quotes are supported |
Version supported
The collector supports Power BI Cloud API v 1.0.
Authentication supported
There are two separate ways to authenticate to Power BI:
Service principal
User and password
The collector will harvest metadata for all Power BI apps and workspaces to which the supplied account has access.