About the Tableau collector

Warning

This collector is in public preview. It has passed our standard testing, but it is not yet widely adopted. You might encounter unforeseen edge cases in your environment. data.world is committed to promptly addressing any issues with public preview collectors. If you face any problems, please report them through your Customer Success Director, implementation team, or support team for assistance.

Note

The latest version of the Collector is 2.294. To view the release notes for this version and all previous versions, please go here.

Use this collector to:

Discover Tableau objects (such as, Tableau workbooks and dashboards) in your Tableau Online or Tableau Server instance, etc.
Perform impact analysis to understand how changes to upstream data sources impact Tableau objects

Tableau version supported

The collector supports Tableau Cloud and Tableau Server. The specific versions supported are Tableau API versions 3.10 and above on Tableau Server v 2022.1
It is expected that the collector will support current versions of Tableau Online and Tableau Server. If you have any questions or encounter problems, please contact data.world Support.

Authentication supported

The Tableau collector supports the following methods for authentication:

What is cataloged

The collector catalogs the following information.

Table 1.

Object	Information cataloged
Databases	Name, Identifier, Description, Database Connection Type
Database Schemas	Name, Identifier
Database tables	Name, Identifier
Database columns	Name, Identifier
Tableau Databases	Name, Identifier, Connection Type
Tableau Database tables	Name, Identifier, Connection Type
Tableau Database columns	Name, Identifier
Projects	Name, Description
Workbooks	Name, Description, Creator Email, Creator Name, Creator Tableau User, Preview Image, and Workbook URL Note: Unpublished workbooks are not harvested. This is because the Tableau REST APIs do not return the objects if they are not published.
Dashboards	Name, Creator Email, Creator Name, Creator Tableau User, Preview Image, Dashboard URL, Number of Favorites, and Number of Views Note: Unpublished dashboard are not harvested. This is because the Tableau REST APIs do not return the objects if they are not published.
Views	Name, Creator Email, Creator Name, Creator Tableau User, Number of Views, Number of Favorites, Preview Image, and View URL Note: Unpublished views are not harvested. This is because the Tableau REST APIs do not return the objects if they are not published.
Datasource fields	Name, Identifier, Description
Calculated fields	Name, Identifier, Description, Calculation Formula, Category, Role, Type
Group fields	Name, Identifier, Description, Category, Role, Type
Bin fields	Name, Identifier, Description, Category, Role, Type, Bin Size
Column fields	Name, Identifier, Description, Category, Role, Type
Metrics	Name, Identifier, Description, Creator Email, Creator Name, Creator Tableau User, Metric Url
Custom SQL tables	Name, Identifier, Description, SQL Query
Embedded data sources	Name, Identifier, Last refresh date
Published data sources	Name, Identifier, Description, Last refresh date

Relationships between objects

By default, the harvested metadata includes catalog pages for the following resource types. Each catalog page has a relationship to the other related resource types. If the metadata presentation for this data source has been customized with the help of the data.world Solutions team, you may see other resource pages and relationships.

Table 2.

Resource page	Relationship
Databases	Schemas contained within the database Tables contained within the database
Database Schemas	Tables contained within the schema
Database tables	Schema containing database table Database containing the database table Views using the database table as a data source Column fields using the database table as a data source
Database columns	Tables containing the database columns Column fields referencing the database columns
Tableau Databases	Tableau Tableau Database table contained within the Tableau Database
Tableau Database tables	Tableau Column contained within the Tableau Database table
Tableau Database columns	Tableau Column Field referencing the Tableau Database Column
Projects	Views contained within the project Workbooks contained within the project Dashboards contained within the project Subprojects contained within the project
Workbooks	Projects that contain a workbook Data sources embedded within the workbook Views contained within the workbook Custom SQL Tables embedded within the workbook
Dashboards	Fields used by dashboard Custom SQL tables used by dashboard Projects containing dashboard Tables used by dashboard Workbooks containing dashboard Views embedded in dashboard
Views	Fields used by view Custom SQL tables used by view Projects containing view Database Tab;es used by view Workbooks containing view Dashboards which embed the view Metrics presenting the view
Datasource fields	Views that use the datasource field Dashboards that use the datasource field Data source fields containing the datasource field
Calculated fields	Views that use the calculated field Dashboards that use the calculated field Data sources containing the calculated field
Group fields	Data sources containing group field
Bin fields	Data sources containing group field
Column fields	Data Source containing the column field Views that use the column field Dashboards that use the column field
Custom SQL tables	Views using Custom SQL table Dashboards using Custom SQL table Workbooks using Custom SQL table
Embedded data sources	Fields contained within the embedded data source Workbook containing embedded data source
Published data sources	Fields contained within the published data source

Lineage for Tableau

The collector does not support harvesting cross-system lineage when Tableau reports connect to a source system using ODBC connections.

Table 3.

Object	Lineage available
Database columns and tables	Fields that use database columns and tables
Projects	Databases, Database schemas, Database Tables, Database Columns, Workbooks, Views, Dashboards, custom SQL tables, and Data sources that projects contain
Dashboards	Fields and tables that dashboards source their data from
Views	Fields and tables that views source their data from
Fields	Columns, tables, and other fields that a field uses its data from
Tableau Database tables	Tableau Databases containing the Tableau Database table
Tableau Database columns	Fields that reference the Tableau Database column, Tableau Database tables containing the Tableau Database column
Published data sources	Embedded data sources that were derived from published data source
Embedded data sources	Database tables and Database columns that the Embedded data source uses data from.
Custom SQL Tables	Database tables and Database columns that the Custom SQL Table uses data from. Note: Lineage is not created between Custom SQL Tables and columns or tables that are not available in the API.

Supported cross-system lineage

The currently supported data sources for cross-system lineage are:

Postgres
Snowflake
BigQuery
Redshift
Important
While other data sources are not formally supported, running the collector for those sources may still enable you to view cross-system lineage between Tableau and these sources.

Important things to note about improving the performance of collector runs

Depending on the size of your Tableau instance, you may want to exclude or include specific resources from your catalog.

Exclude object types: Use the --tableau-exclude parameter to exclude harvesting of certain object types. The supported object types are: View, Dashboard, Database, PublishedDataSource, EmbeddedDataSource, CalculatedField, ColumnField, BinField, GroupField, DatasourceField, CustomSQLTable, Metric
Filter to specific Tableau site: Use the --tableau-site parameter to filter to a specific site.
Filter to specific Tableau projects: Use the --tableau-project parameter to harvest from multiple tableau projects. Use the parameter multiple times for multiple projects.
Filter out specific Tableau projects: Use the --tableau-exclude-project parameter to skip harvesting from multiple tableau projects. Use the parameter multiple times for multiple projects.
GraphQL page size: Use the --tableau-graphql-page-size parameter to adjust the GraphQL page size. The maximum page size is 1000.
Increase Docker resources: If you run into out of memory errors, increase the memory on the machine running the collector, or increase the java heap size when running a jar file, or use filtering.

In this section: