What is collected
The catalog collectors pull only metadata from the source. They don't collect any data. For databases, the information gathered includes the number of tables and columns , the names of the tables and columns, key information, and the data types used--information that is useful for data analysts to use.
Depending on your data source, additional things might be cataloged. What is cataloged also changes as we release new versions of the collectors. Detailed information about what is collected--as we have it--is in the documentation on the collector.
JDBC data sources
When the DWCC is run against a JDBC data source the following metadata is collected:
database name
connection information
schema name
table and view names by schema
column names
column data types
column length
column precision (as appropriate)
table and column descriptions (if they exist)
Primary and foreign key information is also collected by the DWCC, but it is not currently displayed in the platform.
JDBC sources include:
Databricks
DB2
Denodo
Dremio
Hive
Infor ION
MySQL*
Oracle
PostgreSQL
Presto
Redshift
Snowflake
SQL Anywhere
SQL Server
Note
* For MS SQL Server, table and column descriptions are not cataloged, even if they exist.
Tableau
When the DWCC is run against Tableau server the following metadata is collected:
Workbook name
Dashboard name
Dashboard title
Project a dashboard is in
Non-dashboard views
Number of dashboard views
Tags for objects that have them
Relationships between views/dashboards and workbooks
Number of dashboard favorites