Docs portal

What is cataloged

The catalog collectors pull only metadata from the source. They don't collect any data. For databases, the information gathered includes the number of tables and columns , the names of the tables and columns, key information, and the data types used--information that is useful for data analysts to use.

Depending on your data source, additional things might be cataloged. What is cataloged also changes as we release new versions of the collectors. Detailed information about what is collected--as we have it--is in the documentaiton on the collector.

JDBC data sources

When the DWCC is run against a JDBC data source the following metadata is collected:

  • database name

  • connection information

  • schema name

  • table and view names by schema

  • column names

  • column datatypes

  • column length

  • column precision (as appropriate)

  • table and column descriptions (if they exist)

Primary and foreign key information is also collected by the DWCC, but it is not currently displayed in the platform.

JDBC sources include:

  • Athena

  • Databricks

  • DB2

  • Denodo

  • Dremio

  • Hive

  • Infor ION

  • MySQL*

  • Oracle

  • PostgreSQL

  • Presto

  • Redshift

  • Snowflake

  • SQL Anywhere

  • SQL Server

Note

* For MS SQL Server, table and column descriptions are not cataloged, even if they exist.

Tableau

When the DWCC is run against Tableau server the following metadata is collected:

  • Workbook name

  • Dashboard name

  • Dashboard title

  • Project a dashboard is in

  • Non-dashboard views

  • Number of dashboard views

  • Tags for objects that have them

  • Relationships between views/dashboards and workbooks

  • Number of dashboard favorites