About the Snowflake collector
Use this collector to directly harvest metadata from Snowflake data warehouse including tabular objects, policies, and tags, as well as functions and stored procedures from Snowpark. You can also harvest column statistics and table query counts.
Note
The latest version of the Collector is 2.159. To view the release notes for this version and all previous versions, please go here.
What is cataloged
The collector catalogs the following information.
Object | Information cataloged |
---|---|
Columns | Name, Description, Data Type, Is Nullable, Default Value, Key type (Primary, foreign) |
Table | Name, Description, Primary key, Schema |
Views | Name, Description, SQL definition |
Schema | Identifier |
Database | Type, Name, Identifier, Server, Port, Environment, JDBC URL |
User-defined functions (this feature is a part of Snowpark) | Name, Description, Owner, Function Type, Signature, Definition, Return Type, Language, Packages |
Stored Procedures (this feature is a part of Snowpark) | Name, Description, Owner, Signature, Stored Procedure Type, Definition, Return Type, Language, Packages |
Cataloging Snowflake Tags and Policies
Following additional information is cataloged when you run the collector with the --policy-collection and --tag-collection parameters.
Object | Information cataloged |
---|---|
Columns | Snowflake masking policies |
Snowflake Tags | Name |
Snowflake Tag Values | Name, Value |
Snowflake Masking Policies | Name, Date created, Snowflake tags, Owner, Type, Policy body, Schema, Database |
Snowflake Row access policies | Name, Date created, Snowflake tags, Owner, Type, Policy body, Schema, Database |
Cataloging Snowflake table query counts
Following additional information is cataloged when you run the collector with the --table-usage-collection and --table-usage-lookback-days parameters.
Object | Information cataloged |
---|---|
Table | Query Count |
Relationships between objects
By default, the harvested metadata includes catalog pages for the following resource types. Each catalog page has a relationship to the other related resource types. If the metadata presentation for this data source has been customized with the help of the data.world Solutions team, you may see other resource pages and relationships.
Resource page | Relationship |
---|---|
Snowflake Row access policies | Tables |
Snowflake Masking Policies | Columns, Tag value |
Table | Columns, Tag value |
Snowflake Tags | Tag value |
Columns | Table, Tag Value |
Lineage for Snowflake
The following lineage information is collected by the Snowflake collector.
Note
Note that the collector traces these relationships from a View’s columns to ultimate source Table columns across SQL expressions and subqueries.
Object | Lineage available |
---|---|
Column in View | The collector identifies the associated column in an upstream view or table:
|
User-defined function | A Function and:
A View and:
|
Stored Procedure | The collector identifies:
|
Important details about snowflake tag and policies
The Snowflake collector harvests Snowflake object tags, Snowflake tag-based masking policies, Snowflake masking policies, and Snowflake row access policies. This information enhances the data governance experience to discover all tags and policies applied across their Snowflake tables and columns in a data catalog
In the following example you can see how a Snowflake Tag-Based Masking Policy applied to sensitive data columns: routing numbers, bank name, and bank account number. In this view, you can also see the associated tag (Classification:confidential), and technical details about the policy, like the Policy Body which explains how the Policy works.

Feature | Details |
---|---|
Snowflake object tags | Tags enable data stewards to track sensitive data for compliance, discovery, protection, and resource usage use cases through either a centralized or decentralized data governance management approach. For details, see Snowflake documentation. |
Snowflake masking policies | Masking policies define what data is masked and not shown to users allowing authorized users to access sensitive data at query runtime. For details, see the Snowflake documentation. |
Snowflake tag based masking policies | When a masking policy is applied to a Snowflake tag, the columns with the same tag are automatically protected based on the conditions set in the masking policy. For details, see the Snowflake documentation. |
Snowflake row access policies | Access policies are applied to rows of data, that is, who can see the rows of data, and masking policies define what data is masked and not shown to users. For details about this see the Snowflake documentation. |
Table query count | For each table, the Snowflake collector computes percentile popular and a query count based on the user access history table in Snowflake. This information can then be used to assign popularity ranking to Snowflake tables in data.world. For details about this see the Snowflake documentation. |
When this data is harvested and is displayed in data.world, you see the following information in your catalog: Snowflake masking policies, Snowflake row access policies, Snowflake tag values, and Snowflake tags.

When you browse to a table harvested from Snowflake, you will notice that the appropriate policies are applied to the tables. In the following example, the policy and tag values are applied to the Order table.

Users can click through to the tag values and policies to see their details.
Note
Note that users can also use the Search feature in the application to locate all these resources easily.

Authentication supported
The collector supports the following authentication methods:
Username and password authentication. For details, see the Snowflake documentation.
Username and Key pair authentication. For details, see the Snowflake documentation.