Skip to main content

About the Snowflake collector

Use this collector to directly harvest metadata from Snowflake data warehouse including tabular objects, policies, and tags, as well as functions and stored procedures from Snowpark. You can also harvest column statistics and table query counts.

Note

The latest version of the Collector is 2.159. To view the release notes for this version and all previous versions, please go here.

What is cataloged

The collector catalogs the following information.

Table 1.

Object

Information cataloged

Columns

Name, Description, Data Type, Is Nullable, Default Value, Key type (Primary, foreign)

Table

Name, Description, Primary key, Schema

Views

Name, Description, SQL definition

Schema

Identifier

Database

Type, Name, Identifier, Server, Port, Environment, JDBC URL

User-defined functions

(this feature is a part of Snowpark)

Name, Description, Owner, Function Type, Signature, Definition, Return Type, Language, Packages

Stored Procedures

(this feature is a part of Snowpark)

Name, Description, Owner, Signature, Stored Procedure Type, Definition, Return Type, Language, Packages



Cataloging Snowflake Tags and Policies

Following additional information is cataloged when you run the collector with the --policy-collection and --tag-collection parameters.

Table 3.

Object

Information cataloged

Columns

Snowflake masking policies

Snowflake Tags

Name

Snowflake Tag Values

Name, Value

Snowflake Masking Policies

Name, Date created, Snowflake tags, Owner, Type, Policy body, Schema, Database

Snowflake Row access policies

Name, Date created, Snowflake tags, Owner, Type, Policy body, Schema, Database



Cataloging Snowflake table query counts

Following additional information is cataloged when you run the collector with the --table-usage-collection and --table-usage-lookback-days parameters.

Table 4.

Object

Information cataloged

Table

Query Count



Relationships between objects

By default, the harvested metadata includes catalog pages for the following resource types. Each catalog page has a relationship to the other related resource types. If the metadata presentation for this data source has been customized with the help of the data.world Solutions team, you may see other resource pages and relationships.

Table 5.

Resource page

Relationship

Snowflake Row access policies

Tables

Snowflake Masking Policies

Columns, Tag value

Table

Columns, Tag value

Snowflake Tags

Tag value

Columns

Table, Tag Value



Lineage for Snowflake

The following lineage information is collected by the Snowflake collector.

Note

Note that the collector traces these relationships from a View’s columns to ultimate source Table columns across SQL expressions and subqueries.

Table 6.

Object

Lineage available

Column in View

The collector identifies the associated column in an upstream view or table:

  • Where the data is sourced from

  • That sort the rows via ORDER BY

  • That filter the rows via WHERE/HAVING

  • That aggregate the rows via GROUP BY

User-defined function

A Function and:

  • Views referenced in Function

  • Tables referenced in Function

  • Functions referenced in Function

A View and:

  • Function referencing View

Stored Procedure

The collector identifies:

  • The associated column in an upstream view or table

    • Where the data is sourced from

    • That sort the rows via ORDER BY

    • That filter the rows via WHERE/HAVING

    • That aggregate the rows via GROUP BY

  • The downstream table that has its tables updated



Important details about snowflake tag and policies

The Snowflake collector harvests Snowflake object tags, Snowflake tag-based masking policies, Snowflake masking policies, and Snowflake row access policies. This information enhances the data governance experience to discover all tags and policies applied across their Snowflake tables and columns in a data catalog

In the following example you can see how a Snowflake Tag-Based Masking Policy applied to sensitive data columns: routing numbers, bank name, and bank account number. In this view, you can also see the associated tag (Classification:confidential), and technical details about the policy, like the Policy Body which explains how the Policy works.

policy_info.png
Table 7.

Feature

Details

Snowflake object tags

Tags enable data stewards to track sensitive data for compliance, discovery, protection, and resource usage use cases through either a centralized or decentralized data governance management approach. For details, see Snowflake documentation.

Snowflake masking policies

Masking policies define what data is masked and not shown to users allowing authorized users to access sensitive data at query runtime. For details, see the Snowflake documentation.

Snowflake tag based masking policies

When a masking policy is applied to a Snowflake tag, the columns with the same tag are automatically protected based on the conditions set in the masking policy. For details, see the Snowflake documentation.

Snowflake row access policies

Access policies are applied to rows of data, that is, who can see the rows of data, and masking policies define what data is masked and not shown to users. For details about this see the Snowflake documentation.

Table query count

For each table, the Snowflake collector computes percentile popular and a query count based on the user access history table in Snowflake. This information can then be used to assign popularity ranking to Snowflake tables in data.world. For details about this see the Snowflake documentation.



When this data is harvested and is displayed in data.world, you see the following information in your catalog: Snowflake masking policies, Snowflake row access policies, Snowflake tag values, and Snowflake tags.

snowflake_harvested_info.png

When you browse to a table harvested from Snowflake, you will notice that the appropriate policies are applied to the tables. In the following example, the policy and tag values are applied to the Order table.

snowflake_harvested_info02.png

Users can click through to the tag values and policies to see their details.

Note

Note that users can also use the Search feature in the application to locate all these resources easily.

snowflake_harvested_info03.png

Authentication supported

The collector supports the following authentication methods: