Skip to main content

Troubleshooting Snowflake collector issues

Collector runtime and troubleshooting

The catalog collector may run in several seconds to many minutes depending on the size and complexity of the system being crawled.

  • If the catalog collector runs without issues, you should see no output on the terminal, but a new file that matching *.dwec.ttl should be in the directory you specified for the output.

  • If there was an issue connecting or running the catalog collector, there will be either a stack trace or a *.log file. Both of those can be sent to support to investigate if the errors are not clear.

A list of common issues and problems encountered when running the collectors is available here.

Issue 1: Access denied error occurs

The following eror message is observed in the error logs: WARN: Access to the snowflake.account_usage database/schema is denied. To harvest Snowflake tag, policy, and/or table usage information into the catalog, use a role with adequate permissions (current role is X).

  • Cause: To harvest policies, tags, and table usage, the role that the collector uses must have read access¬† to the snowflake database. The role the collector uses does not have permissions to the snowflake database.

  • Solution: Run the recommended statement to grant permissions to the snowflake database.

Issue 2: Some tables, views, or other database objects are not harvested

  • Description: The account and role used to run the collector does not have permissions to a specific table.

  • Solution: Run the following query to check that the account and its role used for the collector has permissions to the specific table.

    select PRIVILEGE_TYPE from <database name>.information_schema.table_privileges
    WHERE table_name='<table name>';

Issue 3: Collector takes a long time to complete the run

  • Description: The collector takes a long time to complete the run.

  • Possible solutions:

    1. Increase the warehouse size in Snowflake.

    2. Reduce the target sample size (--target-sample-size parameter) for column statistics collection.

    3. Instead of running the collector for all schema, specify individual schemas (--schema parameter) to harvest from.