Troubleshooting Snowflake collector issues
Collector runtime and troubleshooting
The catalog collector may run in several seconds to many minutes depending on the size and complexity of the system being crawled.
If the catalog collector runs without issues, you should see no output on the terminal, but a new file that matching *.dwec.ttl should be in the directory you specified for the output.
If there was an issue connecting or running the catalog collector, there will be either a stack trace or a *.log file. Both of those can be sent to support to investigate if the errors are not clear.
A list of common issues and problems encountered when running the collectors is available here.
Issue 1: Access denied error occurs
The following eror message is observed in the error logs: WARN: Access to the snowflake.account_usage database/schema is denied. To harvest Snowflake tag, policy, and/or table usage information into the catalog, use a role with adequate permissions (current role is X).
Cause: To harvest policies, tags, and table usage, the role that the collector uses must have READ access to the ACCOUNT_USAGE schema in the SNOWFLAKE database. The role the collector uses does not have permissions to the Snowflake database.
Solution: Run the recommended statement to grant permissions to the Snowflake database.
Issue 2: Some tables, views, or other database objects are not harvested
Description: The account and role used to run the collector does not have permissions to a specific table.
Solution: Run the following query to check that the account and its role used for the collector has permissions to the specific table.
select PRIVILEGE_TYPE from <database name>.information_schema.table_privileges WHERE table_name='<table name>';
Issue 3: Collector takes a long time to complete the run
Description: The collector takes a long time to complete the run.
Possible solutions:
Increase the warehouse size in Snowflake.
Reduce the target sample size (--target-sample-size parameter) for column statistics collection.
Instead of running the collector for all schema, specify individual schemas (--schema parameter) to harvest from.
Issue 4: Recently created or modified Snowflake objects are not appearing in the catalog
You are using incremental metadata collection with the Snowflake Cloud Collector, but some newly created or modified objects are not showing up in the catalog after a run.
Cause: Snowflake logs metadata changes to the account_usage schema with an approximate latency of up to 3 hours. The collector uses this schema to detect changes, so any objects updated or created shortly before a run may not be logged in time to be included. This issue only affects collectors configured with incremental collection mode.
Solution: Allow at least 3 hours between the time changes are made in Snowflake and when the collector is scheduled to run. This ensures changes are captured in account_usage and picked up by the next incremental run. To avoid missing updates, do not schedule incremental runs at intervals shorter than 3 hours. For more details, refer to the Snowflake documentation and the incremental collection guidance.