Community docs

Ways to run a collector

There are currently four ways to catalog your metadata in They are (in order of ease of use):

  • Connection Manager - A simple interface you use to create connections that can be used to both catalog metadata and access remote data from a virtual connection. You can find out about the Connection Manager here.

  • Create a configuration file (config.yaml) - This is a new option which stores all the information needed to catalog your data sources. This is an especially valuable option if you have multiple data sources to catalog as you don't need to run multiple scripts or CLI commands separately. (DOC IN PROGRESS).

  • Create a configuration script - This option is very similar to the one for the config.yaml file. We have a quick start with instructions to create and run a script here. The instructions are for Snowflake, but you can get information on the parameters you will need to include for your particular data source from the the parameters section of the configuration document for your data source. The sources we currently collect and links to their configurations can be found below.

  • Run the collector though a CLI - Makes regular, repeating runs of the collector very laborious and time-consuming as the commands are re-entered for each run. Detailed instructions for getting the collectors, configuring them, and running them are available through the links below.