Community docs

Collector FAQ

How many collectors do you need?

You can use one DWCC to catalog as many data sources as you have. All you need to do is change the name of the catalog source and the parameters in the command-line.

What is needed to run a catalog collector?

Because the collectors are shipped as Docker images, you need to have Docker installed on your local machine. If you can't use Docker, we also have a Java version of the collectors available. For more information about Docker see https://docs.docker.com/get-docker/.

The computer running the catalog collector should have network access to the data source.

The user running the catalog collector must have read access to the data resource.

For many data sources you will also need to have jdbc drivers for the data source installed on the local machine. The DWCC collectors assume the .jar file driver is in the ../jdbcdrivers directory.

Finally, a minimum of 2G of memory and a 2Ghz processor are required for all sources. Certain data sources (like BigQuery) may have additional requirements.

What operating system does the collector Docker image use?

Debian Buster (the development codename for Debian 10).