Community docs

How to catalog metadata

Metadata is cataloged using either:

  • The Connection Manager - data.worlds GUI for connections. Currently available for these data sources. A simple interface you use to create connections that can be used to both catalog metadata and access remote data from a virtual connection. Here is an example of the Configuration Manager interface:

    connectin_manager_-_snowflake.png
  • DWCC - a data.world program created to catalog metadata. DWCC is either run from a Docker container or from a jar file. We recommend that you run DWCC from inside a Docker container. If you cannot use Docker we can provide you with a .jar file containing the correct DWCC version for your implementation. See our DWCC FAQ for more information about DWCC.

Where to get a DWCC collector

The DWCC collectors are distributed as images on Dockerhub. If you run DWCC from Docker, the run command will attempt to find the image locally, and if it doesn't find it, it will go to Dockerhub and download it automatically:

dwcc_and_cli.png

If you are running DWCC from a .jar file, you will get the correct file from customer support.

If you are unsure what version of a DWCC collector to use, the most current releases of the collectors are always listed in the Catalog collector change log. However If you don't know the complete version name, or if you would like to see a list of the DWCC collector versions, you can go to our Dockerhub repositories. There are two repositories, one for released versions and one for release candidate versions:

  • datadotworld/dwcc- Contains all of the officially released versions of the DWCC

  • datadotworld/dwcc-rc - Contains the "release candidate" versions. Release candidates are test versions, they are not officially supported and released. They are primarily used for quick customer fixes until the official release comes out.

Caution

Do not use the versions named Latest from either repository--only specify numeric releases (e.g., dwcc:2.36).

Warning

Do not use a release candidate (rc) version of the DWCC unless you have been explicitly directed to do so by your customer success or support representative.

The name you specify on the CLI should match exactly the version name on Dockerhub. For example:

  • The name of the DWCC collector version 2.36 is datadotworld/dwcc:2.36

  • The name of the third DWCC RC collector version of 2.37 is datadotworld/dwcc-rc:2.37-rc-0003 (RC versions are padded to four digits).