Viewing metadata collectors summary page
The metadata collector summary page shows provides access to following two things:
A list of collector configurations generated using the collector Wizard. This table only shows the configurations generated after this feature was launched (September 20, 2023).
A list of collector runs feeding data into an organization's Catalog. Collector runs appear on this page when the metadata harvested by the collector is successfully uploaded/synced to the ddw-catalogs dataset.
Managing configuration details for collectors
On the Organization profile page, go to the Settings tab >Metadata collectors section.
Click the Add a collector button to launch the Collector wizard that helps you generate the YAML file or CLI command for running the collectors. You can find the details for running the wizard for each collector on their respective collector pages.
Once you have run the wizard and saved the configuration, the configuration is listed on the page even if the collector has not run yet. You can view the following details - the name for the saved configuration, the data source, the destination collection, the last execution time and status.
To edit a configuration, open the Three dot menu and click the Edit configuration button. Run through the wizard and edit the configurations, as needed.
Important things to note:
The collector wizard will automatically update when new versions are released. If the collector command is updated since the last time you saved your configuration, you will see a message letting you know about the change.
If you do not wish to edit the configuration but just want to view the instructions for running the collector, open the Three dot menu and click the View run instructions button. Run through the wizard to view the instructions and download the YAML file or copy the CLI command.
Important things to note:
The collector wizard will automatically update when new versions are released. If the collector command is updated since the last time you saved your configuration, you will see a message letting you know about the change.
Sometimes, changes made to collectors between versions may introduce new parameters. It is recommended to check the collector release notes to see if you should rerun the collector wizard to use these new parameters.
To delete a configuration, open the Three dot menu and click the Delete configuration button.
Important things to note:
Deleting this configuration will not affect the resources that were collected from previous runs.
All future runs for the collector will continue to run without any interruption. Deleting this configuration has no affect on the collector runs.
Managing catalog metadata sources
Use this section of the page to see the sources powering your catalog as well as the status of the most recent run. Here you will find a list of all the collector runs feeding data into an organization's Catalog.
This page lists the collectors run using the Connection Manager and the collectors that are run on-premise.
Collector runs appear on this page when the metadata harvested by the collector is successfully uploaded to the ddw-catalogs dataset.
The page displays all collector runs for collector version 2.60 and higher.
If the collector configuration is deleted from the Configured collectors section, the Catalog metadata sources section will continue to show the runs for the collector as long as the metadata harvested by the collector is successfully uploaded to the ddw-catalogs dataset.
To view the metadata collectors summary page:
On the Organization profile page, go to the Settings tab.
Go to the Metadata collectors section. The Catalog metadata sources table shows the following information for each run of the collector:
Data source What the collector was run against.
Target: The specific target from which metadata is collected. For example, name of DB/Schema for database sources or Name of project/workbook etc. for non database sources.
Run location: The House icon indicates that the collector is run on-premise. The Cloud icon indicates the collection is run using the Connection Manager.
Destination: Collection where collected metadata resources appear. Click the collection name to navigate to the collection page. If you do not have access to the collection, you may see a 404 error page.
Status: Hover over the status to see exact completed time, run time, and number of resources collected.
Complete (green color): collector run successful, without exceptions.
Complete (yellow color): collector run successful, with exceptions. Click View messages to see the list of errors. Exceptions are not uncommon and likely just mean that the collector has an issue accessing a particular table.
Failed: Collector run unsuccessful. Click View messages to see the list of errors.