Automating collector runs
Important
This topic only applies to on-premise collector runs.
Harvesting metadata from various data sources consistently is critical in maintaining the modern data ecosystem. To automate this task, numerous tools such as Amazon ECS, Azure Pipelines, CircleCI, etc., are available, each equipped with their unique features and capabilities. These tools ensure frequent and regular scheduling of collector runs, resulting in fresh and up-to-date data for your analysis.
Even though a variety of tools can satisfy different needs, in this documentation, we will primarily focus on the operation of three such utilities: Azure Pipelines and CircleCI.
Azure Pipelines allows for frequent and regular collector runs, ensuring data harvested from your sources is current and ready for analysis at any time.
AWS Elastic Container Service (ECS) allows you to automate your collector runs in an AWS environment.
Likewise, CircleCI can be configured to perform collector runs as per your desired schedule.
If you are running the collectors on a Windows machine using JAR files, you can find instructions on configuring the Windows Credential Manager and scripts to automatically run the collectors on a schedule.
Utilizing these tools, you can create an automated, self-updating system that automatically gathers metadata, saving efforts and increasing efficiency.