Skip to main content

About the Azure Data Lake Storage Gen2 collector

Use this collector to directly harvest metadata on Azure Data Lake Storage Gen2 storage accounts, containers, and files from your Azure Data Lake Storage Gen 2 instance.


The latest version of the Collector is 2.159. To view the release notes for this version and all previous versions, please go here.

Authentication supported

Authenticate to Azure Data Lake Storage Gen 2 using Service principal.

What is cataloged

The collector catalogs the following information from Azure Data Lake Storage Gen 2.

Table 1.


Information collected

Storage Account

Name, Description, Last Modified, Resource Group name, Region Name, Creation Time, Subscription ID, Account Status, Account Kind, Access Control, Access Tier, Provisioning State, Tags


Name, Description, Server, Last Modified, Metadata, Subscription ID, Entity Tag, Public Access, Access Control


Name, Description, File URL, File Path, Blob Type, Content Length, Creation Time, Last Modified, Metadata, Subscription ID, Entity Tag, Access Control

Relationships between objects

By default, the catalog will include catalog pages for the resource types below. Each catalog page will have a relationship to other related resource types. Note that the catalog presentation and relationships are fully configurable, so these will list the default configuration.

Table 2.

Resource page


Storage Account

  • Relationship to Containers contained within Storage Account


  • Relationship to Blobs contained within Container

  • Relationship to Storage Account containing Container


  • Relationship to Container containing Blob

Important things to note about maximum resource limits

  • By default the collector harvests metadata from Azure Data Lake Storage Gen 2 with up to 10,000 objects in each Storage Account. If your Azure Data Lake Storage Gen 2 has more than 10,000 objects in a given Storage Account, you must set the --max-resource-limit parameter to what you want. The max value can be set to 10 million. If the contents of a Storage Account cross this maximum limit, the Storage Account is skipped and a warning message is logged for the Storage Account.