About the Dremio collector
Use this collector to harvest metadata for Dremio tables and columns across the enterprise systems and make it searchable and discoverable in data.world.
Important
The Dremio collector can be run in the Cloud or on-premise using Docker or JAR files.
Note
The latest version of the Collector is 2.316. To view the release notes for this version and all previous versions, please go here.
Dremio version supported
The collector supports Dremio version 26.0.
Authentication supported
The collector supports username/password authentication to Dremio.
What is cataloged
The collector catalogs the following information.
Object | Information cataloged |
|---|---|
Columns | Name, Description, JDBC type, Column Type, Is Nullable, Default Value, Key type (Primary, Foreign), Column size, Column index |
Table | Name, Description, Primary key, Schema |
Table Index | Index Cardinality, Column name, Index Type, Index Name, is non Unique, Column Ordinal Position, Pages, Column Sort Sequence |
Views | Name, description |
Schema | Identifier, Name |
Database | Type, Name, Identifier, Server, Port, Environment, JDBC URL |
Relationships between objects
By default, the harvested metadata includes catalog pages for the following resource types. Each catalog page has a relationship to the other related resource types. If the metadata presentation for this data source has been customized with the help of the data.world Solutions team, you may see other resource pages and relationships.
Resource page | Relationship |
|---|---|
Table | Columns, Table Indexes |
Columns | Table, Table Indexes |
Table Indexes | Columns |
Schema | Database that contains Schema, Table that is part of Schema |
Database | Schema contained in Database |
Lineage for Dremio
Lineage collection requires the Dremio API server -(-api-server) parameter. The collector obtains inter-table lineage from the Dremio built-in catalog graph via the REST API, and writes lineage relationships for files and datasets represented as tables in the graph.
Without the Dremio API server -(-api-server) parameter, lineage metadata is not collected. The collector harvests only basic metadata available through JDBC, such as tables, columns, schemas, and views.
The collector obtains information about inter-table relationships from Dremio’s built-in catalog graph. It writes a lineage relationship for any files or datasets represented as tables that are found in the graph.