Catalog collector release notes
Important
Stay updated on collector releases! To keep up with the latest updates and enhancements to data.world collectors, subscribe to the RSS feed from your favorite RSS reader.
Release version 2.328
Details about the release
Item | Details |
|---|---|
Release version | 2.328 |
Release date | 15 June, 2026 |
Docker image ID |
|
Jar file |
|
Bug fixes
Qlik Replicate collector: Fixed an issue where some properties and relationships were not appearing in the catalog, ensuring complete metadata capture.
Dremio collector: Fixed metadata inconsistencies for Virtual Datasets, improving catalog accuracy and asset linking.
Tableau collector: Fixed lineage capture for Dremio data sources, ensuring complete data flow visibility.
Release version 2.327
Details about the release
Item | Details |
|---|---|
Release version | 2.327 |
Release date | 29 May, 2026 |
Docker image ID |
|
Jar file |
|
Bug fixes
Tableau collector: Fixed incorrect classification of tables as schemas when custom SQL uses unqualified table references, improving metadata accuracy.
Release version 2.326
Details about the release
Item | Details |
|---|---|
Release version | 2.326 |
Release date | 21 May, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Microsoft SQL Server collector: Added warning message when the distributor server cannot reach the publisher database, improving troubleshooting for replication configurations.
Sensitive Data Classification: Added support for ServiceNow Vault Discovery, enabling classification of sensitive data stored in ServiceNow Vault fields.
Microsoft Fabric collector: The collector now harvests variable libraries, expanding metadata coverage for Fabric workspaces.
Bug fixes
ADLS collector: Corrected descriptions for Azure Blob Storage and ADLS Gen2 harvesting options, improving configuration clarity.
Dremio collector: Fixed Object not found error when harvesting column statistics, ensuring complete metadata collection.
Release version 2.325
Details about the release
Item | Details |
|---|---|
Release version | 2.325 |
Release date | 12 May, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
ServiceNow collector:
Added option (--skip-glide) to skip Glide table harvesting for more targeted metadata collection.
The collector now supports Platform Analytics, with extended capabilities for comprehensive analytics metadata harvesting.
Enhanced field metadata now includes column size and nullable properties, providing more detailed data profiling.
Databricks collector: Added option (--governance-metadata-collection) to harvest Unity Catalog governance metadata, enabling enhanced data governance visibility for Databricks environments.
Microsoft Fabric collector: Added support for Oracle table datasets, expanding metadata coverage for Fabric workspaces.
Bug fixes
Power BI Service and Microsoft Fabric collectors: Fixed an issue where large formulas in measures and calculated columns were not being cataloged, ensuring complete metadata capture for complex calculations.
Azure Data Factory collector: Fixed an issue where certain pipeline variable structures caused collection failures, improving collector stability.
Release version 2.324
Details about the release
Item | Details |
|---|---|
Release version | 2.324 |
Release date | 22 April, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
AWS Glue collector: Support added for harvesting metadata from multiple AWS regions in a single run, improving efficiency for multi-region environments. The collector now also harvests table parameters (advanced properties), expanding metadata coverage.
ServiceNow collector: The collector now harvests Vault classification metadata, providing enhanced data governance visibility.
Bug fixes
BigQuery collector:
Fixed acceptance of invalid credential formats, ensuring proper authentication validation.
Resolved an issue that caused crashes when the datasets parameter was not specified.
Fixed collection failures due to dependency conflicts.
Databricks collector: Fixed Service Principal authentication failures, ensuring reliable authentication.
Dremio collector: Fixed an issue where out-of-scope schemas were being collected during API calls, improving collection accuracy.
Tableau collector:
Improved error messaging for API base URL configuration issues, making troubleshooting easier.
Fixed collection failures for tenants with deeply nested composite fields.
Oracle collector:
Fixed missing default values when using DBA access.
Resolved collection failures on wrapped (encrypted) package bodies, ensuring complete metadata capture.
Snowflake, Redshift, Databricks, Denodo, Dremio, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Fixed errors when encountering Infinity float values in numeric columns, preventing data type errors.
Release version 2.323
Details about the release
Item | Details |
|---|---|
Release version | 2323 |
Release date | 10 April, 2026 |
Docker image ID |
|
Jar file |
|
Bug fixes
Microsoft Fabric collector: Fixed an issue where Direct Lake tables displayed null schema names when the source schema was not explicitly defined, improving metadata accuracy.
Release version 2.322
Details about the release
Item | Details |
|---|---|
Release version | 2.322 |
Release date | 9 April, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Hive Metastore collector: New collector added for Hive Metastore, enabling direct metadata harvesting from the Hive Metastore database (PostgreSQL, Oracle, MySQL/MariaDB, or SQL Server) without requiring HiveServer2. The collector harvests databases, schemas, tables, columns, Hive-specific table properties, and basic view lineage. This provides a lightweight alternative for environments where HiveServer2 is unavailable or when only structural metadata is needed.
This collector replaces the existing Hive and Hive Metastore collectors.
Snowflake, Redshift, Databricks, Denodo, Dremio, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, Microsoft SQL Server, and SAP HANA collectors: Added sample data preview capability, enabling the catalog to display sample of table data. When Sensitive Data Classification is enabled, sensitive columns are automatically masked in the preview, ensuring secure data visibility.
Bug fixes
Snowflake, Redshift, Databricks, Denodo, Dremio, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Reduced unnecessary warning messages during SQL lineage processing, improving log clarity.
Release version 2.321
Details about the release
Item | Details |
|---|---|
Release version | 2.321 |
Release date | 6 April, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
BigQuery collector: The collector now harvests primary and foreign keys, expanding metadata coverage for BigQuery table relationships.
Bug fixes
All collectors: Fixed an issue where boolean false values in YAML configuration files were not properly recognized.
Snowflake, Redshift, Databricks, Denodo, Dremio, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Resolved memory issues when processing columns with very large string values, improving collector stability.
Tableau collector: Improved handling of recursive field queries, enhancing collection reliability for complex Tableau workbooks.
Release version 2.320
Details about the release
Item | Details |
|---|---|
Release version | 2.320 |
Release date | 24 March, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Qlik Replicate collector: New collector added for Qlik Replicate, enabling metadata harvesting from Qlik's data replication platform.
Bug fixes
Snowflake, Redshift, Databricks, Denodo, Dremio, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Fixed column statistics metrics to ensure consistent data type handling across database sources.
Release version 2.319
Details about the release
Item | Details |
|---|---|
Release version | 2.319 |
Release date | 20 March, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Microsoft SQL Server collector: The collector now collects usage metrics, providing insights into table and query access patterns.
BigQuery collector: Support added for harvesting multiple BigQuery projects in a single run, improving efficiency for multi-project environments.
Bug fixes
Qlik Talend Data Integration collector: Resolved an issue that caused collector failures in certain configurations.
ServiceNow collector: Improved metadata structure for tables and fields, ensuring better catalog integration.
Release version 2.318
Details about the release
Item | Details |
|---|---|
Release version | 2.318 |
Release date | 16 March, 2026 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file |
|
New features and changes
Airflow collector: Support added for for Airflow 3.x, enabling metadata collection from the latest Airflow version.
ServiceNow collector: The collector now harvests data interfaces, expanding metadata coverage.
Snowflake collector: Added support for semantic views, providing metadata capture for Snowflake's business-friendly data representations.
SQL Server Integration Services (SSIS) collector: Added lineage support for tables and SQL statements with variable references, improving lineage accuracy for dynamic SSIS workflows.
Bug fixes
Databricks collector: Fixed regex used for parsing metric views, resolving parsing errors for certain metric view configurations.
Microsoft Fabric collector: Fixed an issue that occurred when property values were not present, improving collector stability.
Power BI Service and Microsoft Fabric collectors: Performance improvement while harvesting calculation groups, reducing collection time for Power BI models with calculation groups.
Microsoft SQL Server collector: Addressed issues with sample data size, ensuring consistent sample data collection across different table sizes.
ServiceNow collector: Fixed CatalogingEventReporter when off the main thread, resolving threading issues during collection.
Release version 2.317
Details about the release
Item | Details |
|---|---|
Release version | 2.317 |
Release date | 10 March, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Databricks collector:
Harvest information on Databricks Lakeflow, enabling complete metadata capture for Lakeflow workflows and pipelines.
Harvest Unity Catalog metastore metadata, expanding coverage of Databricks governance and catalog structures.
Snowflake collector: Added Tasks harvesting capability, providing complete visibility into Snowflake automated processes and dependencies.
Microsoft Fabric and Power BI Service collectors: Support for harvesting Paginated Power BI Reports, including preview images, extending metadata coverage for Power BI assets.
Microsoft Fabric collector: Added page size configuration and improved logging for better performance monitoring and troubleshooting.
Bug fixes
SQL Server Integration Services (SSIS) collector:
Removed stored procedure dependency cataloging to improve accuracy and reduce false positive dependencies.
Fixed missing lineage for SQL statements, ensuring complete data flow visibility across SSIS packages.
Tableau collector: Fixed null pointer exception (NPE) in field harvesting.
Release version 2.316
Details about the release
Item | Details |
|---|---|
Release version | 2.316 |
Release date | 26 February, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Tableau collector: Added an option to exclude hidden fields within published data sources.
Microsoft Fabric collector: Added an option to exclude internal Delta table files from Lakehouse file cataloging, reducing noise in harvested metadata.
Snowflake collector: Now logs warnings for unsupported stage references in SQL, improving transparency during lineage harvesting.
Athena collector: Added include/exclude options for databases, giving users more control over collection scope.
Snowflake, Redshift, Databricks, Denodo, Dremio, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Reorganized database index harvesting for improved structure and consistency.
Bug fixes
Power BI and Microsoft Fabric collectors: Fixed the relationship between semantic models/dataflows and data sources to properly represent uses relationships, improving lineage accuracy.
Tableau collector: Corrected embedsView relationships for unpublished views in published dashboards, ensuring complete dashboard-to-view associations.
Release version 2.315
Details about the release
Item | Details |
|---|---|
Release version | 2.315 |
Release date | 18 February, 2026 |
Docker image ID |
|
Jar file |
|
Bug fixes
OpenAPI collector: Added handling to prevent a null pointer exception when expected content is missing from an OpenAPI specification.
Tableau collector: Ensured that published dashboards correctly include embedsView relationships, even when they contain unpublished sheets.
Release version 2.314
Details about the release
Item | Details |
|---|---|
Release version | 2.314 |
Release date | 11 February, 2026 |
Docker image ID |
|
Jar file |
|
Bug fixes
ServiceNow collector: Fixed an issue in the generation of IRIs for database tables referenced by other resources, ensuring consistent and accurate identification of those tables in the catalog.
Release version 2.313
Details about the release
Item | Details |
|---|---|
Release version | 2.313 |
Release date | 6 February, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Microsoft Fabric collector: Added support for harvesting EventStream resources, enabling lineage tracking for Fabric’s real-time streaming data pipelines.
Bug fixes
dbt collector: Corrected resource identifier generation for tables to match BigQuery collector format, ensuring proper lineage connections between dbt models and their underlying BigQuery tables.
OpenAPI collector: Added support for OpenAPI 3.1’s updated nullable type syntax, ensuring schemas using the latest specification version are parsed correctly.
SSIS collector: Fixed parsing of SSIS dataflow tasks to handle a wider variety of schema and table name patterns in SQL queries.
Release version 2.312
Details about the release
Item | Details |
|---|---|
Release version | 2.312 |
Release date | 3 February, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Redshift collector: Added support for harvesting lineage from stored procedures, improving visibility into procedural data transformations.
Bug fixes
ServiceNow collector: Improved pagination handling to ensure results are fully harvested when total counts do not equal 100, preventing missing resources in larger result sets.
Tableau collector: Fixed a defect affecting sheet-to-dashboard associations, ensuring relationships are harvested accurately.
Release notes for previous versions
Go here to access release notes for previous version.