Catalog collector release notes
Important
Stay updated on collector releases! To keep up with the latest updates and enhancements to data.world collectors, subscribe to the RSS feed from your favorite RSS reader.
Release version 2.319
Details about the release
Item | Details |
|---|---|
Release version | 2.319 |
Release date | 20 March, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Microsoft SQL Server collector: The collector now collects usage metrics, providing insights into table and query access patterns.
BigQuery collector: Support added for harvesting multiple BigQuery projects in a single run, improving efficiency for multi-project environments.
Bug fixes
Qlik Talend Data Integration collector: Resolved an issue that caused collector failures in certain configurations.
ServiceNow collector: Improved metadata structure for tables and fields, ensuring better catalog integration.
Release version 2.318
Details about the release
Item | Details |
|---|---|
Release version | 2.318 |
Release date | 16 March, 2026 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file |
|
New features and changes
Airflow collector: Support added for for Airflow 3.x, enabling metadata collection from the latest Airflow version.
ServiceNow collector: The collector now harvests data interfaces, expanding metadata coverage.
Snowflake collector: Added support for semantic views, providing metadata capture for Snowflake's business-friendly data representations.
SQL Server Integration Services (SSIS) collector: Added lineage support for tables and SQL statements with variable references, improving lineage accuracy for dynamic SSIS workflows.
Bug fixes
Databricks collector: Fixed regex used for parsing metric views, resolving parsing errors for certain metric view configurations.
Microsoft Fabric collector: Fixed an issue that occurred when property values were not present, improving collector stability.
Power BI Service and Microsoft Fabric collectors: Performance improvement while harvesting calculation groups, reducing collection time for Power BI models with calculation groups.
Microsoft SQL Server collector: Addressed issues with sample data size, ensuring consistent sample data collection across different table sizes.
ServiceNow collector: Fixed CatalogingEventReporter when off the main thread, resolving threading issues during collection.
Release version 2.317
Details about the release
Item | Details |
|---|---|
Release version | 2.317 |
Release date | 10 March, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Databricks collector:
Harvest information on Databricks Lakeflow, enabling complete metadata capture for Lakeflow workflows and pipelines.
Harvest Unity Catalog metastore metadata, expanding coverage of Databricks governance and catalog structures.
Snowflake collector: Added Tasks harvesting capability, providing complete visibility into Snowflake automated processes and dependencies.
Microsoft Fabric and Power BI Service collectors: Support for harvesting Paginated Power BI Reports, including preview images, extending metadata coverage for Power BI assets.
Microsoft Fabric collector: Added page size configuration and improved logging for better performance monitoring and troubleshooting.
Bug fixes
SQL Server Integration Services (SSIS) collector:
Removed stored procedure dependency cataloging to improve accuracy and reduce false positive dependencies.
Fixed missing lineage for SQL statements, ensuring complete data flow visibility across SSIS packages.
Tableau collector: Fixed null pointer exception (NPE) in field harvesting.
Release version 2.316
Details about the release
Item | Details |
|---|---|
Release version | 2.316 |
Release date | 26 February, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Tableau collector: Added an option to exclude hidden fields within published data sources.
Microsoft Fabric collector: Added an option to exclude internal Delta table files from Lakehouse file cataloging, reducing noise in harvested metadata.
Snowflake collector: Now logs warnings for unsupported stage references in SQL, improving transparency during lineage harvesting.
Athena collector: Added include/exclude options for databases, giving users more control over collection scope.
Snowflake, Redshift, Databricks, Denodo, Dremio, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Reorganized database index harvesting for improved structure and consistency.
Bug fixes
Power BI and Microsoft Fabric collectors: Fixed the relationship between semantic models/dataflows and data sources to properly represent uses relationships, improving lineage accuracy.
Tableau collector: Corrected embedsView relationships for unpublished views in published dashboards, ensuring complete dashboard-to-view associations.
Release version 2.315
Details about the release
Item | Details |
|---|---|
Release version | 2.315 |
Release date | 18 February, 2026 |
Docker image ID |
|
Jar file |
|
Bug fixes
OpenAPI collector: Added handling to prevent a null pointer exception when expected content is missing from an OpenAPI specification.
Tableau collector: Ensured that published dashboards correctly include embedsView relationships, even when they contain unpublished sheets.
Release version 2.314
Details about the release
Item | Details |
|---|---|
Release version | 2.314 |
Release date | 11 February, 2026 |
Docker image ID |
|
Jar file |
|
Bug fixes
ServiceNow collector: Fixed an issue in the generation of IRIs for database tables referenced by other resources, ensuring consistent and accurate identification of those tables in the catalog.
Release version 2.313
Details about the release
Item | Details |
|---|---|
Release version | 2.313 |
Release date | 6 February, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Microsoft Fabric collector: Added support for harvesting EventStream resources, enabling lineage tracking for Fabric’s real-time streaming data pipelines.
Bug fixes
dbt collector: Corrected resource identifier generation for tables to match BigQuery collector format, ensuring proper lineage connections between dbt models and their underlying BigQuery tables.
OpenAPI collector: Added support for OpenAPI 3.1’s updated nullable type syntax, ensuring schemas using the latest specification version are parsed correctly.
SSIS collector: Fixed parsing of SSIS dataflow tasks to handle a wider variety of schema and table name patterns in SQL queries.
Release version 2.312
Details about the release
Item | Details |
|---|---|
Release version | 2.312 |
Release date | 3 February, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Redshift collector: Added support for harvesting lineage from stored procedures, improving visibility into procedural data transformations.
Bug fixes
ServiceNow collector: Improved pagination handling to ensure results are fully harvested when total counts do not equal 100, preventing missing resources in larger result sets.
Tableau collector: Fixed a defect affecting sheet-to-dashboard associations, ensuring relationships are harvested accurately.
Release version 2.311
Details about the release
Item | Details |
|---|---|
Release version | 2.311 |
Release date | 30 January, 2026 |
Docker image ID |
|
Jar file |
|
Bug fixes
Snowflake and Databricks collectors: Fixed relationship mapping so that database tables are correctly associated with their parent database, ensuring accurate hierarchy and navigation in the catalog.
Release version 2.310
Details about the release
Item | Details |
|---|---|
Release version | 2.310 |
Release date | 29 January, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Microsoft Fabric collector: Added support for harvesting Environments in Microsoft Fabric, expanding visibility into Fabric workspace configurations.
Oracle collector: Added support for harvesting lineage from CREATE TABLE statements by querying historical tables, improving lineage coverage for table creation workflows.
Bug fixes
OpenAPI collector: Fixed an issue where restrictive server configurations caused specification requests to be rejected.
Microsoft SQL Server collector: Resolved an issue where SQL preprocessing of TOP statements did not work correctly in some cases.
Microsoft Fabric collector: Fixed issues with Lakehouse Delta tables to ensure they are cataloged under the correct parent folder and to prevent duplicate resources from being created.
Release version 2.309
Details about the release
Item | Details |
|---|---|
Release version | 2.309 |
Release date | 14 January, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
SSIS collector: Added support for harvesting variables and their references, improving lineage and package-level context.
Microsoft SQL Server collector: Now harvests table descriptions from extended properties, enriching table-level documentation.
Databricks collector: Renamed the CLI option --http-path to --compute-resources-url for clearer configuration of compute connectivity.
Snowflake, Redshift, Databricks, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Added support to exclude system functions across all JDBC collectors, helping reduce noise in harvested metadata.
Microsoft Fabric collector:
Added support for listing tables for schema-enabled Lakehouses using the OneLake Tables API, and harvesting columns for those tables via the same API.
Added support for schema-enabled Fabric Lakehouses in Pipelines, improving pipeline lineage coverage.
Added support for SQL Server database tables and stored procedures as sources/sinks in Fabric Data Pipelines, expanding supported pipeline endpoints.
Sensitive data classification feature: Improved sensitive data classification for collectors. On-premise collectors now support Microsoft Presidio classification services in addition to Private AI.
Bug fixes
Tableau collector: Fixed embedsView relationships for unpublished Tableau views and dashboards, ensuring these relationships are harvested correctly.
Azure Data Lake Storage Gen2 collector: Fixed an error that occurred when harvesting a non-hierarchical storage account, improving compatibility and stability.
Release version 2.308
Details about the release
Item | Details |
|---|---|
Release version | 2.308 |
Release date | 5 January, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Snowflake collector: Added lineage resolution support for QUALIFY clauses, LATERAL FLATTEN constructs, and subqueries in previously unsupported contexts, including WHERE, JOIN ON, and ORDER BY.
Oracle collector: The collector now harvests public synonyms, expanding visibility into shared database objects.
SSIS collector: Added support for harvesting stored procedure references, improving lineage completeness.
Bug fixes
Tableau collector:
The collector now harvests component fields and their relationships to constituent fields.
Ensured GraphQL filtering by project name is followed by filtering by project ID, preventing harvesting of unintended projects.
Snowflake collector: Fixed an issue where some SQL statements took an excessive amount of time to parse during lineage harvesting.
SQL Server collector: Prevented attempts to harvest table size metadata for views that do not support this operation.
Databricks collector: Added a safeguard to prevent SQL execution for excluded databases during lineage collection.
Oracle collector: Fixed harvesting of statistic indexes, ensuring they are correctly captured.
Release version 2.307
Details about the release
Item | Details |
|---|---|
Release version | 2.307 |
Release date | 18 December, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
ServiceNow collector: Introduced a new collector that harvests metadata for scoped applications, tables, fields, views, and data fabric tables from a ServiceNow instance.
Dremio collector: The API server (--api-server) option is no longer required. When it is not provided, the collector does not harvest the API-based lineage and dataset information.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Improved memory usage during the harvesting of column statistics, enhancing stability for large datasets.
Tableau collector:
Added hostname mapping configuration support, enabling more flexible environment and network setups.
Added a new option, --tableau-exclude-unpublished-views, to exclude unpublished views (sheets and dashboards) from harvesting.
Bug fixes
Snowflake collector: Updated SQL parsing behavior to ensure parsing terminates correctly when a timeout occurs, and made the timeout value configurable to better handle complex or long-running queries.
Release version 2.306
Details about the release
Item | Details |
|---|---|
Release version | 2.306 |
Release date | 10 December, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
Salesforce collector: Added support for grouping and aggregation columns, expanding metadata coverage for analytical queries.
Databricks collector: Added support for harvesting jobs from external workspaces, enabling visibility into cross-environment job metadata.
Azure Data Factory collector: Now harvests SQL queries used in activities, providing richer lineage and operational context.
Dremio collector:
Migrated to the Arrow Flight JDBC driver in place of the legacy Dremio driver, improving performance and compatibility.
Updated the --graphApiServer parameter to --api-server for clarity and consistency, and added a new --use-tls parameter to support secure connections.
Bug fixes
Microsoft Fabric collector: Fixed an issue where some files were not being harvested, ensuring more complete metadata collection.
QlikSense collector: Updated logic to exclude writing the API key in the TTL file
Fabric collector: Fixed a modeling issue where Lakehouse tables were incorrectly displayed as folders, ensuring proper representation of table structures
Release version 2.305
Details about the release
Item | Details |
|---|---|
Release version | 2.305 |
Release date | 25 November, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
Microsoft SQL Server collector: The collector now harvests column comments from the MS_Description extended property, improving metadata detail and context.
SQL Server Integration Services (SSIS) collector: Added support for lineage when Oracle is used as a source or target, expanding cross-platform lineage coverage.
Bug fixes
Azure Data factory collector: Fixed an issue where lineage relationships were missing when a Snowflake credential had an assigned default role, ensuring complete lineage capture.
Release version 2.304
Details about the release
Item | Details |
|---|---|
Release version | 2.304 |
Release date | 20 November, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
OpenAPI collector: Added a new configuration option --environment-qualifier. When provided, this option ensures that harvested resources are unique and distinct for that environment qualifier, improving multi-environment cataloging.
Power BI collector: The collector now harvests calculation groups and calculation items, expanding metadata coverage for Power BI semantic models.
Bug fixes
dbt core and dbt cloud collectors: Fixed an issue where tags were represented as strings instead of relationships. Tags are now properly modeled as relationships in the catalog graph.
Power BI collector: Fixed an issue where descriptions in parameter definitions were incorrectly appended to parameter values.
Release version 2.303
Important
Published versions of collectors are available as a docker image and a JAR file.
Details about the release
Item | Details |
|---|---|
Release version | 2.303 |
Release date | 13 November, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
Databricks collector: Added support for harvesting metrics views, expanding metadata coverage within Databricks. A new optional parameter, --enable-metric-views, is now available to enable this feature.
Power BI collector: Added support for Power BI–BigQuery column lineage.
Bug fixes
Informatica CDI collector: Fixed an issue that caused incomplete harvesting in large CDI instances by ensuring the collector continues harvesting beyond the Informatica CDI API’s default pagination limit.
dbt Cloud and dbt core collectors: Corrected tag representation so that tags are now modeled as resources rather than strings, improving metadata consistency and usability.
Release version 2.302
Important
Published versions of collectors are available as a docker image and a JAR file.
Details about the release
Item | Details |
|---|---|
Release version | 2.302 |
Release date | 5 November, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
OpenAPI collector: Added handling for schema references and array item references, improving completeness of OpenAPI metadata harvesting.
Power BI collector: Now harvests endorsement details for reports, datasets, and dataflows, providing better visibility into trusted and certified content.
Databricks collector: Added support for harvesting materialized views, expanding metadata coverage within Databricks.
SQL Server collector: Now harvests database synonyms, improving understanding of object relationships and dependencies.
Tableau collector: Updated logic to associate published data sources with the project that contains them, improving metadata organization and context.
SSRS collector: Now harvests the external URL for linked reports, providing direct traceability to linked content.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Updated lineage logic to represent INSERT stored procedures as the triggering agent in lineage relationships, improving lineage accuracy.
Bug fixes
Informatica collector: Fixed an issue that limited harvesting to a maximum of 4,200 mapping tasks. The collector now processes all available mappings.
Databricks collector: Fixed handling of non-alphanumeric characters in identifiers. Resolved issues with the creation of invalid identifiers.
dbt core and dbt cloud collectors: Fixed a case mismatch issue when coining database IRIs for Snowflake accounts, ensuring consistent IRI generation.
SSIS collector: Corrected package lookup logic to prioritize the Package GUID, using the Version GUID only when multiple versions share the same Package GUID.
Talend collector: Fixed a classification issue where Talend Jobs were incorrectly identified as activities instead of agents.
Snowflake collector: Implemented a workaround for a Snowflake JDBC driver defect that was triggered by unexpected datatypes, preventing runtime errors.
SQL Server collector: The collector now harvests table size for tables in non-dbo schemas, ensuring complete table metadata coverage.
Release notes for previous versions
Go here to access release notes for previous version.