Skip to main content

Catalog collector release notes

Important

Published versions of collectors are available as a docker image and a JAR file.

Release version 2.289

Details about the release

Table 1.

Item

Details

Release version

2.289

Release date

4 July, 2025

Docker image ID

Jar file



New features and changes

  • A new collector, the AWS Lake Formation collector, is now available in public preview.

  • Microsoft Fabric collector: Now catalogs metadata for Eventhouses, including the addition of associated KQL databases, expanding visibility into the Microsoft Fabric ecosystem.

  • dbt core and cloud collectors: Added support for harvesting semantic model metadata, enriching the data model layer within your catalog.

  • Power BI collector: Introduced support for an additional Databricks source MQuery function type in lineage resolution, improving coverage and accuracy of Power BI lineage.

  • SSIS collector: Enhanced debug-level logging to support better root cause analysis for missing catalog resources, aiding in troubleshooting and diagnostics.

Bug fixes

  • Alteryx collector: Fixed an issue where the workflow description was not correctly captured from the user-provided meta info section.

  • AWS Glue collector: Resolved a null pointer exception that could occur when the Glue Data Catalog tables are empty, improving stability.

  • Snowflake collector: Improved data type standardization by stripping parenthesized size/length values (for example, VARCHAR(255) → VARCHAR) for cleaner and more consistent metadata.

Release version 2.288

Important

This release was for internal improvements and has no customer impacting changes.

Release version 2.287

Details about the release

Table 2.

Item

Details

Release version

2.287

Release date

17 June, 2025

Docker image ID

  • Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags

    • amd64: sha256:b72a8d8e27570cde13ab517cf956faede2bb00f2bc278aba3b4131c4a7a993fe

    • arm64: sha256:448bb310bc92d52e9d73c404301aeaa65d98e51e2961ad67701c1d94207a22ba

Jar file



New features and changes

  • Tableau collector: Added view-based filtering for bin and group fields to align with how calculated fields are handled.

  • Salesforce collector: Now harvests all reports and dashboards, not just recently viewed ones.

  • Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors, SAP HANA collectors: Column decimal digits are now written only for appropriately typed columns.

Bug fixes

  • Tableau collector: Fixed a null pointer exception in column lineage processing for Custom SQL tables.

  • OpenAPI collector: Resolved errors caused by malformed spec files that previously triggered null pointer exceptions.

Release version 2.286

Details about the release

Table 3.

Item

Details

Release version

2.286

Release date

11 June, 2025

Docker image ID

Jar file



New features and changes

  • Logging framework (all collectors): Introduced minor updates to the logging framework. As a result, users may notice slightly different log messages compared to previous versions.

  • Snowflake collector: Upgraded the embedded Snowflake JDBC driver to version 3.34.2, addressing potential exceptions and improving stability.

  • Azure data factory collector: Now catalogs parameters used in parameterized linked services, along with the relationship between each linked service and its data source, providing deeper lineage visibility.

  • PowerBI collector: Added support for parsing lineage from certain SQL statement types without requiring database credentials, making it easier to extract lineage in more restricted environments.

Bug fixes

  • Tableau collector: Prevented exceptions that could occur when harvesting table-view relationships, particularly when table information is missing from the Tableau GraphQL API.

  • PowerBI collector: Fixed an issue with Denodo sources that use custom SQL, improving support for a wider range of PowerBI source types.

  • SQL Server Integration Services (SSIS) collector: Resolved an exception that occurred during the harvesting of column information, enhancing reliability in metadata extraction.

Release version 2.285

Details about the release

Table 4.

Item

Details

Release version

2.285

Release date

3 June, 2025

Docker image ID

Jar file



New features and changes

  • Tableau collector: Now determines the associated project for a Custom SQL Table based on its workbook rather than its datasource, improving accuracy in project assignments.

  • PowerBI collector: Added support for Oracle Autonomous Database as a source, expanding connectivity and metadata coverage within PowerBI environments.

Release version 2.284

Details about the release

Table 5.

Item

Details

Release version

2.284

Release date

29 May, 2025

Docker image ID

Jar file



New features and changes

  • All collectors (on-premise version): Updated the logging behavior so that each new collector run will truncate existing log files written to the local filesystem, rather than appending to them. This means logs from previous runs will be overwritten during subsequent runs.

    This change does not affect uploaded log files—those remain intact.

  • SSRS collector: Added support for harvesting metadata using the SOAP API, enabling compatibility with older SSRS versions that do not support the REST API.

  • Microsoft Fabric collector: Added support for harvesting lineage from custom SQL queries used in Fabric Lakehouse and Warehouse sources

Bug fixes

  • Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors, SAP HANA collectors: Resolved an issue that caused exceptions when harvesting index metadata for tables lacking a defined type.

  • SQL Server collector: Fixed an error in how table names were qualified when calling the sp_spaceused stored procedure—ensuring accurate harvesting of table size metadata.

  • QlikSense collector: Prevented exceptions that occurred when the created-by or modified-by user information was missing from collected resources.

  • Databricks collector: Expanded lineage harvesting to include relationships between objects across different schemas, rather than only within the same schema.

Release version 2.283

Details about the release

Table 6.

Item

Details

Release version

2.283

Release date

21 May, 2025

Docker image ID

Jar file



New features and changes

  • Tableau collector: The collector Now harvests the last refresh date for Tableau data sources, where available—providing better visibility into data currency and freshness.

Bug fixes

  • Tableau collector: Fixed an issue that was preventing the harvesting of certain lineage relationships between Column Fields and their underlying database columns.

Release version 2.282

Details about the release

Table 7.

Item

Details

Release version

2.282

Release date

20 May, 2025

Docker image ID

Jar file



Bug fixes

  • SSIS collector: Fixed an issue that caused a stack overflow error when cycles were present in execution flows.

Release version 2.281

Details about the release

Table 8.

Item

Details

Release version

2.281

Release date

16 May, 2025

Docker image ID

Jar file



Bug fixes

  • Tableau collector:

    • Resolved an issue where re-authentication failed after API token timeouts, improving reliability in long-running sessions.

    • Fixed a bug in SQL parsing for Custom SQL Tables when referenced tables were missing or not found.

  • Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors, SAP HANA collectors: Corrected the calculation of sampling percentages for column statistics, which were previously computed incorrectly in some cases.

  • Postgres and Redshift collectors: Added support for the ~~* symbol in view definition SQL, which serves as an alias for the LIKE keyword in SQL syntax.

  • SSIS Collector: Improved logging and resolved a NullPointerException caused by missing elements in execution XML files.

Release version 2.280

Details about the release

Table 9.

Item

Details

Release version

2.280

Release date

14 May, 2025

Docker image ID

  • Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags

    • amd64: sha256:91010120613ffa3effef509870398a5880a146ef243d22a3da3d99b9889ae619

    • arm64: sha256:3a0dc63787adf5ae110dab484ef231b74ff7ec65a56a4417264ac6f81368087d

Jar file



New features and changes

  • PowerBI collector: Improved performance by handling PowerBI API retry instructions more efficiently.

  • Tableau collector: Added enhanced logging to monitor collector progress and track cataloged resources.

  • Athena collector:

    • Now harvests the S3 path for database tables, enabling lineage from Athena to S3 buckets and objects.

    • Also collects associated AWS tags, enriching metadata coverage.

Bug fixes

  • ADF collector: Fixed an issue where unparseable JSON in certain parameter values caused collector failures.

  • BigQuery collector: Correctly handles cases where the BigQuery API does not return table details, preventing errors.

  • PowerBI collector: Supports automated migration of resource IRIs (identifiers) from earlier collector versions to maintain continuity.

Release version 2.279

Details about the release

Table 10.

Item

Details

Release version

2.279

Release date

9 May 2025

Docker image ID

  • Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags

    • amd64: sha256:95f85d9bc3449c1b4571ee1a48d2e544d162231d6b5aaeb4b06bf2b730d2258d

    • arm64: sha256:6304b7bc8077c2156ba0033fd535938d4ecee6e8d7b5c4d0b0c370f6fb2dbdbb

Jar file



New features and changes

  • QlikSense collector: Implemented a rate limiter to improve performance and prevent API errors during metadata harvesting.

  • AWS S3 collector: Enhanced performance by applying regex filters before fetching objects within buckets, reducing unnecessary API calls.

Bug fixes

  • Tableau collector: Fixed a gap in harvesting of lineage relationships between Tableau column fields and database columns.

  • Microsoft Fabric collector: Resolved an issue where some Data Pipelines failed to deserialize due to varying data types based on configuration.

  • SSIS collector: Fixed an exception caused by missing or empty elements in SSIS executable descriptor XML files.

  • Athena collector: Updated harvesting of databases to accommodate environments with more than 100 databases.

Release version 2.278

Details about the release

Table 11.

Item

Details

Release version

2.278

Release date

30 April, 2025

Docker image ID

Jar file



Bug fixes

  • Microsoft Fabric collector:

    • Fixed an issue with parameters and variables in data pipelines to ensure correct cataloging of parameter and variable names.

    • Resolved a problem where catalog records for pipeline activities and runs were not being created.

    • Corrected the Is Active property for activities, which was previously always set to false.

Release version 2.277

Details about the release

Table 12.

Item

Details

Release version

2.277

Release date

29 April, 2025

Docker image ID

Jar file



New features and changes

  • Microsoft Fabric collector:

    • Harvests notebook definitions along with endorsement details, expanding the collected metadata. A new parameter Disable harvesting notebook definition (--disable-notebook-definition) is introduced for this.

    • Harvests data pipeline activities and run information, providing deeper visibility into pipeline executions.

  • dbt Core and dbt Cloud collectors: dbt sources are now associated with database schemas (instead of tables) through a non-lineage relationship, improving metadata accuracy.

  • Tableau collector: Enhanced log tracing around column field lineage to improve troubleshooting and debugging clarity.

Release version 2.276

Details about the release

Table 13.

Item

Details

Release version

2.276

Release date

24 April, 2025

Docker image ID

Jar file



New features and changes

  • Databricks collector: Now captures table lineage to corresponding AWS S3 objects.

  • Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, SQL Server collectors: Added support for harvesting metadata about table indexes, providing deeper insight into database structure and performance optimization.

Bug fixes

  • Tableau collector: Improved handling of missing connection types on Tableau Server to prevent ingestion errors and ensure smoother metadata extraction.

Release version 2.275

Details about the release

Table 14.

Item

Details

Release version

2.275

Release date

22 April, 2025

Docker image ID

Jar file



New features and changes

  • dbt collector: Transitioned from using the manifest to the catalog to retrieve database identifiers. This resolves issues with quoted identifiers (for example, Snowflake).

  • Athena, Amazon Database Migration Service (DMS), Amazon DynamoDB, AWS Glue, Amazon QuickSight, Amazon S3 collectors: The collectors now support multiple authentication methods, offering greater flexibility and compatibility. The following new parameters are introduced for the new authentication methods:

    • Explicitly supplied static credentials authentication using AWS Access Key ID (--aws-access-key-id) and AWS Secret Access Key (--aws-secret-access-key).

  • Amazon Database Migration Service (DMS) collector: Introduced support for S3 as both a source and target endpoint.

Bug fixes

  • Tableau collector: Implemented a nonNull filter for relatedFields, addressing errors and improving stability in the Tableau collector.

Release version 2.274

Details about the release

Table 15.

Item

Details

Release version

2.274

Release date

17 April, 2025

Docker image ID

Jar file



Bug fixes

  • Databricks collector: Fixed an issue where table lineage could not be resolved when information was incomplete.

  • Tableau collector: Corrected processing of column fields to resolve subtle errors in field-to-database lineage.

  • Azure data factory collector: Improved handling of missing table details when collecting lineage.

  • All collectors: Fixed logging issues where some messages were not being written to the log file as expected.

Release version 2.273

Details about the release

Table 16.

Item

Details

Release version

2.273

Release date

12 April, 2025

Docker image ID

Jar file



Bug fixes

  • Monte Carlo collector: Fixed alignment issues in generated table information from Monte Carlo Monitor by properly escaping pipes ('|') in monitor names.

Release version 2.272

Details about the release

Table 17.

Item

Details

Release version

2.272

Release date

11 April, 2025

Docker image ID

Jar file



New features and changes

  • A new collector, the SAP HANA collector, is now available in public preview.

  • SSIS collector: Now captures the connections between packages through the control flow, improving visibility of package relationships.

  • Databricks collector: The collector now harvests lineage only within the current schema. A new option Harvest entire lineage (--harvest-entire-lineage) is added to enable harvesting lineage from external schemas.

  • Microsoft Fabric collector: Enhanced to support additional syntaxes for Semantic Model table connections to Fabric warehouse/lakehouse resources, broadening compatibility and connectivity.

Bug fixes

  • QlikSense collector: Made performance improvements to increase the efficiency and speed of the collector.

  • Microsoft Fabric collector: Fixed issues with Lakehouses and SQL endpoints to ensure database resources are associated with the correct resource, enhancing accuracy.

Release version 2.271

Details about the release

Table 18.

Item

Details

Release version

2.271

Release date

3 April, 2025

Docker image ID

Jar file



Bug fixes

  • Tableau collector: Fixed an issue to ensure proper handling of sites and projects with large quantities of column and data source fields.

  • Databricks collector: Resolved a parsing issue in view queries where aliases starting with a number caused failures.

Release version 2.270

Details about the release

Table 19.

Item

Details

Release version

2.270

Release date

31 March, 2025

Docker image ID

  • Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags

    • amd64: sha256:4ef7ce6fbf19c6ca9be1836d9e7ad11f8846480d037808cf746aff772f5fe9c0

    • arm64: sha256:2f0ac52d9891790444061c718d0b47b808b50d443d1571ec9dbd1426b57b3ffe

Jar file



New features and changes

  • The following two new collectors are now available in public preview.

  • Power BI collector: The collector now harvests column descriptions for Power BI columns.

  • Databricks collector: The collector now supports Oauth service principal authentication for Databricks. Two new parameters, Service principal client ID (--client-id) and Service principal client secret (--client-secret) are introduced for this.

Bug fixes

  • Tableau collector: Fixed an issue with the display of lineage between Tableau Fields and Database Columns, ensuring accurate representation of data relationships.

Release version 2.269

Important

This release was for internal improvements and has no customer impacting changes.

Details about the release

Table 20.

Item

Details

Release version

2.269

Release date

25 March, 2025

Docker image ID

  • Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags

    • amd64: sha256:bf16a4096768878ce956b3d215be58a02a5eec0d422f00ae2605a4d632161269

    • arm64: sha256:402330521664791a905a9d4987b8463a75c844e1de179709f67ac41f98ecb25b

Jar file



Release version 2.268

Warning

Collector versions 2.264 through 2.267 have been deprecated. If you are using these versions, please update to version 2.268 as soon as possible.

Details about the release

Table 21.

Item

Details

Release version

2.268

Release date

24 March, 2025

Docker image ID

Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags

  • amd64: sha256:71d1c7d7af7e1883ca1285cffc96e7259aeabadc42830511ff014a645de9f5d2

  • arm64: sha256:7912116980fa3e8b1e42e875fef1a57f16888e844afb16fdffdab2539c63d8e3

Jar file

Link to download the JAR file: https://releases.data.world/dwcc/2.268/dwcc-2.268.zip

  • Sha256: ecd223fac5f28a9a575e9bb1bee71f6a4a2cce46f44830d71c1ee32d3355fed0



New features and changes

  • Snowflake Collector: The collector now harvests system tags.

  • SSIS collector: The collector now supports the inclusion or exclusion of specific databases or servers from being harvested. For new parameters are introduced to use these features: --include-database, --exclude-database, --include-server, --exclude-server.

Bug fixes

  • All collectors: Resolved an issue where collectors created new collections with a new ID, leading to duplicate collections in the catalog.

  • Qlik Sense collector: Fixed an issue where missing user information in Qlik Sense resulted in an exception trace in the logfile.

  • Azure data factory collector: Added a log message to indicate when the Dataset API response lacks sufficient information, such as schema and table details, to construct lineage.

Release version 2.267 (deprecated)

Warning

This collector version is deprecated. Please use version 2.268 or higher to receive the latest collector updates.

Details about the release

Table 22.

Item

Details

Release version

2.267

Release date

17 March, 2025

Docker image ID

Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags

  • amd64: sha256:1a57e39be32b58a8e287a5429491461987fe9108628612fa41ddacdb3d493661

  • arm64: sha256:f38814914fa6330d9c696c38b00e6424b92eb2f540a214bef9233a2ba500f172

Jar file

Link to download the JAR file: https://releases.data.world/dwcc/2.267/dwcc-2.267.zip

  • Sha256: f5fffc04e80a8a81908ac94b54efd520a0ac9d4324f99e9aa389c1229133b9d9



New features and changes

  • Tableau collector: The way we represent ownership information for workbooks, views, and metrics in the Tableau catalog has been updated. The owner is now represented using the kos:hasOwner property.

    Important

    The former approach utilizing kos:createdBy will continue to be supported during a transition period but is deprecated and will be phased out in a future release. This change only impacts users who have written SPARQL queries or exported content using RDF properties. You will want to update your queries accordingly to reflect this update.

Bug fixes

  • Alteryx collector: Increased API call read timeout and improved error handling by capturing and logging processing exceptions.

Release version 2.266 (deprecated)

Warning

This collector version is deprecated. Please use version 2.268 or higher to receive the latest collector updates.

Details about the release

Table 23.

Item

Details

Release version

2.266

Release date

10 March, 2025

Docker image ID

Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags

  • amd64: sha256:390cd67a5a71c60669e9389310badb17f0a624868c0abd77fe5c8c963acd93f5

  • arm64: sha256:7151d3cd2794676bcd3e05b2ef9260ba2c1d1b78ca9c1ff347312f7178bd11da

Jar file

Link to download the JAR file: https://releases.data.world/dwcc/2.266/dwcc-2.266.zip

  • Sha256: faa3bdd9ebaf07a1328316d32b474f0d098b4842289d873c4f3985dc51bf1f9a



Bug fixes

  • Confluent collectors: The collector now correctly handles cases where consumer member assignments are missing a topic description.

Release version 2.265 (deprecated)

Warning

This collector version is deprecated. Please use version 2.268 or higher to receive the latest collector updates.

Details about the release

Table 24.

Item

Details

Release version

2.265

Release date

10 March, 2025

Docker image ID

Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags

  • amd64: sha256:9bed576a51eb315ee2b1399d11e0b9fd8d8957792cd577d94cf531fe5d27459f

  • arm64: sha256:d33e1f591f104fea033bf4c8d154c66e08db86131828d815416aedf4de59044e

Jar file

Link to download the JAR file: https://releases.data.world/dwcc/2.265/dwcc-2.265.zip

  • Sha256: a97b9770a3ca23b8a6974cb6aebd660edc6ea78675bc40fea8d584d81456b33c



Bug fixes

  • Tableau collector: The collector now accurately harvests hidden dashboards. Previously, the feature was limited to hidden Views, which included both Sheets and Dashboards but classified them all as Sheets. With this update, the collector distinguishes between Sheets and Dashboards, assigning the correct type to each entity. This ensures a more accurate representation of hidden Views in Tableau.

Release version 2.264 (deprecated)

Warning

This collector version is deprecated. Please use version 2.268 or higher to receive the latest collector updates.

Details about the release

Table 25.

Item

Details

Release version

2.264

Release date

7 March, 2025

Docker image ID

Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags

  • amd64: sha256:6e716b6fd9e635a2a54af5743507d932a7aaf003090b06bb5091c782e4bd39e3

  • arm64: sha256:43ecd32f0fe955eab774011ac8aef6c1fd3024087b4c3f810314dc62ce35d8a8

Jar file

Link to download the JAR file: https://releases.data.world/dwcc/2.264/dwcc-2.264.zip

  • Sha256: 236d5981adbc919337689d1c175ac82c13c315e4b1aeefb6e7ffa476223d8c42 (



New features and changes

  • A new collector, the AWS Database Migration Service (DMS) collector, is now available in public preview.

  • Tableau collector: Improve command guidance, documentation, and warnings about the required format of the Tableau API URL option.

  • Alteryx collector: The collector now harvests nested workflow nodes and catalogs their relationship with the workflow.

  • Azure Data Factory collector: Made improvements to Azure Data Factory lineage by enhancing the harvesting of lineage from parameterized dataset references. The collector now also harvests both downstream and upstream resources.

  • Oracle collector: Add a new parameter --autonomous-db-connection-string for connection string for autonomous DB.

Release version 2.263

Details about the release

Table 26.

Item

Details

Release version

2.263

Release date

3 March, 2025

Docker image ID

Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags

  • amd64: sha256:40f9a92cd0715afe2d3cf422c6990251ba72c7b540233d47d269a7c2105ebc8c

  • arm64: sha256:3b361d434add509aaeba3abefc44434fcf0d07ed77abdae8f95ae023d8b8f98e

Jar file

Link to download the JAR file: https://releases.data.world/dwcc/2.263/dwcc-2.263.zip

  • Sha256: 726d229923640d4f50d035cad0b2f09f6de32bda0799468310fa7526456f5762



New features and changes

  • Tableau collector: The collector now harvests relationships between SQL tables and workbooks, enhancing data connectivity and visualization.

  • AWS Glue collector: Enhanced the collector to harvest lineage from Glue Data Catalog tables to their underlying S3 objects and gather more metadata for tables. The enhanced collector is available with the command catalog-aws-glue, while the legacy collector remains available as catalog-aws-glue-legacy or catalog-awsglue for compatibility. Please coordinate with your Customer Success Director for a smooth transition to the new collector version soon.

    Note that the AWS Glue collector is only available as an on-premise solution, not as a cloud collector.

Bug fixes

  • Tableau collector:

    • Resolved an error in harvesting Custom SQL Tables.

    • Fixed an issue with filtering projects by name or ID.

Release notes for previous versions