Skip to main content

Catalog collector release notes

Important

Published versions of collectors are available as a docker image and a JAR file.

Release version 2.251

Details about the release

Table 1.

Item

Details

Release version

2.251

Release date

December 12, 2024

Docker image ID

Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags

  • arm64: sha256:53b6ad3101a4dac888f230813161fad176edc5779ea54535991cf926704c265b

  • amd64: sha256:beffef2f05d41abc0bf60e5db8b5d80973ca144400c2c2bed621b12625b8c870

Jira file

Link to download the JAR file:https://releases.data.world/dwcc/2.251/dwcc-2.251.zip

  • Sha256: d0cdb4eb89bede779fa0c5285b2af620523b8cfe503361cad23027ed6e542ede



New features and changes

  • Snowflake collector: Enhanced support for harvesting lineage from view select statements that include QUALIFY and TRY_CAST constructs.

  • Redshift collector: The collector is now able to harvest definitions (DDL) for stored procedures and functions.

  • All Collectors: Improved the collectors logging for greater consistency in local and uploaded logfile names. Updated log file naming conventions to ensure uniqueness and preserve log files from previous runs when uploading to data.world.

  • Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Enhanced support for harvesting lineage from view select statements containing a mix of qualified and unqualified table names.

  • dbt Core and dbt Cloud collectors: The relationship between DbtSource and its abstracted tables or views is now defined as representsDataSource. This change better reflects the semantics of dbt sources, and improves visualization of lineage relationships in Eureka Explorer.

  • Tableau Collector: Released an updated collector for Tableau, featuring improved detection of lineage relationships to database objects along with stability and performance enhancements. While the new collector will co-exist with the legacy version, we encourage transitioning to the new version to take advantage of these enhancements.

Release version 2.250

Details about the release

Table 2.

Item

Details

Release version

2.250

Release date

December 7, 2024

Docker image ID

Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags

  • arm64: ce6b7554c41b9e1a5c3dbaac8670882945a81e6e8770e3d71d730cb607d64baf

  • amd64: 8196a7fcde7d60b9c4970d6662921fc6dd5135a35b4685093f80c2208167b697

Jar file

Link to download the JAR file:https://releases.data.world/dwcc/2.250/dwcc-2.250.zip

  • Sha256: d002e0aed3d57dee0a7ec8b289389eeb67af3ca44214d3f2e1fdd4f2864e6528



New features and changes

  • Snowflake collector: Added support for additional SQL syntaxes when parsing statements that include the QUALIFY keyword.

  • Alteryx collector: Added support for honoring the max job limit option when set to zero.

Bug fixes

  • Snowflake collector: Resolved an issue that caused duplicate schema processing and duplicate CatalogCuration resources during incremental collection.

Release version 2.249

Details about the release

Table 3.

Item

Details

Release version

2.249

Release date

November 27, 2024

Docker image ID

Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags

  • arm64: sha256:1291bc01e591438370cc19146d5a43798a831ebc961fa3675dd674efaa8357a7

  • amd64: sha256:9ac5378476c0901b41a55f9d1a2d68e568b2882958191f3961dfe4ee3a93c27b

Jar file

Link to download the JAR file:https://releases.data.world/dwcc/2.249/dwcc-2.249.zip

  • Sha256: 02d3927a6aea2441256dc8bd2f1066dc3770c3fa60ebaaaedb581c917d1a87e8



New features and changes

  • Snowflake collector:

    • The collector now harvests dependencies from tables and views.

    • Made performance enhancements for incremental metadata collections.

  • PostgreSQL collector: Added support for AWS IAM Authentication tokens for databases hosted on AWS.

  • Power BI Service and Power BI Gov collectors: Updated to accommodate Microsoft’s transition from dataset to semantic model in all catalog resources emitted by the collector.

Bug fixes

  • Redshift collector: Corrected handling of lineage harvesting from SQL statements using ARRAY literals.

  • Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Improved handling of lineage harvesting from SQL statements containing KEYS as a column name.

  • Snowflake collector: Fixed issues in harvesting lineage from SQL statements using QUALIFY and COPY GRANTS keywords.

  • Power BI Service and Power BI Gov collectors: Introduced various improvements to the datasource template.

Release version 2.248

Details about the release

Table 4.

Item

Details

Release version

2.248

Release date

November 19, 2024

Docker image ID

Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags

  • arm64: sha256:96d5eaa7b377b459ba0f7c017d061c12df565cfa851b5f6b8024e98506d0b1c4

  • amd64: sha256:635a46171728ebc41c2a32a29996191c089aa39baa030efc72cd66bb5fef0bc3

Jar file

Link to download the JAR file:https://releases.data.world/dwcc/2.248/dwcc-2.248.zip

  • Sha256: ad1e2adaced3304b1f1e860dcb8b9414ceea97ed75a43fca682933db9f376880



New features and changes

  • Denodo collector: Improved column fetching by processing one view or table at a time if an issue occurs when retrieving all columns at once.

  • Fivetran collector: Added support for Salesforce as a source.

  • Salesforce collector: Added support for harvesting metadata for summary (roll-up) fields.

Bug fixes

  • Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Resolved an issue by clearing lineage resolution cache between runs to avoid conflicts when multiple commands are configured.

Release version 2.247

Details about the release

Table 5.

Item

Details

Release version

2.247

Release date

November 14, 2024

Docker image ID

Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags

  • arm64: 24525b02a7d7eb7230d441affc7cd0b4fb23fe4f6135bf55b86960761949ca60

  • amd64: 621f9d5b2db19b1b03201cb3df75ae49f45797a9dac26d0aa999061d5878b191

Jar file

Link to download the JAR file:https://releases.data.world/dwcc/2.247/dwcc-2.247.zip

  • Sha256: dc0b72f8053022b9973f43ce0c01e35fb0f444e4a2a21d36d4fe1597078a1c98



New features and changes

  • SQL Server collector: Added support for Active Directory Service Principal and Entra ID authentications.

  • Amazon S3 collector: Enhanced object filtering based on configuration options to include or exclude object names before checking the maximum resources limit.

  • AWS Glue Collector: Added functionality to catalog partitioned columns.

Bug fixes

  • Power BI Service and Power BI Gov collectors: Resolved an issue with the CombineColumns transform step that was causing warnings in some cases due to misalignment in lineage to source columns.

  • dbt Core collector: Improved handling of situations where the profiles.yml file is mistakenly specified as a directory.

Release version 2.246

Details about the release

Table 6.

Item

Details

Release version

2.246

Release date

November 7, 2024

Docker image ID

Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags

  • arm64: sha256:668965157334c401364646dd200b6c30d8373cd26ca095e046e36421816d9e80

  • amd64: sha256:3ad5c152a6196925ee87450b54f02232797e5e2bbf21d2232067283186c9ad0a

Jar file

Link to download the JAR file:https://releases.data.world/dwcc/2.246/dwcc-2.246.zip

  • Sha256: 642c91d5eab913cdda14350d0a70dc3ae098ea5bfeca9e755ae393c15f251a43



New features and changes

  • Salesforce collector: The security token configuration option is no longer required, as there are scenarios where authentication to the Salesforce API does not require it.

Bug fixes

  • Power BI Service and power BI Gov collectors: Fixed a problem that was occurring while attempting to replace parameters in custom SQL when the parameter name contained a special character.

  • dbt cloud collector: Fixed an issue in which the user-specified Snowflake account override was being ignored by the collector.

Release version 2.245

Details about the release

Table 7.

Item

Details

Release version

2.245

Release date

November 3, 2024

Docker image ID

Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags

  • arm64: sha256:79ce38d27226e8a1967c642c16e368ac5e32fa28f7fee9d7720aabe0915e5406

  • amd64: sha256:d7a98dd219fb150fc373132c081d7a7a2adcd3a56a2b291e8e105c9d82ade8ba

Jar file

Link to download the JAR file:https://releases.data.world/dwcc/2.245/dwcc-2.245.zip

  • Sha256: 119aa2eeae911beb952914240bcfbb4c7f71a45d5e45f7fe63fbad87b4a89cc7



New features and changes

  • Reltio collector: The collector now supports client_credentials authentication, in addition to the existing user and password authentication.

Release version 2.244

Details about the release

Table 8.

Item

Details

Release version

2.244

Release date

October 31, 2024

Docker image ID

Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags

  • arm64: 3d6659738cf9c50a1f96c1939c200d46eb2706d2e9f433b45136aa09a2cca583

  • amd64: 78b139870c93b1f6fa1ea9626e69f6dff3aa7d6bd055be5afaa35b223cb56254

Jar file

Link to download the JAR file:https://releases.data.world/dwcc/2.244/dwcc-2.244.zip

  • Sha256: fd205d477baf0bd987f7c4efa03825f11e2fed09cc49fcabb47d1f853665deaf



New features and changes

  • Denodo collector: The collector now harvests primary and foreign keys by schema to improve performance.

  • Databricks collector: Introduced a new parameter, --exclude-system-functions, which allows collector to exlude harvesting of built-in Databricks system functions.

  • Power BI Service and Power BI Gov collectors: The collectors now catalog the following additional properties:

    • Dataset: Created date, Created by

    • Dataflow: Created by

    • Report: Created date, Last modified, Created by, Last modified by

  • Oracle collector: Modified the --linked-host option to require the database name for a linked database, as there was no reliable way to query this through the link.

Bug fixes

  • Power BI Service and Power BI Gov collectors: Fixed an issue with parsing parameters in SQL to account for the use of Text.From().

Release version 2.243

Details about the release

Table 9.

Item

Details

Release version

2.243

Release date

October 25, 2024

Docker image ID

Jar file



New features and changes

  • The following two new collectors are now available in private preview. If you would like access to this collector, please contact your Customer Success Director.

  • Azure Data Lake Storage Gen2 collector: Enabled harvesting of Azure blob storage. Two new parameters are Introduced as a result of this change: --disable-adls-gen2, --disable-azure-blob-storage

  • PowerBI collector:

    • The collector now supports parameters within SQL queries.

    • Added support for Table.DuplicateColumn and Table.CombineColumns table transformations.

  • ADLS (Azure Data Lake Storage) Gen2 collector: The collector now supports harvesting Azure blob storage.

  • Redshift collector: Added support for harvesting materialized views.

  • Databricks collector: Skipped collection of extended table metadata for system database to ensure stable run.

Bug fixes

  • Monte Carlo collector: Resolved a null pointer error caused by unexpected null values.

Release version 2.242

Details about the release

Table 10.

Item

Details

Release version

2.242

Release date

October 19, 2024

Docker image ID

  • Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags

    • arm64: sha256:b0085f078adbd5b9173e5a2c1ca795a227a6a83ea5d86e4553832b98ca225f7c

    • amd64: sha256:6b080dfbe1b8697e0b8da79a09f9f1c92e660fe1b4a7a5f971fe7ea4207e6ec2

Jar file



New features and changes

  • PowerBI collector: The collector now supports parsing custom SQL queries for Databricks sources.

Bug fixes

  • Oracle collector:

    • Fixed an issue with lineage relationships where synonyms referring to tables in a different schema.

    • Fixed an issue with determining the database name for cross-system or cross-database lineage.

    • The collector now properly resolves lineage for views that use database links.

  • Databricks collector:

    • Fixed an issue with table names containing non-alphanumeric characters.

    • Resolved an error caused by references to non-existent tables in SQL.

  • SQL Server Integration Services (SSIS) collector: Fixed an issue with missing file connection strings.

Release version 2.240

Details about the release

Table 11.

Item

Details

Release version

2.240

Release date

October 15, 2024

Docker image ID

Jar file



New features and changes

  • SQL Server collector: Added support for parsing WITH TIES and TRY_CAST statements.

  • PowerBI collector: Added an optional starter datasource YAML file required for harvesting lineage. If the YAML file is missing or incomplete, a template will be generated in the collector logs for easy reference.

Bug fixes

  • All collectors: Fixed an issue with detecting the existence of configured datasets.

Release version 2.239

Details about the release

Table 12.

Item

Details

Release version

2.239

Release date

October 3, 2024

Docker image ID

Jar file



New features and changes

  • SQL Server collector now catalogs:

    • Dependencies between database resources.

    • Lineage for stored procedures.

    • Minimal available lineage for views when the SQL query fails to parse.

  • PowerBI collector: Improved data source titles for better specificity.

  • Salesforce collector: Optimized the process of fetching authentication tokens to reduce the number of Salesforce API calls.

Bug fixes

  • Sigma collector: Fixed an issue with API structure changes due to pagination limits.

  • Redshift collector:  Resolved an issue where the collector incorrectly reported that the provided credential did not have access to the database tables being harvested.

Release version 2.238

Details about the release

Table 13.

Item

Details

Release version

2.238

Release date

September 25, 2024

Docker image ID

Jar file



New features and changes

  • Snowflake collector:

    • Added options to harvest all tags and policies within Snowflake, regardless of the specified database.

    • The collector now performs normal collection when incremental mode is requested with incompatible options.

  • Denodo collector:

    • The collector now preprocess view definitions to remove derived view descriptions.

    • Added a fallback when SQL parsing fails.

  • SQL Server collector: Added support for lineage in stored procedures that do not use BEGIN/END statements in the procedure body.

  • dbt Cloud collector: The collector now accommodates large numeric values for account, project, and run identifiers.

Bug fixes

  • Redshift collector: Fixed an issue with view parsing for lineage when no schema binding is present.

Release version 2.237

Details about the release

Table 14.

Item

Details

Release version

2.237

Release date

September 18, 2024

Docker image ID

Jar file



New features and changes

Bug fixes

  • Power BI Report Server (PBIRS) collector: Fixed an issue when collector resource does not have a parent folder.

Release version 2.236

Details about the release

Table 15.

Item

Details

Release version

2.236

Release date

September 18, 2024

Docker image ID

Jar file



New features and changes

  • Oracle collector: The collector now supports cataloging lineage when database links are used in view and stored procedure and function definitions.

Bug fixes

  • Snowflake collector: The collector now properly handles comments in view SQL definition during lineage parsing.

  • Sigma collector: Resolved an error caused by changes in the API response structure when the number of records exceed the default pagination limit.

Release version 2.235

Details about the release

Table 16.

Item

Details

Release version

2.235

Release date

September 10, 2024

Docker image ID

Jar file



New features and changes

  • Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: The collectors now support harvesting decimal digits metadata for columns.

Release version 2.234

Details about the release

Table 17.

Item

Details

Release version

2.234

Release date

September 6, 2024

Docker image ID

  • Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags

    • arm64: sha256:aba0a42c93f9ed4dc73f4290f4d1c917f6f2d2ad92cb21bc94fcbf6b18fc5af4

    • amd64: sha256:bbfb88b4665d0b1a72b54f600e829a16e104db9e4640b1b74bbb90d970a98601

Jar file



Bug fixes

  • Generic JDBC, Denodo, MySQL, SQL Server collectors: Resolved arithmetic overflow error when converting expressions to data type bigint.

  • SSIS collector: Added database location information when creating a database asset.

Release notes for previous versions