Catalog collector release notes
Important
Published versions of collectors are available as a docker image and a JAR file.
Release version 2.251
Details about the release
Item | Details |
---|---|
Release version | 2.251 |
Release date | December 12, 2024 |
Docker image ID | Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jira file | Link to download the JAR file:https://releases.data.world/dwcc/2.251/dwcc-2.251.zip
|
New features and changes
Snowflake collector: Enhanced support for harvesting lineage from view select statements that include QUALIFY and TRY_CAST constructs.
Redshift collector: The collector is now able to harvest definitions (DDL) for stored procedures and functions.
All Collectors: Improved the collectors logging for greater consistency in local and uploaded logfile names. Updated log file naming conventions to ensure uniqueness and preserve log files from previous runs when uploading to data.world.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Enhanced support for harvesting lineage from view select statements containing a mix of qualified and unqualified table names.
dbt Core and dbt Cloud collectors: The relationship between DbtSource and its abstracted tables or views is now defined as representsDataSource. This change better reflects the semantics of dbt sources, and improves visualization of lineage relationships in Eureka Explorer.
Tableau Collector: Released an updated collector for Tableau, featuring improved detection of lineage relationships to database objects along with stability and performance enhancements. While the new collector will co-exist with the legacy version, we encourage transitioning to the new version to take advantage of these enhancements.
Release version 2.250
Details about the release
Item | Details |
---|---|
Release version | 2.250 |
Release date | December 7, 2024 |
Docker image ID | Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file:https://releases.data.world/dwcc/2.250/dwcc-2.250.zip
|
New features and changes
Snowflake collector: Added support for additional SQL syntaxes when parsing statements that include the QUALIFY keyword.
Alteryx collector: Added support for honoring the max job limit option when set to zero.
Bug fixes
Snowflake collector: Resolved an issue that caused duplicate schema processing and duplicate CatalogCuration resources during incremental collection.
Release version 2.249
Details about the release
Item | Details |
---|---|
Release version | 2.249 |
Release date | November 27, 2024 |
Docker image ID | Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file:https://releases.data.world/dwcc/2.249/dwcc-2.249.zip
|
New features and changes
Snowflake collector:
The collector now harvests dependencies from tables and views.
Made performance enhancements for incremental metadata collections.
PostgreSQL collector: Added support for AWS IAM Authentication tokens for databases hosted on AWS.
Power BI Service and Power BI Gov collectors: Updated to accommodate Microsoft’s transition from dataset to semantic model in all catalog resources emitted by the collector.
Bug fixes
Redshift collector: Corrected handling of lineage harvesting from SQL statements using ARRAY literals.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Improved handling of lineage harvesting from SQL statements containing KEYS as a column name.
Snowflake collector: Fixed issues in harvesting lineage from SQL statements using QUALIFY and COPY GRANTS keywords.
Power BI Service and Power BI Gov collectors: Introduced various improvements to the datasource template.
Release version 2.248
Details about the release
Item | Details |
---|---|
Release version | 2.248 |
Release date | November 19, 2024 |
Docker image ID | Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file:https://releases.data.world/dwcc/2.248/dwcc-2.248.zip
|
New features and changes
Denodo collector: Improved column fetching by processing one view or table at a time if an issue occurs when retrieving all columns at once.
Fivetran collector: Added support for Salesforce as a source.
Salesforce collector: Added support for harvesting metadata for summary (roll-up) fields.
Bug fixes
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Resolved an issue by clearing lineage resolution cache between runs to avoid conflicts when multiple commands are configured.
Release version 2.247
Details about the release
Item | Details |
---|---|
Release version | 2.247 |
Release date | November 14, 2024 |
Docker image ID | Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file:https://releases.data.world/dwcc/2.247/dwcc-2.247.zip
|
New features and changes
SQL Server collector: Added support for Active Directory Service Principal and Entra ID authentications.
Amazon S3 collector: Enhanced object filtering based on configuration options to include or exclude object names before checking the maximum resources limit.
AWS Glue Collector: Added functionality to catalog partitioned columns.
Bug fixes
Power BI Service and Power BI Gov collectors: Resolved an issue with the CombineColumns transform step that was causing warnings in some cases due to misalignment in lineage to source columns.
dbt Core collector: Improved handling of situations where the profiles.yml file is mistakenly specified as a directory.
Release version 2.246
Details about the release
Item | Details |
---|---|
Release version | 2.246 |
Release date | November 7, 2024 |
Docker image ID | Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file:https://releases.data.world/dwcc/2.246/dwcc-2.246.zip
|
New features and changes
Salesforce collector: The security token configuration option is no longer required, as there are scenarios where authentication to the Salesforce API does not require it.
Bug fixes
Power BI Service and power BI Gov collectors: Fixed a problem that was occurring while attempting to replace parameters in custom SQL when the parameter name contained a special character.
dbt cloud collector: Fixed an issue in which the user-specified Snowflake account override was being ignored by the collector.
Release version 2.245
Details about the release
Item | Details |
---|---|
Release version | 2.245 |
Release date | November 3, 2024 |
Docker image ID | Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file:https://releases.data.world/dwcc/2.245/dwcc-2.245.zip
|
New features and changes
Reltio collector: The collector now supports client_credentials authentication, in addition to the existing user and password authentication.
Release version 2.244
Details about the release
Item | Details |
---|---|
Release version | 2.244 |
Release date | October 31, 2024 |
Docker image ID | Link to download the Docker image:https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file:https://releases.data.world/dwcc/2.244/dwcc-2.244.zip
|
New features and changes
Denodo collector: The collector now harvests primary and foreign keys by schema to improve performance.
Databricks collector: Introduced a new parameter, --exclude-system-functions, which allows collector to exlude harvesting of built-in Databricks system functions.
Power BI Service and Power BI Gov collectors: The collectors now catalog the following additional properties:
Dataset: Created date, Created by
Dataflow: Created by
Report: Created date, Last modified, Created by, Last modified by
Oracle collector: Modified the --linked-host option to require the database name for a linked database, as there was no reliable way to query this through the link.
Bug fixes
Power BI Service and Power BI Gov collectors: Fixed an issue with parsing parameters in SQL to account for the use of Text.From().
Release version 2.243
Details about the release
Item | Details |
---|---|
Release version | 2.243 |
Release date | October 25, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
The following two new collectors are now available in private preview. If you would like access to this collector, please contact your Customer Success Director.
Azure Data Lake Storage Gen2 collector: Enabled harvesting of Azure blob storage. Two new parameters are Introduced as a result of this change: --disable-adls-gen2, --disable-azure-blob-storage
PowerBI collector:
The collector now supports parameters within SQL queries.
Added support for Table.DuplicateColumn and Table.CombineColumns table transformations.
ADLS (Azure Data Lake Storage) Gen2 collector: The collector now supports harvesting Azure blob storage.
Redshift collector: Added support for harvesting materialized views.
Databricks collector: Skipped collection of extended table metadata for system database to ensure stable run.
Bug fixes
Monte Carlo collector: Resolved a null pointer error caused by unexpected null values.
Release version 2.242
Details about the release
Item | Details |
---|---|
Release version | 2.242 |
Release date | October 19, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
PowerBI collector: The collector now supports parsing custom SQL queries for Databricks sources.
Bug fixes
Oracle collector:
Fixed an issue with lineage relationships where synonyms referring to tables in a different schema.
Fixed an issue with determining the database name for cross-system or cross-database lineage.
The collector now properly resolves lineage for views that use database links.
Databricks collector:
Fixed an issue with table names containing non-alphanumeric characters.
Resolved an error caused by references to non-existent tables in SQL.
SQL Server Integration Services (SSIS) collector: Fixed an issue with missing file connection strings.
Release version 2.240
Details about the release
Item | Details |
---|---|
Release version | 2.240 |
Release date | October 15, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
SQL Server collector: Added support for parsing WITH TIES and TRY_CAST statements.
PowerBI collector: Added an optional starter datasource YAML file required for harvesting lineage. If the YAML file is missing or incomplete, a template will be generated in the collector logs for easy reference.
Bug fixes
All collectors: Fixed an issue with detecting the existence of configured datasets.
Release version 2.239
Details about the release
Item | Details |
---|---|
Release version | 2.239 |
Release date | October 3, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
SQL Server collector now catalogs:
Dependencies between database resources.
Lineage for stored procedures.
Minimal available lineage for views when the SQL query fails to parse.
PowerBI collector: Improved data source titles for better specificity.
Salesforce collector: Optimized the process of fetching authentication tokens to reduce the number of Salesforce API calls.
Bug fixes
Sigma collector: Fixed an issue with API structure changes due to pagination limits.
Redshift collector: Resolved an issue where the collector incorrectly reported that the provided credential did not have access to the database tables being harvested.
Release version 2.238
Details about the release
Item | Details |
---|---|
Release version | 2.238 |
Release date | September 25, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Snowflake collector:
Added options to harvest all tags and policies within Snowflake, regardless of the specified database.
The collector now performs normal collection when incremental mode is requested with incompatible options.
Denodo collector:
The collector now preprocess view definitions to remove derived view descriptions.
Added a fallback when SQL parsing fails.
SQL Server collector: Added support for lineage in stored procedures that do not use BEGIN/END statements in the procedure body.
dbt Cloud collector: The collector now accommodates large numeric values for account, project, and run identifiers.
Bug fixes
Redshift collector: Fixed an issue with view parsing for lineage when no schema binding is present.
Release version 2.237
Details about the release
Item | Details |
---|---|
Release version | 2.237 |
Release date | September 18, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
The following two new collectors are now available in private preview. If you would like access to this collector, please contact your Customer Success Director.
The Amazon QuickSight collector is now generally available to all customers.
Bug fixes
Power BI Report Server (PBIRS) collector: Fixed an issue when collector resource does not have a parent folder.
Release version 2.236
Details about the release
Item | Details |
---|---|
Release version | 2.236 |
Release date | September 18, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Oracle collector: The collector now supports cataloging lineage when database links are used in view and stored procedure and function definitions.
Bug fixes
Snowflake collector: The collector now properly handles comments in view SQL definition during lineage parsing.
Sigma collector: Resolved an error caused by changes in the API response structure when the number of records exceed the default pagination limit.
Release version 2.235
Details about the release
Item | Details |
---|---|
Release version | 2.235 |
Release date | September 10, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: The collectors now support harvesting decimal digits metadata for columns.
Release version 2.234
Details about the release
Item | Details |
---|---|
Release version | 2.234 |
Release date | September 6, 2024 |
Docker image ID |
|
Jar file |
|
Bug fixes
Generic JDBC, Denodo, MySQL, SQL Server collectors: Resolved arithmetic overflow error when converting expressions to data type bigint.
SSIS collector: Added database location information when creating a database asset.
Release notes for previous versions
Go here to access release notes for previous version.