Release notes for previous versions
Release version 2.270
Details about the release
Item | Details |
---|---|
Release version | 2.270 |
Release date | 31 March, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
The following two new collectors are now available in public preview.
Power BI collector: The collector now harvests column descriptions for Power BI columns.
Databricks collector: The collector now supports Oauth service principal authentication for Databricks. Two new parameters, Service principal client ID (--client-id) and Service principal client secret (--client-secret) are introduced for this.
Bug fixes
Tableau collector: Fixed an issue with the display of lineage between Tableau Fields and Database Columns, ensuring accurate representation of data relationships.
Release version 2.269
Important
This release was for internal improvements and has no customer impacting changes.
Details about the release
Item | Details |
---|---|
Release version | 2.269 |
Release date | 25 March, 2025 |
Docker image ID |
|
Jar file |
|
Release version 2.268
Warning
Collector versions 2.264 through 2.267 have been deprecated. If you are using these versions, please update to version 2.268 as soon as possible.
Details about the release
Item | Details |
---|---|
Release version | 2.268 |
Release date | 24 March, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.268/dwcc-2.268.zip
|
New features and changes
Snowflake Collector: The collector now harvests system tags.
SSIS collector: The collector now supports the inclusion or exclusion of specific databases or servers from being harvested. For new parameters are introduced to use these features: --include-database, --exclude-database, --include-server, --exclude-server.
Bug fixes
All collectors: Resolved an issue where collectors created new collections with a new ID, leading to duplicate collections in the catalog.
Qlik Sense collector: Fixed an issue where missing user information in Qlik Sense resulted in an exception trace in the logfile.
Azure data factory collector: Added a log message to indicate when the Dataset API response lacks sufficient information, such as schema and table details, to construct lineage.
Release version 2.267 (deprecated)
Warning
This collector version is deprecated. Please use version 2.268 or higher to receive the latest collector updates.
Details about the release
Item | Details |
---|---|
Release version | 2.267 |
Release date | 17 March, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.267/dwcc-2.267.zip
|
New features and changes
Tableau collector: The way we represent ownership information for workbooks, views, and metrics in the Tableau catalog has been updated. The owner is now represented using the kos:hasOwner property.
Important
The former approach utilizing kos:createdBy will continue to be supported during a transition period but is deprecated and will be phased out in a future release. This change only impacts users who have written SPARQL queries or exported content using RDF properties. You will want to update your queries accordingly to reflect this update.
Bug fixes
Alteryx collector: Increased API call read timeout and improved error handling by capturing and logging processing exceptions.
Release version 2.266 (deprecated)
Warning
This collector version is deprecated. Please use version 2.268 or higher to receive the latest collector updates.
Details about the release
Item | Details |
---|---|
Release version | 2.266 |
Release date | 10 March, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.266/dwcc-2.266.zip
|
Bug fixes
Confluent collectors: The collector now correctly handles cases where consumer member assignments are missing a topic description.
Release version 2.265 (deprecated)
Warning
This collector version is deprecated. Please use version 2.268 or higher to receive the latest collector updates.
Details about the release
Item | Details |
---|---|
Release version | 2.265 |
Release date | 10 March, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.265/dwcc-2.265.zip
|
Bug fixes
Tableau collector: The collector now accurately harvests hidden dashboards. Previously, the feature was limited to hidden Views, which included both Sheets and Dashboards but classified them all as Sheets. With this update, the collector distinguishes between Sheets and Dashboards, assigning the correct type to each entity. This ensures a more accurate representation of hidden Views in Tableau.
Release version 2.264 (deprecated)
Warning
This collector version is deprecated. Please use version 2.268 or higher to receive the latest collector updates.
Details about the release
Item | Details |
---|---|
Release version | 2.264 |
Release date | 7 March, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.264/dwcc-2.264.zip
|
New features and changes
A new collector, the AWS Database Migration Service (DMS) collector, is now available in public preview.
Tableau collector: Improve command guidance, documentation, and warnings about the required format of the Tableau API URL option.
Alteryx collector: The collector now harvests nested workflow nodes and catalogs their relationship with the workflow.
Azure Data Factory collector: Made improvements to Azure Data Factory lineage by enhancing the harvesting of lineage from parameterized dataset references. The collector now also harvests both downstream and upstream resources.
Oracle collector: Add a new parameter --autonomous-db-connection-string for connection string for autonomous DB.
Release version 2.263
Details about the release
Item | Details |
---|---|
Release version | 2.263 |
Release date | 3 March, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.263/dwcc-2.263.zip
|
New features and changes
Tableau collector: The collector now harvests relationships between SQL tables and workbooks, enhancing data connectivity and visualization.
AWS Glue collector: Enhanced the collector to harvest lineage from Glue Data Catalog tables to their underlying S3 objects and gather more metadata for tables. The enhanced collector is available with the command catalog-aws-glue, while the legacy collector remains available as catalog-aws-glue-legacy or catalog-awsglue for compatibility. Please coordinate with your Customer Success Director for a smooth transition to the new collector version soon.
Note that the AWS Glue collector is only available as an on-premise solution, not as a cloud collector.
Bug fixes
Tableau collector:
Resolved an error in harvesting Custom SQL Tables.
Fixed an issue with filtering projects by name or ID.
Release version 2.262
Details about the release
Item | Details |
---|---|
Release version | 2.262 |
Release date | 26 February, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.262/dwcc-2.262.zip
|
Bug fixes
Tableau collector:
Fixed an issue with URL encoding.
The collector now properly handles server errors in GraphQL pagination.
Databricks collector: Fixed an issue where a null pointer exception occurred while harvesting tags from Databricks.
Release version 2.261
Details about the release
Item | Details |
---|---|
Release version | 2.261 |
Release date | 19 February, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.261/dwcc-2.261.zip
|
Bug fixes
Databricks collector: Fixed an exception that occurred when the table type did not match any known types.
Fivetran collector: Updated to use new APIs for retrieving column lineage due to changes in Fivetran API.
Important
Update your collector configurations to the latest version to seamlessly view column lineage without disruptions.
Release version 2.260
Details about the release
Item | Details |
---|---|
Release version | 2.260 |
Release date | 13 February, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.260/dwcc-2.260.zip
|
New features and changes
Snowflake collector: Added support for parsing SQL that utilizes IDENTIFIER() function calls when passing exact table names.
Bug fixes
Tableau collector (Preview): Added a null check in custom SQL Table logic to prevent errors.
SQL Server collector: Reduced excessive log and warning messages during dependency collection to streamline output.
Databricks collector: Improved error handling for connection issues with the Databricks host.
Release version 2.259
Details about the release
Item | Details |
---|---|
Release version | 2.259 |
Release date | 6 February, 2025 |
Docker image ID |
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.259/dwcc-2.259.zip
|
New features and changes
Databricks collector: Added support for harvesting resources with the browse privilege.
Tableau collector: Now supports harvesting lineage between Custom SQL tables and their upstream tables.
Power BI and Power BI Gov collectors: Handling calculated tables as a new type and cataloging table-level lineage to source tables and columns.
Bug fixes
Monte Carlo collector: Updated the collector to remove reaction type from incidents, as it has been deprecated in Monte Carlo GraphQL responses.
Salesforce collector: Added a null check to prevent exceptions when last modified by information is missing.
Release version 2.258
Details about the release
Item | Details |
---|---|
Release version | 2.258 |
Release date | 30 January, 2025 |
Docker image ID |
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.258/dwcc-2.258.zip
|
New features and changes
The following two new collectors are now available in public preview.
Bug fixes
Reltio collector: Resolved an issue that caused errors when cataloging containment relationships for certain attribute containers
Power BI and Power BI Gov collectors: Stopped cataloging JDBC types for columns in Power BI, as this task is best handled by the database collector. Power BI previously attempted to infer JDBC types based on its column types, which is now corrected.
Release version 2.257
Details about the release
Item | Details |
---|---|
Release version | 2.257 |
Release date | 27 January, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.257/dwcc-2.257.zip
|
New features and changes
Power BI and Power BI Gov collectors:
Added the ability to include database names in the datasources mapping file, allowing datasource credentials to be restricted to specific databases.
Introduced a flag for Power BI tables to indicate when data is loaded manually.
Tableau collector: Enhanced the collector to catalog unpublished views, expanding visibility into Tableau assets.
Bug fixes
Snowflake collector: Enhanced the incremental collection process to prevent the deletion of specific database column resources, ensuring data integrity and continuity.
Monte Carlo collector: Updated to catalog Monte Carlo warehouse IDs instead of hostnames due to API changes.
Databricks collector: Fixed an issue where the --include-information-schema option produced an incorrect warning.
Denodo collector: Resolved an issue with the internal cleanup function that was unable to remove PRIMARY KEY statements from SQL.
Release version 2.256
Details about the release
Item | Details |
---|---|
Release version | 2.256 |
Release date | 17 January, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.256/dwcc-2.256.zip
|
Bug fixes
Power BI Service and Power BI Gov collectors: Resolved an issue with handling parameters when they are referenced using @ symbol.
Release version 2.255
Details about the release
Item | Details |
---|---|
Release version | 2.255 |
Release date | 15 January, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.255/dwcc-2.255.zip
|
New features and changes
Databricks collector: Added support for harvesting SQL queries and their associated lineage. A new parameter Page size for harvesting queries (--query-pagination-limit) is introduced for this.
Power BI Service and Power BI Gov collectors: Added support for jdbcProperties in the datasources.yaml configuration for connecting to databases when resolving lineage.
Bug fixes
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Database authentication issues are now reported as errors.
Oracle collector:
Corrected the command description for the autonomous database parameter.
Fixed an issue with removing comments containing special symbols.
Redshift collector: The collector now correctly harvests distinct utility functions and procedures.
Databricks collector: Resolved an issue where column names were not recognized due to case sensitivity mismatches.
Release version 2.254
Details about the release
Item | Details |
---|---|
Release version | 2.254 |
Release date | 7 January, 2025 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.254/dwcc-2.254.zip
|
New features and changes
Oracle Collector: Now supports harvesting from Oracle Autonomous Database. A new parameter --autonomous-db is introduced for this.
Tableau collector: Now supports harvesting of personal space workbooks. A new parameter --tableau-catalog-personal-space-workbooks is introduced for this.
Bug fixes
Power BI collector: Resolved an issue with parameter replacements in table source expressions when a parameter name is the same as the name of the table it defines.
SQL Server collector: Ensured encryption is enabled for connections when the encrypt JDBC property is configured.
Alteryx collector: Fixed an issue encountered while fetching workflow details when the user does not have permission on the workflow.
Release version 2.253
Details about the release
Item | Details |
---|---|
Release version | 2.253 |
Release date | December 23, 2024 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.253/dwcc-2.253.zip
|
New features and changes
Power BI Service and Power BI gov collectors:
Added support for the default behavior in Power BI SQL Server sources when no schema is specified in Custom SQL queries.
Added support for cataloging report descriptions.
Bug fixes
Snowflake, Redshift, Databricks, Oracle, PostgreSQL, Db2, Netezza, SQL Server collectors: Improved handling of lineage harvesting from SQL statements that contain quoted dashes, which were previously misinterpreted as comments.
Oracle collector: Corrected the handling of dependency harvesting to focus only on requested schemas.
Release version 2.252
Details about the release
Item | Details |
---|---|
Release version | 2.252 |
Release date | December 20, 2024 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.252/dwcc-2.252.zip
|
New features and changes
Snowflake, Oracle, and SQL Server collectors: Modified to query dependencies across the entire database at once, rather than by individual schema, to reduce the number of queries executed.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Introduced a new parameter, --exclude-schema, to exclude specific schemes from the collection process.
Databricks collector: Added support for harvesting notebook content and lineage.
Release version 2.251
Details about the release
Item | Details |
---|---|
Release version | 2.251 |
Release date | December 12, 2024 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.251/dwcc-2.251.zip
|
New features and changes
Snowflake collector: Enhanced support for harvesting lineage from view select statements that include QUALIFY and TRY_CAST constructs.
Redshift collector: The collector is now able to harvest definitions (DDL) for stored procedures and functions.
All Collectors: Improved the collectors logging for greater consistency in local and uploaded logfile names. Updated log file naming conventions to ensure uniqueness and preserve log files from previous runs when uploading to data.world.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Enhanced support for harvesting lineage from view select statements containing a mix of qualified and unqualified table names.
dbt Core and dbt Cloud collectors: The relationship between DbtSource and its abstracted tables or views is now defined as representsDataSource. This change better reflects the semantics of dbt sources, and improves visualization of lineage relationships in Eureka Explorer.
Tableau Collector: Released an updated collector for Tableau, featuring improved detection of lineage relationships to database objects along with stability and performance enhancements. While the new collector will co-exist with the legacy version, we encourage transitioning to the new version to take advantage of these enhancements.
Release version 2.250
Details about the release
Item | Details |
---|---|
Release version | 2.250 |
Release date | December 7, 2024 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.250/dwcc-2.250.zip
|
New features and changes
Snowflake collector: Added support for additional SQL syntaxes when parsing statements that include the QUALIFY keyword.
Alteryx collector: Added support for honoring the max job limit option when set to zero.
Bug fixes
Snowflake collector: Resolved an issue that caused duplicate schema processing and duplicate CatalogCuration resources during incremental collection.
Release version 2.249
Details about the release
Item | Details |
---|---|
Release version | 2.249 |
Release date | November 27, 2024 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.249/dwcc-2.249.zip
|
New features and changes
Snowflake collector:
The collector now harvests dependencies from tables and views.
Made performance enhancements for incremental metadata collections.
PostgreSQL collector: Added support for AWS IAM Authentication tokens for databases hosted on AWS.
Power BI Service and Power BI Gov collectors: Updated to accommodate Microsoft’s transition from dataset to semantic model in all catalog resources emitted by the collector.
Bug fixes
Redshift collector: Corrected handling of lineage harvesting from SQL statements using ARRAY literals.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Improved handling of lineage harvesting from SQL statements containing KEYS as a column name.
Snowflake collector: Fixed issues in harvesting lineage from SQL statements using QUALIFY and COPY GRANTS keywords.
Power BI Service and Power BI Gov collectors: Introduced various improvements to the datasource template.
Release version 2.248
Details about the release
Item | Details |
---|---|
Release version | 2.248 |
Release date | November 19, 2024 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.248/dwcc-2.248.zip
|
New features and changes
Denodo collector: Improved column fetching by processing one view or table at a time if an issue occurs when retrieving all columns at once.
Fivetran collector: Added support for Salesforce as a source.
Salesforce collector: Added support for harvesting metadata for summary (roll-up) fields.
Bug fixes
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Resolved an issue by clearing lineage resolution cache between runs to avoid conflicts when multiple commands are configured.
Release version 2.247
Details about the release
Item | Details |
---|---|
Release version | 2.247 |
Release date | November 14, 2024 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.247/dwcc-2.247.zip
|
New features and changes
SQL Server collector: Added support for Active Directory Service Principal and Entra ID authentications.
Amazon S3 collector: Enhanced object filtering based on configuration options to include or exclude object names before checking the maximum resources limit.
AWS Glue Collector: Added functionality to catalog partitioned columns.
Bug fixes
Power BI Service and Power BI Gov collectors: Resolved an issue with the CombineColumns transform step that was causing warnings in some cases due to misalignment in lineage to source columns.
dbt Core collector: Improved handling of situations where the profiles.yml file is mistakenly specified as a directory.
Release version 2.246
Details about the release
Item | Details |
---|---|
Release version | 2.246 |
Release date | November 7, 2024 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.246/dwcc-2.246.zip
|
New features and changes
Salesforce collector: The security token configuration option is no longer required, as there are scenarios where authentication to the Salesforce API does not require it.
Bug fixes
Power BI Service and power BI Gov collectors: Fixed a problem that was occurring while attempting to replace parameters in custom SQL when the parameter name contained a special character.
dbt cloud collector: Fixed an issue in which the user-specified Snowflake account override was being ignored by the collector.
Release version 2.245
Details about the release
Item | Details |
---|---|
Release version | 2.245 |
Release date | November 3, 2024 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.245/dwcc-2.245.zip
|
New features and changes
Reltio collector: The collector now supports client_credentials authentication, in addition to the existing user and password authentication.
Release version 2.244
Details about the release
Item | Details |
---|---|
Release version | 2.244 |
Release date | October 31, 2024 |
Docker image ID | Link to download the Docker image: https://hub.docker.com/r/datadotworld/dwcc/tags
|
Jar file | Link to download the JAR file: https://releases.data.world/dwcc/2.244/dwcc-2.244.zip
|
New features and changes
Denodo collector: The collector now harvests primary and foreign keys by schema to improve performance.
Databricks collector: Introduced a new parameter, --exclude-system-functions, which allows collector to exlude harvesting of built-in Databricks system functions.
Power BI Service and Power BI Gov collectors: The collectors now catalog the following additional properties:
Dataset: Created date, Created by
Dataflow: Created by
Report: Created date, Last modified, Created by, Last modified by
Oracle collector: Modified the --linked-host option to require the database name for a linked database, as there was no reliable way to query this through the link.
Bug fixes
Power BI Service and Power BI Gov collectors: Fixed an issue with parsing parameters in SQL to account for the use of Text.From().
Release version 2.243
Details about the release
Item | Details |
---|---|
Release version | 2.243 |
Release date | October 25, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
The following two new collectors are now available in private preview. If you would like access to this collector, please contact your Customer Success Director.
Azure Data Lake Storage Gen2 collector: Enabled harvesting of Azure blob storage. Two new parameters are introduced as a result of this change: --disable-adls-gen2, --disable-azure-blob-storage
PowerBI collector:
The collector now supports parameters within SQL queries.
Added support for Table.DuplicateColumn and Table.CombineColumns table transformations.
ADLS (Azure Data Lake Storage) Gen2 collector: The collector now supports harvesting Azure blob storage.
Redshift collector: Added support for harvesting materialized views.
Databricks collector: Skipped collection of extended table metadata for system database to ensure stable run.
Bug fixes
Monte Carlo collector: Resolved a null pointer error caused by unexpected null values.
Release version 2.242
Details about the release
Item | Details |
---|---|
Release version | 2.242 |
Release date | October 19, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
PowerBI collector: The collector now supports parsing custom SQL queries for Databricks sources.
Bug fixes
Oracle collector:
Fixed an issue with lineage relationships where synonyms referring to tables in a different schema.
Fixed an issue with determining the database name for cross-system or cross-database lineage.
The collector now properly resolves lineage for views that use database links.
Databricks collector:
Fixed an issue with table names containing non-alphanumeric characters.
Resolved an error caused by references to non-existent tables in SQL.
SQL Server Integration Services (SSIS) collector: Fixed an issue with missing file connection strings.
Release version 2.240
Details about the release
Item | Details |
---|---|
Release version | 2.240 |
Release date | October 15, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
SQL Server collector: Added support for parsing WITH TIES and TRY_CAST statements.
PowerBI collector: Added an optional starter datasource YAML file required for harvesting lineage. If the YAML file is missing or incomplete, a template will be generated in the collector logs for easy reference.
Bug fixes
All collectors: Fixed an issue with detecting the existence of configured datasets.
Release version 2.239
Details about the release
Item | Details |
---|---|
Release version | 2.239 |
Release date | October 3, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
SQL Server collector now catalogs:
Dependencies between database resources.
Lineage for stored procedures.
Minimal available lineage for views when the SQL query fails to parse.
PowerBI collector: Improved data source titles for better specificity.
Salesforce collector: Optimized the process of fetching authentication tokens to reduce the number of Salesforce API calls.
Bug fixes
Sigma collector: Fixed an issue with API structure changes due to pagination limits.
Redshift collector: Resolved an issue where the collector incorrectly reported that the provided credential did not have access to the database tables being harvested.
Release version 2.238
Details about the release
Item | Details |
---|---|
Release version | 2.238 |
Release date | September 25, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Snowflake collector:
Added options to harvest all tags and policies within Snowflake, regardless of the specified database.
The collector now performs normal collection when incremental mode is requested with incompatible options.
Denodo collector:
The collector now preprocess view definitions to remove derived view descriptions.
Added a fallback when SQL parsing fails.
SQL Server collector: Added support for lineage in stored procedures that do not use BEGIN/END statements in the procedure body.
dbt Cloud collector: The collector now accommodates large numeric values for account, project, and run identifiers.
Bug fixes
Redshift collector: Fixed an issue with view parsing for lineage when no schema binding is present.
Release version 2.237
Details about the release
Item | Details |
---|---|
Release version | 2.237 |
Release date | September 18, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
The following two new collectors are now available in private preview. If you would like access to this collector, please contact your Customer Success Director.
The Amazon QuickSight collector is now generally available to all customers.
Bug fixes
Power BI Report Server (PBIRS) collector: Fixed an issue when collector resource does not have a parent folder.
Release version 2.236
Details about the release
Item | Details |
---|---|
Release version | 2.236 |
Release date | September 18, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Oracle collector: The collector now supports cataloging lineage when database links are used in view and stored procedure and function definitions.
Bug fixes
Snowflake collector: The collector now properly handles comments in view SQL definition during lineage parsing.
Sigma collector: Resolved an error caused by changes in the API response structure when the number of records exceed the default pagination limit.
Release version 2.235
Details about the release
Item | Details |
---|---|
Release version | 2.235 |
Release date | September 10, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: The collectors now support harvesting decimal digits metadata for columns.
Release version 2.234
Details about the release
Item | Details |
---|---|
Release version | 2.234 |
Release date | September 6, 2024 |
Docker image ID |
|
Jar file |
|
Bug fixes
Generic JDBC, Denodo, MySQL, SQL Server collectors: Resolved arithmetic overflow error when converting expressions to data type bigint.
SSIS collector: Added database location information when creating a database asset.
Release version 2.233
Details about the release
Item | Details |
---|---|
Release version | 2.233 |
Release date | August 30, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Tableau collector: The collector now supports filtering descendant projects.
Bug fixes
Salesforce collector: The collector now properly handles null responses from the Salesforce API.
Snowflake collector: Fixed an issue where column nodes were not being copied to the interleaved Snowflake graph.
Tableau collector: Fixed an issue with missing owner users for various resources.
Release version 2.232
Details about the release
Item | Details |
---|---|
Release version | 2.232 |
Release date | August 27, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Power BI and Power BI Gov collectors: Added new relationships from app report to workspace report.
Bug fixes
Salesforce Collector: Implemented a fix to avoid exceptions when custom object field metadata contains null values.
Denodo Collector: Resolved an issue with harvesting view SQL.
Release version 2.231
Details about the release
Item | Details |
---|---|
Release version | 2.231 |
Release date | August 14, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Power BI Service and Power BI Gov collectors:
The collectors now support parsing parameters, measures and column expressions when expression parsing is enabled.
Added functionality to get the correct case for database, schema, table, and column names when database credentials are provided for a source.
Power BI Gov collector: The collector now supports harvesting of user workspaces using the --include-user-workspace parameter.
Databricks collector: The collector now supports harvesting data from ADLS (Azure Data Lake Storage) Gen2 external location.
dbt Core collector: Added support for dbt projects using the dbt-sqlserver adapter.
Release version 2.230
Details about the release
Item | Details |
---|---|
Release version | 2.230 |
Release date | August 6, 2024 |
Docker image |
|
Jar file |
|
Bug fixes
SQL Server Reporting Services (SSRS) and Power BI Report Server (PBIRS) collectors: Fixed an issue that occurred while fetching information on data sources for a report.
Release version 2.228
Details about the release
Item | Details |
---|---|
Release version | 2.228 |
Release date | August 5, 2024 |
Docker image |
|
Jar file |
|
Bug fixes:
SQL Server Reporting Services (SSRS) and Power BI Report Server (PBIRS) collectors: Fixed an issue that caused API requests to return an HTTP 404 error.
Release version 2.227
Details about the release
Item | Details |
---|---|
Release version | 2.227 |
Release date | July 31, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes:
SQL Server collector: If an error occurs while fetching columns from the database by schema, the collector now attempts to fetch columns by table instead.
Bug fixes:
SQL Server Reporting Services (SSRS) and Power BI Report Server (PBIRS) collectors: Fixed an issue when collector resources were not returned by the API.
Release version 2.226
Details about the release
Item | Details |
---|---|
Release version | 2.226 |
Release date | July, 31, 2024 |
Docker image ID |
|
Jar file |
|
Bug fixes:
SQL Server Reporting Services (SSRS) and Power BI Report Server (PBIRS) collectors: Fixed an issue where the collector would terminate abnormally if the SSRS API returned no data under certain circumstances.
Power BI Service and Power BI Gov Collectors: The collectors now correctly handled case mismatches in source column names when resolving SQL statements for lineage.
Release version 2.225
Details about the release
Item | Details |
---|---|
Release version | 2.225 |
Release date | July 30, 2024 |
Docker image ID |
|
Jar file |
|
Bug fixes
SQL Server Reporting Services (SSRS) and Power BI Report Server (PBIRS) collectors: Fixed an issue that caused the collector to terminate unexpectedly when encountering Linked Reports with names containing non-alphanumeric characters.
Release version 2.224
Details about the release
Item | Details |
---|---|
Release version | 2.224 |
Release date | July 30, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Oracle collector: Enabled caching for primary keys and foreign keys, and reduced the number of queries used to gather table and column extended metadata, resulting in improved collector run time.
SQL Server Reporting Services (SSRS) and Power BI Report Server (PBIRS) collectors: Item path is now harvested for report, data source, and dataset titles.
Bug fixes
SQL Server Reporting Services (SSRS) and Power BI Report Server (PBIRS) collectors: Resolved an issue with NTLM Authentication.
Release version 2.223
Details about the release
Item | Details |
---|---|
Release version | 2.223 |
Release date | July 29, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Power BI Service and Power BI Gov collectors: The collectors now support TNS connection strings in lineage parsing for Oracles sources if HOST and SID are specified. For example, (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=SERVER_NAME)(PORT=1521))(CONNECT_DATA=(SID=KOSTEST))).
SQL Server Reporting Services and Power BI Report Service collectors: The collectors now support authentication using NTLM.
Amazon S3 collector: The collector now harvests objects that begin with a prefix.
Salesforce collector: The collector now harvests metadata for Objects, Fields, Dashboards, and Reports. It also supports OAuth authentication instead of Basic authentication. You must complete the new pre-requisite tasks to use OAuth authentication.
Tableau collector: Enhanced resiliency for Tableau GraphQL query execution.
Bug fixes
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors:
The collectors now properly handle SQL parsing for lineage, ensuring newline characters \r do not disrupt SQL parsing.
Fixed an issue with the usage of variable names in stored procedures.
Power BI Service and Power BI Gov collectors: Fixed an issue with handling parameters that are defined in the tables section of Semantic Models, allowing for successful parsing of source information for tables using those parameters.
Snowflake collector: The collector now appropriately handles date time parsing for the timestamp NTZ format.
Release version 2.222
Details about the release
Item | Details |
---|---|
Release version | 2.222 |
Release date | July 23, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Power BI Service and Power BI Gov collectors: The collectors now support Denodo sources in Power BI column-level lineage parsing.
Denodo collector: The collector now harvests column-level lineage.
SQL Server Integration Services (SSIS) collector: Added a new --jdbc-property parameter. This allows you to provide authentication details for NTLM Authentication type.
dbt Core and dbt Cloud collectors: The collectors now harvests model columns from catalog.json and manifest.json database objects.
Bug fixes
Power BI collector: The collector now properly handle scenarios where columns are renamed in Power BI that resulted in duplicate columns in source tables.
Azure Data Factory collector: The collector now properly performs Date transformation when the time zone is not available as ZoneID.
Azure Data Lake Storage Gen2 collector:
Updated the collector to remove redundant permission-related relationships.
Fixed an issue with the IRIs creation for collector resources by using correct terms.
Release version 2.221
Details about the release
Item | Details |
---|---|
Release version | 2.221 |
Release date | July 15, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Power BI Gov collector: The collector now harvests preview images for Power BI reports. Add the new parameter --image-collection to your command/YAML file to use this new feature.
Bug fixes
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors:
Fixed an issue where placing a comment directly after a keyword without a space was sometimes causing parsing issues.
Fixed an issue with parsing CREATE VIEW statements where parentheses were being incorrectly removed during the SQL pre-processing.
Proper error messages are now logged when users run the collectors with the --dry-run option without specifying a single database or with multiple databases.
Snowflake collector: Resolved an issue where the collector was cataloging an incorrect database when the user had a default namespace set in Snowflake.
Databricks collector: Fixed an issue where the collector output files uploads were failing due to spaces in IRIs.
QuickSight Collector: Fixed an AwsAccountId null error while listing resources using pagination, which was causing issues in cataloging all the specified resources.
Azure Data Factory collector:
Resolved an issue with truncated paginated results.
Fixed an issue with the title of global parameters by correctly using the parameter name.
Release version 2.220
Details about the release
Item | Details |
---|---|
Release version | 2.220 |
Release date | July 10, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Oracle collector: Added support for lineage when the SELECT statement contains synonyms. This enhancement fixes lineage tracking between Oracle and Power BI when synonyms are used.
Power BI collector: The collector now harvests preview images for Power BI reports. Add the new parameter --image-collection to your command/YAML file to use this new feature.
Bug fixes
Power BI and Power BI Gov collectors:
Fixed an issue with parameter value replacement in expressions when the parameter contains a $ symbol.
Fixed an issue where Power BI reports failed to process when the page name is null.
Release version 2.219
Details about the release
Item | Details |
---|---|
Release version | 2.219 |
Release date | July 8, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
All collectors: Each catalog resource in the catalog output file now contains information about the collector that harvested the resource. This information is available only in the catalog file and can be used in SPARQL automations.
Bug fixes
Power BI and Power BI Gov collectors: The collectors now properly handle scenarios when they run into API request limits. A new parameter Disable max requests wait (--disable-max-requests-wait) is added for handling these scenarios.
Azure Data Lake Storage Gen2 collector: Resolved an issue where certain ACL information missing in the Azure Data Lake Storage API response caused errors in the collector.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Harvesting of column-level lineage from views now supports view definitions containing unaliased subselects.
Release version 2.218
Details about the release
Item | Details |
---|---|
Release version | 2.218 |
Release date | July 1, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Power BI and Power BI Gov collectors: The collectors now support parsing data source expressions for Power BI tables where the source connection information is defined as a parameter. This means that if Power BI users specify data source connection information in a parameter and use that parameter in place of the source in the expression, the collectors will correctly parse and resolve the expression/lineage.
Oracle collector: The collector now harvests from DBA_ views if the credential used to execute the collector lacks permissions for information schema views.
dbt Core collector: The collector now harvest database objects and intra-database lineage from dbt projects and artifacts that use Azure Synapse as a backend.
All collectors: Collectors now verify that the user-requested upload location exists with proper permissions before execution and issue a warning if a problem exists.
Databricks collector: The collector no longer supports Databricks-managed password authentication. If you used this method of authentication, you must change the authentication to personal access token. For details, see "Preparing Databricks for collectors".
Bug fixes
SQL Server collector:
Fixed an issue where large values for column statistics produced an arithmetic overflow.
Resolved a problem where view definitions that include the TOP() expression were not properly handled when harvesting column-level lineage for views.
Power BI and Power BI Gov collectors: Fixed an issue where logging operations were causing an exception if certain Power BI objects were null.
Tableau collector: Fixed an issue where certain Tableau projects were not fully cataloged.
Release version 2.216
Details about the release
Important
This release was for internal improvements and has no customer impacting changes.
Item | Details |
---|---|
Release version | 2.216 |
Release date | June 26, 2024 |
Docker image ID |
|
Jar file |
|
Release version 2.215
Details about the release
Item | Details |
---|---|
Release version | 2.215 |
Release date | June 26, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Power BI Gov Collector:
The collector now supports harvesting of all workspaces and apps using the --all-workspaces-and-apps parameter.
Added the ability to disable lineage harvesting using the --disable-expression-lineage parameter.
Release version 2.214
Important
This release was for internal improvements and has no customer impacting changes.
Details about the release
Item | Details |
---|---|
Release version | 2.214 |
Release date | June 25, 2024 |
Docker image ID |
|
Jar file |
|
Release version 2.213
Details about the release
Item | Details |
---|---|
Release version | 2.213 |
Release date | June 25, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Azure Data Factory collector: The collector now harvests Expressions for table names, schema names, file names.
A new collector for SQL Server Integration Services (SSIS) is now available in private preview. If you would like access to this collector, please contact your Customer Success Director.
Bug fixes
Power BI and Power BI Gov collectors: The collectors now correctly harvest lineage for column types.
Release version 2.212
Details about the release
Item | Details |
---|---|
Release version | 2.212 |
Release date | June 21, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes:
Snowflake collector: The collector now harvests metadata for functions and stored procedures from the snowflake.account_usage views when the metadata is unavailable from the information_schema of the database.
Power BI and Power BI gov collectors now catalog:
Dataset table expression
Description for the workspace, app, and dataset
Bug fixes:
ADF collector: Fixed an issue with datetime parse errors while harvesting triggers.
Release version 2.211
Details about the release
Item | Details |
---|---|
Release version | 2.211 |
Release date | June 15, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Power BI and Power BI gov collectors: The collectors now support lineage for Oracle database objects.
Bug fixes
Power BI and Power BI gov collectors: Resolved an issue with collecting child resources for apps when using service principal authentication.
Snowflake and Oracle collectors: Fixed an issue where the system now correctly does not harvest function lineage when users enable the Disable lineage collection (--disable-lineage-collection) option.
Oracle collector: Fixed an issue with harvesting database columns of LONG type.
Release version 2.210
Details about the release
Item | Details |
---|---|
Release version | 2.210 |
Release date | June 7, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Power BI and Power BI Gov collectors:
Added a new feature that provides support to parse SQL statements within table expressions, enabling column-level lineage harvesting. To use this feature, you need to use the --datasource-mapping-file to specify the credentials. These credentials allow the collector to link lineage to the database sources.
The collector now harvests measures.
Databricks collector: The collector now harvests table and column tags by schema.
Bug fixes
Snowflake collector was unable to harvest lineage if the SQL statement included a dash in the column aliases.
Snowflake, Teradata, Netezza collectors: Fixed an issue that occurred because of insufficient information while harvesting agent resources for functions and procedures.
SQL Server collector: Fixed an issue that occurred while parsing view queries where columns have dashes in their names.
Release version 2.209
Details about the release
Item | Details |
---|---|
Release version | 2.209 |
Release date | June 2, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Databricks collector: The collector now harvests table and column lineage from system tables. To use this feature, you need to set new permissions for the collector.
Bug fixes
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors: Resolved a problem concerning column statistics when an aggregate statistic has a zero value.
Tableau collector: Resolved an issue to correctly associate lineage with the appropriate parent project.
Sigma collector: Resolved an issue which occurred when a dataset referred to in the lineage was not available among the harvested datasets.
Snowflake collector: Fixed an issue associated with external URLs containing special characters.
Release version 2.208
Details about the release
Item | Details |
---|---|
Release version | 2.208 |
Release date | 24 May, 2024 |
Docker image ID |
|
Jar file |
|
Bug fixes
Snowflake collector:
Resolved the issue that arose from Snowflake not returning function metadata.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors:
Addressed the issue encountered during the harvesting of column statistics when the result set contained non-integer values.
Release version 2.207
Details about the release
Item | Details |
---|---|
Release version | 2.207 |
Release date | 21 May, 2024 |
Docker image ID |
|
Jar file |
|
Bug fixes
BigQuery collector: The collector is updated to generate catalog records for BigQuery Label instances. This allows them to be visible on the resource pages in the application.
Sigma collector: Resolved an issue that could result in an exception when the Sigma APIs failed to return a table path for a connection.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors:
Enhanced error log statements by adding fully qualified table names when certain tables or columns in the database cannot be located during lineage resolution.
Release version 2.206
Details about the release
Item | Details |
---|---|
Release version | 2.206 |
Release date | 17 May, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Sigma collector:
A new --pagination-limit parameter is now available for the collector. You can use this parameter to set the page size for the Sigma API response. The maximum value you can set is 1000. If you do not specify a value, the default page size is 25.
The collectors is optimized to enhance the efficiency of lineage harvesting.
Snowflake collector: The collector now harvests extended metadata for tables, views, and materialized views.
Bug fixes
SQL Server collector: Incorporated additional debug logging for when the collector fails to harvest extended metadata.
Oracle collector:
The collector is now able to handle column names with single quotes in them.
Fixed an issue with synonyms being harvested in the wrong schema.
Release version 2.205
Details about the release
Item | Details |
---|---|
Release version | 2.205 |
Release date | 17 May, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Power BI and Power BI Gov collectors: The OBDC data sources YAML file (datasources.yml) is updated to allow user-specified aliases for the database location (host). This ensures that resources are accurately linked across collectors.
Snowflake collector: Added support for harvesting materialized views for SQL definition, External URL (Snowsight).
Bug fixes
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors:
The collectors are optimized to load JDBC drivers more efficiently, thereby reducing memory usage.
Release version 2.204
Details about the release
Item | Details |
---|---|
Release version | 2.204 |
Release date | 10 May, 2024 |
Docker image ID |
|
Jar file |
|
Bug fixes
SQL Server collector: The collector now correctly manages a scenario to use a consistent case when a collation is set.
dbt core and dbt cloud collectors: The collectors are optimized to correctly manage scenarios that previously caused an exception while harvesting lineage.
Sigma collector: The collector is optimized to manage scenarios that were previously causing the collector to not run properly.
Release version 2.203
Details about the release
Item | Details |
---|---|
Release version | 2.203 |
Release date | 8 May, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
dbt Core collector:
Now supports multiple run_results.json in single collector run. Add the new parameter --run-results-directory to your command/YAML file to use this new feature.
Now comes with enhancements that optimize the harvesting of column-level lineage for dbt models.
dbt cloud collector now comes with enhancements that optimize the harvesting of column-level lineage for dbt models.
Bugs
Sigma collector properly deserializes objects from Sigma API.
Power BI and Power BI gov collectors now properly obtains server name and port from Power BI data source parameters.
Release version 2.202
Details about the release
Item | Details |
---|---|
Release version | 2.202 |
Release date | 7 May, 2024 |
Docker image ID |
|
Jar file |
|
New features
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors:
Optimized view query parsing to improve the processing time for large SQL statements.
Optimized the querying of metadata during view lineage harvesting.
Oracle Collector: Added a new --oracle-jdbc-timezone-as-region parameter. This allows you to decide if the Oracle JDBC connection timezone should utilize the JVM's default timezone.
Bug fixes
AWS Glue Collector: Improved the log message that are recorded when the harvesting of job lineage fail.
Release version 2.201
Details about the release
Item | Details |
---|---|
Release version | 2.201 |
Release date | 2 May, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Oracle and SQL Server collectors: The collectors now catalog column-level lineage when functions and stored procedures contain sub-selects.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server collectors:
Performance optimizations are done to the collectors to improve the overall runtime of the collectors.
A new parameter --disable-extended-metadata is now available that allows you to skip harvesting of extended metadata for resource types such as database, schema, table, columns functions, stored procedures, user defined types, synonyms. Basic metadata for these resource types will still be harvested.
Power BI and Power BI gov collectors now catalog:
Relationships between Power BI apps and workspaces
Apps with associated workspace IDs (when service principal authentication is used)
Bug fixes
Teradata collector properly harvests lineage metadata from views with SQL statements containing REPLACE RECURSIVE VIEW, LOCK ROW ACCESS.
Oracle collector properly harvests lineage metadata from views with COLLECT.
All collectors properly handle config file options that start with option flags.
Release version 2.200
Details about the release
Item | Details |
---|---|
Release version | 2.200 |
Release date | 19 April, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
All collectors: Users now have the option to define a custom output file name for the collector catalog during run time. To do this, use the --output-name parameter. The system automatically adds .dwec.ttl to the end of the provided file name.
Note
If you are updating the file name for an already configured collector, make sure to check and modify any existing SPARQL queries that explicitly mention existing collector output files.
Oracle Collector: The collector now harvests Oracle package bodies and Oracle package specifications.
Bug fixes
SQL Server collector Fixed an error that occurred when harvesting column statistics.
Power BI and Power BI Gov collectors: Resolved an issue that was causing errors during the parsing of expressions that used the Table.RenameColumns Power Query table function in certain cases.
Snowflake Collector: The collector now properly harvest tags that are defined in a different schema than the schemas specified for the collector.
The following collectors are updated to harvest lineage accurately for group by, order by, where, and having SQL expressions. Prior to this update, the relationships were incorrectly directed.
Postgres, Databricks, Derby, Netezza, Oracle, Redshift, Snowflake, SQL Server, Teradata collectors
Release version 2.199
Details about the release
Item | Details |
---|---|
Release version | 2.199 |
Release date | 11 April, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
A new collector is now available for Amazon Managed Streaming for Kafka.
Oracle collector: The collector now harvests lineage from views, stored procedures, and functions.
Snowflake collector: The collector now harvests Streamlit apps.
The following collectors now support harvesting from multiple databases specified by users. This means you can provide the --database parameter multiple times while running the collector.
Databricks, PostgresSQL, SQL Server, Db2, Redshift, Denodo, Oracle, MySQL, Snowflake, Teradata
Bug fixes
Power BI and Power BI Gov collector: Resolved an issue that was caused by parsing expand column expressions.
dbt cloud collector: The collector now properly harvests metadata of dbt Cloud artifacts when the target database is not Snowflake. Note the collector will only harvest metadata from the dbt Cloud artifacts and not connect to any unsupported target database to obtain database lineage metadata.
Snowflake collector: The collector harvest policies associated with cataloged database objects, regardless of the database in which the policies reside.
Release version 2.198
Details about the release
Item | Details |
---|---|
Release version | 2.198 |
Release date | 9 April, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Oracle collector: The collector now harvest Synonyms.
Athena collector: Starting with release 2.198, data.world no longer package the Athena JDBC driver with the Athena collector. You can continue to use the releases previous to 2.198 as-is, but when you updated to the collector version to 2.198 or higher, you will have to download and mount the driver for the collector and update the collector command to include the driver path.
Release version 2.197
Details about the release
Important
This release was for internal improvements and has no customer impacting changes.
Item | Details |
---|---|
Release version | 2.197 |
Release date | 5 April, 2024 |
Docker image ID |
|
Jar file |
|
Release version 2.196
Details about the release
Item | Details |
---|---|
Release version | 2.196 |
Release date | 2 April, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Log files for collectors: The collector log files for each collector run now have unique names. This allows logs to be written to separate files when running multiple collector instances.
Reltio collector: Survivorship groups and mappings are now recognized as primary entities with catalog records.
Snowflake collector: The collector now harvests tags associated with database objects in the user-specified database, regardless of the database in which the tag resides.
Bug fixes
Teradata collector: Fixed an issue that was blocking column harvesting due to invalid column references in Views.
Azure data Factory collector: Fixed an issue preventing successful file uploads to data.world.
Release version 2.195
Details about the release
Item | Details |
---|---|
Release version | 2.195 |
Release date | 25 March, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Databricks collector: The collector now harvests tags for Databases, Schemas, Tables, and Columns.
Bug fixes
Power BI Service and Power BI Gov collectors: The collectors now correctly harvest skipped data sources during metadata scans.
Azure Data Lake Storage Gen2 collector: The collector is updated to refresh API authorization requests per ADLS requirements to avoid session expiration.
Azure Data Factory collector: Fixed an issue to accommodate varying data returned from the Azure Data Factory API.
Release version 2.194
Details about the release
Item | Details |
---|---|
Release version | 2.194 |
Release date | 21 March, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
The Power BI Service and Power BI Gov collectors now support harvesting lineage from ODBC data source types. A new parameter --datasource-mapping-file can be used to provide the information required for harvesting lineage relationships when the data source uses an ODBC connection in Power BI.
Bug fixes
The Amazon S3 collector now continues to harvest objects in the bucket when a 403 error is encountered.
The Azure Data Lake Storage Gen2 collector properly handles the scenario involving special characters in the blob name.
The Azure Data Factory collector properly handles a scenario that causes the collector to stop due to the format of information returned from the Azure Data Factory APIs.
BigQuery Collector properly handles a scenario when a table is in a different database from the one being harvested.
Release version 2.193
Details about the release
Item | Details |
---|---|
Release version | 2.193 |
Release date | 15 March, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
The following two new collectors are now available in Private Preview. Please contact your Customer Success Director to get access to these collectors:
Bug fixes
The Azure Data factory collector is updated to correctly handle a situation that previously caused the collector to stop, due to the format of the information returned from the ADF APIs.
Release version 2.192
Details about the release
Item | Details |
---|---|
Release version | 2.192 |
Release date | 12 March, 2024 |
Docker image ID |
|
Jar file |
|
New features and changes
Amazon S3 collector: The collector now offers the options, --include-object and --exclude-object. These options allow you to select which objects should be included or excluded from the harvesting process.
Databricks collector: The collector now harvests Databricks tags for database, schema, table, view, and column as as key-value pairs. The collector also harvests tags for clusters and jobs, replacing the existing ClusterTag and JobTag resource types.
Release version 2.191
Details about the release
Item | Details |
---|---|
Release version | 2.191 |
Release date | 7 March, 2024 |
Docked image ID |
|
Jar file |
|
New features and changes
All collectors: The --dry-run option is now available for all collectors. This option allows you do a test run for the collectors to validate that the collector can authenticate to the specified source system. If specified, the collector does not actually harvest any metadata, but just checks the connection parameters provided by the user and reports success or failure at connecting.
Bug fixes
Teradata collector: The collector is updated to correctly parse view SQL syntax for extracting lineage metadata. It also now includes improved logging of any errors encountered during lineage harvesting.
BigQuery collector: The collector now properly handles fully qualified table names that include dashes (-).
Release version 2.190
Details about the release
Item | Details |
---|---|
Release version | 2.190 |
Release date | 5 March, 2024 |
Docked image ID |
|
Jar file |
|
New features and changes
Snowflake, Teradata and Netezza collectors: In the harvested metadata, the owner of resources are now correctly referenced as owner objects. Earlier they were referenced as string text.
Bug fixes
The Teradata collector now correctly manages variations in database cases within SQL statements while gathering lineage metadata.
Release version 2.189
Details about the release
Item | Details |
---|---|
Release version | 2.189 |
Release date | 24 February, 2024 |
Docker image ID |
|
JAR file |
|
New features and changes
The Tableau collector now captures all sub-projects when you specify certain projects to catalog. Additionally, it enables users to exclude specific projects using the --tableau-exclude-project parameter. Any sub-projects under an excluded project are also automatically excluded.
Release version 2.188
Details about the release
Item | Details |
---|---|
Release version | 2.1288 |
Release date | 23 February, 2024 |
Docker image ID |
|
JAR file |
|
New features and changes
The Information Schema Catalog Collector now collects descriptions from both tables and columns, if they are present in the source.
The Snowflake collector now harvests comments from Snowflake databases, schemas, and views (as resource description).
The Teradata collector has been enhanced to better parse view SQL definitions that use specific Teradata syntax elements, particularly when extracting lineage from views.
Bug fixes
BigQuery collector:
Fixed issues with handling identifiers with hyphens ( -).
Fixed issues with harvesting lineage when a view refers to columns in a separate database.
Release version 2.187
Details about the release
Item | Details |
---|---|
Release version | 2.187 |
Release date | 20 February, 2024 |
Docker image ID |
|
JAR file |
|
New features and changes
Netezza collector: A new and improved collector is now available for Netezza.
Oracle collector: The collector now harvest definitions for view, function and stored procedure.
Release version 2.186
Details about the release
Item | Details |
---|---|
Release version | 2.186 |
Release date | 14 February, 2024 |
Docker image ID |
|
JAR file |
|
New features and changes
The following collectors now harvest all databases in a single collector run when the --database parameter is not specified.
The collectors also support a new parameter --exclude-database to exclude specific databases from metadata collection:
Databricks
DB2
MySQL
Oracle
PostgreSQL
Redshift
SQL Server
Snowflake
Teradata
Bug fixes
Databricks collector: The collector properly handles malformed task responses.
Power BI collector: The collector properly handles harvesting lineage relationships from Power BI data sources when parameters are used in place of the Snowflake Warehouse value.
For the following collectors, the behavior of the --include-information-schema option is changed. Now, if you use this option in the command without the --all-schemas option, the system will generate a warning to alert you about the missing parameter.
Databricks
DB2
Oracle
PostgreSQL
Redshift
SQL Server
Snowflake
Release version 2.185
Details about the release
Item | Details |
---|---|
Release version | 2.185 |
Release date | 9 February, 2024 |
Docker image ID |
|
JAR file |
|
Bug fixes
Fixed an issue that was causing database collectors to run into error state.
Release version 2.184
Details about the release
Item | Details |
---|---|
Release version | 2.184 |
Release date | 7 February, 2024 |
Docker image ID |
|
JAR file |
|
Bug fixes
Azure Data Lake Storage Gen2 collector: Fixed an issue that previously prevented the collector from running successfully on machines using amd64 processor.
Microsoft SQL Server collector now properly harvests views from Azure Synapse Analytics.
Release version 2.183
Details about the release
Item | Details |
---|---|
Release version | 2.183 |
Release date | 1 February, 2024 |
Docker image ID |
|
JAR file |
|
Bug fixes
Tableau collector: The collector is updated to properly harvest usage data in newer versions of Tableau Server.
Azure Data Lake Storage Gen2 Collector: Fixed an authentication issue in the collector that resulted in failures to initialize a channel.
Snowflake collector: The collector now properly harvests lineage between function and source table if the source table is in the cataloged schema.
Release version 2.182
Details about the release
Item | Details |
---|---|
Release version | 2.182 |
Release date | 30 January, 2024 |
Docker image ID |
|
JAR file |
|
New features and changes
All collectors: In addition to being available as Docker Images, collectors are now also accessible as JAR files. Follow these instructions to run collectors using JAR files.
The following collectors now harvest all versions of overloaded function and stored procedure resources, each as its own resource:
Db2
MS SQL Server
Netezza
Oracle
PostgreSQL
Redshift
Snowflake
Teradata
Bug fixes
Teradata and MySQL collectors: The following schema options have been removed for these collectors: --all-schemas, --include-information-schema, and --schema.
Release version 2.181
Details about the release
Item | Details |
---|---|
Release version | 2.181 |
Release date | 22 January, 2024 |
Docker image ID |
|
New feature and changes:
The Snowflake collector now harvests Data Metric Functions, their associations to tables and observed metrics.
Release version 2.180
Details about the release
Item | Details |
---|---|
Release version | 2.180 |
Release date | 17 January, 2024 |
Docker image ID |
|
New features and changes
Snowflake collector harvests allowed tag values from Snowflake.
Bug fixes
Oracle collector properly harvests Column descriptions from Oracle Data Dictionary tables.
Release version 2.179
Details about the release
Item | Details |
---|---|
Release version | 2.179 |
Release date | 10 January, 2024 |
Docker image ID |
|
New features and changes
The latest tag for docker images has been removed and is not available for use going forward.
What does this change mean for users using the latest tag?
If you were using the latest tag, you can continue to use the image with the latest tag. However, we recommend all users update their docker run command to use an explicit version.
If you make a change to your local docker environment (such as removing the latest image), then your collector run will not work. You will need to update the run command to use a specific version. You can open a support ticket for assistance on updating the command.
Athena, Snowflake, SQL Server, DB2 collectors now harvest basic metadata for materialized views (name, description if available).
The Postgres collector now collector harvests materialized view with name, description, and view SQL definition (DDL) and column-level lineage.
Bug fixes
All collectors: Environment variables referenced in collector config (YAML) files can now have values containing backslashes and dollar signs.
Release version 2.178
Details about the release
Item | Details |
---|---|
Release version | 2.178 |
Release date | 5 January, 2024 |
Docker image ID |
|
New features and changes
The Snowflake collector now harvests the External URL for Snowsight for tables and views.
The dbt Cloud collector now includes --dbt-cloud-host option to enable interaction with dbt static access URLs.
Bug fixes
Databricks collector: Addressed an issue related to correctly forming IRIs for tables under certain circumstances. This was previously causing duplicate tables and databases to be cataloged and non-existent tables to be referenced by columns.
The Tableau collector now properly handles a scenario when the Tableau instance has no databases defined.
Release notes for versions released before 2024
Go here to access release notes for versions released before 2024.