Catalog collector release notes
Important
Stay updated on collector releases! To keep up with the latest updates and enhancements to data.world collectors, subscribe to the RSS feed from your favorite RSS reader.
Release version 2.313
Details about the release
Item | Details |
|---|---|
Release version | 2.313 |
Release date | 6 February, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Microsoft Fabric collector: Added support for harvesting EventStream resources, enabling lineage tracking for Fabric’s real-time streaming data pipelines.
Bug fixes
dbt collector: Corrected resource identifier generation for tables to match BigQuery collector format, ensuring proper lineage connections between dbt models and their underlying BigQuery tables.
OpenAPI collector: Added support for OpenAPI 3.1’s updated nullable type syntax, ensuring schemas using the latest specification version are parsed correctly.
SSIS collector: Fixed parsing of SSIS dataflow tasks to handle a wider variety of schema and table name patterns in SQL queries.
Release version 2.312
Details about the release
Item | Details |
|---|---|
Release version | 2.312 |
Release date | 3 February, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Redshift collector: Added support for harvesting lineage from stored procedures, improving visibility into procedural data transformations.
Bug fixes
ServiceNow collector: Improved pagination handling to ensure results are fully harvested when total counts do not equal 100, preventing missing resources in larger result sets.
Tableau collector: Fixed a defect affecting sheet-to-dashboard associations, ensuring relationships are harvested accurately.
Release version 2.311
Details about the release
Item | Details |
|---|---|
Release version | 2.311 |
Release date | 30 January, 2026 |
Docker image ID |
|
Jar file |
|
Bug fixes
Snowflake and Databricks collectors: Fixed relationship mapping so that database tables are correctly associated with their parent database, ensuring accurate hierarchy and navigation in the catalog.
Release version 2.310
Details about the release
Item | Details |
|---|---|
Release version | 2.310 |
Release date | 29 January, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Microsoft Fabric collector: Added support for harvesting Environments in Microsoft Fabric, expanding visibility into Fabric workspace configurations.
Oracle collector: Added support for harvesting lineage from CREATE TABLE statements by querying historical tables, improving lineage coverage for table creation workflows.
Bug fixes
OpenAPI collector: Fixed an issue where restrictive server configurations caused specification requests to be rejected.
Microsoft SQL Server collector: Resolved an issue where SQL preprocessing of TOP statements did not work correctly in some cases.
Microsoft Fabric collector: Fixed issues with Lakehouse Delta tables to ensure they are cataloged under the correct parent folder and to prevent duplicate resources from being created.
Release version 2.309
Details about the release
Item | Details |
|---|---|
Release version | 2.309 |
Release date | 14 January, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
SSIS collector: Added support for harvesting variables and their references, improving lineage and package-level context.
Microsoft SQL Server collector: Now harvests table descriptions from extended properties, enriching table-level documentation.
Databricks collector: Renamed the CLI option --http-path to --compute-resources-url for clearer configuration of compute connectivity.
Snowflake, Redshift, Databricks, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Added support to exclude system functions across all JDBC collectors, helping reduce noise in harvested metadata.
Microsoft Fabric collector:
Added support for listing tables for schema-enabled Lakehouses using the OneLake Tables API, and harvesting columns for those tables via the same API.
Added support for schema-enabled Fabric Lakehouses in Pipelines, improving pipeline lineage coverage.
Added support for SQL Server database tables and stored procedures as sources/sinks in Fabric Data Pipelines, expanding supported pipeline endpoints.
Sensitive data classification feature: Improved sensitive data classification for collectors. On-premise collectors now support Microsoft Presidio classification services in addition to Private AI.
Bug fixes
Tableau collector: Fixed embedsView relationships for unpublished Tableau views and dashboards, ensuring these relationships are harvested correctly.
Azure Data Lake Storage Gen2 collector: Fixed an error that occurred when harvesting a non-hierarchical storage account, improving compatibility and stability.
Release version 2.308
Details about the release
Item | Details |
|---|---|
Release version | 2.308 |
Release date | 5 January, 2026 |
Docker image ID |
|
Jar file |
|
New features and changes
Snowflake collector: Added lineage resolution support for QUALIFY clauses, LATERAL FLATTEN constructs, and subqueries in previously unsupported contexts, including WHERE, JOIN ON, and ORDER BY.
Oracle collector: The collector now harvests public synonyms, expanding visibility into shared database objects.
SSIS collector: Added support for harvesting stored procedure references, improving lineage completeness.
Bug fixes
Tableau collector:
The collector now harvests component fields and their relationships to constituent fields.
Ensured GraphQL filtering by project name is followed by filtering by project ID, preventing harvesting of unintended projects.
Snowflake collector: Fixed an issue where some SQL statements took an excessive amount of time to parse during lineage harvesting.
SQL Server collector: Prevented attempts to harvest table size metadata for views that do not support this operation.
Databricks collector: Added a safeguard to prevent SQL execution for excluded databases during lineage collection.
Oracle collector: Fixed harvesting of statistic indexes, ensuring they are correctly captured.
Release version 2.307
Details about the release
Item | Details |
|---|---|
Release version | 2.307 |
Release date | 18 December, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
ServiceNow collector: Introduced a new collector that harvests metadata for scoped applications, tables, fields, views, and data fabric tables from a ServiceNow instance.
Dremio collector: The API server (--api-server) option is no longer required. When it is not provided, the collector does not harvest the API-based lineage and dataset information.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Improved memory usage during the harvesting of column statistics, enhancing stability for large datasets.
Tableau collector:
Added hostname mapping configuration support, enabling more flexible environment and network setups.
Added a new option, --tableau-exclude-unpublished-views, to exclude unpublished views (sheets and dashboards) from harvesting.
Bug fixes
Snowflake collector: Updated SQL parsing behavior to ensure parsing terminates correctly when a timeout occurs, and made the timeout value configurable to better handle complex or long-running queries.
Release version 2.306
Details about the release
Item | Details |
|---|---|
Release version | 2.306 |
Release date | 10 December, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
Salesforce collector: Added support for grouping and aggregation columns, expanding metadata coverage for analytical queries.
Databricks collector: Added support for harvesting jobs from external workspaces, enabling visibility into cross-environment job metadata.
Azure Data Factory collector: Now harvests SQL queries used in activities, providing richer lineage and operational context.
Dremio collector:
Migrated to the Arrow Flight JDBC driver in place of the legacy Dremio driver, improving performance and compatibility.
Updated the --graphApiServer parameter to --api-server for clarity and consistency, and added a new --use-tls parameter to support secure connections.
Bug fixes
Microsoft Fabric collector: Fixed an issue where some files were not being harvested, ensuring more complete metadata collection.
QlikSense collector: Updated logic to exclude writing the API key in the TTL file
Fabric collector: Fixed a modeling issue where Lakehouse tables were incorrectly displayed as folders, ensuring proper representation of table structures
Release version 2.305
Details about the release
Item | Details |
|---|---|
Release version | 2.305 |
Release date | 25 November, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
Microsoft SQL Server collector: The collector now harvests column comments from the MS_Description extended property, improving metadata detail and context.
SQL Server Integration Services (SSIS) collector: Added support for lineage when Oracle is used as a source or target, expanding cross-platform lineage coverage.
Bug fixes
Azure Data factory collector: Fixed an issue where lineage relationships were missing when a Snowflake credential had an assigned default role, ensuring complete lineage capture.
Release version 2.304
Details about the release
Item | Details |
|---|---|
Release version | 2.304 |
Release date | 20 November, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
OpenAPI collector: Added a new configuration option --environment-qualifier. When provided, this option ensures that harvested resources are unique and distinct for that environment qualifier, improving multi-environment cataloging.
Power BI collector: The collector now harvests calculation groups and calculation items, expanding metadata coverage for Power BI semantic models.
Bug fixes
dbt core and dbt cloud collectors: Fixed an issue where tags were represented as strings instead of relationships. Tags are now properly modeled as relationships in the catalog graph.
Power BI collector: Fixed an issue where descriptions in parameter definitions were incorrectly appended to parameter values.
Release version 2.303
Important
Published versions of collectors are available as a docker image and a JAR file.
Details about the release
Item | Details |
|---|---|
Release version | 2.303 |
Release date | 13 November, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
Databricks collector: Added support for harvesting metrics views, expanding metadata coverage within Databricks. A new optional parameter, --enable-metric-views, is now available to enable this feature.
Power BI collector: Added support for Power BI–BigQuery column lineage.
Bug fixes
Informatica CDI collector: Fixed an issue that caused incomplete harvesting in large CDI instances by ensuring the collector continues harvesting beyond the Informatica CDI API’s default pagination limit.
dbt Cloud and dbt core collectors: Corrected tag representation so that tags are now modeled as resources rather than strings, improving metadata consistency and usability.
Release version 2.302
Important
Published versions of collectors are available as a docker image and a JAR file.
Details about the release
Item | Details |
|---|---|
Release version | 2.302 |
Release date | 5 November, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
OpenAPI collector: Added handling for schema references and array item references, improving completeness of OpenAPI metadata harvesting.
Power BI collector: Now harvests endorsement details for reports, datasets, and dataflows, providing better visibility into trusted and certified content.
Databricks collector: Added support for harvesting materialized views, expanding metadata coverage within Databricks.
SQL Server collector: Now harvests database synonyms, improving understanding of object relationships and dependencies.
Tableau collector: Updated logic to associate published data sources with the project that contains them, improving metadata organization and context.
SSRS collector: Now harvests the external URL for linked reports, providing direct traceability to linked content.
Snowflake, Redshift, Databricks, Denodo, Oracle, PostgreSQL, Teradata, MySQL, Db2, Netezza, SQL Server, SAP HANA collectors: Updated lineage logic to represent INSERT stored procedures as the triggering agent in lineage relationships, improving lineage accuracy.
Bug fixes
Informatica collector: Fixed an issue that limited harvesting to a maximum of 4,200 mapping tasks. The collector now processes all available mappings.
Databricks collector: Fixed handling of non-alphanumeric characters in identifiers. Resolved issues with the creation of invalid identifiers.
dbt core and dbt cloud collectors: Fixed a case mismatch issue when coining database IRIs for Snowflake accounts, ensuring consistent IRI generation.
SSIS collector: Corrected package lookup logic to prioritize the Package GUID, using the Version GUID only when multiple versions share the same Package GUID.
Talend collector: Fixed a classification issue where Talend Jobs were incorrectly identified as activities instead of agents.
Snowflake collector: Implemented a workaround for a Snowflake JDBC driver defect that was triggered by unexpected datatypes, preventing runtime errors.
SQL Server collector: The collector now harvests table size for tables in non-dbo schemas, ensuring complete table metadata coverage.
Release version 2.301
Important
Published versions of collectors are available as a docker image and a JAR file.
Details about the release
Item | Details |
|---|---|
Release version | 2.301 |
Release date | 17 October, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
Qlik Sense collector: Added lineage between Qlik Sense apps and datasets, providing greater visibility into data flow and dependencies.
Marquez collector: Added column-level lineage, offering more granular insight into data transformations and relationships.
Bug fixes
Databricks collector: Fixed the creation of invalid agent IRIs, ensuring accurate and consistent identifier generation.
Qlik Sense collector: Resolved a memory leak related to closing WebSocket connections, improving stability and performance.
AWS Glue collector: Added handling for large object size properties, preventing errors during metadata harvesting for large datasets.
Release version 2.300
Important
Published versions of collectors are available as a docker image and a JAR file.
Details about the release
Item | Details |
|---|---|
Release version | 2.300 |
Release date | 9 October, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
Marquez collector: The collector now harvests job run history. A new optional parameter, --job-run-count, is now available to enable this feature.
Bug fixes
Tableau collector: Fixed an issue where relationships between sheets and fields were not being harvested under certain conditions, ensuring more complete lineage capture.
Release version 2.299
Important
Published versions of collectors are available as a docker image and a JAR file.
Details about the release
Item | Details |
|---|---|
Release version | 2.299 |
Release date | 2 October, 2025 |
Docker image ID |
|
Jar file |
|
New features and changes
PostgresSQL collector: Now harvests all view definitions, regardless of user select privileges, ensuring more complete metadata capture.
OpenAPI collector:
Changed hasParameter to be a containment property, improving accuracy of relationships.
Updated titles for OpenAPI resources to provide clearer labeling in the catalog.
Databricks collector:
Added support for Volume resources and harvesting lineage across tables, jobs, and notebooks.
Improved performance when harvesting lineage, reducing runtime for large environments.
dbt Core and dbt Cloud collectors: Now extract the host from the database-server option when it is provided as a URL, ensuring correct connection details.
Tableau collector:
Added support for JWT tokens for authentication, expanding login options.
Added support for Databricks database servers as a data source.
Now harvests combinedFields, improving completeness of metadata.
Marquez collector: Added support for AWS S3, enabling metadata harvesting for S3 datasets through Marquez.
Oracle collector: Now harvests Oracle roles, extending coverage of user and access metadata.
Release notes for previous versions
Go here to access release notes for previous version.