About the Qlik Talend Data Integration collector
Warning
This collector is in public preview. It has passed our standard testing, but it is not yet widely adopted. You might encounter unforeseen edge cases in your environment. data.world is committed to promptly addressing any issues with public preview collectors. If you face any problems, please report them through your Customer Success Director, implementation team, or support team for assistance.
Use this collector to harvest lineage within Talend jobs. Users run jobs from Talend Studio, which generates job files (.properties and .item files). The user must specify the talend workspace location of the job files (.properties and .item) to the collector. The collector scans the workspace location and parses the job .properties file to harvest job properties and .item file to harvest lineage. The collector should be running on the same system where the talend workspace with job files (.properties and .item) is located.
Important
The Talend collector can be run on-premise using Docker or JAR files.
Note
The latest version of the Collector is 2.258. To view the release notes for this version and all previous versions, please go here.
What is cataloged?
The collector catalogs following information.
Object | Information cataloged |
---|---|
Job |
|
Relationships between objects
By default, the harvested metadata includes catalog pages for the following resource types. Each catalog page has a relationship to the other related resource types. If the metadata presentation for this data source has been customized with the help of the data.world Solutions team, you may see other resource pages and relationships.
Resource page | Relationship |
---|---|
Job |
|
Database Table |
|
Amazon S3 files |
|
Files in local system |
|
Lineage for Talend
The collector catalogs the following lineage information.
Object | Lineage available |
---|---|
Database (Relational database) |
|
Amazon S3 |
|
Files in the local file system |
|