Skip to main content

About data lineage and Eureka Explorer

Danger

data.world University!

Check out our Intro to Lineage video!

Introduction to Data Lineage

Data lineage refers to the comprehensive tracing of the lifecycle of data—from its origins, through various transformations, and eventually to the systems, processes, and applications utilizing that data. Organizations rely on data lineage for:

  • Ensuring data quality and accuracy

  • Assisting with compliance and auditing

  • Facilitating troubleshooting when issues arise

  • Supporting data governance and collaboration

Lineage in data.world

In data.world, lineage is captured through the platform’s catalog collectors. These collectors are capable of uniquely identifying each data resource, even when that resource is referenced by multiple data sources. This capability is enhanced by our Knowledge Graph Foundation, which stitches together information from each collector to create a robust representation of data lineage.

sample_lineage.png

Eureka Explorer: Overview, features, and benefits

Eureka™ Explorer is a visual map of your data and relationships, powered by a knowledge graph. It visually represents your data catalog, providing a visual tool that combines powerful automated metadata lineage collection with an easy-to-use interface. This allows data teams to rapidly and visually browse the lineage relationships of data resources, answering root cause, impact, and compliance questions with just a few clicks.

Features

  • Aggregated Summary and Interactive Graph: Explorer lineage delivers a general preview of data flow and a fully interactive graph of technical lineage, including table, column, and query-level details. This makes it a comprehensive tool for understanding data connections, transformations, and dependencies.

  • Simplified Browsing: Allows the exploration of lineage relationships on a single screen, integrating the richness of the data catalog so teams do not need to spend extensive time searching for answers.

  • Automatic Collection for Enterprise Customers: Available to all Enterprise customers, with automated collection and display of lineage information when a supported collector is run. Refer to Supported sources for detailed information on sources with automated lineage collection.

Lineage Benefits

  • Building Trust in Data: Provides essential context, enabling business analysts and decision-makers to ensure the data's reliability. Users can examine upstream sources to verify data derivation, trustworthiness, and resolve issues such as broken dashboards or disrupted data flows.

  • Risk and Impact Analysis: Data producers and engineers are informed about data changes' impact on downstream users, aiding them in effective communication and troubleshooting.

Cross-System Lineage Relationships

Lineage relationships between two objects from different source systems are feasible if collectors for those sources harvest the lineage relationships. Exceptions occur when source metadata is lacking information or is insufficient to identify both resources unequivocally.

Eureka Explorer in data.world offers an intuitive, user-friendly, and visually appealing approach to tracking and exploring data lineage. It supports organizations in maintaining a clear and comprehensive understanding of data pathways across their data infrastructure, enhancing data governance and operational efficiency.

Access needed for viewing Explorer lineage

  • Users who have View access to the resource will be able to see the Explorer lineage on the Resource page and interact with it in fullscreen mode.

  • If there are resources in the lineage graph the user does not have permission to see, users will see an indication that there is upstream or downstream content that they are unable to see.

    lineage_limited_view01.png
    lineage_limited_view02.png
    lineage_limited_view03.png

Customizing the Lineage pages

The system allows you to enhance the lineage pages to highlight key metadata about the resource. You can do this by adding custom fields to the sidebar of the pages. To learn everything about the out-of-the-box fields that are available in the sidebar and instructions on adding custom fields, please see this documentation.

sample_sidebar.png