Skip to main content

An overview of metadata modeling

Metadata is data that provides detailed information about other data. It encompasses tables, columns, glossary terms, glossary types, and similar entities gathered from source systems to descriptively outline the underlying data.

Metadata models serve as frameworks to organize, classify, and structure metadata, ensuring its usability and consistency. These models facilitate understanding, discovery, and functional utilization by others within the organization.

Metadata modeling is the process of organizing and defining this information in a way that allows it to be easily understood and used. It is essentially a blueprint for how data and its descriptions are structured. A metadata model provides the rules and guidelines on how we document and describe our data so that others, and even future systems, can use it effectively.

Importance of metadata modeling

  • Enhancement of discoverability: Facilitates the searchability and accessibility of data sets, especially pertinent in platforms containing extensive datasets.

  • Quality assurance: Ensures accuracy and consistency of data throughout the organization.

  • Support for governance: Enforces data policies and regulations, providing connections to various governing policies within the organization.

  • Elevation of data literacy: Makes data understandable and accessible across teams, enabling them to comprehend and employ data effectively.

  • Enabling advanced analytics: Provides contextual information crucial for accurate data analysis and paves the way for sophisticated machine learning models.

Components of the metadata model

  • Resources: Entities or objects being described (for example, tables, columns, dashboards, glossary types, etc.).

  • Attributes: Characteristics or properties assigned to resources (e.g., data owner, technical owner). There are a number of different attributes that you can use and put on a resource that help describe it to help you understand its certification and whether it is accessible for use.

  • Relationships: Connections illustrating how resources relate to one another. For example, tables contain columns, tables are part of databases, glossary terms describe policies that apply to resources, narrating the story of the metadata model.

    metadata_model.png

Steps to create a metadata model

  • Step 1: Define objectives: Clarify what you aim to achieve with metadata. Determine if the goal is to enhance searchability, improve data quality, or address other specific challenges.

  • Step 2: Identify key resources: Pinpoint the essential data resources, such as tables, columns, policies, and reports, that need to be described in your metadata model.

  • Step 3: Review and modify resource types and fields: Examine existing resource types and fields. Decide which fields are necessary and modify or remove any that aren't essential.

  • Step 4: Add custom types and fields: Introduce additional metadata fields for current resources and create custom resource types as needed.

  • Step 5: Enhance glossary types and fields: Extend and customize glossary types and fields to support the unique needs of your organization’s metadata.

  • Step 6: Evaluate and expand collections: Analyze current collections and consider additional metadata fields or custom collection types to enrich the model.

  • Step 7: Establish relationships: Define and document how various data resources are interrelated. For example, one data source might feed into a dashboard, or multiple tables might combine to create a report.

  • Step 8: Review and expand relationships: Reassess existing relationships and introduce custom relationships between resource types as necessary.

  • Step 9: Define rules and constraints: Set guidelines to maintain data quality. This could include creating required fields, implementing rules through exception reports, or managing access controls for resources.

  • Step 10: Implement the metadata model: Use data.world to implement the model and to document and manage metadata. Create a model that is usable and accessible across your organization.

In data.world installations with the Catalog Toolkit, all metadata configurations are managed within a special organization called the catalog configuration.

ctk_catalog_config_organization.png
ctk_catalog_config_organization02.png