Overview of data.world
data.world is a modern data catalog that helps enterprises activate metadata, enforce governance, and drive collaboration across teams. Powered by a knowledge graph, the platform brings clarity and context to complex data environments by linking technical metadata, business terms, lineage, and usage insights.
Why enterprises choose data.world:
Scalable and intelligent catalog for managing sprawling data ecosystems
Graph-driven navigation of relationships, lineage, and meaning
Business glossary and collections to align data with human understanding
Curation and enrichment workflows that balance agility with governance
Full-stack integrations and enterprise controls for flexible deployment
This page provides an enterprise-focused overview of data.world for architects, stewards, and platform admins.
Modern data catalog
At the heart of data.world is its metadata-first, graph-powered catalog. It unifies data assets—whether from cloud warehouses, BI tools, or transformation layers—into a single pane of glass enriched with custom metadata, tags, glossary terms, classifications, and ownership.
Flexible metadata model (technical + business + custom)
Asset-level documentation and annotation
Smart faceted search with semantic context via the Knowledge Graph
Collections: curate and organize meaningfully
Collections are curated groupings of assets that allow teams to organize data in a business-relevant way.
Use them to:
Group assets by domain (for example, Finance, Customer 360, Marketing Analytics)
Assign stewards, apply tags, and enrich with descriptions
Improve discoverability and governance by aligning data with organizational units or initiatives
Collections act as living catalogs within the broader catalog, empowering stewardship and focus.
Curation and enrichment workflows
Curation in data.world combines human-in-the-loop enrichment with automation to ensure assets are trustworthy and business-ready.
Assign data stewards and curators
Annotate resources with glossary terms, classifications, tags, and purpose
Mark assets as certified, deprecated, or in-review
Track changes over time and measure completeness
This allows your organization to scale governance while enabling agility in how data is discovered and used.
Business glossary
The business glossary is your organization’s semantic backbone. It creates a shared language that bridges technical data with business understanding.
Define and manage key terms (for example, ARR, Churn Rate, Active Customer)
Link glossary terms to custom resources, tables, metrics, and dashboards
Build term hierarchies, synonyms, and relationships
Assign term owners and track usage across assets
Through its integration with the knowledge graph, glossary terms enhance search and context across the platform.
Data source connectivity
data.world integrates with your existing stack through native connectors and ingestion pipelines. This allows automated metadata harvesting from supported data sources:
Data storage and query systems, such as, Databricks, Amazon S3, BigQuery
Business intelligence systems, such as, Power BI, Tableau, Microsoft Fabric
ETL/ELT systems, such as, dbt core, dbt cloud, Fivetran
Knowledge Graph
The Knowledge Graph links assets, people, glossary terms, and systems to create a rich semantic layer that makes discovery, understanding, and trust possible at scale.
Navigate relationships visually
Automatically enrich metadata with context
Enable lineage tracing and impact analysis
Unlike flat catalogs, the graph supports real semantic search and contextual insights.
Lineage and governance
Trust and compliance are critical. data.world offers out-of-the-box and custom lineage tracking, alongside governance signals like:
Stewardship assignments
Certification states
Change tracking and metadata completeness
Lineage diagrams that trace upstream/downstream flows
Combined with collections, glossary, and curation workflows, governance becomes visible, actionable, and collaborative.
Enterprise-ready platform
data.world is built to support large-scale organizations and complex security requirements:
SSO / SAML support and granular role-based access control
SOC 2 Type 2 compliance and full audit logs
Rich API and SDK for integration, automation, and orchestration
Whether you’re managing internal policies or supporting regulated industries, data.world is designed to scale with confidence.
Getting Started for Enterprise Teams
Create your collections to structure your environment and configure roles, access, and security controls.
Connect to your data sources to start ingesting metadata.
Define glossary terms and associate them with your catalog resources.
Use curation workflows to assign stewards, enrich metadata, and apply governance policies.
Leverage the knowledge graph for discovery, impact analysis, and continuous improvement.