How our data connections interact with your data
data.world offers three primary methods for integrating with data from your environment, catering to both security and functionality requirements. - Metadata collection, Virtual query capability and Data extract and import capability.
For the majority of security and compliance requirements, leveraging Metadata collection and Virtual query capability adequately addresses the needs. Importantly, in this configuration, data.world does not store your actual data on our platform—only the metadata. The Data extract and import capability can be disabled if not required to further enhance security.
Metadata collection
The Metadata Collection feature is designed to capture detailed metadata about your data effectively and securely.
Enhanced Data Governance: Optionally, the Technical Lineage Server can be enabled for improved data tracing and governance, providing a complete view of data flow and changes.
Information Collected:
Source System Descriptive Information: Comprehensive details about the source systems.
Schema Information: Metadata including tables and columns.Object-Oriented Descriptive Information: Titles and creation dates of dashboards or reports.
Transmission and Security: All metadata is securely transmitted over HTTPS, with optional support for custom SSL certificates. This non-sensitive information can be managed by data stewards, governance, and security professionals within your organization for enhanced curation and access control.
Virtual query capability
The Virtual query capability allows you to query your data in place, eliminating the need to move it physically onto the data.world platform.
Operational insight: Provides insights directly from your environment, ensuring data remains securely within your infrastructure.
Security and performance:
Data is accessed through a temporary, short-lived connection where only the query results are transmitted and not persisted.Default 5-minute query timeout ensures efficient performance and security.
Query audit logs are available for monitoring and transparency.
Deployment: Implemented via a virtual appliance, potentially paired with a hardware appliance or a reverse SSH tunnel service. This setup, powered by Trustgrid technology, enables secure, outbound-only encrypted connections.
Data extract and import capability
The data extract and import capability feature facilitates the physical movement of data into the data.world platform when necessary, optimizing the handling of data processes.
Performance Optimization: Extracted/imported data benefits from enhanced query performance, especially for datasets under 3GB. Larger datasets should be divided or accessed through virtualization for efficiency.
Security and Control: All data stored is encrypted. Customers can opt to disable this capability to ensure no data is persisted on data.world, aligning with organizational compliance and security preferences.