Docs portal

Security

Security

Security is a paramount concern for enterprise customers who are evaluating new systems. The data.world service is not just cloud-first, but it's also security-first. We’ve designed it from the ground up to ensure that we can support a unique combination of internal and external compliance needs. In the following articles we have documented how we keep both your data and your connections safe, and how we enable you to do the same.

Understanding permissions

Permissions on a dataset or project are initially set when the resource is created. If an organization is set as the owner, then permission options are:

  • No one

  • Everyone in the organization

  • Public to the data.world community

New_dataset_permissions_org.png

Note

One safeguard against users accidentally publishing enterprise data out to the wider community is our standard enterprise team publication configuration: By default ‘Create public datasets’ is turned off for our Enterprise customers.

Owners of datasets and projects can invite specific users to contribute, or approve incoming requests from users who want to contribute. Either way, the owner controls what each contributor can do by granting three levels of permissions:

  • View only

  • View + edit

  • View + edit + manage

Datasets have another layer of access permission as they can be flagged as Discoverable. More about this kind of access in the section Discoverable datasets.Here's what each permission level will allow a contributor to do:

View only: primarily used for private datasets and projects, this allows the user to simply view the dataset or project. As part of that, the contributor can:

  • Download any of the files.

  • Query the data and export results.

  • View and comment in either public or private discussions.

  • Create new discussion topics.

View + edit: in addition to the view-only permissions, the contributor can:

  • Make edits to descriptions and summaries.

  • Add and remove tags.

  • Add and remove files.

  • Replace files by uploading new versions with the same name.

  • Modify file and column descriptions.

  • Modify license type.

  • Switch the dataset or project between open and private.

  • Publish queries for others to use.

View + edit + manage: The contributor will have full admin controls to the dataset or project. In addition to the view + edit permissions, they can:

  • Delete the dataset or project.

  • Add, remove, and modify contributors.

How link sharing works

One of the powerful features of our platform is that results from queries in a project can be reused or embedded. These links are not discoverable.

When a link to the results of a query is created, it is encoded with the user token information for the user who originally ran the query. Every subsequent running from that link also runs with the original user's permissions and token. As further security however, even with the link, access is scoped and limited to the specific results of the query. Finally, in VPC deployments share URL's expire after 12 hours.

Integrations and security

Integrations in data.world are primarily used either to bring data in for querying or for metadata analysis, or to take data out and work with it in a third-party application. Security for both types is comprehensive. In structure, integrations are stored as datasets in data.world.

Integration access

At the core, every integration uses some form of a user token to ensure that users only have access to the data that they should have access to. In the case of integrations used to download data into third-party applications, this token can be created via an OAuth flow or may involve the user copy/pasting a token that they copied from their advanced settings page.

In the case of database integrations, where we connect out to the system, we use the credentials entered by the Data Engineer. For Athena, the data engineer must also configure their AWS instance to allow us to connect.

Integration visibility and permissions

The integrations available to users are presented on an integrations web page:

Integrations_page.png

Organizations in the multi-tenant environment have all the publicly available integrations as well as any private integrations that they have specifically created for their organizations. Access to private integrations can be set by an organization admin.

Organizations currently using a VPC have implementations that come default with NO integrations on their integrations page. This configuration enables them to set visibility and access very flexibly. All permission levels and access can be set within the platform by the data administrators in the organization.

Security best practices

There are several best practices you can follow to improve the security of your data and manage access to it on data.world.

Use organization-owned connections

The Connection Manager on your organization page allows for connections to be managed by only organization administrators. All database and dataset connections are audited and reportable.

Never share keys or tokens

Some third party applications may require an API token or key to work with data.world. If you have such a key or token, or one for data.world's metadata catalog collector, you should never share them with anyone else. These tokens run as your user with your permission levels. Every user who needs an API token should have their own for security and accountability.

Provide masked/limited file previews on discoverable datasets

Often for evaluating data you want users to understand not only the column names and other descriptive metadata, but also some example rows. Masking/limitations applied to samples allow for them to be provided in a way that effectively works within sensitive data or compliance needs.