Skip to main content

About data virtualization

When you connect a database to data.world using Connection manager or an integration from the Integration Gallery, your data continues to live at its source location and is not stored in data.world. This configuration is frequently referred to as Data virtualization.

Tip

The Connection manager is the best way to create a live connection that will be owned by an organization. If you need a connection that you will own personally, you will need to create it from the Integration Gallery.

One of the benefits of data virtualization is that it allows you to view and query data on data.world that would exceed the dataset size limits on data.world. It also ensures that you have access to your most current data without needing to worry about scheduling synchronizations, or the processing it time it would take to import/refresh the data.

When you query a live table using data.world, our system will translate your query from our native SQL dialect into the SQL dialect of the target system. That system will then execute the query on its own hardware and return the results to data.world for display. Another benefit of virtualization is that it makes managing permissions and access to the data easier.

Notice

Cloud database providers frequently charge either by the amount of time that queries run on their systems or by the total amount of data scanned during the query. If this describes your database service then executing queries against live tables in data.world will also incur charges on those systems.