Skip to main content

Databricks Delta Lake connection

Configuring a connection to Databricks Delta Lake

To set a connection:

  1. On the Organization profile page, go to the Settings tab > Connection manager section.

  2. Click the Add connection button.

  3. In the Add an organization-level connection window, select Databricks Delta Lake.

  4. In the Add a new Databricks Delta Lake connection window, set the following:

    databricks.png
    • Host/IP: The URL of the source system.

    • Port (optional): Use if you connect through a special port.

    • HTTP Path: Specify the HTTP path for your Databricks Delta Lake connection, which can be obtained from the Databricks admin console.

    • Database (optional): You can specify the database here or you can choose it when you use the connection in a dataset while setting up data extraction or data virtualization.

    • Connection username: The username for connecting to the source system. The user of the connection should have, at a minimum, view permissions for the data you aim to extract or virtualize.

    • Personal access token: Access token for the user.

  5. Click the Test Databricks Delta Lake configuration button to test your configuration. It is essential to test your connections to ensure the application can successfully connect to the source system.

  6. Click Configure to save your configuration.

  7. The saved configuration is added to the list of Organization-level connections. From here, you have the option to edit or delete a connection. Although you will see a Create task option for a connection, it is not applicable for data virtualization and data extraction and should be disregarded. This Create task option was originally intended for metadata collection, but this feature has since been deprecated and should not be used. To configure metadata collection, use the Metadata collectors configuration available in the product.

    connection_options.png

Configuring advanced SSH Tunnel connection options

For enhanced security, the application supports configuring an SSH tunnel as part of your connection setup. This method provides a secure link between your database server and data.world, utilizing an SSH-secured pathway.

Editing connections

To edit a connection:

  1. Locate the connection you want to edit.

  2. Click the Three dot menu and select Edit connection.

  3. In the Edit connection window, make the desired changes. Note that when you are editing a connection, you are required to provide the sensitive information for the connection again, for example, Connection password.

  4. After modifying the connection details, click the Test configuration button to recheck the set up and ensure it is functioning as expected.

Deleting connections

Deleting a connection causes all dependent resources, including virtualized connections and queries, to lose access to their data source. While the resources persist, they cannot access or retrieve data without a functioning connection.

For example, If you delete the connection, the following error message will appear for the Insight:

virtualized-data-insight.png

To delete a connection:

  1. Locate the connection you want to delete.

  2. Click the Three dot menu and select Delete connection.

  3. Confirm the deletion. Once deleted the connection cannot be restored. You have to set a new connection again.