Skip to main content

Preparing to run the Microsoft SQL Server collector

Setting up pre-requisites for running the collector

Make sure that the machine from where you are running the collector meets the following hardware and software requirements.

Table 1.

Item

Requirement

Hardware (for on-premise runs only)

Note: The following specs are based upon running one collector process at a time. Please adjust the hardware if you are running multiple collectors at the same time.

RAM

8 GB

CPU

2 Ghz processor

Software (for on-premise runs only)

Docker

Click here to get Docker.

data.world specific objects (for both cloud and on-premise runs)

Dataset

You must have a ddw-catalogs dataset set up to hold your catalog files when you are done running the collector.

If you are using Catalog Toolkit , follow these instructions to prepare the datasets for collectors.

Network connection

Allowlist IPs and domains

Follow these instructions to configure your network.



Setting up permissions

Setting up permissions for username and password authentication

The user you are using to run the collector needs at least SELECT ON DATABASE permission to access the metadata. Users need additional VIEW DEFINITION permission to harvest column-level lineage from views.

To set up permissions for username and password authentication:

  1. Create a new login <loginName> with password <password>.

    CREATE LOGIN '<loginName>' WITH PASSWORD = '<password>'; 
  2. Create a new user.

    CREATE USER <user> FOR LOGIN <loginName>;
  3. Grant SELECT ON DATABASE permissions to harvest catalog resources such as tables, views, and columns.

    GRANT SELECT ON DATABASE :: <databaseName> TO <user>;
  4. Grant VIEW DEFINITION permissions to harvest column-level lineage from views.

    GRANT VIEW DEFINITION ON DATABASE :: <databaseName> TO <user>;
  5. Grant execute sp_spaceused permissions to harvest the size of a table.

Setting up permissions for NTLM authentication

The computer running the collector must be attached to the Active Directory domain. The user you are using to run the collector needs at least SELECT ON DATABASE permission to access the metadata. Users need additional VIEW DEFINITION permission to harvest column-level lineage from views.

To set up permissions for NTLM authentication:

Important

The computer running the collector must be attached to the Active Directory domain.

  1. Create a service account that you want to use to run the collector.

  2. Grant SELECT ON DATABASE permissions to harvest catalog resources such as tables, views, and columns.

    GRANT SELECT ON DATABASE :: <databaseName> TO <user>;
  3. Grant VIEW DEFINITION permissions to harvest column-level lineage from views.

    GRANT VIEW DEFINITION ON DATABASE :: <databaseName> TO <user>;
  4. To gather optional table size information, give the user the public role. This allows the collector to run the sp_spaceused stored procedure. For more details, see the execute sp_spaceused documentation.

    Note

    Make sure that when you setup the collector (on-premise, cloud), you set the following two JDBC properties to use the NTLM authentication - integratedSecurity=true, authenticationScheme=NTLM

Obtaining the server information for Azure Synapse

To find the fully qualified server name:

  1. Navigate to the Azure portal.

  2. Navigate to the Azure Synapse workspace you want to connect to.

  3. Click on Overview.

  4. The full server name is listed as Dedicated SQL endpoint. For example <synapse-workspace>.sql.azuresynapse.net