Preparing to run the Microsoft SQL Server collector

Setting up pre-requisites for running the collector

Make sure that the machine from where you are running the collector meets the following hardware and software requirements.

Table 1.

Item	Requirement
Hardware (for on-premise runs only) Note: The following specs are based upon running one collector process at a time. Please adjust the hardware if you are running multiple collectors at the same time.
RAM	8 GB
CPU	2 Ghz processor
Software (for on-premise runs only) Docker or Java Runtime Environment
Docker	Click here to get Docker.
Java Runtime Environment	OpenJDK 17 is supported and available here.
data.world specific objects (for both cloud and on-premise runs)
Dataset	You must have a ddw-catalogs dataset set up to hold your catalog files when you are done running the collector. If you are using Catalog Toolkit , follow these instructions to prepare the datasets for collectors.
Network connection
Allowlist IPs and domains	Follow these instructions to configure your network. Use these tools to check network connections before running the collector.

Setting up permissions

The collector supports the following authentication methods to Azure Synapse Analytics and Microsoft SQL Server - username/password authentication, NTLM authentication, and Microsoft Entra authentication methods such Active directory service principal and Active directory password authentication.

Setting up permissions for username and password authentication

The user you are using to run the collector needs at least SELECT ON DATABASE permission to access the metadata. Users need additional VIEW DEFINITION permission to harvest column-level lineage from views.

Important

The SQL statements presented here are intended as helpful examples. There are other valid ways to configure permissions in SQL Server, and the exact commands may vary depending on your environment and requirements.

To set up permissions for username and password authentication:

Create a new login <loginName> with password <password>.

CREATE LOGIN <loginName> WITH PASSWORD = '<password>';

Create a new user.

CREATE USER <user> FOR LOGIN <loginName>;

Grant SELECT ON DATABASE permissions to harvest catalog resources such as tables, views, and columns.
```
GRANT SELECT ON DATABASE :: <databaseName> TO <user>;
```
Grant VIEW DEFINITION permissions to harvest column-level lineage from views.
```
GRANT VIEW DEFINITION ON DATABASE :: <databaseName> TO <user>;
```
Grant execute sp_spaceused permissions to harvest the size of a table.

Setting up permissions for NTLM authentication

The computer running the collector must be attached to the Active Directory domain. The user you are using to run the collector needs at least SELECT ON DATABASE permission to access the metadata. Users need additional VIEW DEFINITION permission to harvest column-level lineage from views.

Important

To set up permissions for NTLM authentication:

Important

The computer running the collector must be attached to the Active Directory domain.

Create a service account that you want to use to run the collector.
Grant SELECT ON DATABASE permissions to harvest catalog resources such as tables, views, and columns.
```
GRANT SELECT ON DATABASE :: <databaseName> TO <user>;
```
Grant VIEW DEFINITION permissions to harvest column-level lineage from views.
```
GRANT VIEW DEFINITION ON DATABASE :: <databaseName> TO <user>;
```
To gather optional table size information, give the user the public role. This allows the collector to run the sp_spaceused stored procedure. For more details, see the execute sp_spaceused documentation.
Note
Make sure that when you setup the collector (on-premise, cloud), you set the following two JDBC properties to use the NTLM authentication - integratedSecurity=true, authenticationScheme=NTLM

Setting up permissions for harvesting agent jobs

To harvest SQL Server Agent jobs, the user running the collector must be assigned the SQLAgentReaderRole in the msdb system database.

Important

This operation must be performed by a user with the sysadmin role.

To grant the necessary permissions:

Run the following statements:

USE msdb;
EXEC sp_addrolemember 'SQLAgentReaderRole', '<username>';

Setting up access for using Service Principal authentication

STEP 1: Register a new application:

Go to the Azure Portal.
Go to the App Registrations service.
Click New Registration and enter the following information:
- Application Name: Set as DataDotWorldSQLServerApplication.
- Supported account types: Select Accounts in this organizational directory only.
Click Register to complete the registration.

STEP 2: Create a client secret

On the new application page you created, select Certificates and Secrets.
Under the Client secrets tab, click the New client secret button.
Add a Description.
Set the expiration for the client secret.
Click Add, and copy the secret value.

STEP 3: Grant Service Principal access to SQL Server

In the in Azure portal, go to the SQL Server instance you want to catalog.
Click on Access control (IAM).
Click on Add Role Assignment.
Select Reader under the Job function roles tab. Click Next.
In the next page, make sure that Assign access to User, group or service principal is selected.
Click on Select members. Search for DataDotWorldSQLServerApplication that was registered earlier. Click Select.
Finally, click on Review + assign.

STEP 4: Grant Service Principal access to SQL Server database

Run the following SQL command:

CREATE USER [<service-principal>;] FROM EXTERNAL PROVIDER;

Setting up access for active directory password authentication

If the user running the collector is different from the Microsoft Entra Administrator, the user needs to be added to the database if not done so already.

To set up permissions for active directory password authentication:

Add the user by running the following SQL command:

CREATE USER [<user-or-service-principal>] FROM EXTERNAL PROVIDER;

Obtaining the server information for Azure Synapse

To find the fully qualified server name:

Navigate to the Azure portal.
Navigate to the Azure Synapse workspace you want to connect to.
Click on Overview.
The full server name is listed as Dedicated SQL endpoint. For example <synapse-workspace>.sql.azuresynapse.net

In this section:

Preparing to run the Microsoft SQL Server collector

Setting up pre-requisites for running the collector

Setting up permissions

Setting up permissions for username and password authentication

Important

Setting up permissions for NTLM authentication

Important

Important

Note

Setting up permissions for harvesting agent jobs

Important

Setting up access for using Service Principal authentication

Setting up access for active directory password authentication

Obtaining the server information for Azure Synapse

Search results