Skip to main content

Preparing to run the PostgreSQL collector

Setting up pre-requisites for running the collector

Make sure that the machine from where you are running the collector meets the following hardware and software requirements.

Table 1.

Item

Requirement

Hardware (for on-premise runs only)

Note: The following specs are based upon running one collector process at a time. Please adjust the hardware if you are running multiple collectors at the same time.

RAM

8 GB

CPU

2 Ghz processor

Software (for on-premise runs only)

Docker

Click here to get Docker.

data.world specific objects (for both cloud and on-premise runs)

Dataset

You must have a ddw-catalogs dataset set up to hold your catalog files when you are done running the collector.

If you are using Catalog Toolkit , follow these instructions to prepare the datasets for collectors.

Network connection

Allowlist IPs and domains



Setting up PostresSQL

Setting up permissions

Run the following SQL statements to set up a new user with appropriate permissions to harvest from PostgreSQL

  1. Create a new role.

    CREATE ROLE ddw_role;
  2. Run the following SQL statement to grant permissions to the new role to harvest metadata:

    GRANT SELECT ON ALL TABLES IN SCHEMA information_schema TO ddw_role;
    GRANT SELECT ON ALL TABLES IN SCHEMA "<schemaName>" TO ddw_role;
    GRANT SELECT ON TABLE pg_namespace TO ddw_role;
    GRANT SELECT ON TABLE pg_class TO ddw_role;
    GRANT SELECT ON TABLE pg_description TO ddw_role;
    GRANT SELECT ON TABLE pg_attribute TO ddw_role;
    GRANT SELECT ON TABLE pg_type TO ddw_role;
    GRANT SELECT ON TABLE pg_attrdef TO ddw_role;
    GRANT SELECT ON TABLE pg_index TO ddw_role;
    GRANT SELECT ON TABLE pg_proc TO ddw_role;
    GRANT SELECT ON TABLE pg_database TO ddw_role;
    GRANT SELECT ON TABLE pg_matviews TO ddw_role;
  3. If you want to harvest column statistics metadata, run the following SQL statement to grant permissions to the new role for each database and schema.

    GRANT USAGE ON SCHEMA "<schemaName>" TO ddw_role;
    GRANT SELECT ON ALL TABLES IN SCHEMA "<schemaName>" TO ddw_role;
  4. Create a new user.

    CREATE USER ddw_user PASSWORD "<password>";
    GRANT ddw_role TO ddw_user;

Setting up IAM authentication

Important

This is available only for on-premise collectors.

  • If you plan to use IAM authentication, follow the AWS documentation to set it up. This is an optional task.