Skip to main content

Preparing to run the AWS Glue collector

Note

The latest version of the Collector is 2.159. To view the release notes for this version and all previous versions, please go here.

Setting up pre-requisites for running the collector

Make sure that the machine from where you are running the collector meets the following hardware and software requirements.

Table 1.

Item

Requirement

Hardware

RAM

8 GB

CPU

2 Ghz processor

Software

Docker

Click here to get Docker.

Java Runtime Environment

OpenJDK 17 is supported and available here.

data.world specific objects

Dataset

You must have a ddw-catalogs (or other) dataset set up to hold your catalog files when you are done running the collector.



Preparing AWS Glue for the collector

  • An AWS credentials file for authentication which contains the user profile to determine which AWS account's instance to catalog. Typically the AWS_CREDENTIALS_FILE is at [user’s home directory]/.aws/credentials. See the AWS documentation on configuration and credential file settings for information on setting up this file.