Skip to main content

Troubleshooting Amazon S3 collector issues

Collector runtime and troubleshooting

The catalog collector may run in several seconds to many minutes depending on the size and complexity of the system being crawled.

  • If the catalog collector runs without issues, you should see no output on the terminal, but a new file that matching *.dwec.ttl should be in the directory you specified for the output.

  • If there was an issue connecting or running the catalog collector, there will be either a stack trace or a *.log file. Both of those can be sent to support to investigate if the errors are not clear.

A list of common issues and problems encountered when running the collectors is available here.

Issue 1: An access error occurs while running the collector

  • Cause: The account used to authenticate to Amazon S3 does not have permissions to read buckets or objects.

  • Solution: Follow the instructions to set the user and permissions for the collector.

Issue 2: An invalid access token error occurs while running the collector

  • Cause: The access key for the AWS account is expired or is incorrect.

  • Solution: Delete the ~/.aws/credentials file and re-run the steps to obtain the access key and setting up the credentials file.

Issue 3: Resources from certain buckets from S3 not getting cataloged

  • Cause: This issue generally happens when the bucket has resources more than 10,000 or what is set in the --max-resources parameter.

  • Solution: Check if the --max-resources parameter is set and if so, what value is configured for it.