Troubleshooting Amazon S3 collector issues
Collector runtime and troubleshooting
The catalog collector may run in several seconds to many minutes depending on the size and complexity of the system being crawled.
If the catalog collector runs without issues, you should see no output on the terminal, but a new file that matching *.dwec.ttl should be in the directory you specified for the output.
If there was an issue connecting or running the catalog collector, there will be either a stack trace or a *.log file. Both of those can be sent to support to investigate if the errors are not clear.
A list of common issues and problems encountered when running the collectors is available here.
Issue 1: An access error occurs while running the collector
Cause: The account used to authenticate to Amazon S3 does not have permissions to read buckets or objects.
Solution: Follow the instructions to set the user and permissions for the collector.
Issue 2: An invalid access token error occurs while running the collector
Cause: The access key for the AWS account is expired or is incorrect.
Solution: Delete the ~/.aws/credentials file and re-run the steps to obtain the access key and setting up the credentials file.
Issue 3: Resources from certain buckets from S3 not getting cataloged
Cause: This issue generally happens when the bucket has resources more than 10,000 or what is set in the --max-resources parameter.
Solution: Check if the --max-resources parameter is set and if so, what value is configured for it.
Issue 4: No buckets or objects are cataloged
Cause: AWS credentials that the collector runs with does not have necessary permissions to list buckets and objects.
Solution:
Confirm that the AWS credentials have the proper permissions to read objects and buckets. You can use the AWS client to confirm by using the following commands:
To list buckets: aws s3 ls
To list objects in a bucket: aws s3 ls s3://<bucket>
If the buckets or objects do not appear, review the required permissions and make sure they are set properly.