(Glue Crawler): unable to pass sqs_queue_arn to S3 target for S3 Event notification crawler
See original GitHub issueWhat is the problem?
For the CRAWL_EVENT_MODE of the recrawl_policy, there is no was at the moment to pass the sqs_queue_arn into s3Targets. Cloud formation throws the error without it - Event queue ARN is required when event crawler is selected.
Reproduction Steps
Create glue crawler with recrawl property as following:
recrawl_policy = glue.CfnCrawler.RecrawlPolicyProperty(recrawl_behavior='CRAWL_EVENT_MODE')
What did you expect to happen?
There should be the “event_queue_arn” parameter in the CfnCrawler.S3TargetProperty that would accept the sqs arn. At the moment there is non
What actually happened?
Cloud formation throws the error without sqs arn - Event queue ARN is required when event crawler is selected.
CDK CLI Version
1.134.0
Framework Version
No response
Node.js Version
v14.17.6
OS
windows
Language
Python
Language Version
3.9.7
Other information
No response
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
Accelerating crawls using Amazon S3 event notifications
The first crawl lists all Amazon S3 objects from the target. ... To set up a crawler for Amazon S3 event notifications using...
Read more >Glue Crawler: The number of unique events received is 0 for ...
Are you using Amazon S3 event notification, or Amazon S3 bucket notification to sending out notification to Amazon SQS ?
Read more >Using S3 Event Notification - AWS Glue Immersion day
The Amazon S3 event crawl runs by consuming Amazon S3 events from the SQS queue ... in listing mode by performing full a...
Read more >Incremental Glue crawling using Amazon S3 Event Notifications
When configuring the AWS Glue crawler to discover data in Amazon S3, you can choose from a full scan, where all objects in...
Read more >aws_s3_bucket_notification | Resources | hashicorp/aws
Resource: aws_s3_bucket_notification. Manages a S3 Bucket Notification Configuration. For additional information, see the Configuring S3 Event Notifications ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Looks like CFN support just landed.
https://github.com/aws-cloudformation/cloudformation-coverage-roadmap/issues/947
Hi @kaizen3031593 , thank you for reply. I have only provided the snippet of the recrawl_policy config that I have used. The full code looks like this:
crawler_raw = glue.CfnCrawler( self, 'Raw-Crawler', name='Raw-Crawler', role=glue_crawler_role.role_arn, database_name=database_name, targets={ 's3Targets': [{"path": f"s3://{bucket_name}/"}] }, configuration="{\"Version\":1.0,\"Grouping\":{\"TableGroupingPolicy\":\"CombineCompatibleSchemas\",\"TableLevelConfiguration\":2},\"CrawlerOutput\": {\"Partitions\": {\"AddOrUpdateBehavior\": \"InheritFromTable\"}}}", schema_change_policy=schema_change_policy, recrawl_policy = glue.CfnCrawler.RecrawlPolicyProperty(recrawl_behavior='CRAWL_EVENT_MODE') )
After doing more research I have found out that at the moment CloudFormation doesn’t support the sqs_queue_arn parameter for the s3_target so basically there is no was to add the sqs arn to crawler in the automated way to point at the queue where s3 notifications are being stored. So it is not a cdk issue per se but not yet implemented feature in CF