Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Random GetBucketLocation Failures Since Upgrading to v0.9.19

See original GitHub issue

Describe the bug

While testing v0.9.18 we encountered the AccessDenied error for the GetBucketLocation api while policies were trying to write to a central S3 logging bucket in one region. After realizing this was a known issue for which a fix was being delivered in v0.9.19, we held off on our upgrade from v0.9.14. We did testing in our PreProd environment when v0.9.19 was released and the issue seemed to have been resolved.

Two nights ago we moved forward with the upgrade and since then we have been receiving random failures. We have several dozen policies running against an Organization of 500+ accounts across 16 regions. Pull-type policies run hourly and daily, and this is where we are seeing the random errors. Event-based policies that run in lambda do not appear to be experiencing this, st least we haven’t identified any occurrences of it yet.

What did you expect to happen?

The policies would run successfully and not receive AccessDenied failures for the GetBucketLocation api.

Cloud Provider

Amazon Web Services (AWS)

Cloud Custodian version and dependency information

Custodian:   0.9.19
Python:      3.9.10 (main, Jan 15 2022, 11:48:04) 
             [Clang 13.0.0 (clang-1300.0.29.3)]
Platform:    posix.uname_result(sysname='Darwin', nodename='PL1USCLT001MAC.local', release='21.6.0', version='Darwin Kernel Version 21.6.0: Mon Aug 22 20:17:10 PDT 2022; root:xnu-8020.140.49~2/RELEASE_X86_64', machine='x86_64')
Using venv:  True
Docker: False
Installed: 

PyYAML==6.0
Pygments==2.13.0
argcomplete==2.0.0
attrs==22.1.0
aws-xray-sdk==2.10.0
bleach==5.0.1
boto3==1.24.87
botocore==1.27.87
c7n==0.9.19
cachetools==5.2.0
certifi==2022.9.24
charset-normalizer==2.1.1
click==8.1.3
colorama==0.4.5
coverage==6.5.0
docutils==0.17.1
execnet==1.9.0
flake8==3.9.2
freezegun==1.2.2
google-api-core==2.10.1
google-api-python-client==2.64.0
google-auth==2.12.0
google-auth-httplib2==0.1.0
google-cloud-appengine-logging==1.1.5
google-cloud-audit-log==0.2.4
google-cloud-core==2.3.2
google-cloud-logging==3.2.4
google-cloud-monitoring==2.11.2
google-cloud-storage==1.44.0
google-crc32c==1.5.0
google-resumable-media==2.4.0
googleapis-common-protos==1.56.4
grpc-google-iam-v1==0.12.4
grpcio==1.49.1
grpcio-status==1.49.1
httplib2==0.20.4
idna==3.4
importlib-metadata==4.13.0
importlib-resources==5.9.0
iniconfig==1.1.1
jaraco.classes==3.2.3
jmespath==1.0.1
jsonpatch==1.32
jsonpointer==2.3
jsonschema==4.16.0
keyring==23.9.3
mccabe==0.6.1
mock==4.0.3
more-itertools==8.14.0
multidict==6.0.2
packaging==21.3
pkginfo==1.8.3
pkgutil-resolve-name==1.3.10
placebo==0.9.0
pluggy==1.0.0
portalocker==2.5.1
proto-plus==1.22.1
protobuf==4.21.7
psutil==5.9.2
py==1.11.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycodestyle==2.7.0
pyflakes==2.3.1
pygments==2.13.0
pyparsing==3.0.9
pyrsistent==0.18.1
pytest==7.1.3
pytest-cov==3.0.0
pytest-forked==1.4.0
pytest-recording==0.12.1
pytest-sugar==0.9.5
pytest-terraform==0.6.4
pytest-xdist==2.5.0
python-dateutil==2.8.2
pyyaml==6.0
ratelimiter==1.2.0.post0
readme-renderer==37.2
requests==2.28.1
requests-toolbelt==0.9.1
retrying==1.3.3
rfc3986==2.0.0
rsa==4.9
s3transfer==0.6.0
six==1.16.0
tabulate==0.8.10
termcolor==2.0.1
tomli==2.0.1
tqdm==4.64.1
twine==3.8.0
typing-extensions==4.3.0
uritemplate==4.1.1
urllib3==1.26.12
vcrpy==4.2.1
webencodings==0.5.1
wrapt==1.14.1
yarl==1.8.1
zipp==3.8.1

Policy

The failure is not tied to a specific policy.

Relevant log/traceback output

2022-10-12 13:20:03,574: c7n_org:ERROR Exception running policy:ec2-optin-start-instances-off-hours-periodic account:XXXXXX region:us-east-2 error:unable to determine a region for output bucket XXX-XXX-XXX: An error occurred (AccessDenied) when calling the GetBucketLocation operation: Access Denied

Extra information or context

No response

Issue Analytics

State:
Created a year ago
Reactions:1
Comments:11 (9 by maintainers)

Top GitHub Comments

1reaction

KISStiancommented, Oct 25, 2022

Unfortunately, I hit the following error while running with the updated module:

2022-10-25 15:56:31,337: custodian.aws:WARNING unable to determine output bucket region with HTTP HEAD request: HTTP Error 503: Slow Down

I received a few of those errors in the logs.

1reaction

KISStiancommented, Oct 24, 2022

Sounds great. Unless you see any issues with this, I would like to take the updated aws.py module with your changes and place it into our runtime environment. Considering I too am unable to reproduce the issue manually, this is best way we can test the efficacy of the enhancement.