Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Inconsistent configuration of S3Result vs S3 Storage with custom S3 backend

See original GitHub issue

Description

The following S3 storage configuration works as expected when connecting to Minio:

storage = S3(
        bucket="flows",
        aws_access_key_id=os.getenv("MY_MINIO_ID"),
        aws_secret_access_key=os.getenv("MY_MINIO_KEY"),
        client_options=dict(endpoint_url=os.getenv("MINIO_ENDPOINT")),
    )

Although when trying to work with S3Result:

    result = S3Result(
        bucket="results",
        boto3_kwargs=dict(
            aws_access_key_id=os.getenv("MY_MINIO_ID"),
            aws_secret_access_key=os.getenv("MY_MINIO_KEY"),
            client_options=dict(endpoint_url=os.getenv("MINIO_ENDPOINT")),
        ),
    )

I receive the following error in UI:

prefect-agent_1      | [2020-06-09 07:59:36] DEBUG - prefect.S3Result | Starting to upload result to 2020/6/9/5a96e8ef-70d1-4d47-a495-15b0afa99169.prefect_result...
prefect-agent_1      | [2020-06-09 07:59:36] ERROR - prefect.CloudTaskRunner | Unexpected error: TypeError("client() got multiple values for keyword argument 'aws_access_key_id'")
prefect-agent_1      | Traceback (most recent call last):
prefect-agent_1      |   File "/usr/local/lib/python3.7/site-packages/prefect/engine/runner.py", line 48, in inner
prefect-agent_1      |     new_state = method(self, state, *args, **kwargs)
prefect-agent_1      |   File "/usr/local/lib/python3.7/site-packages/prefect/engine/task_runner.py", line 986, in get_task_run_state
prefect-agent_1      |     result = self.result.write(value, filename="output", **prefect.context)
prefect-agent_1      |   File "/usr/local/lib/python3.7/site-packages/prefect/engine/results/s3_result.py", line 103, in write
prefect-agent_1      |     self.client.upload_fileobj(stream, Bucket=self.bucket, Key=new.location)
prefect-agent_1      |   File "/usr/local/lib/python3.7/site-packages/prefect/engine/results/s3_result.py", line 60, in client
prefect-agent_1      |     self.initialize_client()
prefect-agent_1      |   File "/usr/local/lib/python3.7/site-packages/prefect/engine/results/s3_result.py", line 49, in initialize_client
prefect-agent_1      |     "s3", credentials=None, use_session=True, **self.boto3_kwargs
prefect-agent_1      |   File "/usr/local/lib/python3.7/site-packages/prefect/utilities/aws.py", line 49, in get_boto_client
prefect-agent_1      |     **kwargs
prefect-agent_1      | TypeError: client() got multiple values for keyword argument 'aws_access_key_id'

Expected Behavior

Expectation is to be able to configure S3Result by directly passing boto3 arguments (same as it works for S3 storage) and have consistency across these 2 structures.

Reproduction

Sample flow I’m using:

import os
from prefect import task, Flow
from prefect.engine.results.s3_result import S3Result
from prefect.environments.storage import Docker

@task
def add(x, y=1):
    """
    The only task we use so far here ;-)
    """
    return x + y

def create_flow():
    """
    Create the flow
    """
    result = S3Result(
        bucket="results",
        boto3_kwargs=dict(
            aws_access_key_id=os.getenv("MY_MINIO_ID"),
            aws_secret_access_key=os.getenv("MY_MINIO_KEY"),
            client_options=dict(endpoint_url=os.getenv("MINIO_ENDPOINT")),
        ),
    )

    with Flow("Sample Flow", result=result) as flow:
        first_result = add(1, y=2)
        second_result = add(x=first_result, y=100)
    
    storage = Docker()
    storage.add_flow(flow)
    flow.storage = storage

    return flow

Environment

Any additional information about your environment

OSX
Docker Compose
Docker Agent

Optionally run prefect diagnostics from the command line and paste the information here

root@9675cb8de5d4:/opt/packages# prefect diagnostics
{
  "config_overrides": {},
  "env_vars": [
    "PREFECT__LOGGING__LEVEL",
    "PREFECT__SERVER__HOST",
    "PREFECT__BACKEND"
  ],
  "system_information": {
    "platform": "Linux-4.19.76-linuxkit-x86_64-with-debian-10.4",
    "prefect_version": "0.11.5",
    "python_version": "3.7.7"
  }
}

Issue Analytics

State:
Created 3 years ago
Comments:18

Top GitHub Comments

1reaction

joshmeekcommented, Jun 9, 2020

@oleksandr Yeah that could be it. Removing that and also adding -U to the pip install should install it from the branch.

0reactions

joshmeekcommented, Jun 16, 2020

Thanks for the follow up! Will keep this issue in mind when doing #2714

Top Results From Across the Web

Best practices design patterns: optimizing Amazon S3 ...

You can increase your read or write performance by using parallelization. For example, if you create 10 prefixes in an Amazon S3 bucket...

Errors Related to Visible S3 Inconsistency

The directory inconsistency is precisely the problem which S3Guard aims to correct. Inconsistent directory listings can surface as a `FileNotFoundException` ...

Resource: aws_s3_bucket - hashicorp - Terraform Registry

S3 Bucket CORS can be configured in either the standalone resource aws_s3_bucket_cors_configuration or with the deprecated parameter cors_rule in the resource ...

Configuring Object Storage as Primary Storage

Nextcloud allows to configure object storages like OpenStack Swift or Amazon Simple Storage Service (S3) or any compatible S3-implementation (e.g. Minio or ......

AWS Terraform S3 and dynamoDB backend - Angelo Malatacca

In practice, it stores the terraform.tfstate file in an s3 bucket and uses a dynamoDB table for state locking and consistency checking. In...