Spark-UI docker container : setting aws security credentials throws
See original GitHub issueAfter building the docker image and attempting to start it (env vars are exported):
docker run -it -e SPARK_HISTORY_OPTS="-Dspark.history.fs.logDirectory=$LOG_DIR -Dspark.hadoop.fs.s3a.access.key=$AWS_ACCESS_KEY_ID -Dspark.hadoop.fs.s3a.secret.key=$AWS_SECRET_ACCESS_KEY -Dfs.s3n.awsAccessKeyId=$AWS_ACCESS_KEY_ID -Dfs.s3n.awsSecretAccessKey=$AWS_SECRET_ACCESS_KEY" -p 18080:18080 glue/sparkui:latest "/opt/spark/bin/spark-class org.apache.spark.deploy.history.HistoryServer"
I get:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/10/15 11:02:47 INFO HistoryServer: Started daemon with process name: 1@f3fe1ed456cf
20/10/15 11:02:47 INFO SignalUtils: Registered signal handler for TERM
20/10/15 11:02:47 INFO SignalUtils: Registered signal handler for HUP
20/10/15 11:02:47 INFO SignalUtils: Registered signal handler for INT
20/10/15 11:02:47 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/10/15 11:02:47 INFO SecurityManager: Changing view acls to: root
20/10/15 11:02:47 INFO SecurityManager: Changing modify acls to: root
20/10/15 11:02:47 INFO SecurityManager: Changing view acls groups to:
20/10/15 11:02:47 INFO SecurityManager: Changing modify acls groups to:
20/10/15 11:02:47 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
20/10/15 11:02:48 INFO FsHistoryProvider: History server ui acls disabled; users with admin permissions: ; groups with admin permissions
20/10/15 11:02:48 WARN FileSystem: S3FileSystem is deprecated and will be removed in future releases. Use NativeS3FileSystem or S3AFileSystem instead.
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:280)
at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified by setting the fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey properties (respectively).
at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:74)
at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.initialize(Jets3tFileSystemStore.java:94)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
at com.sun.proxy.$Proxy5.initialize(Unknown Source)
at org.apache.hadoop.fs.s3.S3FileSystem.initialize(S3FileSystem.java:111)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2812)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2849)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2831)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:117)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:86)
... 6 more
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:6 (2 by maintainers)
Top Results From Across the Web
Launching the Spark history server - AWS Glue
Use AWS CloudFormation or Docker to launch the Spark history server and ... On the Configure stack options page, to use the current...
Read more >Troubleshoot problems with viewing the Spark UI for AWS ...
Confirm that the AWS credentials (access key and secret key) are valid. If you want to use temporary credentials, then you must use...
Read more >Troubleshooting errors with Docker commands when using ...
In some cases, running a Docker command against Amazon ECR may result in an error message. Some common error messages and potential solutions...
Read more >Troubleshooting errors in AWS Glue
Check the subnet ID and VPC ID in the message to help you diagnose the issue. Check that you have an Amazon S3...
Read more >Develop and test AWS Glue version 3.0 jobs locally using a ...
Solution overview · Prerequisites · Configure AWS credentials · Pull the image from Docker Hub · Run the container · Conclusion · Appendix...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Seeing the same error. I fixed by adding
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
to the container environment directly. However, I think the issue is that it is picking the wrong credentials provider. In other words, the flag-Dspark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
seems to have no effect.fs.s3.awsAccessKeyId
andfs.s3.awsSecretAccessKey
are not the configuration property for S3A. It is for S3N. And those parameters are hadoop-related parameters. If you want to use them, you will need to addspark.hadoop.
to the prefix of each configuration names (spark.hadoop.fs.s3.awsAccessKeyId
andspark.hadoop.fs.s3.awsSecretAccessKey
).BTW, when you use
s3a://
for the prefix of LOG_DIR, you won’t need to configure S3N credentials. If you still see the issue in S3A, can you paste entire command you used with masking credentials, and also paste the output of $LOG_DIR?Reference: https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Authentication_properties