question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Spark-UI docker container : setting aws security credentials throws

See original GitHub issue

After building the docker image and attempting to start it (env vars are exported):

docker run -it -e SPARK_HISTORY_OPTS="-Dspark.history.fs.logDirectory=$LOG_DIR -Dspark.hadoop.fs.s3a.access.key=$AWS_ACCESS_KEY_ID -Dspark.hadoop.fs.s3a.secret.key=$AWS_SECRET_ACCESS_KEY -Dfs.s3n.awsAccessKeyId=$AWS_ACCESS_KEY_ID -Dfs.s3n.awsSecretAccessKey=$AWS_SECRET_ACCESS_KEY" -p 18080:18080 glue/sparkui:latest "/opt/spark/bin/spark-class org.apache.spark.deploy.history.HistoryServer"

I get:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/10/15 11:02:47 INFO HistoryServer: Started daemon with process name: 1@f3fe1ed456cf
20/10/15 11:02:47 INFO SignalUtils: Registered signal handler for TERM
20/10/15 11:02:47 INFO SignalUtils: Registered signal handler for HUP
20/10/15 11:02:47 INFO SignalUtils: Registered signal handler for INT
20/10/15 11:02:47 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/10/15 11:02:47 INFO SecurityManager: Changing view acls to: root
20/10/15 11:02:47 INFO SecurityManager: Changing modify acls to: root
20/10/15 11:02:47 INFO SecurityManager: Changing view acls groups to: 
20/10/15 11:02:47 INFO SecurityManager: Changing modify acls groups to: 
20/10/15 11:02:47 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
20/10/15 11:02:48 INFO FsHistoryProvider: History server ui acls disabled; users with admin permissions: ; groups with admin permissions
20/10/15 11:02:48 WARN FileSystem: S3FileSystem is deprecated and will be removed in future releases. Use NativeS3FileSystem or S3AFileSystem instead.
Exception in thread "main" java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:280)
	at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified by setting the fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey properties (respectively).
	at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:74)
	at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.initialize(Jets3tFileSystemStore.java:94)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
	at com.sun.proxy.$Proxy5.initialize(Unknown Source)
	at org.apache.hadoop.fs.s3.S3FileSystem.initialize(S3FileSystem.java:111)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2812)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2849)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2831)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
	at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:117)
	at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:86)
	... 6 more

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
mc-rcacommented, Oct 23, 2020

Seeing the same error. I fixed by adding AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to the container environment directly. However, I think the issue is that it is picking the wrong credentials provider. In other words, the flag -Dspark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider seems to have no effect.

1reaction
moomindanicommented, Dec 25, 2020

fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey are not the configuration property for S3A. It is for S3N. And those parameters are hadoop-related parameters. If you want to use them, you will need to add spark.hadoop. to the prefix of each configuration names (spark.hadoop.fs.s3.awsAccessKeyId and spark.hadoop.fs.s3.awsSecretAccessKey).

BTW, when you use s3a:// for the prefix of LOG_DIR, you won’t need to configure S3N credentials. If you still see the issue in S3A, can you paste entire command you used with masking credentials, and also paste the output of $LOG_DIR?

Reference: https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Authentication_properties

Read more comments on GitHub >

github_iconTop Results From Across the Web

Launching the Spark history server - AWS Glue
Use AWS CloudFormation or Docker to launch the Spark history server and ... On the Configure stack options page, to use the current...
Read more >
Troubleshoot problems with viewing the Spark UI for AWS ...
Confirm that the AWS credentials (access key and secret key) are valid. If you want to use temporary credentials, then you must use...
Read more >
Troubleshooting errors with Docker commands when using ...
In some cases, running a Docker command against Amazon ECR may result in an error message. Some common error messages and potential solutions...
Read more >
Troubleshooting errors in AWS Glue
Check the subnet ID and VPC ID in the message to help you diagnose the issue. Check that you have an Amazon S3...
Read more >
Develop and test AWS Glue version 3.0 jobs locally using a ...
Solution overview · Prerequisites · Configure AWS credentials · Pull the image from Docker Hub · Run the container · Conclusion · Appendix...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found