Setup dr-elephant with EMR
See original GitHub issueI have compiled and deployed dr-elephant in EMR. It started successfully and i could tunnel to 8080 in my local.
compiled Version : default dr.elephant configuration EMR spark Version: EMR 5.0.0, Spark- 2.0.0
But i do not find any job information in the dr-elephant. I tried to run simple wordcount program and no information in the dr-elephant.
kerberos disabled in my EMR.
Elephant.conf port=8080 #Database configuration db_url=localhost db_name=drelephant db_user=hadoop db_password=hadoop
jvm_args=“-Devolutionplugin=enabled -DapplyEvolutions.default=true -Djava.net.preferIPv4Stack=true -mem 1024 -J-Xloggc:$project_root…/logs/elephant/dr-gc.date +'%Y%m%d%H%M'
-J-XX:+PrintGCDetails”
metrics=true
Dr.Elephant Dashboard shows below information
Hello there, I’ve been busy! I looked through 0 jobs today. About 0 of them could use some tuning. About 0 of them need some serious attention!
Issue Analytics
- State:
- Created 6 years ago
- Comments:16
Top GitHub Comments
My spark fetcher looks something like this… our logs are being stored on the maternode at /var/log/… It seems that the fetcher is able to read the logs once the job has finished, but now I am having the issue that spark 2.x is not directly supported https://github.com/linkedin/dr-elephant/issues/389
<fetcher> <applicationtype>spark</applicationtype> <classname>com.linkedin.drelephant.spark.fetchers.SparkFetcher</classname> <params> <use_rest_for_eventlogs>true</use_rest_for_eventlogs> <should_process_logs_locally>false</should_process_logs_locally> </params> </fetcher>
I think you can try true or false for the process locally, that didnt seem to change anything for meHi @dmateusp , I got my dr.elephant up and running.