question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issue with Spark 1.5.1: es-hadoop "Connection error (check network and/or proxy settings)- all nodes failed"

See original GitHub issue

Hi, I compiled Spark 1.5.1 with Hive and SparkR with the following command:

mvn -Pyarn -Phive -Phive-thriftserver -PsparkR -DskipTests -X clean package

After its installation, the file “hive-site.xml” has been added in Spark’s conf directory (this is not an hard copy, it’s a symbolic link). I also created a new directory in Spark which includes the elasticsearch-hadoop.jar connector ($SPARK_HOME/elastic/jar/elasticsearch-hadoop-2.1.1.jar).

In order to start the spark-shell with the connector, I’m using the following command:

spark-shell -v --jars /opt/application/Spark/current/elastic/jar/elasticsearch-hadoop-2.1.1.jar --name elasticsearch-hadoop --master yarn-client --conf spark.es.net.ssl=true --conf spark.es.net.http.auth.user=username --conf spark.es.net.http.auth.pass=password --conf spark.es.nodes=loadbalancer.fr --conf spark.es.port=9200 --conf spark.es.field.read.empty.as.null=true --conf spark.es.resource=index/type

Once Spark fully loaded and the SparkContext properly created, I can successfuly do:

scala> import org.elasticsearch.spark._
import org.elasticsearch.spark._

scala> val RDD = sc.esRDD("index/type/", "?q=Ge*")
RDD: org.apache.spark.rdd.RDD[(String, scala.collection.Map[String,AnyRef])] = ScalaEsRDD[0] at RDD at AbstractEsRDD.scala:17

scala> RDD.count
res1: Long = 39

The result returned by “RDD.count” is correct and is the same as the one I can get when I do a curl request

curl -XGET -k "https://loadbalancer.fr:9200/orange/bank/_search?q=Ge*&pretty" -u username:password

Nevertheless, if I try to get the document sources (with the es-hadoop connector), I’m getting the following errors:

scala> RDD.collect().foreach(println)
[Stage 0:>                                                          (0 + 2) / 5]15/11/04 09:43:17 ERROR TaskSetManager: Task 1 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 9, node07.fr): org.apache.spark.util.TaskCompletionListenerException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[node01.fr:9200, loadbalancer.fr:9200, node02.fr:9200, node03.fr:9200, node04.fr:9200, node05.fr:9200, node06.fr:9200, node07.fr:9200]]
        at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:90)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1822)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1835)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1848)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1919)
        at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:905)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
        at org.apache.spark.rdd.RDD.collect(RDD.scala:904)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:21)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:26)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:28)
        at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30)
        at $iwC$$iwC$$iwC$$iwC.<init>(<console>:32)
        at $iwC$$iwC$$iwC.<init>(<console>:34)
        at $iwC$$iwC.<init>(<console>:36)
        at $iwC.<init>(<console>:38)
        at <init>(<console>:40)
        at .<init>(<console>:44)
        at .<clinit>(<console>)
        at .<init>(<console>:7)
        at .<clinit>(<console>)
        at $print(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
        at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340)
        at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
        at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
        at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
        at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
        at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
        at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
        at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
        at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
        at org.apache.spark.repl.Main$.main(Main.scala:31)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.spark.util.TaskCompletionListenerException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[node01.fr:9200, loadbalancer.fr:9200, node02.fr:9200, node03.fr:9200, node04.fr:9200, node05.fr:9200, node06.fr:9200, node07.fr:9200]]
        at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:90)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

This issue also happens when the “hive-site.xml” is deleted. I can reproduce the issue at 100%. The only way to get the document sources is to deactivate SSL and the authentication through Searchguard. You’ll understand that this solution is an unacceptable option.

Also important, there is nothing in the logs related to this error, neither in the loadbalancer nor in the nodes.

Note: I also tried to get the sources with the last beta connector and the result is the same.

Here are the specs of my environment:

  • Red Hat ES 6.7 x86_64
  • Spark 1.5.1, Scala 2.10.4, Java 1.7.0_85, Hive 1.2.1
  • Global authentication done through Kerberos
  • Elasticsearch 1.6.2, Lucene 4.10.4
  • SSL activated and authentication done through searchguard

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:16 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
vaibhavtupecommented, Nov 22, 2016

@jbaiera @yogeshdarji99 , I found out the problem. I had not specified the property " es.nodes.wan.only = true " my bad. Specifying this property solved my problem

0reactions
vaibhavtupecommented, Nov 21, 2016

Hi @jbaiera / @yogeshdarji99 , I am also facing same issue while using it in container. I have simple application which uses sparkSQL to process the data and store it in Elasticsearch.

Elasticsearch version: 2.4.1 Maven dependency added in project :

 <dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-spark_2.10</artifactId>
<version>2.4.0</version>
</dependency>

project properties : ENV elasticsearch.nodes 10.222.171.57 ENV elasticsearch.port 31921

Error:

`10.202.151.57:31921]
 | 2016-11-21 23:01:18.989  INFO 1 --- [launch worker-0] o.e.spark.sql.EsDataFrameWriter          : Writing to [testAnalyticsApp/proxy_data]
 | 2016-11-21 23:03:26.258  INFO 1 --- [launch worker-0] o.a.c.httpclient.HttpMethodDirector      : I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
 | 2016-11-21 23:03:26.259  INFO 1 --- [launch worker-0] o.a.c.httpclient.HttpMethodDirector      : Retrying request
 | 2016-11-21 23:05:33.618  INFO 1 --- [launch worker-0] o.a.c.httpclient.HttpMethodDirector      : I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
  | 2016-11-21 23:05:33.619  INFO 1 --- [launch worker-0] o.a.c.httpclient.HttpMethodDirector      : Retrying request
  | 2016-11-21 23:07:40.978  INFO 1 --- [launch worker-0] o.a.c.httpclient.HttpMethodDirector      : I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
 | 2016-11-21 23:07:40.979  INFO 1 --- [launch worker-0] o.a.c.httpclient.HttpMethodDirector      : Retrying request
 | 2016-11-21 23:09:48.338 ERROR 1 --- [launch worker-0] o.e.hadoop.rest.NetworkClient            : Node [172.17.0.5:9200] failed (Connection timed out); no other nodes left - aborting...
  | 2016-11-21 23:09:48.345 ERROR 1 --- [launch worker-0] org.apache.spark.executor.Executor       : Exception in task 0.0 in stage 0.0 (TID 0)
  | 
  | org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[172.17.0.5:9200]] 
 | 	at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:142) ~[elasticsearch-spark_2.10-2.4.0.jar!/:2.4.0]
 | 	at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:434) ~[elasticsearch-spark_2.10-2.4.0.jar!/:2.4.0]
  | 	at org.elasticsearch.hadoop.rest.RestClient.executeNotFoundAllowed(RestClient.java:442) ~[elasticsearch-spark_2.10-2.4.0.jar!/:2.4.0]
  | 	at org.elasticsearch.hadoop.rest.RestClient.exists(RestClient.java:518) ~[elasticsearch-spark_2.10-2.4.0.jar!/:2.4.0]
  | 	at org.elasticsearch.hadoop.rest.RestClient.touch(RestClient.java:524) ~[elasticsearch-spark_2.10-2.4.0.jar!/:2.4.0]
 | 	at org.elasticsearch.hadoop.rest.RestRepository.touch(RestRepository.java:491) ~[elasticsearch-spark_2.10-2.4.0.jar!/:2.4.0]
 | 	at org.elasticsearch.hadoop.rest.RestService.initSingleIndex(RestService.java:412) ~[elasticsearch-spark_2.10-2.4.0.jar!/:2.4.0]
 | 	at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:400) ~[elasticsearch-spark_2.10-2.4.0.jar!/:2.4.0]
 | 	at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40) ~[elasticsearch-spark_2.10-2.4.0.jar!/:2.4.0]
  | 	at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:59) ~[elasticsearch-spark_2.10-2.4.0.jar!/:2.4.0]
 | 	at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:59) ~[elasticsearch-spark_2.10-2.4.0.jar!/:2.4.0]
 | 	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) ~[spark-core_2.10-1.6.1.jar!/:1.6.1]
 | 	at org.apache.spark.scheduler.Task.run(Task.scala:89) ~[spark-core_2.10-1.6.1.jar!/:1.6.1]
  | 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) ~[spark-core_2.10-1.6.1.jar!/:1.6.1]
 | 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_102]
  | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_102]
 | 	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102]
`
Read more comments on GitHub >

github_iconTop Results From Across the Web

Connection error (check network and/or proxy settings)- all ...
In a Databricks Job, we have a UBQ with a Painless script for ES. these are the options. Staging and prod are the...
Read more >
All Nodes Failed Exception - Elasticsearch - Elastic Discuss
Hi, We are getting the following error: Error summary: EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- ...
Read more >
Connection error (check network and/or proxy settings)
*this is a error* *org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed*
Read more >
Connecting from Spark to ElasticSearch using Hadoop not ...
We are running Spark and ElasticSearch on a number of nodes. The Python code is running fine here, but trying the Java code...
Read more >
Solved: Can't connect to CDH Spark via local spark-shell
When I start spark-shell inside this machine, everything works perfectly. ... finishConnect(SocketChannelImpl.java:744) at org.apache.hadoop.net.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found