NullPointer Exception when running the most basic example (HBaseSource) on HDP-2.5 Sandbox
See original GitHub issueHello, I am experimenting a NullPointerException running the basice HBaseSource example on the HDP-2.5 Sandbox.
I build an assembly and here is my submit:
/usr/hdp/current/spark-client/bin/spark-submit --driver-memory 1024m --class org.apache.spark.sql.execution.datasources.hbase.examples.HBaseSource --master yarn --deploy-mode client --executor-memory 512m --num-executors 4 --files /usr/hdp/current/hbase-master/conf/hbase-site.xml /root/affinytix/tunnel/affinytix-test-tunnel-assembly-1.0.0-SNAPSHOT.jar |& tee /tmp/test-kafka-sparkstreaming.log
And here is the stack (it seems to occur while saving = first phase of the demo populating the table):
16/10/02 08:39:37 INFO ZooKeeperRegistry: ClusterId read in ZooKeeper is null Exception in thread "main" java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:208) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160) at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:821) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:193) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:89) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.isTableAvailable(ConnectionManager.java:985) at org.apache.hadoop.hbase.client.HBaseAdmin.isTableAvailable(HBaseAdmin.java:1399) at org.apache.spark.sql.execution.datasources.hbase.HBaseRelation.createTable(HBaseRelation.scala:87) at org.apache.spark.sql.execution.datasources.hbase.DefaultSource.createRelation(HBaseRelation.scala:58) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148) at org.apache.spark.sql.execution.datasources.hbase.examples.HBaseSource$.main(HBaseSource.scala:90) at org.apache.spark.sql.execution.datasources.hbase.examples.HBaseSource.main(HBaseSource.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.getMetaReplicaNodes(ZooKeeperWatcher.java:395) at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:553) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:61) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateMeta(ConnectionManager.java:1185) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1152) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:151) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) ... 24 more
Any idea about the cause ?
I have consulted the NullPointer from the first issues but both hbase-site.xml and yarn client mode are adopted to execute the example so I don’t really understand.
Thanks for helping
Issue Analytics
- State:
- Created 7 years ago
- Comments:7 (3 by maintainers)
@gypsysunny Could you try: copy hbase-site.xml from /usr/hdp/current/hbase-client/conf/ to /usr/hdp/current/spark-client/conf/ ?
Hello,
It eventually works. After searching the Net and trying different configurations, I saw two different problems:
1 - Pointing to the hbase-site.xml with the spark-submit option --files is not working (for a reason that I cannot figure out for the moment):
/usr/hdp/current/spark-client/bin/spark-submit --driver-memory 1024m --class org.apache.spark.sql.execution.datasources.hbase.examples.HBaseSource --master yarn --deploy-mode client --executor-memory 512m --files /usr/hdp/current/hbase-client/conf/hbase-site.xml --num-executors 4 /root/affinytix/tunnel/affinytix-test-shc-assembly-1.0.0-g_14.0.1-no-xml.jar |& tee /tmp/test-kafka-sparkstreaming.log
=> The hbase-site.xml must be in the classpath of the classes from the beginning (for instance in the src/main/resources directory).
2 - Beside this the Guava version has it importance and using in a project some version above 14.0.1 that would shade the shc dependance lead to the stack of the preceding post. In an SBT project you may force it with:
dependencyOverrides += "com.google.guava" % "guava" % "14.0.1"
So I could be nice to add a warning about the Guava version. Given that the actual version is 19 (and even above very soon) it conflicts easily with any secondary project including this connector.
@weiqingy Any idea about why does the --files /usr/hdp/current/hbase-client/conf/hbase-site.xml is not operational ?
Many thanks