Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

hdfsBuilderConnect class not found when loading the datasets into HDFS

See original GitHub issue

Environment:

Python version [e.g. 2.7, 3.6] 3.6
Spark version [e.g. 2.1, 2.3.1] 2.4.4
TensorFlow version [e.g. 1.5, 1.9.0] 1.14
TensorFlowOnSpark version [e.g. 1.1, 1.3.2] master
Cluster version [e.g. Standalone, Hadoop 2.8, CDH5] Hadoop 2.8.5

I am running the hadoop/spark installation on AWS EMR at the moment.

Describe the bug:

I am trying to run mnist example and I having an issue when performing the data prep, using the tensorflow_datasets package. In my code, mnist_data_setup.py loads the data to HDFS as opposed to local file system as seen below,

import tensorflow_datasets as tfds
mnist, info = tfds.load('mnist', with_info=True, data_dir='hdfs://default/user/hadoop/tensorflow_datas')

Perhaps the exception (shown below) is not pertaining to TensorflowOnSpark directly, but I wanted to see @leewyang can provide some advise/assistance here. Appreciate your time.

Logs:

I am receiving the following when running the spark application.

loadFileSystems error:
(unable to get stack trace for java.lang.NoClassDefFoundError exception: ExceptionUtils::getStackTrace error.)
hdfsBuilderConnect(forceNewInstance=0, nn=default, port=0, kerbTicketCachePath=(NULL), userName=(NULL)) error:

Spark Submit Command Line:

I have tried various variations, including providing LD_LIBRARY_PATH to the executor env.

${SPARK_HOME}/bin/spark-submit  --deploy-mode cluster \
--queue default --num-executors 4 \
--conf spark.executorEnv.CLASSPATH=$(hadoop classpath --glob) \
--executor-memory 4G --archives mnist/mnist.zip#mnist \
--jars hdfs:///user/${USER}/tensorflow-hadoop-1.10.0.jar,hdfs:///user/${USER}//spark-tensorflow-connector_2.11-1.10.0.jar \
TensorFlowOnSpark/examples/mnist/mnist_data_setup.py \
--output cluster --format tfr

I have performed the hadoop classpath --glob and verified that the full list of jars are present on both master and slave nodes.

Weird part is that when running the same python snippet on pyspark shell (after setting up CLASSPATH), it seems run perfectly fine.

import tensorflow_datasets as tfds
mnist, info = tfds.load('mnist', with_info=True, data_dir='hdfs://default/user/hadoop/tensorflow_datas')

Is there a known limitation around the length that can be passed via Spark Submit?

Additionally see a related issue here.

Issue Analytics

State:
Created 4 years ago
Comments:8 (3 by maintainers)

Top GitHub Comments

1reaction

leewyangcommented, Jan 17, 2020

You can try setting the CLASSPATH variable at the top of your map_fn with code like this:

import os
import subprocess

classpath = os.environ['CLASSPATH']
hadoop_path = os.path.join(os.environ['HADOOP_PREFIX'], 'bin', 'hadoop')
hadoop_classpath = subprocess.check_output([hadoop_path, 'classpath', '--glob']).decode()
os.environ['CLASSPATH']=classpath + os.pathsep + hadoop_classpath

0reactions

jerrygbcommented, Jan 20, 2020

Hi @leewyang,

Thank you, you have solved the issue! If I see another way to retain the classpath across the executors according what we pass onto the spark-submit, then I will post back here.

I will close this issue.

Top Results From Across the Web

Unable to read HDFS using Java program: Could not find or ...

The error you are getting is related to Hadoop being not able to find the class URLCat in its classpath. You can either...

RDD Programming Guide - Spark 3.3.1 Documentation

The first line defines a base RDD from an external file. This dataset is not loaded in memory or otherwise acted on: lines...

Loading data into HDFS

The aim of this short guide is to provide detailed instructions of how to load a dataset from a. PC into a Hadoop...

Receiving “java.lang.NoClassDefFoundError” — Dataiku DSS ...

NoClassDefFoundError or class not initialized error message in the UI. Here are some common reasons and resolutions. Receiving “java.lang.NoClassDefFoundError: ...

Hadoop - copyFromLocal Command - GeeksforGeeks

If the file is already present in the folder then copy it into the ... you can observe that copyFromLocal command itself does...