question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

HDFS command used is hardcoded and not compatible with newer versions

See original GitHub issue

hdfs fs -<command> is deprecated, however this command is hardcoded here: https://github.com/iterative/dvc/blob/832b834b65ec2665ece0e8d1cf756f8ed460441b/dvc/remote/hdfs.py#L37

I am running hadoop 2.7 on which hdfs fs is not supported. This stops me from being able to deploy a dvc pipeline, unless i work around it by mapping every hdfs fs to hdfs dfs command.

DVC: 0.50.1 Installation: apt-get Platform: NAME=“Ubuntu” VERSION=“18.04.2 LTS (Bionic Beaver)” ID=ubuntu ID_LIKE=debian PRETTY_NAME=“Ubuntu 18.04.2 LTS” VERSION_ID=“18.04” HOME_URL=“https://www.ubuntu.com/” SUPPORT_URL=“https://help.ubuntu.com/” BUG_REPORT_URL=“https://bugs.launchpad.net/ubuntu/” PRIVACY_POLICY_URL=“https://www.ubuntu.com/legal/terms-and-policies/privacy-policy” VERSION_CODENAME=bionic UBUNTU_CODENAME=bionic Hadoop version: 2.7.0

I was able to fix the issue by reverting my hadoop installation to v. 1.2.1.

But this is really annoying for users that use an hadoop version that does not support hadoop fs.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
falcowinklercommented, Jun 29, 2019

it doesn’t show a deprecation warning and i believe hadoop fs is available in hadoop 1.x while hdfs dfs is not. therefore i would advise, we keep the command as it is

1reaction
efiopcommented, Jun 29, 2019

For the record: related to https://github.com/iterative/dvc/issues/1629 , but it is much easier to adjust the command for now than re-implement HDFS driver with webhdfs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Apache Hadoop 3.3.4 – HDFS Commands Guide
Run a filesystem command on the file system supported in Hadoop. The various COMMAND_OPTIONS can be found at File System Shell Guide.
Read more >
Is there an equivalent to `pwd` in hdfs? - hadoop
hdfs dfs -pwd does not exist because there is no "working directory" concept in HDFS when you run commands from command line.
Read more >
Installing and Running Hadoop and Spark on Windows
Even though newer versions of Hadoop and Spark are currently available, there is a bug with Hadoop 3.2.1 on Windows that causes installation ......
Read more >
Incompatible Changes in CDH 6.3.1 - Cloudera Documentation
HBASE-18883: Updated our Curator version to 4.0 - Users who experience classpath issues due to version conflicts are recommended to use either ...
Read more >
HDFS and Spark FAQ - Christo Wilson
New Users. In order to use Spark, you need to have a home directory on HDFS. Just because you can login to Achtung,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found