question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot use mapreduce under HA mode

See original GitHub issue

Alluxio Version:

2.6.0

Describe the bug

I read the documentation, but it not work when set fs.defaultFS under HA mode. I have tried many ways of writing but none of them work.

To Reproduce

I create a minimal reproduction repository, just run:

➜  alluxio-ha-mapreduce git:(master) ✗ docker-compose up -d master1 master2 master3 worker
Creating alluxio-ha-mapreduce_master3_1 ... done
Creating alluxio-ha-mapreduce_master2_1 ... done
Creating alluxio-ha-mapreduce_worker_1  ... done
Creating alluxio-ha-mapreduce_master1_1 ... done
➜  alluxio-ha-mapreduce git:(master) ✗ docker exec -it alluxio-ha-mapreduce_worker_1  alluxio fs chmod -R 777 /
Changed permission of / to 777
➜  alluxio-ha-mapreduce git:(master) ✗ docker-compose up -d nodemanager resourcemanager
Creating resourcemanager ... done
Creating nodemanager     ... done
➜  alluxio-ha-mapreduce git:(master) ✗ docker exec resourcemanager hadoop jar /opt/hadoop-3.2.1/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.2.1-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 4KB -resFile /tmp/DFSIO-write.out

2021-08-03 02:41:48,675 INFO fs.TestDFSIO: TestDFSIO.1.8
2021-08-03 02:41:48,676 INFO fs.TestDFSIO: nrFiles = 10
2021-08-03 02:41:48,676 INFO fs.TestDFSIO: nrBytes (MB) = 0.00390625
2021-08-03 02:41:48,676 INFO fs.TestDFSIO: bufferSize = 1000000
2021-08-03 02:41:48,676 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
2021-08-03 02:41:48,962 INFO hadoop.AbstractFileSystem: Creating Alluxio configuration from Hadoop configuration {alluxio.master.rpc.addresses=master1:19998,master2:19998,master3:19998}, uri configuration {alluxio.zookeeper.address=null, alluxio.zookeeper.enabled=false, alluxio.master.rpc.addresses=master1:19998,master2:19998,master3:19998}
2021-08-03 02:41:49,007 INFO hadoop.AbstractFileSystem: Initializing filesystem with connect details master1:19998,master2:19998,master3:19998
2021-08-03 02:41:49,066 INFO network.TieredIdentityFactory: Initialized tiered identity TieredIdentity(node=58a0416639be, rack=null)
2021-08-03 02:41:49,075 INFO fs.TestDFSIO: creating control file: 4096 bytes, 10 files
2021-08-03 02:41:49,229 INFO network.NettyUtils: EPOLL_MODE is available
2021-08-03 02:41:49,835 WARN hadoop.AbstractFileSystem: delete failed: alluxio.exception.FileDoesNotExistException: Path "/benchmarks/TestDFSIO/io_control" does not exist.
2021-08-03 02:41:50,732 INFO fs.TestDFSIO: created control files for: 10 files
2021-08-03 02:41:50,742 WARN hadoop.AbstractFileSystem: delete failed: alluxio.exception.FileDoesNotExistException: Path "/benchmarks/TestDFSIO/io_data" does not exist.
2021-08-03 02:41:50,753 WARN hadoop.AbstractFileSystem: delete failed: alluxio.exception.FileDoesNotExistException: Path "/benchmarks/TestDFSIO/io_write" does not exist.
2021-08-03 02:41:50,838 INFO client.RMProxy: Connecting to ResourceManager at resourcemanager/172.17.0.6:8032
2021-08-03 02:41:50,990 INFO mapreduce.Cluster: Failed to use org.apache.hadoop.mapred.YarnClientProtocolProvider due to error:
java.lang.RuntimeException: java.net.URISyntaxException: Malformed IPv6 address at index 11: alluxio://[master1:19998,master2:19998,master3:19998]/
        at org.apache.hadoop.fs.AbstractFileSystem.newInstance(AbstractFileSystem.java:141)
        at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:173)
        at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:258)
        at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:336)
        at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:333)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
        at org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:333)
        at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:459)
        at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:485)
        at org.apache.hadoop.mapred.YARNRunner.<init>(YARNRunner.java:178)
        at org.apache.hadoop.mapred.YARNRunner.<init>(YARNRunner.java:162)
        at org.apache.hadoop.mapred.YARNRunner.<init>(YARNRunner.java:152)
        at org.apache.hadoop.mapred.YarnClientProtocolProvider.create(YarnClientProtocolProvider.java:34)
        at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:130)
        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:109)
        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:102)
        at org.apache.hadoop.mapred.JobClient.init(JobClient.java:475)
        at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:454)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:872)
        at org.apache.hadoop.fs.TestDFSIO.runIOTest(TestDFSIO.java:476)
        at org.apache.hadoop.fs.TestDFSIO.writeTest(TestDFSIO.java:455)
        at org.apache.hadoop.fs.TestDFSIO.run(TestDFSIO.java:872)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
        at org.apache.hadoop.fs.TestDFSIO.main(TestDFSIO.java:743)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
        at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:139)
        at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:147)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: java.net.URISyntaxException: Malformed IPv6 address at index 11: alluxio://[master1:19998,master2:19998,master3:19998]/
        at java.net.URI$Parser.fail(URI.java:2848)
        at java.net.URI$Parser.parseIPv6Reference(URI.java:3469)
        at java.net.URI$Parser.parseServer(URI.java:3219)
        at java.net.URI$Parser.parseAuthority(URI.java:3155)
        at java.net.URI$Parser.parseHierarchical(URI.java:3097)
        at java.net.URI$Parser.parse(URI.java:3053)
        at java.net.URI.<init>(URI.java:673)
        at java.net.URI.<init>(URI.java:774)
        at org.apache.hadoop.fs.AbstractFileSystem.getUri(AbstractFileSystem.java:330)
        at org.apache.hadoop.fs.AbstractFileSystem.<init>(AbstractFileSystem.java:274)
        at org.apache.hadoop.fs.DelegateToFileSystem.<init>(DelegateToFileSystem.java:49)
        at alluxio.hadoop.AlluxioFileSystem.<init>(AlluxioFileSystem.java:50)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.fs.AbstractFileSystem.newInstance(AbstractFileSystem.java:135)
        ... 40 more
java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
        at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:116)
        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:109)
        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:102)
        at org.apache.hadoop.mapred.JobClient.init(JobClient.java:475)
        at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:454)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:872)
        at org.apache.hadoop.fs.TestDFSIO.runIOTest(TestDFSIO.java:476)
        at org.apache.hadoop.fs.TestDFSIO.writeTest(TestDFSIO.java:455)
        at org.apache.hadoop.fs.TestDFSIO.run(TestDFSIO.java:872)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
        at org.apache.hadoop.fs.TestDFSIO.main(TestDFSIO.java:743)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
        at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:139)
        at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:147)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
        Suppressed: java.io.IOException: Failed to use org.apache.hadoop.mapred.YarnClientProtocolProvider due to error:
                at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:148)
                ... 25 more
        Caused by: java.lang.RuntimeException: java.net.URISyntaxException: Malformed IPv6 address at index 11: alluxio://[master1:19998,master2:19998,master3:19998]/
                at org.apache.hadoop.fs.AbstractFileSystem.newInstance(AbstractFileSystem.java:141)
                at org.apache.hadoop.fs.AbstractFileSystem.createFileSystem(AbstractFileSystem.java:173)
                at org.apache.hadoop.fs.AbstractFileSystem.get(AbstractFileSystem.java:258)
                at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:336)
                at org.apache.hadoop.fs.FileContext$2.run(FileContext.java:333)
                at java.security.AccessController.doPrivileged(Native Method)
                at javax.security.auth.Subject.doAs(Subject.java:422)
                at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
                at org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:333)
                at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:459)
                at org.apache.hadoop.fs.FileContext.getFileContext(FileContext.java:485)
                at org.apache.hadoop.mapred.YARNRunner.<init>(YARNRunner.java:178)
                at org.apache.hadoop.mapred.YARNRunner.<init>(YARNRunner.java:162)
                at org.apache.hadoop.mapred.YARNRunner.<init>(YARNRunner.java:152)
                at org.apache.hadoop.mapred.YarnClientProtocolProvider.create(YarnClientProtocolProvider.java:34)
                at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:130)
                ... 25 more
        Caused by: java.net.URISyntaxException: Malformed IPv6 address at index 11: alluxio://[master1:19998,master2:19998,master3:19998]/
                at java.net.URI$Parser.fail(URI.java:2848)
                at java.net.URI$Parser.parseIPv6Reference(URI.java:3469)
                at java.net.URI$Parser.parseServer(URI.java:3219)
                at java.net.URI$Parser.parseAuthority(URI.java:3155)
                at java.net.URI$Parser.parseHierarchical(URI.java:3097)
                at java.net.URI$Parser.parse(URI.java:3053)
                at java.net.URI.<init>(URI.java:673)
                at java.net.URI.<init>(URI.java:774)
                at org.apache.hadoop.fs.AbstractFileSystem.getUri(AbstractFileSystem.java:330)
                at org.apache.hadoop.fs.AbstractFileSystem.<init>(AbstractFileSystem.java:274)
                at org.apache.hadoop.fs.DelegateToFileSystem.<init>(DelegateToFileSystem.java:49)
                at alluxio.hadoop.AlluxioFileSystem.<init>(AlluxioFileSystem.java:50)
                at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
                at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
                at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
                at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
                at org.apache.hadoop.fs.AbstractFileSystem.newInstance(AbstractFileSystem.java:135)
                ... 40 more

Expected behavior

Can submit mapreduce task successfully.

Urgency

Urgency.

Additional context

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

2reactions
ZhuTophercommented, Sep 27, 2021

The bot closed this issue due to #14021 being merged, but if you run into similar connectivity issues with mapreduce & HA alluxio using those changes please re-open this issue. Thank you @qian0817 !

1reaction
ZhuTophercommented, Sep 16, 2021

EDIT: I misunderstood the docker-compose.yml in your repo and thought you set the HDFS as your UFS, so disregard that component of the following post.


@qian0817 I am in the process of trying to get a successful configuration based on your docker-compose.yml file. There is a possibility of some networking conflicts in Docker with the YARN images that I am running in to, but I will update this ticket when I reach a conclusion.

If you would like to try this configuration and see if you run into similar issues, use the following docker-compose.yml:

  • Since you are using the default network created by docker compose up -d, make sure this is in the same root directory as your other code.
version: "3"
​
services:
  alluxio-master-1:
    image: alluxio/alluxio:2.6.0
    container_name: "alluxio-master-1"
    restart: always
    ports:
      - "19998:19998"
      - "19999:19999"
    environment:
      ALLUXIO_JAVA_OPTS: >
        -Dalluxio.master.hostname=alluxio-master-1
        -Dalluxio.master.embedded.journal.addresses=alluxio-master-1:19200,alluxio-master-2:19200,alluxio-master-3:19200
        -Dalluxio.master.rpc.addresses=alluxio-master-1:19998,alluxio-master-2:19998,alluxio-master-3:19998
        -Dalluxio.worker.hostname=alluxio-worker
        -Dalluxio.master.mount.table.root.ufs=hdfs://namenode:9000/
        -Dalluxio.master.mount.table.root.option.alluxio.underfs.hdfs.configuration:/etc/hadoop/conf/core-site.xml/etc/hadoop/conf/hdfs-site.xml
        -Dalluxio.security.authorization.permission.enabled=false
      SERVICE_PRECONDITION: "namenode:9000"
    volumes:
      - ./core-site.xml:/etc/hadoop/conf/core-site.xml
      - ./hdfs-site.xml:/etc/hadoop/conf/hdfs-site.xml
    command: master
​
  alluxio-master-2:
    image: alluxio/alluxio:2.6.0
    container_name: "alluxio-master-2"
    restart: always
    environment:
      ALLUXIO_JAVA_OPTS: >
        -Dalluxio.master.hostname=alluxio-master-2
        -Dalluxio.master.embedded.journal.addresses=alluxio-master-1:19200,alluxio-master-2:19200,alluxio-master-3:19200
        -Dalluxio.master.rpc.addresses=alluxio-master-1:19998,alluxio-master-2:19998,alluxio-master-3:19998
        -Dalluxio.worker.hostname=alluxio-worker
        -Dalluxio.master.mount.table.root.ufs=hdfs://namenode:9000/
        -Dalluxio.master.mount.table.root.option.alluxio.underfs.hdfs.configuration:/etc/hadoop/conf/core-site.xml/etc/hadoop/conf/hdfs-site.xml
        -Dalluxio.security.authorization.permission.enabled=false
      SERVICE_PRECONDITION: "namenode:9000"
    volumes:
      - ./core-site.xml:/etc/hadoop/conf/core-site.xml
      - ./hdfs-site.xml:/etc/hadoop/conf/hdfs-site.xml
    command: master

  alluxio-master-3:
    image: alluxio/alluxio:2.6.0
    container_name: "alluxio-master-3"
    restart: always
    environment:
      ALLUXIO_JAVA_OPTS: >
        -Dalluxio.master.hostname=alluxio-master-3
        -Dalluxio.master.embedded.journal.addresses=alluxio-master-1:19200,alluxio-master-2:19200,alluxio-master-3:19200
        -Dalluxio.master.rpc.addresses=alluxio-master-1:19998,alluxio-master-2:19998,alluxio-master-3:19998
        -Dalluxio.worker.hostname=alluxio-worker
        -Dalluxio.master.mount.table.root.ufs=hdfs://namenode:9000/
        -Dalluxio.master.mount.table.root.option.alluxio.underfs.hdfs.configuration:/etc/hadoop/conf/core-site.xml/etc/hadoop/conf/hdfs-site.xml
        -Dalluxio.security.authorization.permission.enabled=false
      SERVICE_PRECONDITION: "namenode:9000"
    volumes:
      - ./core-site.xml:/etc/hadoop/conf/core-site.xml
      - ./hdfs-site.xml:/etc/hadoop/conf/hdfs-site.xml
    command: master

  alluxio-worker:
    image: alluxio/alluxio:2.6.0
    container_name: "alluxio-worker"
    restart: always
    ports:
      - "29999:29999"
      - "30000:30000"
    shm_size: 1G
    environment:
      ALLUXIO_JAVA_OPTS: >
        -Dalluxio.master.hostname=alluxio-master-1
        -Dalluxio.master.embedded.journal.addresses=alluxio-master-1:19200,alluxio-master-2:19200,alluxio-master-3:19200
        -Dalluxio.master.rpc.addresses=alluxio-master-1:19998,alluxio-master-2:19998,alluxio-master-3:19998
        -Dalluxio.worker.hostname=alluxio-worker
        -Dalluxio.worker.ramdisk.size=1G
        -Dalluxio.master.mount.table.root.option.alluxio.underfs.hdfs.configuration:/etc/hadoop/conf/core-site.xml/etc/hadoop/conf/hdfs-site.xml
        -Dalluxio.security.authorization.permission.enabled=false
      SERVICE_PRECONDITION: "alluxio-master-1:19998 alluxio-master-2:19998 alluxio-master-3:19998"
    volumes:
      - ./core-site.xml:/etc/hadoop/conf/core-site.xml
      - ./hdfs-site.xml:/etc/hadoop/conf/hdfs-site.xml
    command: worker

Please make sure to docker cp namenode:/${HADOOP_HOME}/etc/core-site.xml ./core-site.xml and docker cp namenode:/${HADOOP_HOME}/etc/hdfs-site.xml ./hdfs-site.xml before you deploy these services.

Read more comments on GitHub >

github_iconTop Results From Across the Web

MapReduce job is not running on a HADOOP 2.6.0 (Multi ...
I have done Hadoop 2.6.0 Multi node cluster setup successfully on 4 machines(1 master and 3 slaves). But when I'm trying to run...
Read more >
How to Set Up Hadoop Cluster with HDFS High Availability
To set up High Availability in Hadoop cluster you have to use Zookeeper in all the nodes. The daemons in Active NameNode are:....
Read more >
Apache Hadoop 3.3.4 – ResourceManager High Availability
Introduction. This guide provides an overview of High Availability of YARN's ResourceManager, and details how to configure and use this ...
Read more >
Hadoop Mock Test - TutorialsPoint
A - Stand alone cannot use map reduce. B - Stand alone has a single java process running in it. C - Pseudo...
Read more >
Enabling HDFS HA | 6.3.x - Cloudera Documentation
You can use Cloudera Manager to configure your CDH cluster for HDFS HA and automatic failover. In Cloudera Manager, HA is implemented using...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found