Copying the data doesn't go through short-circuit when using `load` command.
See original GitHub issueAlluxio Version: v2.0.0
Describe the bug A clear and concise description of what the bug is.
To Reproduce
Copying the same data in the same node(virtual machine) twice times. I expect it should go through short-circuit in the second time. But it’s still copied from remote node in alluxio.
- Deploy master, workers and fuse in kubernetes
- And go to the master, distributedLoad whole the directory from alluxio
time /opt/alluxio/bin/alluxio fs distributedLoad --replication 1 /training-data/images
...../training-data/images/train-00304-of-01024 loaded
..../training-data/images/train-00475-of-01024 loaded
/training-data/images loaded
real 151m11.859s
user 0m44.141s
sys 0m15.720s
- After that, check the persistent status. All the data are in persistent.
bash-4.4# /opt/alluxio/bin/alluxio fs ls /training-data/images| wc -l
1152
bash-4.4# /opt/alluxio/bin/alluxio fs ls /training-data/images| grep PERSIST|wc -l
1152
bash-4.4# /opt/alluxio/bin/alluxio fs ls /training-data/images| grep -v PERSIST|wc -l
0
- I’ve checked the data is fully loaded into alluxio, and it’s put on the different nodes.
bash-4.4# /opt/alluxio/bin/alluxio fsadmin report ufs
Alluxio under storage system information:
oss://imagenet-huabei5/images on /training-data/images (oss, capacity=-1B, used=-1B, read-only, not shared, properties={fs.oss.accessKeySecret=******, fs.oss.accessKeyId=******, fs.oss.endpoint=oss-cn-internal.aliyuncs.com})
/opt/alluxio-2.0.0/underFSStorage on / (local, capacity=4843.27GB, used=-1B(0%), not read-only, not shared, properties={})
bash-4.4# /opt/alluxio/bin/alluxio fsadmin report capacity
Capacity information for all workers:
Total Capacity: 9.38TB
Tier: MEM Size: 1600.00GB
Tier: SSD Size: 7.81TB
Used Capacity: 143.67GB
Tier: MEM Size: 143.67GB
Tier: SSD Size: 0B
Used Percentage: 1%
Free Percentage: 99%
Worker Name Last Heartbeat Storage Total MEM SSD
192.168.0.117 0 capacity 600.00GB 100.00GB 500.00GB
used 32.07GB (5%) 0B 0B
192.168.0.118 0 capacity 600.00GB 100.00GB 500.00GB
used 39.50GB (6%) 0B 0B
192.168.0.119 0 capacity 600.00GB 100.00GB 500.00GB
used 37.40GB (6%) 0B 0B
192.168.0.120 0 capacity 600.00GB 100.00GB 500.00GB
used 34.70GB (5%) 0B 0B
- I went to the node(192.168.0.120) to copy data for the first time, I think the data should be also download the local node.
time cp -r /alluxio-fuse/training-data/images /test
real 10m23.516s
user 0m1.190s
sys 2m40.581s
- Check the metrics, I can see most data are from remote node in alluxio. It’s as expected.
bash-4.4# ./bin/alluxio fsadmin report metrics
Total IO:
Short-circuit Read 0B
Short-circuit Read (Domain Socket) 1232.86GB
From Remote Instances 220.87GB
Under Filesystem Read 29.00MB
Alluxio Write 0B
Alluxio Write (Domain Socket) 718.40GB
Under Filesystem Write 0B
Total IO Throughput (Last Minute):
Short-circuit Read 0B
Short-circuit Read (Domain Socket) 19.93MB
From Remote Instances 69.23MB
Under Filesystem Read 0B
Alluxio Write 0B
Alluxio Write (Domain Socket) 0B
Under Filesystem Write 0B
Cache Hit Rate (Percentage):
Alluxio Local 0.00
Alluxio Remote 99.99
Miss 0.01
- I went to the node(192.168.0.120) to copy data for the second time, I think the data should be read from local
time cp -r /alluxio-fuse/training-data/images /test
real 10m23.516s
user 0m1.190s
sys 2m40.581s
- But from the metrics, it shows the data is still downloaded from remote node in alluxio.
# ./bin/alluxio fsadmin report metrics
Total IO:
Short-circuit Read 0B
Short-circuit Read (Domain Socket) 1266.88GB
From Remote Instances 325.68GB
Under Filesystem Read 29.00MB
Alluxio Write 0B
Alluxio Write (Domain Socket) 718.40GB
Under Filesystem Write 0B
Total IO Throughput (Last Minute):
Short-circuit Read 0B
Short-circuit Read (Domain Socket) 13.62MB
From Remote Instances 37.96MB
Under Filesystem Read 0B
Alluxio Write 0B
Alluxio Write (Domain Socket) 0B
Under Filesystem Write 0B
Cache Hit Rate (Percentage):
Alluxio Local 0.00
Alluxio Remote 99.99
Miss 0.01
Expected behavior A clear and concise description of what you expected to happen.
Urgency Describe the impact and urgency of the bug.
Additional context Add any other context about the problem here.
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (7 by maintainers)
Top Results From Across the Web
Is the SQL WHERE clause short-circuit evaluated?
Short circuiting implies evaluating conditions from left to right. Given a condition such as WHERE a = 1 AND b = 2 it...
Read more >Short Circuit - DCCWiki
A Short Circuit gets its name from the electrical energy finding a shortcut, an easier path from one side of the power supply...
Read more >Apache Hadoop Distributed Copy – DistCp Guide
Overview. DistCp (distributed copy) is a tool used for large inter/intra-cluster copying. It uses MapReduce to effect its distribution, ...
Read more >Power*ToolsR for Windows™ V8.0 Tutorial - SKM
you can use the Go-to-Component Editor function to display the ... AC Short Circuit Method use in Arc Flash Calculation - These three...
Read more >How to Repair a Dead Hard Disk Drive to Recover Data
If your hard disk drive has failed, this guide will help you with the hard disk drive's repair and data recovery.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It’s not a blocking issue, but it’s also critical. Because for performance-sensitive and distributed application, loading data from network will also impact the network bandwidth heavily which should be used by interconnecting of the application itself.
@cheyang @zrss has this been addressed in recent versions?