Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Alluxio worker on K8s does not support shared storage

See original GitHub issue

Is your feature request related to a problem? Please describe. As of Alluxio 2.5, the Alluxio workers assume the tiered storage directories (disks) to be exclusively owned by the Alluxio worker. For example the tier 0 dir is /data/alluxio. Then the Alluxio worker is writing to /data/alluxio/alluxioworker.

For that reason, Alluxio workers on K8s only support the PV types of local and hostPath, because we want the worker tiered storage to be exclusive (not shared by other workers). Also the worker pods are deployed with DaemonSet, which means we cannot define separate PVs for each worker pod. We can only use one local or hostPath PV, so that the worker will translate it to the local path.

In K8s there’s a use case where the Alluxio workers are running on physical machines that do not have local storage. In that case the machine is backed by storage PVs from Ceph, which are read-write-many. Under this circumstance it is hard to define a tiered storage because:

If we define the workers to use this shared Ceph PV, because the workers will each think the shared disk is dedicated. The behavior will be undefined. The workers all think what is under alluxioworker/ belong to it alone. But in fact all the workers are writing files under alluxioworker/.
It is hard to define a dedicated PV for each worker, because the workers are deployed with DaemonSet. All pods created from DaemonSet will share the PVCs.

Describe the solution you’d like We can enable the worker tiered storage to be shared directories, by figuring out a way to distinguish alluxioworker/ dir between workers. One naive idea is to change the data dir to be alluxioworker-UUID/.

There may be confusions in handling lost workers and recreation.

Describe alternatives you’ve considered A clear and concise description of any alternative solutions or features you’ve considered.

Urgency MEDIUM

This will enable many more deployments in K8s. Yes locality to worker is lost, but this can make sense to use cases that have low requirement on locality.

Additional context Add any other context or screenshots about the feature request here.

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

nirav-chotaicommented, Mar 16, 2021

@jiacheliu3, you are right. Ceph clusters are managed by Rook within the same Kubernetes Cluster where Alluxio is running. In terms of data locality, we might lose that only to the node, however, data still be within the same data center, infrastructure within other nodes where disks are available. Having around 100Gbps network, not having data local to the node is not going to have much impact.

1reaction

jiacheliu3commented, Mar 16, 2021

In the use case for @nirav-chotai the PV is from Ceph and it has the ability to resize. Yes I think Ceph is managing a pool of PVs (Nirav pls correct me if I’m wrong).

Yes a loss of locality is inevitable in this setup, but by not having local storage on the worker physical nodes, that’s the best one can do I guess.

Top Results From Across the Web

Deploy Alluxio on Kubernetes

Alluxio supports two methods of installation on Kubernetes: either using helm charts or using kubectl . When available, helm is the preferred way...

Alluxio worker K8S storage management · Issue #14616 - GitHub

We deploy alluxio with fluid(an alluxio controller responsible for deployment) on k8s shared with multiple applications. Claiming 10GB is not a good idea ......

Improving Data Locality for Analytics Jobs on Kubernetes ...

Challenge 3: Executor fails to find domain socket. 23. Problem: • Pods don't share the File System. • Domain socket /opt/domain is in...

Impacts of Alluxio Optimization on Kubernetes Deep Learning ...

Each machine provides 40 GB of memory for storage, and the total distributed cache volume is 160 GB. No preload policy is used....

Cluster with HA - 《Alluxio CE v2.3 Documentation》 - 书栈网

Zookeeper and Shared Journal Storage · alluxio.zookeeper.enabled=true enables the HA mode for the masters, and informs workers that HA mode is ...